<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=DistOS_2023W_2023-02-27</id>
	<title>DistOS 2023W 2023-02-27 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=DistOS_2023W_2023-02-27"/>
	<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2023W_2023-02-27&amp;action=history"/>
	<updated>2026-04-08T03:23:07Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.42.1</generator>
	<entry>
		<id>https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2023W_2023-02-27&amp;diff=24366&amp;oldid=prev</id>
		<title>Soma: Created page with &quot;==Notes== &lt;pre&gt; Web Scale ---------  * Midterm grading is ongoing, hopefully will be finished this week * Proposal deadline extended to Friday, will try to give you some material this week to help   - I&#039;ve been ignoring some of you on Teams, I will be replying today   Up to this point in the class, we&#039;ve really been focused on distributed systems for running classic UNIX-like workloads  - individual developer/engineer working at a workstation on their stuff  Key problems...&quot;</title>
		<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2023W_2023-02-27&amp;diff=24366&amp;oldid=prev"/>
		<updated>2023-02-27T17:56:40Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;==Notes== &amp;lt;pre&amp;gt; Web Scale ---------  * Midterm grading is ongoing, hopefully will be finished this week * Proposal deadline extended to Friday, will try to give you some material this week to help   - I&amp;#039;ve been ignoring some of you on Teams, I will be replying today   Up to this point in the class, we&amp;#039;ve really been focused on distributed systems for running classic UNIX-like workloads  - individual developer/engineer working at a workstation on their stuff  Key problems...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Notes==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Web Scale&lt;br /&gt;
---------&lt;br /&gt;
&lt;br /&gt;
* Midterm grading is ongoing, hopefully will be finished this week&lt;br /&gt;
* Proposal deadline extended to Friday, will try to give you some material this week to help&lt;br /&gt;
  - I&amp;#039;ve been ignoring some of you on Teams, I will be replying today&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Up to this point in the class, we&amp;#039;ve really been focused on distributed systems for running classic UNIX-like workloads&lt;br /&gt;
 - individual developer/engineer working at a workstation on their stuff&lt;br /&gt;
&lt;br /&gt;
Key problems here are&lt;br /&gt;
 - making small files available to any workstation&lt;br /&gt;
 - authentication at a workstation (Kerberos, out of scope)&lt;br /&gt;
 - running jobs on remote computers&lt;br /&gt;
   - process migration is nice but not so common a need in practice&lt;br /&gt;
&lt;br /&gt;
We&amp;#039;ve had hints at larger problems&lt;br /&gt;
 - DSM can be used for large-scale scientific applications&lt;br /&gt;
&lt;br /&gt;
Scientific applications will consume as many resources as you throw at them&lt;br /&gt;
 - but mostly they&amp;#039;ve been addressed through specialized solutions&lt;br /&gt;
 - classic method is distributed apps built on MPI (a message passing API)&lt;br /&gt;
 - look up &amp;quot;Beowulf clusters&amp;quot; for classic implementations&lt;br /&gt;
&lt;br /&gt;
These aren&amp;#039;t so &amp;quot;distributed OS&amp;quot;-like, because it is like coding in assembly language - not much abstraction beyond what UNIX+networking provides&lt;br /&gt;
&lt;br /&gt;
But the web came in the 1990&amp;#039;s and changed everything&lt;br /&gt;
&lt;br /&gt;
Why did the web change the computing landscape?  There were new problems to solve!&lt;br /&gt;
&lt;br /&gt;
Today it is commonplace to have web applications that are accessed by millions of people concurrently.&lt;br /&gt;
* Initially web applications ran on a single computer with just a web server process.&lt;br /&gt;
* Then, that process was connected to a backend database (MySQL) and Perl scripts (CGI).&lt;br /&gt;
* This was all great for a modest number of users.  But then the world got on the world wide web and we neeeded to SCALE&lt;br /&gt;
&lt;br /&gt;
The first company to really try to scale with the growth of the web was Google&lt;br /&gt;
 - before, companies bought the biggest, most expensive computers they could and load balanced between them carefully&lt;br /&gt;
 - google realized that you could use lots of cheap computers if you could deal with their cheapness in software (i.e., distributing workload, dealing with failures)&lt;br /&gt;
 - and this really mattered for search engines&lt;br /&gt;
&lt;br /&gt;
Search engines were the first hard problem because to index the web you had to download it, and the web was growing exponentially in size&lt;br /&gt;
 - but how do you make an application layer for software to run on&lt;br /&gt;
   that takes advantage of many many computers?&lt;br /&gt;
&lt;br /&gt;
Papers for the second half of the semester are to give you an idea of how large-scale systems are built to support web applications.  We need:&lt;br /&gt;
 - filesystems&lt;br /&gt;
 - databases&lt;br /&gt;
 - computation/data processing&lt;br /&gt;
 - plumbing to support the above (coordination services)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Do these things make a &amp;quot;distributed operating system&amp;quot;?&lt;br /&gt;
 - yes and no&lt;br /&gt;
 - yes: real applications are built on top of these abstractions&lt;br /&gt;
 - no: abstractions aren&amp;#039;t general purpose, need different ones&lt;br /&gt;
   for different applications (an OS normally has one set of abstractions that&lt;br /&gt;
   everyone uses)&lt;br /&gt;
&lt;br /&gt;
Why can&amp;#039;t we just have one unifying distributed OS then?&lt;br /&gt;
 - communication and fault tolerance&lt;br /&gt;
 - communication: it is always expensive, and how to minimize it&lt;br /&gt;
   depends on the application.  (Parallel is always hard)&lt;br /&gt;
     - really, we turn most of the app into being embarassingly parallel&lt;br /&gt;
       (requiring no coordination), and the remainder becomes the&lt;br /&gt;
       hard part that requires special engineering&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Soma</name></author>
	</entry>
</feed>