<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=DistOS_2021F_2021-10-12</id>
	<title>DistOS 2021F 2021-10-12 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=DistOS_2021F_2021-10-12"/>
	<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;action=history"/>
	<updated>2026-05-12T23:28:38Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.42.1</generator>
	<entry>
		<id>https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;diff=23411&amp;oldid=prev</id>
		<title>Soma: /* Notes */</title>
		<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;diff=23411&amp;oldid=prev"/>
		<updated>2021-10-13T02:24:33Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Notes&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;a href=&quot;https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;amp;diff=23411&amp;amp;oldid=23405&quot;&gt;Show changes&lt;/a&gt;</summary>
		<author><name>Soma</name></author>
	</entry>
	<entry>
		<id>https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;diff=23405&amp;oldid=prev</id>
		<title>Soma: Created page with &quot;==Notes==  &lt;pre&gt; Lecture 9 ---------  Plan for next week for groups  - generated by a script  - posted at start of class, you&#039;ll need to    manually go to your assigned room...&quot;</title>
		<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=DistOS_2021F_2021-10-12&amp;diff=23405&amp;oldid=prev"/>
		<updated>2021-10-13T02:20:21Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;==Notes==  &amp;lt;pre&amp;gt; Lecture 9 ---------  Plan for next week for groups  - generated by a script  - posted at start of class, you&amp;#039;ll need to    manually go to your assigned room...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Notes==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Lecture 9&lt;br /&gt;
---------&lt;br /&gt;
&lt;br /&gt;
Plan for next week for groups&lt;br /&gt;
 - generated by a script&lt;br /&gt;
 - posted at start of class, you&amp;#039;ll need to&lt;br /&gt;
   manually go to your assigned room&lt;br /&gt;
 - based on what you submit&lt;br /&gt;
 - 4 groups:&lt;br /&gt;
   - high response grades&lt;br /&gt;
   - low response grades&lt;br /&gt;
   - high quiz grades&lt;br /&gt;
   - low quiz grades&lt;br /&gt;
 - groups will be first assigned inside each group randomly and then&lt;br /&gt;
   based on nearby groups (high with high, low with low)&lt;br /&gt;
 - will respect group exclusion list&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Zookeeper &amp;amp; Chubby&lt;br /&gt;
 - why do we need these?&lt;br /&gt;
 - and why don&amp;#039;t we have them on individual systems, or on older&lt;br /&gt;
   distributed systems (Plan 9, NFS, LOCUS etc)&lt;br /&gt;
&lt;br /&gt;
We need them because&lt;br /&gt;
 - failures: hosts die, so we need more than one to run the service&lt;br /&gt;
   - so need to be able to recover&lt;br /&gt;
 - but really, it is *because* it is distributed that we need these complex algorithms&lt;br /&gt;
&lt;br /&gt;
Servers A and B&lt;br /&gt;
Clients X and Y&lt;br /&gt;
&lt;br /&gt;
A and B are supposed to provide access to the &amp;quot;same&amp;quot; data&lt;br /&gt;
&lt;br /&gt;
Client X reads and writes to A&lt;br /&gt;
Client Y reads and writes to B&lt;br /&gt;
&lt;br /&gt;
&amp;quot;like networked mutex-semaphore systems&amp;quot;&lt;br /&gt;
  - except there is NO SHARED MEMORY&lt;br /&gt;
&lt;br /&gt;
You want multiple hosts to behave as if they were the same host&lt;br /&gt;
 - unified view of state that, externally, is always consistent&lt;br /&gt;
&lt;br /&gt;
With older systems, if you wanted a unified view of state, it would just go to one host&lt;br /&gt;
 - but here we need more than one, for scalability, performance, and reliability&lt;br /&gt;
&lt;br /&gt;
Chubby really takes this seriously&lt;br /&gt;
Zookeeper can get these sorts of guarantees but allows flexibility&lt;br /&gt;
 - trade off consistency for performance&lt;br /&gt;
&lt;br /&gt;
consistency: act like copies are &amp;quot;the same&amp;quot;&lt;br /&gt;
  - changes are visible everywhere at the same time&lt;br /&gt;
&lt;br /&gt;
Classic single system operating systems take shared state as a given&lt;br /&gt;
 - get in for free with shared RAM&lt;br /&gt;
 - in NUMA systems (i.e., any modern system with multiple cores),&lt;br /&gt;
   this can take a hit, but the hardware makes it mostly true&lt;br /&gt;
   - and when it isn&amp;#039;t we can use special CPU instructions to&lt;br /&gt;
     make sure it is true&lt;br /&gt;
&lt;br /&gt;
Note that Zookeeper is open source and is widely used&lt;br /&gt;
 - Chubby is google-specific&lt;br /&gt;
 - and I don&amp;#039;t know of an open source implementation of it&lt;br /&gt;
&lt;br /&gt;
Hadoop project is a gathering of projects for building large-scale distributed systems&lt;br /&gt;
 - many based on papers published by Google&lt;br /&gt;
 - all developed by parties other than Google, but major internet corporations (Yahoo was one of these, but continued with Facebook etc)&lt;br /&gt;
&lt;br /&gt;
Kubernetes was developed by Google&lt;br /&gt;
 - I think they finally realized that people were developing tech&lt;br /&gt;
   outside their bubble and it meant that hires would know others&amp;#039;&lt;br /&gt;
   tech, not theirs&lt;br /&gt;
&lt;br /&gt;
Note that Google doesn&amp;#039;t use Kubernetes internally&lt;br /&gt;
 - they have Borg, which we will discuss next week&lt;br /&gt;
&lt;br /&gt;
Sometimes you need everyone to be on the same page, that&amp;#039;s when you use these sorts of services&lt;br /&gt;
 - not for data, but for metadata, config, etc&lt;br /&gt;
 - they pay a huge price for consistency guarantees&lt;br /&gt;
    - limited data storage&lt;br /&gt;
    - limited access methods&lt;br /&gt;
    - limited scalability (they don&amp;#039;t scale per se, they allow&lt;br /&gt;
      other systems to scale)&lt;br /&gt;
&lt;br /&gt;
Chubby turns 5 computers into 1 file server&lt;br /&gt;
  - but not really more than 5&lt;br /&gt;
  - but 5 is enough for fault tolerance, and load is relatively low&lt;br /&gt;
    because &amp;quot;locks&amp;quot; are coarse-grained&lt;br /&gt;
  - workload could be served by fewer machines, but these can be&lt;br /&gt;
    distributed around infrastructure&lt;br /&gt;
&lt;br /&gt;
Zookeeper is based on Zab&lt;br /&gt;
Chubby is based on Paxos&lt;br /&gt;
etcd is based on Raft&lt;br /&gt;
&lt;br /&gt;
Each algorithm has its tradeoffs&lt;br /&gt;
 - importance is you need something that provides a consistent view to build larger distributed systems&lt;br /&gt;
 - seems to be easier if you do this as a service rather than a library&lt;br /&gt;
   an application uses&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Soma</name></author>
	</entry>
</feed>