DistOS 2021F 2021-10-07

Notes

Lecture 9
---------

Plan for next week for groups
 - generated by a script
 - posted at start of class, you'll need to
   manually go to your assigned room
 - based on what you submit
 - 4 groups:
   - high response grades
   - low response grades
   - high quiz grades
   - low quiz grades
 - groups will be first assigned inside each group randomly and then
   based on nearby groups (high with high, low with low)
 - will respect group exclusion list


Zookeeper & Chubby
 - why do we need these?
 - and why don't we have them on individual systems, or on older
   distributed systems (Plan 9, NFS, LOCUS etc)

We need them because
 - failures: hosts die, so we need more than one to run the service
   - so need to be able to recover
 - but really, it is *because* it is distributed that we need these complex algorithms

Servers A and B
Clients X and Y

A and B are supposed to provide access to the "same" data

Client X reads and writes to A
Client Y reads and writes to B

"like networked mutex-semaphore systems"
  - except there is NO SHARED MEMORY

You want multiple hosts to behave as if they were the same host
 - unified view of state that, externally, is always consistent

With older systems, if you wanted a unified view of state, it would just go to one host
 - but here we need more than one, for scalability, performance, and reliability

Chubby really takes this seriously
Zookeeper can get these sorts of guarantees but allows flexibility
 - trade off consistency for performance

consistency: act like copies are "the same"
  - changes are visible everywhere at the same time

Classic single system operating systems take shared state as a given
 - get in for free with shared RAM
 - in NUMA systems (i.e., any modern system with multiple cores),
   this can take a hit, but the hardware makes it mostly true
   - and when it isn't we can use special CPU instructions to
     make sure it is true

Note that Zookeeper is open source and is widely used
 - Chubby is google-specific
 - and I don't know of an open source implementation of it

Hadoop project is a gathering of projects for building large-scale distributed systems
 - many based on papers published by Google
 - all developed by parties other than Google, but major internet corporations (Yahoo was one of these, but continued with Facebook etc)

Kubernetes was developed by Google
 - I think they finally realized that people were developing tech
   outside their bubble and it meant that hires would know others'
   tech, not theirs

Note that Google doesn't use Kubernetes internally
 - they have Borg, which we will discuss next week

Sometimes you need everyone to be on the same page, that's when you use these sorts of services
 - not for data, but for metadata, config, etc
 - they pay a huge price for consistency guarantees
    - limited data storage
    - limited access methods
    - limited scalability (they don't scale per se, they allow
      other systems to scale)

Chubby turns 5 computers into 1 file server
  - but not really more than 5
  - but 5 is enough for fault tolerance, and load is relatively low
    because "locks" are coarse-grained
  - workload could be served by fewer machines, but these can be
    distributed around infrastructure

Zookeeper is based on Zab
Chubby is based on Paxos
etcd is based on Raft

Each algorithm has its tradeoffs
 - importance is you need something that provides a consistent view to build larger distributed systems
 - seems to be easier if you do this as a service rather than a library
   an application uses