DistOS 2021F 2021-10-07: Difference between revisions
Created page with "==Notes== <pre> Lecture 8: NASD & GFS --------------------- Questions? - is NASD a NAS? - how cost efficient? NASD or GFS? - having the file server out of the loop, secu..." |
|||
Line 2: | Line 2: | ||
<pre> | <pre> | ||
Lecture | Lecture 9 | ||
--------- | |||
Plan for next week for groups | |||
- | - generated by a script | ||
- | - posted at start of class, you'll need to | ||
- | manually go to your assigned room | ||
- based on what you submit | |||
- 4 groups: | |||
- high response grades | |||
- | - low response grades | ||
- | - high quiz grades | ||
- low quiz grades | |||
- groups will be first assigned inside each group randomly and then | |||
based on nearby groups (high with high, low with low) | |||
- will respect group exclusion list | |||
Zookeeper & Chubby | |||
- | - why do we need these? | ||
- and why don't we have them on individual systems, or on older | |||
distributed systems (Plan 9, NFS, LOCUS etc) | |||
We need them because | |||
- failures: hosts die, so we need more than one to run the service | |||
- so need to be able to recover | |||
- | - but really, it is *because* it is distributed that we need these complex algorithms | ||
Servers A and B | |||
Clients X and Y | |||
A and B are supposed to provide access to the "same" data | |||
Client X reads and writes to A | |||
Client Y reads and writes to B | |||
"like networked mutex-semaphore systems" | |||
- except there is NO SHARED MEMORY | |||
You want multiple hosts to behave as if they were the same host | |||
- | - unified view of state that, externally, is always consistent | ||
With older systems, if you wanted a unified view of state, it would just go to one host | |||
- but here we need more than one, for scalability, performance, and reliability | |||
Chubby really takes this seriously | |||
Zookeeper can get these sorts of guarantees but allows flexibility | |||
- | - trade off consistency for performance | ||
consistency: act like copies are "the same" | |||
- changes are visible everywhere at the same time | |||
Classic single system operating systems take shared state as a given | |||
- get in for free with shared RAM | |||
- in NUMA systems (i.e., any modern system with multiple cores), | |||
this can take a hit, but the hardware makes it mostly true | |||
- and when it isn't we can use special CPU instructions to | |||
make sure it is true | |||
Note that Zookeeper is open source and is widely used | |||
- | - Chubby is google-specific | ||
- and I don't know of an open source implementation of it | |||
Hadoop project is a gathering of projects for building large-scale distributed systems | |||
- | - many based on papers published by Google | ||
- all developed by parties other than Google, but major internet corporations (Yahoo was one of these, but continued with Facebook etc) | |||
Kubernetes was developed by Google | |||
- | - I think they finally realized that people were developing tech | ||
outside their bubble and it meant that hires would know others' | |||
tech, not theirs | |||
Note that Google doesn't use Kubernetes internally | |||
- | - they have Borg, which we will discuss next week | ||
Sometimes you need everyone to be on the same page, that's when you use these sorts of services | |||
- but | - not for data, but for metadata, config, etc | ||
- they pay a huge price for consistency guarantees | |||
- limited data storage | |||
- limited access methods | |||
- limited scalability (they don't scale per se, they allow | |||
other systems to scale) | |||
Chubby turns 5 computers into 1 file server | |||
- but not really more than 5 | |||
- but 5 is enough for fault tolerance, and load is relatively low | |||
because "locks" are coarse-grained | |||
- workload could be served by fewer machines, but these can be | |||
distributed around infrastructure | |||
Zookeeper is based on Zab | |||
Chubby is based on Paxos | |||
etcd is based on Raft | |||
Each algorithm has its tradeoffs | |||
- importance is you need something that provides a consistent view to build larger distributed systems | |||
- seems to be easier if you do this as a service rather than a library | |||
- | an application uses | ||
- | |||
</pre> | </pre> |
Latest revision as of 02:23, 13 October 2021
Notes
Lecture 9 --------- Plan for next week for groups - generated by a script - posted at start of class, you'll need to manually go to your assigned room - based on what you submit - 4 groups: - high response grades - low response grades - high quiz grades - low quiz grades - groups will be first assigned inside each group randomly and then based on nearby groups (high with high, low with low) - will respect group exclusion list Zookeeper & Chubby - why do we need these? - and why don't we have them on individual systems, or on older distributed systems (Plan 9, NFS, LOCUS etc) We need them because - failures: hosts die, so we need more than one to run the service - so need to be able to recover - but really, it is *because* it is distributed that we need these complex algorithms Servers A and B Clients X and Y A and B are supposed to provide access to the "same" data Client X reads and writes to A Client Y reads and writes to B "like networked mutex-semaphore systems" - except there is NO SHARED MEMORY You want multiple hosts to behave as if they were the same host - unified view of state that, externally, is always consistent With older systems, if you wanted a unified view of state, it would just go to one host - but here we need more than one, for scalability, performance, and reliability Chubby really takes this seriously Zookeeper can get these sorts of guarantees but allows flexibility - trade off consistency for performance consistency: act like copies are "the same" - changes are visible everywhere at the same time Classic single system operating systems take shared state as a given - get in for free with shared RAM - in NUMA systems (i.e., any modern system with multiple cores), this can take a hit, but the hardware makes it mostly true - and when it isn't we can use special CPU instructions to make sure it is true Note that Zookeeper is open source and is widely used - Chubby is google-specific - and I don't know of an open source implementation of it Hadoop project is a gathering of projects for building large-scale distributed systems - many based on papers published by Google - all developed by parties other than Google, but major internet corporations (Yahoo was one of these, but continued with Facebook etc) Kubernetes was developed by Google - I think they finally realized that people were developing tech outside their bubble and it meant that hires would know others' tech, not theirs Note that Google doesn't use Kubernetes internally - they have Borg, which we will discuss next week Sometimes you need everyone to be on the same page, that's when you use these sorts of services - not for data, but for metadata, config, etc - they pay a huge price for consistency guarantees - limited data storage - limited access methods - limited scalability (they don't scale per se, they allow other systems to scale) Chubby turns 5 computers into 1 file server - but not really more than 5 - but 5 is enough for fault tolerance, and load is relatively low because "locks" are coarse-grained - workload could be served by fewer machines, but these can be distributed around infrastructure Zookeeper is based on Zab Chubby is based on Paxos etcd is based on Raft Each algorithm has its tradeoffs - importance is you need something that provides a consistent view to build larger distributed systems - seems to be easier if you do this as a service rather than a library an application uses