DistOS 2021F 2021-10-12: Difference between revisions
Created page with "==Notes== <pre> Lecture 9 --------- Plan for next week for groups - generated by a script - posted at start of class, you'll need to manually go to your assigned room..." |
|||
Line 2: | Line 2: | ||
<pre> | <pre> | ||
Lecture | Lecture 10 | ||
--------- | ---------- | ||
BigTable & MapReduce | |||
Questions | |||
BigTable | |||
- single row transactions | |||
- sparse semi-structured data, what do they mean? | |||
MapReduce | |||
- why do | - what happens when the master goes down? | ||
- | - restarting tasks seems wasteful, why do that? | ||
- single master, failure unlikely how?! | |||
What's the problem MapReduce is trying to solve? | |||
- | - how to process *large* amounts of data quickly | ||
- | - really, summarize data | ||
- | - think of this as web crawl data | ||
example problem: what's the frequency of words on the entire web (as crawled by google?) | |||
What's the strategy? | |||
- map: analyze data in small chunks in parallel, | |||
i.e., run a program on each bit of data separately | |||
(example: have a program calculate word frequencies in | |||
a web page) | |||
- reduce: combine results to summarize | |||
- must be an operation that doesn't depend on order | |||
of operations | |||
- get a list, produce a smaller list where every element is | |||
of the same type as before | |||
- example: combine word frequencies for individual pages | |||
to get aggregate word frequencies | |||
When fully reduced, you have word frequencies for the entire web. | |||
Can every problem be expressed as a map and a reduce? | |||
- | - no! | ||
Can a lot of problems that people care about be expressed this way? | |||
- | - yes! | ||
map is embarassingly parallel | |||
- so can be maximally sped up | |||
- think of this as extracting the features you care about | |||
reduce can be slow, but normally it shouldn't be that hard | |||
- can't have order dependency though | |||
- here we process the features, but in a non-order-dependent way | |||
Why do pieces "start over" when a node crashes during a mapreduce operation? | |||
- | - we don't have access to the local state that was computed | ||
- | - local storage is way faster for this sort of thing! | ||
- and even if you did, you can't trust that state | |||
- because it could have been messed up for a while | |||
What about master crashing? | |||
- | - if it is a long running job, the app developer should | ||
- | do their own checkpointing | ||
- if it isn't, we don't care! | |||
- and we can generally scale the machines used | |||
(at a place like google) to make individual runs pretty quick | |||
MapReduce is everywhere now | |||
- | - sometimes too much, it can be used when the paradigm | ||
- | doesn't really fit the task | ||
- anything that has more data dependencies, i.e. depends on | |||
order of execution | |||
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf | |||
- "dwarves" of parallel computation | |||
- they talk about what can be parallized and what can't | |||
- most of what we talk about in this class is embararringly parallel | |||
- except for the parts that aren't | |||
BigTable | |||
- | - "database" that makes use of parallel storage, i.e., GFS | ||
- overall very parallel, but certain parts are not | |||
- what is very centralized? | |||
- look at Figure 4, most data is in SSTables stored in GFS, but the root isn't, it is in Chubby | |||
Chubby gives robust config information | |||
- | - the state of BigTable at a high level | ||
- has to be very centralized, very available | |||
Note that Chubby *doesn't* scale well, it only scales a bit | |||
- | - like 5 machines | ||
- | - but it scales enough to do what it needs to | ||
- it is the part of the state of the system that | |||
you can NEVER allow to be inconsistent | |||
Parallel systems are hard because they can get partitioned in various ways and state can become inconsistent | |||
- and communication to maintain consistency goes up exponentially with size if done naively | |||
Chubby is a fully connected graph amongst its nodes | |||
Chubby is | - all messages are seen by all nodes | ||
But efficient parallel systems can't be fully connected | |||
- | - way, way too much communication | ||
- | |||
How does BigTable handle mutable state? | |||
- can you update arbitrary entries? not directly! | |||
- data is immutable | |||
- if you want to change something, you have to write it to a | |||
new table and eventually consolidate to get rid of old data | |||
- a bit like a log-structured filesystem | |||
immutability is a very powerful idea in distributed systems | |||
- if state can't change, you never have to communicate updates | |||
Does this architecture have privacy implications? | |||
- deletes don't immediately delete data | |||
- modern solution: delete encryption key | |||
</pre> | </pre> |
Latest revision as of 02:24, 13 October 2021
Notes
Lecture 10 ---------- BigTable & MapReduce Questions BigTable - single row transactions - sparse semi-structured data, what do they mean? MapReduce - what happens when the master goes down? - restarting tasks seems wasteful, why do that? - single master, failure unlikely how?! What's the problem MapReduce is trying to solve? - how to process *large* amounts of data quickly - really, summarize data - think of this as web crawl data example problem: what's the frequency of words on the entire web (as crawled by google?) What's the strategy? - map: analyze data in small chunks in parallel, i.e., run a program on each bit of data separately (example: have a program calculate word frequencies in a web page) - reduce: combine results to summarize - must be an operation that doesn't depend on order of operations - get a list, produce a smaller list where every element is of the same type as before - example: combine word frequencies for individual pages to get aggregate word frequencies When fully reduced, you have word frequencies for the entire web. Can every problem be expressed as a map and a reduce? - no! Can a lot of problems that people care about be expressed this way? - yes! map is embarassingly parallel - so can be maximally sped up - think of this as extracting the features you care about reduce can be slow, but normally it shouldn't be that hard - can't have order dependency though - here we process the features, but in a non-order-dependent way Why do pieces "start over" when a node crashes during a mapreduce operation? - we don't have access to the local state that was computed - local storage is way faster for this sort of thing! - and even if you did, you can't trust that state - because it could have been messed up for a while What about master crashing? - if it is a long running job, the app developer should do their own checkpointing - if it isn't, we don't care! - and we can generally scale the machines used (at a place like google) to make individual runs pretty quick MapReduce is everywhere now - sometimes too much, it can be used when the paradigm doesn't really fit the task - anything that has more data dependencies, i.e. depends on order of execution https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf - "dwarves" of parallel computation - they talk about what can be parallized and what can't - most of what we talk about in this class is embararringly parallel - except for the parts that aren't BigTable - "database" that makes use of parallel storage, i.e., GFS - overall very parallel, but certain parts are not - what is very centralized? - look at Figure 4, most data is in SSTables stored in GFS, but the root isn't, it is in Chubby Chubby gives robust config information - the state of BigTable at a high level - has to be very centralized, very available Note that Chubby *doesn't* scale well, it only scales a bit - like 5 machines - but it scales enough to do what it needs to - it is the part of the state of the system that you can NEVER allow to be inconsistent Parallel systems are hard because they can get partitioned in various ways and state can become inconsistent - and communication to maintain consistency goes up exponentially with size if done naively Chubby is a fully connected graph amongst its nodes - all messages are seen by all nodes But efficient parallel systems can't be fully connected - way, way too much communication How does BigTable handle mutable state? - can you update arbitrary entries? not directly! - data is immutable - if you want to change something, you have to write it to a new table and eventually consolidate to get rid of old data - a bit like a log-structured filesystem immutability is a very powerful idea in distributed systems - if state can't change, you never have to communicate updates Does this architecture have privacy implications? - deletes don't immediately delete data - modern solution: delete encryption key