DistOS 2021F 2021-10-12

From Soma-notes

Notes

Lecture 10
----------
BigTable & MapReduce

Questions

BigTable
 - single row transactions
 - sparse semi-structured data, what do they mean?

MapReduce
 - what happens when the master goes down?
 - restarting tasks seems wasteful, why do that?
 - single master, failure unlikely how?!

What's the problem MapReduce is trying to solve?
 - how to process *large* amounts of data quickly
   - really, summarize data
 - think of this as web crawl data

example problem: what's the frequency of words on the entire web (as crawled by google?)

What's the strategy?
 - map: analyze data in small chunks in parallel,
        i.e., run a program on each bit of data separately

        (example: have a program calculate word frequencies in
	 a web page)
 - reduce: combine results to summarize
    - must be an operation that doesn't depend on order
      of operations
    - get a list, produce a smaller list where every element is
      of the same type as before
    - example: combine word frequencies for individual pages
      to get aggregate word frequencies

When fully reduced, you have word frequencies for the entire web.

Can every problem be expressed as a map and a reduce?
 - no!

Can a lot of problems that people care about be expressed this way?
 - yes!

map is embarassingly parallel
  - so can be maximally sped up
  - think of this as extracting the features you care about

reduce can be slow, but normally it shouldn't be that hard
 - can't have order dependency though
 - here we process the features, but in a non-order-dependent way

Why do pieces "start over" when a node crashes during a mapreduce operation?
 - we don't have access to the local state that was computed
    - local storage is way faster for this sort of thing!
 - and even if you did, you can't trust that state
    - because it could have been messed up for a while

What about master crashing?
 - if it is a long running job, the app developer should
   do their own checkpointing
 - if it isn't, we don't care!
 - and we can generally scale the machines used
   (at a place like google) to make individual runs pretty quick

MapReduce is everywhere now
 - sometimes too much, it can be used when the paradigm
   doesn't really fit the task
    - anything that has more data dependencies, i.e. depends on
      order of execution
    
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf
 - "dwarves" of parallel computation
 - they talk about what can be parallized and what can't
 - most of what we talk about in this class is embararringly parallel
    - except for the parts that aren't

BigTable
 - "database" that makes use of parallel storage, i.e., GFS
 - overall very parallel, but certain parts are not
    - what is very centralized?
    - look at Figure 4, most data is in SSTables stored in GFS, but the root isn't, it is in Chubby

Chubby gives robust config information
 - the state of BigTable at a high level
 - has to be very centralized, very available

Note that Chubby *doesn't* scale well, it only scales a bit
 - like 5 machines
 - but it scales enough to do what it needs to
 - it is the part of the state of the system that
   you can NEVER allow to be inconsistent

Parallel systems are hard because they can get partitioned in various ways and state can become inconsistent
 - and communication to maintain consistency goes up exponentially with size if done naively

Chubby is a fully connected graph amongst its nodes
 - all messages are seen by all nodes

But efficient parallel systems can't be fully connected
 - way, way too much communication

How does BigTable handle mutable state?
 - can you update arbitrary entries?  not directly!
    - data is immutable
 - if you want to change something, you have to write it to a
   new table and eventually consolidate to get rid of old data
    - a bit like a log-structured filesystem

immutability is a very powerful idea in distributed systems
 - if state can't change, you never have to communicate updates

Does this architecture have privacy implications?
 - deletes don't immediately delete data
   - modern solution: delete encryption key