Difference between revisions of "DistOS 2014W Lecture 21"

Revision as of 22:57, 19 April 2014

Presentation

Marking

marked mostly on presentation, not content
basically we want to communicate the basic structure of the paper, and do so in a way that isn't boring

Content

concrete, not "head in the clouds"
present the area
compare and contrast the papers
10 minutes talk, 5 minutes feedback
basic argument
basic references

Form

show the work we've done on paper
try to get feedback
think of it as a rough draft
try to get people to read the paper
enthusiasm
powerpoints are easier
don't read slides
no whole sentences on slides
look at talks by Mark Shuttleworth

MapReduce

A clever observation that a simple solution could solve most distributed problems. It's all about programming to an abstraction that is efficiently parallelizable. Note that it's not actually a simple solution, because it sits atop a mountain of code. It requires something like BigTable which requires something like GFS, which requires something like Chubby. Despite this, it allows for programmers to easily do distributed computation using a simple framework that hides the messy details of parrallelization.

Restricted programming model
Interestingly large scale problems can be implemented with this
Easy to program, powerful for certain classes of problems, it scales well.
MapReduce job model is VERY limited though. You can't do things like simulations.
MapReduce is problem specific.
- Naiad is less problem specific and allows you to do more.

Programming to an abstraction that is efficiently parllel. We have learnt all about infrastructure until now. Classic OS abstractions were about files. Now we used programming abstraction.

Example: word frequency in a document.

How does it work?

Two steps. Map and Reduce. The user writes theses.
- Map takes a single input key-value pair (eg. a named document) and converts it to an intermediate (k,v) representation. A list of new key-values.
- Reduce: Take the intermediate representation and merge the values.

Implementation

Uses commodity HW and GFS.
Master/Slave relationship amongst machines. Master delegates tasks to slaves.
Intermediate representation saved as files.
Many MapReduce jobs can happen in sequence.

Naiad

Where MapReduce was suited for a specific family of solutions, Naiad tries to generalize the solution to apply parallelization to a much wider family. Naiad supports MapReduce style solutions, but also many other solutions. However, the tradeoff was simplicity. It's like we took MapReduce and took away its low barrier to entry. The idea is to create a constrained graph that can easily be parallelized.

More complicated than Map Reduce
Talks about Timely dataflow graphs
Its all about Graph algorithms - Graph abstraction
Restrictions on graphs so that they can be mapped to parllel computation
How to fit anything to this model is a big question.
More general than map reduce.

After reading the MapReduce paper, you could easily write a map reduce job. After reading the Naiad, you can't. Naiad is super complicated.
Their model is super complicated. It doesn't minimize our cognitive load.
Doesn't scale at all. After about 40 nodes, there is no improvement in performance. MapReduce can scale to thousands of nodes and scales forever.
Nobody wants to use it because the abstraction is complicated.

@@ Line 72: / Line 72: @@
 * Their model is super complicated. It doesn't minimize our cognitive load.
 * Doesn't scale at all. After about 40 nodes, there is no improvement in performance. MapReduce can scale to thousands of nodes and scales forever.
-* Nobody wants to use it because the abstraction sucks.
+* Nobody wants to use it because the abstraction is complicated.

Difference between revisions of "DistOS 2014W Lecture 21"

Revision as of 22:57, 19 April 2014

Contents

Presentation

Marking

Content

Form

MapReduce

How does it work?

Implementation

Naiad

Navigation menu

Difference between revisions of "DistOS 2014W Lecture 21"

Revision as of 22:57, 19 April 2014

Presentation

Marking

Content

Form

MapReduce

How does it work?

Implementation

Naiad

Navigation menu

Search