DistOS 2023W 2023-03-15

From Soma-notes
Revision as of 11:44, 15 March 2023 by Soma (talk | contribs) (Created page with "==Notes== <pre> Bigtable & MapReduce -------------------- When you think about BigTable, focus on figure 1 (to understand what it is doing) and Figure 4 (to understand how). Remember that GFS requires structured information to be stored (because data can be duplicated), BigTable is one of the ways GFS files can be organized To what extent is BigTable a database? For MapReduce, think about the kind of tasks Google wanted to perform on its web crawls - generating an...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Notes

Bigtable & MapReduce
--------------------

When you think about BigTable, focus on figure 1 (to understand what it is doing) and Figure 4 (to understand how).

Remember that GFS requires structured information to be stored (because data can be duplicated), BigTable is one of the ways GFS files can be organized

To what extent is BigTable a database?


For MapReduce, think about the kind of tasks Google wanted to perform on its web crawls
 - generating an index, for example
 - gather statistics

Consider the complexity of tasks that you could do with map, and you could do with reduce, noting that map is embarassingly parallel and reduce isn't.

Answer in group
 - To what extent is BigTable a database?
 - how does the design of GFS influence the implementation of BigTable
 - what problems can be solved with MapReduce?  What problems can't (efficiently)?