Talk:Google OS

Group 1: Google FS

Questions from Group 2 (Chubby):

What applications were producing all the appends? Indexing search data collected by Google
What happens when a master fails? The remaining nodes elect a new master using Chubby, and the most recent operations are reconstructed from the operation log
How could you use GFS outside of Google? Any large archiving project, preferably one where data is generated/read sequentially and rarely removed
How does data get deleted? The file is renamed to mark it for deletion; a periodic maintenance process removes marked files if they were deleted more than three days ago

Questions from Group 3 (Bigtable):

What big assumptions did they make for GFS that were against the norm? Large chunk size, files updated primarily via appends, large file sizes, chronic hardware failure
How is the master chosen and what happens when it fails? (See #2 above)
How did GFS remain in sync with the system? Did their method slow the system? The master asks the chunkservers which chunks they have; slowdown is minimal because the chunkservers are doing most of the work
What is lazy space allocation? New chunks are only allocated when needed (eg. if receiving 50 GB of data, GFS wouldn't allocate 50 GB worth of chunks right away, but one chunk at a time as needed)

Group 2: Chubby

Questions from Group 1, GFS, and to-the-point answers

What were some unexpected uses for Chubby? as a DNS server
How does Chubby recover if a client fails while holding a lock? client won't send keep-alive, so Chubby releases lock
How does Chubby's access control differ from UNIX files? permissions don't depend on permissions of super-directories (uses Access Control List files in an ACL directory)
How did they use Chubby's C++ code within Java projects? special native RPC conversion layer, because Java doesn't support their RPC protocols

Questions from Group 3, Bigtable

What was Chubby designed for and did it meet expectations?
Was coarse-grained locking a good idea and how did it impact applications?
How is the master elected and is it elected consistently?
How did Chubby integrate into other systems? Was it lightweight for applications?

Group 3: Bigtable

Questions from: Group 1, GFS

  1. How is Bigtable like/unlike a relational database?

  ans. Bigtable is unlike a relational database because:
             - It stores data in SSTables, which are not in proper relational form.
             - A tablet can store multiple versions of the same data based on timestamps.
             - The language used to query Bigtable does not support a full relational database functionality.
       Bigtable is like a relational database because:
             - Server side scripts can be used to filter results similar to some sql queries.

  2. What is the role of SSTable in Bigtable?
  
  ans. It is a model for formating data.

  3. For the webTable, why are domain names stored in reverse order?

  ans. The domain names are stored in reverse order to increase efficiency of a query.

  4. How did Bigtable use Chubby?

  ans. The master server uses Chubby to track tablet servers.

Questions from Group2, Chubby

  1. Why did they not implement the full relational model?

  ans. They did not need a full relational model. They only implemented what they thought they needed at the time.

  2. When could major compaction occur?

  ans. Major compaction could occur when two tables are the same, but stored differently, or when there are lots of gaps due to row deletions.

  3. How does Bigtable handle fine-grained locking?

  ans.

  4. What are the similarities in the architecture between GFS and Bigtable?

  ans. They are similar in master selection and the use of Chubby.