DistOS 2014W Lecture 12
Introduction
Chubby, developed at Google, was designed to be a coarse-grained locking service for use within loosely coupled distributed systems (i.e., a network consisting of a high number of small machines). The key contribution was the implementation of Chubby (i.e., there were no new algorithms designed/introduced).
Its purpose is to allow clients to synchronize their activities and to agree on basic information about their environment. It is used to varying degrees by other Google project such as the GFS, MapReduce, and BigTable.
By course grained locking, we mean locking resources for extended lengths of time. For example, electing a primary would handle all access to given data for hours or days.
It is basically a ultra reliable and available file system for very small files that is used as a locking service.
Anil: "Once implemented, Chubby abstracts away all the crazy complicated stuff so you can more easily build your distributed system". Chubby is a tool that gives Google devs important guarantees to build on.
Design
The funny thing is that Chubby is essentially a filesystem (with files, file permissions, reading/writing, a hierarchal structure, etc.) with a few caveats. Mainly that any file can act as a reader/writer lock and that only whole file operations are performed (i.e., the whole file is written or read), as the files are quite small (256K max).
All the locks are fully advisory, meaning others can "go around" whoever has the lock to access the resource (for reading and, sometimes, writing), as opposed to mandatory, mandatory locks giving completely exclusive access to a resource.
It can be noted that Linux also utilizes advisory locks as opposed to Windows, which only utilizes mandatory locks. This could be a shortcoming of Windows as, when anything changes regarding the system, the system must be completely rebooted as the locks on files are never broken. With advisory locks, as in Linux, the system need only be rebooted when the kernel is modified/updated.
Chubby also functions as a name server, but only really for functional names/roles , such as for the mail server or a GFS server (i.e., Chubby is mainly used as a name server for logical/symbolic names for roles). It is a centralized place that maps names to resources. A unified interface to do so. The name-value mappings in Chubby allow for a consistent, real-time, overall view of the entire system.
As a name server, Chubby provides guarantees not given with DNS (e.g., DNS is subject to a stale cache) as Chubby provides a unified view of the way things are in the system.
Chubby was made coarse-grained for scalability as coarse-grained locks give the ability to create a distributed system while the fine-grained locks wouldn't scale well. It can also be noted that a fine-grained lock could be implemented on top of the coarse-grained locks. The entire point of Chubby was to give ultra-high availability and integrity.
Implementation
- Uses paxos, which is an insanely complicated way of solving the distributed consensus problem.
- Given many proposed values, it chooses one to be agreed upon.
- Master chubby server with 4 slaves (5 servers total make up a Chubby Cell)
- Master and slaves have all the data.
- Nothing particularity special about the master
- If the master fails, one slave is elected as a new master
use cases
Discussion
Where else do we see things such as Chubby? Where would you want this consistent, overall view?
You would want this consistent view in any sort of synchronized set of files across a set of systems, such as Dropbox. The main tenants of Chubby's design would hold where you would want to make sure there was an online consensus. It should be noted that this is not like version control as, with version control, everyone has their own copy which are all merged later. However, in this type of system, there is only one version available throughout the distributed system. Chubby's design would differ from Dropbox in that Dropbox is designed so that you can work offline and then synchronize your changes once you are online again (i.e., there can sometimes be more than one version of a file meaning you lack the consistent, overall view given by Chubby).