DistOS 2015W Session 12
Haystack
- Facebook's Photo Application Storage System.
- Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
- Main goals of Haystack:
- High throughput with low latency. It uses one disk operation to provide these.
- Fault tolerance
- Cost effective
- SImple
- Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
- Haystack reduces the memory used for filesystem metadata
- It has 2 types of metadata:
- Application metadata
- File System metadata
- The architecture consists of 3 components:
- Haystack Store
- Haystack Directory
- Haystack Cache
- Pitchfork and bulk sync were used to tolerate faults. the fault tolerance works in a very profound way to make haystack feasible and reliable
Comet
- Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
- Comet model works by offloading the computation intensive process from the mobile to only one server.
- The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
F4
- Warm Blob Storage System.
- Warm Blob is a immutable data that gets cool very rapidly.
- F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
- Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks whch means it can tolerate 4 rack failure and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
- XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor
Sapphire
- Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
- Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
- Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].