NASD, GoogleFS, Farsite
Readings
Garth A. Gibson et al., "A Cost-Effective, High-Bandwidth Storage Architecture" (1998)
Sanjay Ghemawat et al., "The Google File System" (2003)
William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)
Questions
- What were the target environments for these filesystems? How did these environments shape their assumptions?
- Farsite was geared towards distributing a company's resources
- Just imagine debugging that sucker
- GFS Is very much geared for their specific requirements
- Farsite was geared towards distributing a company's resources
- What are the key ideas behind each filesystem?
- Scalability
- Separating control & metadata and the data itself
- Separate everything, machines protocols, etc
- Farsite?
- No real notion of striping, their model was "small files distributed everywhere"
- GFS?
- Lots of BIG FILES!
- What are the strengths and weaknesses of each design?
- Would you want to play with Farsite?
- Very baroque, "like windows" :)
- While GFS is not any more applicable to average users, it has a much simpler design
- NASD?
- Minus crypto, this is very close to what NAS is now
- Good idea in principle, but added hardware requirement likely prevented this actual implementation
- Would you want to play with Farsite?
- What are the strengths and weaknesses of each implementation?
- Which system is best suited for today's Internet? How about tomorrow's?
Questions for NASD
- Is giving direct access between client and drive a good idea?
- Are there substantial advantages in storing variable-length objects over fixed-sized blocks?
- Is putting the filesystem on the drive a good idea? Should more control and awareness be given to hardware devices?
- What are the strengths and weaknesses of the capability-based cryptography which NASD makes use of?
Questions for GoogleFS
- How does the Google file system implement security?
- Doesn't
- Is using a central server (point of access) a good design decision?
- It certainly works
- Makes administration easier
- As long as redundant and fast, why bother with the hassle of synchronization?
- Is removing random writes a good idea?
- They didn't actually remove it, but it is horribly inneficient
- BigTable specifically reduces the instances of random write and implements a way to append the same information
- Implementing this style would have killed their model
- Is the speedup attained by GFS's record-append method worth the sacrifice of Application overhead?
- Needing to manage duplication yourself
- Guaranteed access to specific offsets, which helps consistency, though wastes space
Questions for Farsite
- Byzantine fault tolerance?
- Have several entities, some of which may be compromised in some way. They might either be corrupted, compromised, or simply down.
- Assumptions for a Byzantine protocol? Failures are independent, so they are not colluding.
- Good model for hardware failures
- Bad model for software failures (infection, etc)
- Not really the appropriate solution, software is your main likely culprit, not hardware problems.
- Tried to implement a simpler version using checksums
- Have several entities, some of which may be compromised in some way. They might either be corrupted, compromised, or simply down.
- How similar and different compared to OceanStore?
- Uses crypto (same)
- Uses commodity hardware (different)
- Byzantine Fault tolerance (same)
- Namespaces are different
- Simpler version than OceanStore
- Only one administrative domain (someone HAS admin access)
- Planned for complete distribution, though ended up implementing a central server
- Made sure that every machine was identified through different keys
- Oceanstore was originally designed for dedicated distributed network servers, Farsite was designed for local commodity machines
- What's up with the file lease mechanism?
- Four kinds
- Likely discovered an application class that broke and needed different semantics
- Unable to give truly seamless access as if local
- Content, name, access, mode, machine leases
- Likely a windows semantics problem, not a file-system problem. But due to the desire that the file system should accomodate, rather than the OS, many 'hacks' were added
- Four kinds
Questions for Farsite retrospective
- If using different programming methods... how does this file-system work given different programming models
- Details of Byzantine fault tolerance
- Mentioned that they started to use formal methods, really good for their design
- A bit of a reality check that was necessary for this paper. Ultimate realization was that the proposed system was a little grandiose.
Notes
- GFS is great because it works on 'crap hardware'
- Oceanstore is likely better for regular document storage
- NASD on top of GFS? More messages, likely too slow and could defeat the purpose of GFS
- How do you go to the REALLY LARGE SCALE and have things work?
- Great question, only application specific?
- Definitely a need for large scale resource sharing
- Currently no unified way to share resources across administrative domains, so resources are all silo'd