COMP 3000 Essay 1 2010 Question 9

From Soma-notes
Revision as of 22:01, 6 October 2010 by Gbint (talk | contribs) (added data deduplication to the overview of ZFS features; minor editing of existing text)
Jump to navigation Jump to search

Question

What requirements distinguish the Zettabyte File System (ZFS) from traditional file systems? How are those requirements realized in ZFS, and how do other operating systems address those same requirements? (Please discuss legacy, current, and in-development systems.)

Answer

ZFS was developed by Sun Microsystems (now owned by Oracle) as a server class file systems. This differs from most file systems which were developed as desktop file systems that could be used by servers. With the server being the target for the file system particular attention was paid to data integrity, size and speed.

One of the most significant ways in which the ZFS differs from traditional file systems is the level of abstraction. While a traditional file system abstracts away the physical properties of the media upon which it lies i.e. hard disk, flash drive, CD-ROM, etc. ZFS abstracts away if the file system lives one or many different pieces of hardware or media. Examples include a single hard drive, an array of hardrives, a number of hard drives on non co-located systems.

One of the mechanisms that allows this abstraction is that the volume manager which is normally a program separate from the file system in traditional file systems is moved into ZFS.

ZFS is a 128-bit file system allowing this allows addressing of 2128 bytes of storage.


Major Features of ZFS

Data Integrity

  • Checksums
  • self monitoring/self healing using mirroring/copy-on-write.
  • transactional based file IO
  • system snapshots.

Data Deduplication

Duplicated data is recorded only once physically, with those blocks mapped to multiple files. Think of an email database where there may be 100 copies of the same message with the same 20MB attachment. Overall physical storage required can be reduced, which can have important consequences for data center power, space, and cooling needs.