Lecture 9
---------
Beyond filesystems (& better filesystems)
- what happens when they get corrupted?
- what happens when we need to recover old data
- what about multiple storage devices?
Journaled filesystems
- avoid long fsck's
- strategy: write everything TWICE
- write once sequentially, the second time randomly
- first write pass is to a log called the journal: sequential data structure of updates to FS
- example: changed inode 2203, updated data block 18523, blah
- to get a full view of the FS, you have to check the journal and the regular fs
- limit the size of the journal and periodically commit its changes to the rest of the FS
- why it helps: corruption happens from interrupted writes
- with this, an fsck is just playing back the journal
Journaled filesystems work well when workload is mostly reads. But what if they are mostly writes?
- bad because we're writing twice, so half (or much less) performance)
- but what if we only wrote once to the log?
=> log-structured filesystems
log-structured filesystems have a nice feature:
- they don't write to the same block too often
- writing to the same block with flash storage too many times is BAD, destroys the block
- flash firmware implements a log-structured like "filesystem" (layer)
Logical Volume Management (LVM)
- filesystem just needs an array of blocks
- why not merge arrays of blocks from multiple devices?
Modern filesystems is to combine LVM and regular FS