Operating Systems 2020W Lecture 10
Video
The video from the lecture given on February 7, 2020 is now available.
Notes
Lecture 10 ---------- Topics: - VM issues - signals and sleep - filesystems - mount - dd - superblocks, inode blocks, data blocks - fsck - sshfs On the VMs, the root filesystem is on /dev/mapper/COMPbase--vg-root, this is an LVM volume. LVM is the Logical Volume Manager - allows you to combine multiple disks and partitions into one virtual device - really not needed for a simple VM, but ubuntu defaults to using one We want to play with filesystems - but we don't have external devices we can add easily (at least on openstack currently) - so we want to make filesystems in files - treat a file like a block device - normally you don't do this, but it can be useful - this is how virtual machine disks are implmented To use a new filesystem: - prepare the block device (plug in USB stick, or make an empty file) - create a filesystem on the block device - mount the new filesystem on a chosen mount point (an empty directory) The dd command: - allows you to copy from an input file to an output file - key feature is it allows you to specify the number of "blocks" to copy and how big they should be - a block to dd is just the size of the buffer to use when reading and writing - so if bs=4096 and count=10, then dd will issue 10 reads and 10 writes, each reading or writing with a 4096 byte buffer (asking to read 4096 bytes and writing up to 4096 bytes depending on how many were read) Why can't we just do "cp /dev/zero newfile" rather than dd if=/dev/zero of=newfile bs=4096 count=10000 We can't use cp because you can't copy a character device. It will potentially provide an infinite amount of data. So we use dd to control how much we read and how those reads are done. Yes there are other ways to make a file with zeros, but dd is generally useful - can fill a file with random bytes by reading from /dev/urandom - can manipulate arbitrary portions of large files (copy, erase) - very useful for directly reading from and writing to actual devices (hard disks, etc) (We will be making our own character devices later in this term) Remember that device files have their own implementation of read, write, etc. They can do anything, including return random bytes or zeros (or digits of pi). mount: add filesystem to filesystem hierarchy "mount fs dir" means make the files in filesystem fs appear under directory dir dir was an empty directory before, but after this command all the files in fs will appear in it Example: - if you insert a USB stick on an ubuntu system, you'll see its files appear in /media/<user>/<device name> mass storage devices have "less" storage than advertised because they advertise storage in base-10 units but we use base-2 in practice e.g., difference between 1000 (10^3) versus 1024 (2^10). Difference gets really big when we talk about gigabytes and up! (But we also lose some space due to filesystem overhead, by default ext4 reduces available space by 5% so it always has some extra space to play with. The root user can use this reserved space and you can change it.) When we treat a file as a block device, we do this by associating it with a "loopback" block device, e.g. /dev/loop0. You don't have to play with this device directly, normally it is allocated for us automatically (but shows up when we do df). one key reason we make filesystems is to provide isolation - when you fill up a USB stick you don't fill up your main filesystem - similarly, when you fill up a virtual disk, it won't fill up the rest of the system, it will just use up the maximum space the virtual disk could Imagine running a program so it only had access to the files in a special virtual filesystem - it literally couldn't see anything else, let alone mess with anything else - snaps, containers are built around this idea You can manipulate the parameters of a filesystem with utilities - for ext4, you can use e2label (to give/change its name) or other parameters using tune2fs (e.g., how much space is reserved for the root user) In a UNIX filesystem, the blocks are divided between three types - data blocks (contents of files) - inode blocks (file metadata) - directory blocks are normally a kind of inode block - superblocks Superblocks have metadata about whole filesystems (rather than individual files) - what type of filesystem (ext4? vfat?) - what are the parameters of the fs (how big? block size? how many inodes?) - if you lose the superblock you lose the filesystem - which is why you normally have backup superblocks Normally a filesystem will take up an entire device (so mkfs.ext4 asks the kernel how big the device is and adjusts accordingly). We control it for virtual filesystems by controlling how big of a file we allocate (with dd or similar). losing a superblock means erase or corrupt. It isn't quite delete because you can't add or remove blocks from a device once created (how do you "grow" a USB stick?) Note that filesystems are complex data structures. If you look up a filesystem's superblock and low-level format, expect to find way more info than you might otherwise expect! Note that a filesystem's blocksize might not match the block size of the underlying device - but normally it will be a multiple of it, e.g. the device has a block size of 1024 but the filesystem's block size is 4096 - need to be able to easily translate between device blocks and filesystem blocks If you delete the primary superblock (which is either block 0 or 1 of most filesystems), you'll make the filesystem unmountable. - but you can recover by repairing the filesystem using a backup superblock - fsck can normally do it, but you may have to tell it where to find the backup superblock Superblocks don't change often (and what changes is not too important), so they don't have to be update all the time. They are there for disasters Filesystems should be recoverable, a feature that isn't needed of most data structures - the data structures you've studied, how well do they deal with corrupted or deleted pointers? generally not well! - filesystems are tolerant of such problems because storage can go bad Note that tools like fsck *are not* data recovery tools - they try to save the filesystem, not files - they can delete files in order to restore integrity to a filesystem If a filesystem is corrupted, *never* write to it. Instead, - get your data off the filesystem - THEN try to repair it A great tool for copying data off a failing disk is dd - can get a raw filesystem image that can later be examined using advanced tools - there are fancy versions of dd that can get data even when reads fail on parts of the disk Most of the time nowadays, if a device has errors, the device should be thrown away (recycled) - there are low-level errors all the time on modern storage devices - these errors are hidden by the device controllers - when they can no longer hide the errors, that means there are too many to hide, so the device is going bad fast - GET YOUR DATA OUT ASAP and replace! Professional data recovery people can recover data when drives have been damaged in all kinds of ways - they charge for this - and they have real limits - HAVE BACKUPS When you run fsck, it may find inodes that are allocated but have no hard links to them - file contents without a filename fsck will gives these inodes a name, e.g. for inode 5200 it will create /lost+found/#5200 If you find files in lost and found, it means fsck put them there - which means your filesystem was corrupted, but maybe is okay now? - hope you had backups! A filesystem are a set of files that are accessed using common code, with all the files under a mountpoint - most commonly on a device (e.g. hard drive) - but can be virtual (/proc) - and can be remote (nfs, samba/cifs, ceph, sshfs)