Operating Systems 2022F Lecture 12

From Soma-notes
Jump to navigation Jump to search

Video

Video from the lecture given on October 20, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 12
----------

To update your VM
 - backup!  (these updates are very safe but still, good to backup)

As root: (so with sudo)
 - apt update   (update package list)
 - apt dist-upgrade  (update packages)
 - apt autoremove (remove packages no longer needed
 - apt clean (delete downloaded archives)
 - snap refresh (update snaps)
 - reboot (only needed if the kernel has been updated)

You do updates mainly to keep up to date on security patches
Not so important on the class VMs, but if you're ever maintaining systems you should do this
 - can set unattended upgrades as well, see the "unattended-upgrades" package

In general ubuntu packages are documented in /usr/share/doc/<package name>


Note "apt clean" removes .deb's that were downloaded to /var/cache/apt/archives.  If you don't run it, they'll just accumulate there.

You can install packages by running
  apt install <package name>

To remove
  apt remove <package name>

To remove all configuration for the package (remove leaves config info)
  apt purge <package name>

Also, to search
  apt search <pattern>


In UNIX, files have a logical size and a physical size
 - logical: how many bytes can be read from it, its conventional size
 - physical: how much space it takes up on disk, is always in terms
             of blocks

For normal files, the physical size >= logical size, because we always lose
a bit of space in the last block (we need only a few bytes but have to use a whole new 4K block for example.

For example, consider a file with 4097 bytes in it, non-zero, on a filesystem
with 4K blocks (thus, on a device with 1K, 2K, or 4K physical blocks)
 - logical size: 4097 bytes
 - physical size: 8192 bytes (2 blocks), first block stores 4096 bytes,
   next stores just one byte

UNIX also has an optimization for files when they contain long strings of null bytes (zero bytes).  We can make a "hole" where we don't need to actually allocate space on disk, we can just say "these are zero"

When we ran mkfs.ext4, we punched a bunch of holes into the file
 - some blocks were used to store filesystem metadat
 - but rest were set to zero and thus could be de-allocated, turned into holes

If the physical size is less than the logical size, you have holes in the file
 - and if you just copied the bytes to another file, you'd get a file
   whose physical size >= logical size
    - holes have to be created, they aren't added automatically

To be specific
 - write system calls never generate holes
 - you have to do an "lseek" call to make a hole, i.e., move where you are writing.

If you make a file with truncate, you get a file that is one big hole
 - zero space on disk
 - but whatever logical size you want
 - if you read it, you'll get all null bytes

Holes in files is just a simplified compression scheme
 - if a block is zero, no need to allocate space on disk for it
 - in the inode, it keeps track of holes, basically ranges of blocks
   that don't exist and are just zeros

So remember for every file, we have an inode
 - a file is a file name and a reference to an inode
 - the inode stores file metadata (as can be seen with stat)
 - it also stores info on the data blocks

So an inode is a block that contains the inode data structure.  This data structure is in two parts: info on file metadata, and info on data blocks
 - metadata: uid, gid, timestamps, size - what stat reports
 - data blocks: list of data blocks

Now the list of data blocks can be simple like "14, 17, 22, 2552", but
most filesystems employ various data structures and have provisions for
this list to point to other blocks where the list is continued.
 - have to account for very large and very small files
 - many filesystems use "extents", i.e., block ranges (e.g., 22-555)

Remember there is no explicit "end of file" marker
 - we just know the logical size, so in the last block the OS will
   only return the part that is valid

So what is a hole?
 - in the list of blocks, you put a special code (perhaps zero) that
   indicates this block isn't allocated, it is just zeros.

Logical size of a file is normally all we care about as a developer
 - that's how many bytes you can read from a file, period

Physical size is an implementation detail
 - can vary across filesystem implementations
 - on a filesystem that compresses transparently, can be highly variable
 - holes can reduce the physical size

You only care about physical size if you're concerned about how much disk
space is being occupied (i.e., you're running out of space)
  - check df, that will tell you how much space is available on
    a filesystem

Normally physical > logical, but the difference is at most one less
than a physical block ( < 4K generally)
 - this assumes no compression
 - holes are a simplistic kind of compression

On temp directories, do a chmod +t, sets thet "sticky bit"
 - normally in a world-writable directory, anybody can delete any file
 - but with +t, anybody can create files, but only the owner
   can delete them

So how do we get data out of a file /mnt/f1, when /mnt is a filesystem stored in /home/soma/Tut5/foo?
 - read /mnt/f1, sees data is in the block device /dev/loop5
 - /dev/loop5 says look in the file /home/soma/Tut5/foo
 - when reading /home/soma/Tut5/foo, it says to look at the block
   device associated with the / filesystem, namely /dev/mapper/vg0-lv--0

We call /dev/loopX a loopback device because access to it has to "loop" again through the filesystem layers to get the data.  We go down to the block level and the we go back up again to the file level, only to go back down to blocks again, but on a different device (this one hopefully getting us to real device storage)

We're playing with loopback devices because it isn't easy to add new devices to your class VMs.  (On your own systems, you can play with devices by plugging
in USB drives)

But this sort of thing happens all the time when we play with virtual machines


So why /dev/mapper?
 - because we actually are using something called LVM, Logical Volume Manager
   - abstraction over disk blocks
   - allows multiple disks to be combined into one block device,
     so one filesystem can span disks

On the actual class VM, we have these block devices for storage
 - /dev/vda2, a partition on /dev/vda which is our first virtual disk
   - in this partition is an LVM partition which is where the root filesystem is
 

"Sectors" are more of a classical hard disk terminology
 - cylinders, heads, sectors - have to specify all
   to specify a part of a disk
 - nowadays we just treat it as a linear array of blocks

(You can just say sectors = blocks)