Operating Systems 2021F Lecture 14

From Soma-notes

Video

Video from the lecture given on November 4, 2021 is now available:

Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)

Notes

Lecture 14
----------
 - Happy Diwali! (for those who celebrate)

 - grading: will upload A2 after class, midterm will be soon
   (by next class)
 - schedule will be posted when midterm grades are posted, will
   announce on Teams hopefully this weekend

 - today: more about files & filesystems for T6, which I'll post this afternoon

 - 3rd party builds of Firefox can't use the firefox branding without
   special agreement.  Many linux distributions can use the logos of firefox but internally they make the build names a bit different

(For the final I'll probably try to message on Teams to schedule interviews)


Files & filesystems
 - a file is a mapping between a hierarchical key (the filename) with an arbitrarily-sized value (the contents of the file)
 - there's also metadata associated with each file
    - permissions, ownership, timestamps, etc
 - metadata is associated with the value, not the key
    - goes with file contents, not the file name

So files (in UNIX-like systems) are actually a mapping between a name and
a data structure that represents the value and associated metadata
  - that data structure is known as an "inode"
    (I think the i stands for indirection but I'm not sure.)

stat system call returns the metadata associated with an inode
 - there's also a command line wrapper around stat

Example:

  File: foo
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 45220577    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/    soma)   Gid: ( 1000/    soma)
Access: 2021-11-04 10:13:10.897903575 -0400
Modify: 2021-11-04 10:13:10.897903575 -0400
Change: 2021-11-04 10:13:10.897903575 -0400
 Birth: 2021-11-04 10:13:10.897903575 -0400


From the stat man page:
    struct stat {
        dev_t     st_dev;         /* ID of device containing file */
        ino_t     st_ino;         /* Inode number */
        mode_t    st_mode;        /* File type and mode
	                             (permission bits) */
        nlink_t   st_nlink;       /* Number of hard links */
        uid_t     st_uid;         /* User ID of owner */
        gid_t     st_gid;         /* Group ID of owner */
        dev_t     st_rdev;        /* Device ID (if special file) */
        off_t     st_size;        /* Total size, in bytes */
        blksize_t st_blksize;     /* Block size for filesystem I/O */
        blkcnt_t  st_blocks;      /* Number of 512B blocks allocated */

        struct timespec st_atim;  /* Time of last access */
        struct timespec st_mtim;  /* Time of last modification */
        struct timespec st_ctim;  /* Time of last status change */
    };


Key concepts that we haven't covered:
 - blocks
 - multiple timestamps
 - hard links (NOT symbolic links)
 - device IDs
 - inode number

(Mentioned mounting, will discuss more soon)

So why three time stamps?
 - Access: when last read
 - Modify: when last written (data change)
 - Change: when inode last changed   <--- chmod
           (metadata change)
 - Birth: file creation (mostly not specified/used on UNIX)
    - not in the stat struct?

A quirk of UNIX files
 - a file access (read) results in a disk write (to update the accessed
   time stamp value in the inode)
 - on modern systems, this is often a bad idea (too much SSD writing)
    - so depending on how the filesystem was mounted,
      accessed timestamp may be approximate or just wrong

If something is done "lazily" in CS, it means it is put off as long as possible
 - try to minimize the work since if it is put off long enough it may
   not need to be done

So with reading, if you do 100 reads from a file, technically it should be writing the same inode to disk 100 times
 - but if we're lazy and only do the write, say, every few seconds,
   we'd only do one write to the inode for the accessed timestamp

Buffering I/O is a way to be lazy
  - reduces number of writes in exchange for higher latency in some circumstances


inode numbers
 - unique ID for an inode
 - filenames are really name -> inode number mappings
 - inode numbers are unique *within a given filesystem*
    - each filesystem has its own inode number namespace
   (the inode numbers on your usb disk filesystem have nothing
    to do with the inodes in your SSD filesystem)

If you give the -i option to ls, it will show you inode numbers.

A filename is better thought of as a pathname
 - it is the combination of names of every containing directory
   plus the final name

A directory is a data structure (really, a type of inode) which maps names to inode numbers

When you're in a directory, "." refers to this directory
 - "." is a filename like any other that refers to an inode

Hard link count
 - number of names an inode has
   - basically, a reference count
 - regular files will normally have a link count of 1
   - for its filename
 - directories will have a link count of at least 2
   - the directory name in the containing folder
   - the "." link in its own directory
   - every subdirectory has a parent link, those increase
     the link count too

So in every directory there are always two entries
 - "."  <-- this directory (like "self" in an OO language)
 - ".." <-- the parent (containing) directory

By default . is not in your path
 - considered a security vulnerability
 - imagine a web server where you can upload files to a special directory
 - sysadmin is working in the directory, attacker has uploaded a file "ls"
 - now when the sysadmin types "ls" they could get the attacker's ls, not
   /bin/ls

What does it mean to delete a file on a UNIX-like system?
 - you don't!
 - you just "unlink" files
    - remove directory entries
 - when an inode ref count goes to 0, it is considered deleted
   and is de-allocated

In general files are never immediately deleted
 - just like with a program, if you de-allocate a data structure it is
   still there in memory
 - and yes, this is lazy

If you want data to be erased from disk when you delete, use an
encrypted filesystem
 - because the key for the file will be written over when it is deleted
 - (encrypted data with lost key is considered erased if your encryption
    is any good)

inode ref count is the hard link count

Note that any commands that try to overwrite the data on disk won't work,
especially on an SSD.
  - need to explain filesystems a bit first to see why
  - data will be overwritten, but eventually (could be very long)

So what's a filesystem?
 - an "on disk" data structure for storing files
   (i.e., inodes and associated data)


To understand filesystems, you have to understand how disks store data

What does a "disk" store?  What does it appear to the OS as?

 - array of fixed-sized blocks
 - block number: which block
 - block contents: 512, 1024, 4096, 8192, or another power of two
   chunk of bytes

This is why UNIX has block devices & character devices
 - block device: you access data on them a block at a time
 - character device: you access data one character (byte) at a time

So a filesystem is a data structure for storing inodes & file data in blocks of some fixed size
 - some blocks are inode blocks
 - rest are data blocks

(Non-UNIX filesystems will divide these into directory and data blocks)

Each devices has a native block size
 - nowadays 4K or 8K
 - so, decided by the OS (firmware) on the disk
    - yes, disks of any kind are really their own computers

The filesystem block size should be a multiple of the device block size
 - because we always read or write an entire block at once


Character devices are for anything that isn't mass storage
 - so keyboards, mouse/touchpad, terminals
   - typically things for interacting with humans

look at /dev to see what different kinds of devices you have

you can't delete an inode directly, you can just remove its names
 - unlink all its names

When we "read" a file, we are just accessing data in its underlying block device

Instead of using a real block device, we can use a simulated one
for playing with filesystems
 - turn a file into a block device
 - can then make filesystems in it that can be mounted

this is all stuff for tutorial 6, so don't worry if you're lost
 - you may want to come back to what I've done here
 - we'll do it again a few times

symbolic links are filenames that refer filenames, not inodes