Operating Systems 2017F Lecture 15
Video
The video from the lecture given on Nov. 7, 2017 is now available. Unfortunately, the video cut out halfway through; audio is also available however.
Notes
In Class
Lecture 15
----------
What's a filesystem?
 - persistent data structure organized around fixed allocation units (blocks)
 - maps hierarchical names (keys) to values
 - provide a file-like API (open, read, write, close, etc)
What does it mean to "make" a filesystem?
 - initialize the data structure
 - "formatting" a disk
Physical versus logical size of a file
 - logical: the "size" your program sees when accessing the file
 - physical: how much space the file takes up on disk
Physical is in terms of blocks - fixed units of storage allocation
 - ext4 has 4k blocks
 - default for many command line utilities is 1k blocks
Kernel programming
 - you may destroy the system you are working on AT ANY TIME
 - HAVE GOOD BACKUPS
 - rsync is your friend
Kernel modules
 - way of splitting up kernel functionality so everything doesn't have to load
   at boot
    - code loaded as part of the initial boot image is hard to get rid of at
      runtime
 - why do we need modules? why not processes?
   - no new mechanisms
   - increased security (restricted access)
   - "microkernel" approach
   - instead of code talking in supervisor mode, processes do IPC
      - filesystems
      - drivers
      - networking
   - minix, QNX, GNU hurd
 - Linux is a "monolithic" kernel.  Why?
   - performance: context switches are expensive
     - techniques to make microkernels fast can be adopted by
       monolithic kernels to make them even faster
   - security benefit is illusory
     - if you control the filesystem process, you already own everything
Additional
--> Core kernel functionality is implemented via modules --> use ls mod to see modules that are loaded
What is a monolithic kernel? --> a type of OS architecture where the entire OS is working in kernel space --> can dynamically load/unload modules at runtime
make localmodconfig: --> takes output of ls mod and configures your kernel
ones.c program:
/dev/ones: --> permissions are read only
file_operations ones_fops(): --> defines what happens when you open a file, read from it, release tells you what happens when you're done with it (not the same thing as close)
ones_read(): --> len is the number of bytes to read --> offset tells you where you are in the file --> put_user() takes care of whatever needs to be done to write into that process properly
ones_release:
Why are we using printk instead of printf? --> printf is not yet defined (ie. C library is not available in the kernel) --> kernel doesn't depend on any libraries, all code belongs to the kernel itself --> printk is the kernel's own implementation of printf (outputs to the kernel log --> /var/log/kern.log)
vfs = virtual filesystem layer
How do we limit access to user space processes? --> Do a permission check
--> kernels need to be updated regularly to correct bugs that make the kernel vulnerable to programs trying to gain access to important user space processes --> unlikely() = tells you that this branch is not likely to be taken, optimize the current path
vfs_read:
file->f_op->read: --> this is how our read function will be called
From the Text Book (Filesystem Implementation)
VSFS - Illustrates the a typical Unix file system using vsfs (Very Simple File System). - Introduces some of the basic on-disk structures, access methods, and various policies
	
The mental model of File Systems
- Want to answer these questions:
- What on-disk structures store the file system's data and metadata?
- What happens when a process opens a file?
- Which on-disk structures are accessed during a read or write?
 
- The file system is pure software that implements:
- a data-structure to organize its data and metadata.
- access methods - how it maps the calls made by a process (open(), read(), write()) onto its structures
 
Data structure
- A file system divides the hard disk into blocks, a common sized block is 4KB.
- The majority of the blocks hold user data and their inodes
- Two bitmaps, one of the inode table and the other for the user data blocks track usage
- Finally, a superblock contains information about the filesystem -- how many inodes, datablocks, and type of filesystem
Mounting the filesystem
- The operating system reads the superblock and then attaches the volume to the filesystem tree
File Organization
- inode (index node) number indexes into the inode table to find the desired block of inodes. Once the inode is retrieved all of the corresponding file information is known. Some metadata include: uid (who owns it), size (how many bytes are in the file), gid (which group does the file belong to), blocks ( how many blocks have been allocated to this file) etc…
- inode's refer to data blocks using a multi-level index. Direct pointers and indirect pointers. Direct pointers reference data blocks and indirect pointers reference pointers to data blocks. Adding levels of pointers allows the inode to reference very large files
Directory Organization
- A directory is a list of pairs (entry name, inode number). The entry name is the string representing the directory.
- 'dot' is the current directory and 'dot-dot' is the parent directory
- Directories are stored in a file system the same way regular files are - they have an inode. Only directories are marked as such in the inode type field.
Free Space Management
- When a file is created the file system searches the inode bitmap and allocates a free inode to the new file.
- The filesystem uses a pre-allocation policy to allocate data blocks for a new file. It tries to contiguous blocks thus improving performance.
Reading and Writing a file
- Open: the filesystem traverses the pathname and locates the inode for the file. It starts at the root directory (this is known to the filesystem when it is mounted) and traverses the pathname for the file's inode. Once the permissions have been verified a file descriptor to the file is returned to the user. Note that several reads happen during open in order to find the correct inode number.
- Read: The filesystem starts at file offset 0 which it gets from the inode. It updates memory with the file contents and moves the file offset such that the next read knows where to begin. Disk IO is accessed as needed.
- Close: The file descriptor is deallocated.
- Writing: Each write to a file generates five I/O's: one to read the data bitmap, one to write the bitmap, two more to read and then write the inode and finally one to write to the data block.
Caching and Buffering
- Caching is used to eliminate disk IO for future accesses of popular files. Popular files are loaded into virtual memory so that future accesses do not have to go through the process of reading from disk.
- Write buffering is used by the filesystem as a way of minimizing the number of disk accesses. The system builds a buffer of write's and schedules them minimizing IO.