Operating Systems 2022F Lecture 15

From Soma-notes

Video

Video from the lecture given on November 8, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 15
----------

* Washer delivery happening shortly, so I *may* have to step out for a few minutes, sorry
* A2 is graded, I just need to post, should be later today
* Midterm is being marked but we'll need until next week, sorry
* update on Assignment 3
  - I have it mostly done but trying to figure out one thing, should be out in a day
  - but we're going to talk about it today anyway

Assignment 3 from Fall 2021: 3000makefs.sh

* When you run this script, you get what seems like a new system
 - it has its own filesystem with its own /etc, /usr, /bin, and so on

This is part of the tech used to build small Linux systems

Key technologies to understand:
 - making a filesystem in a file
 - chroot to make this new filesystem /
 - busybox

What is busybox?
 - one program that has the (basic) functionality of an entire UNIX/Linux system (userland)
    - with a Linux kernel + busybox, you have a functional Linux system

So why don't we use busybox all the time?
 - because its versions are rather basic

How does a UNIX-like system start up?
 - three stages: bootloader, kernel startup, userland startup
 - the bootloader is responsible for loading the kernel
    - which may mean doing some hardware init
    - on modern Linux systems, most commonly this is GRUB
    - but can be a series of programs
 - the kernel then starts, this is what initializes the hardware,
   creates the process abstraction, manages all resources
    - we'll be exploring this more soon
 - userland starts with init
    - basically, all the things you can see with ps is userland
      (but not the ones in [])

 - the BIOS or EFI loads the bootloader
    - EFI can actually load a kernel directly

"init" is the name given to the first process that runs on a UNIX system
 - always PID 1
 - it is its own parent process (the parent of 1 is 1)
 - when it terminates, the system shuts down

Note that MOST processes sleep most of the time
 - background processes are waiting for something to do


3000makefs uses chroot to have a new root filesystem
 - so we can't see other filesystems unless we mount them from their devices
 - but we can still see the rest of the system
   - and in fact we aren't really isolated at all from things

Have you heard of containers?
 - basically the same as a chroot environment, except we also split up processes, users, etc - everything in userspace
 - idea is to make systems separate, even though they share a kernel

Whenever you see a loopback device as the device associated with a filesystem,
it means the data for that filesystem is *in a file* on another filesystem

/dev/loopX is just a device that is used to make a file behave like a block device

So why is firefox and other software being distributed in containers?
 - because then Mozilla can package it up in a consistent way
   that will work the same on different systems
 - remember a container contains programs, libraries, and anything else you want
   - it is an "entire system"
 - to a first approximation then, anything else installed on the system can't
   mess with how firefox works, because they are kept separate


Any modern program depends on many files
 - libraries, system data, config data, etc
 - if it acts strange/has a bug, the problem could be with the files
   it depends on, and not its code
 - containers package all this up so you get a consistent execution environment

Why are we using loopback devices?  To provide filesystem isolation
 - otherwise we can't make sure files in one container don't take away
   resources in another (say, a log file gets too big)
 
So where is the code for the kernel?
 - basic kernel image: in /boot, vmlinuz...
 - kernel modules: /lib/modules/<version>
 - modules are bits of kernel code that are run at runtime as needed

When we talk about userspace, we are talking about regular programs and their files
When we talk about kernelspace, we are talking about the kernel & modules
 - note the kernel doesn't load libraries or any other files generally
   - it facilitates programs accessing files

We enter the kernel via two routes (which are two versions of the same interface):
  - system calls & interrupts (from timers, keyboard, hard drive, network, etc)
                              (also from dividing by zero, accessing
			       invalid memory (segfault))
			       
We exit the kernel via the scheduler
  - determines what process to run next

Basic flow of control for a system call

  Userspace                     Kernel space
  -------------------          ----------------
  make system call
                               start system call dispatcher
			       handle system call
			       call scheduler
			         - find progam that is in a runnable state
				 - start it
  continue chosen program
    execution


Implicit above is lots of saving and restoring of CPU state


Note that when we return to userspace after a system call, we *generally* aren't going to run the process that made the system call next
  - we'll run another process that was waiting


When you look in /proc and /sys, you are interacting directly with the kernel
 - interface to kernel data structures
 - but you access them using a filesystem interface