Operating Systems 2021F Lecture 19

From Soma-notes

Video

Video from the lecture given on November 23, 2021 is now available:

Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)

Notes

Lecture 19
----------
 - plan for rest of term
 - kernel modules, T7
 - concurrency, T8

Tutorial 8 is about the producer consumer problem
 - classic problem in concurrency (one of the simpler ones)

When you think producer/consumer, think pipes (FIFOs)

If we run something like

  cat /dev/urandom | less

Note that the cat command doesn't keep running; it is paused when less
stops asking for input
  - standard out from cat is going to standard in for less,
    via an anonymous pipe
  - the left side of the pipe pauses when the right side isn't actively
    reading (consumer pauses when producer doesn't need more input)
  - similarly, the consumer will pause when there is no output from
    the producer
  - note there is a buffer between the producer and consumer
    (as part of the pipe)
      - so whether it is full or empty governs whether the producer
        or consumer is going to be paused


Basic structure of a producer consumer problem

                  ----------
  producer  <->   | buffer |  <->  consumer
                  ----------

Buffer has many slots, think of it as a circular array
  - producer writes entries to buffer
  - consumer reads and removes entries from buffer

Producer will sleep when buffer is full
  - and will wait for consumer to wake it up
Consumer will sleep when buffer is empty
  - and will wait for producer to wake it up

Note that if producer and consumer sleep at the same time,
the programs deadlock and no progress will ever be made again
  - unless there are timeouts

This can happen because of a failure in communication, generally due to timing
 - producer and consumer sleep "at the same time"

In above example, cat is the producer (it writes to standard out)
 and less is the consumer (it reads from standard in)

But if the buffer isn't full and isn't empty
 - both producer and consumer run at the same time

To maximize throughput, you ideally never have the producer or consumer
go to sleep
  - but not possible if they don't work at the same rate

So the buffer and the sleep/wake mechanisms ensure that progress continues to be made at the maximum rate possible

Recall what is | doing?
  - a pipe system call
  - you get two file descriptors connected together, so writes from one become reads for the other
     - buffer is implemented in the kernel to coordinate reads and writes


Note in 3000pc-fifo, the buffer is in the kernel
 - we don't control it directly
 - so we don't see its size
 - The only shared resource between the processes is the pipe

But, 3000pc-rendevous* uses a shared buffer
 - note that this is two processes sharing a portion of memory
   using mmap
 - could get the same thing using multiple threads in one process
   but this is safer and should be as performant

T8 is really an example of why multithreaded programming is no fun
 - and generally not worth the effort

Note that 3000pc-rendezvous is broken
 - 3000pc-rendevous-timeout is the fixed version

Specifically, 3000pc-rendezvous can deadlock
 - even though it seems to use semaphores correctly
 - but on a multicore system, it isn't enough


What is a kernel module?
 - inserting code into the Linux kernel
 - lsmod - see the modules currently loaded
    (where do you think it gets its info?)

In general, monolithic kernels allow for code to be added to them
 - most commonly, for device drivers

Linux kernel modules are used for device drivers, but are also
used for many other things (e.g., anything that might not
always be needed)

Note that kernel modules run in supervisor mode on the CPU
 - same as all other kernel code
 - so, can do anything that the kernel can do
 - strictly more powerful than code running as root in a process

A process with root privileges is still executing in user mode on the CPU
 - just when it makes system calls, the kernel will likely always
   say "yes"
 - still has to follow the system call interface, can't
   just mess with kernel data structures
     - unless it decides to load a kernel module!

Currently the Linux kernel is written in C
 - but some people are trying to get parts written in Rust
 - note that C was designed to be a portable assembler


userspace - code running in user mode on the CPU, processes
kernelspace - code running in supervisor mode on the CPU, kernel code
              (including modules)


When you make a kernel module, note that you can't include standard C library headers, only kernel headers.  Why?
 - kernel code can't make system calls directly
   - it implements system calls, so can't depend on them
   - a system call is a userspace -> kernel space switch,
     but in kernel code we're already in kernel space running
     in supervisor mode on the CPU
   - kernel *can* make function calls, but it gets weird
     because system call code assumes it is working
     on behalf of a specific process
 - virtually all regular libraries depend on system calls
 - but for anything you need the Linux kernel has something
   equivalent (but it may have a very different interface)

When you "print" in the kernel, there is no standard out or standard error.  So where does it go?
 - kernel log, can be seen in /var/log/kern.log or dmesg
 - messages are written using printk() or a macro that
   turns into printk()

If /var/log/kern.log gets too big, you can just delete it
 - but you'll have to reboot to get the space back
 - because it will still be written to, and as long as a file is
   open its inode refcount won't be zero

Note you'll get "tainting" messages when you load the class kernel modules
 - this is because they aren't digitally signed with an authorized key
 - on many desktop Linux systems, they run in "lockdown" and won't
   allow any unauthorized modules to be loaded

So what about eBPF-based tools?
 - eBPF loads code into the kernel, but much more safely
    - restrictions on execution
    - verifier checks byte code before JIT compiling bytecode
      and inserting it into the kernel

There are no checks on regular kernel modules beyond code signing
and basic sanity checking

That's how eBPF can see so much
 - it is running in the kernel!

ptrace-based programs can't see so much
 - they are just using the ptrace system call to observe &
   manipulate one other process
 - this system call can mess with the observed process,
   hence it isn't suitable for production generally


Kernel modules can easily corrupt the kernel.  With eBPF, it is almost impossible to mess things up
 - eBPF isn't even Turing complete (no unbounded loops)

.bt code is for bpftrace, it generates eBPF bytecodes that are inserted into the kernel
 - eBPF is a machine-code like byte code
    - so portable across CPUs
 - highly restricted, but by formal verification, not a virtual machine