Operating Systems 2022F Lecture 13

Video

Video from the lecture given on November 1, 2022 is now available:
Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)
Notes

Lecture 13
----------
Bit behind on things
 - grading in progress
   - should have A2 posted shortly
   - midterm will take another week
 - A3 will be out this week, hopefully by Thursday
   - it is just based on T5 and T6, so make sure you understand those well
   - currently it is due on Nov 9th but you'll get more time, at least until
     the 11th, still figuring that out.


When doing the tutorials
 - make sure you understand all the code!
 - if there are things you don't understand, investigate/ask questions


Topic of today: Concurrency

This is a HARD topic, causes even the most advanced developers headaches
 - so bad, most avoid it as much as possible

Why is it so hard?

two or more processes/threads running "at the same time" (in parallel generally, but not always), coordinating their activities through shared resources and/or messages
 - resources are mostly files on UNIX-like systems, but could be a device or anything at all
 - particularly tricky when shared memory

What's hard is race conditions
 - the relative order of execution affects the outcome


Threads vs processes

A process is
 - an address space + one or more execution contexts

A thread is
 - an execution context in an address space

So a process contains one or more threads

What is an execution context?
 - registers
 - stack

So a multithreaded process has multiple stacks, one for each thread
 - also each thread logically has its own set of registers


Note that a multithreaded process is executing 2 or more instructions at the same time, always.

Why don't I like processes with multiple threads?
 - because they can step on each other *so* easily
 - all memory & many other resources are shared
   - and so there are lots of opportunities for interaction

Threads are one way to get concurrent computation.  But there are other ways
 - in particular, we will focus on multiple processes interacting
   as a safer way to do concurrency
 - but if you look at the programming literature, there are other ways

The trick to sane concurrency is limiting the interactions
 - uncontrolled interactions means chaos

Just think about two threads modifying a linked list at the same time
 - how can you insert a node without corrupting the data structure?

On modern CPUs, most instructions are NOT atomic
 - it is possible for them to be divided, because they don't all
   execute in one go
 - modern CPUs divide instructions into micro ops that are executed in parallel
   as much as possible
 - and, of course, there is the disconnect between memory and registers

Where is a variable located?
 - if we say X is on the stack, when we change its value it *isn't* on the stack, it is in a register
 - if Thread 1 wants to add 1 to X, it will load X, modify it, and then write X
 - if Thread 2 accesses X while 1 is modifying it, what value will it see?
    - Thread 2 can't see 1's registers

Because different cores are working with registers, and registers are private to the core, the states of variables can get skewed between threads.

If we want consistent views of data between threads, we have to coordinate, or we have to do things in a way that doesn't require coordination (i.e., separate data access)

We want to enforce mutual exclusion
 - make sure that if one bit of code is accessing data, nobody else can do it
    - so that code can assume that its data won't change from under it

This can manifest as a security problem: TOCTTOU "time of check to time of use"

Consider this primitive attempt to lock something:

  if (lock == 0) {
     lock = 1;
     /* do something, access shared resource */
     lock = 0;
  }

So when lock is 1, the lock is "held".  So whoever set it to 1 has the lock until it is released (set to 0).

THE ABOVE CODE IS WRONG

We need the check (the comparison) and the change (the assignment) to lock to happen atomically, i.e., we can't have any other thread mess with the value of lock between checking its value and changing it.

Note we should wait while the lock is held, so something like this:

  while (lock == 1) {} // spin while lock is held
  /* now lock is 0 */

  lock = 1; /* grab the lock */

  /* do something */

  lock = 0; /* release the lock */

This is also wrong, again because we aren't using atomic operations.

Again we would need to grab the lock right when we check it.

Big thing to know: YOU CAN'T WRITE THIS CODE CORRECTLY IN C
 - you have to use special assembly language instructions
   to do a test & set operation atomically
 - we don't want to use these instructinos by default because
   THEY ARE SLOW (they force the CPU cores to coordinate at a very
   low level, rather than doing their own work)

So whenever you use any sort of binary semaphore or similar mechanism for locking, there is some deep assembly language level magic happening under the hood
 - don't write your own!
 - even experts will try to create semaphores and will mess them up!

A semaphore is a variable that allows for signalling & messaging between concurrent computations (threads) using shared memory
 - metaphor are signal lights for trains to keep two trains from running into each other

Two kinds of semaphores: counting an binary
 - counting can have an arbitrary integer
 - binary is 1 or 0

We normally use binary semaphores for locks
 - but they can be used for other things
 - called here a "mutex" (for mutual exclusion)

So there are many standard problems in concurrency, we're going to focus on one: the producer-consumer problem

(If you want to learn a complex version of a concurrency problem, look up paxos)

There are also file locks, we aren't going to discuss them
 - they aren't normally used on UNIX systems
 - files are locked all the time on Windows however
 - used to ensure exclusive access

Remember that semaphores require shared memory to work.
 - if there isn't shared memory, you're doing something else,
   probably sending messages back and forth

Producer-consumer is all about coordinating relative rates of execution
 - producer "makes" data
 - consumer "uses" data

producer and consumer don't run at the same rates
 - so sometimes producer is faster, sometimes consumer is faster

How do you make sure that they coodinate?
 - you want the one who needs to wait goes to sleep until it is needed
   (so you save CPU resources, no need to run if there is nothing to do)

Key idea: buffer between the producer and consumer
 - shared data structure for storing the work of the producer
 - producer puts things into the buffer, consumer takes things out

When the buffer is full, producer should wait until it isn't full
When the buffer is empty, consumer should wait until it isn't empty

UNIX has very good support for producer/consumer between two processes
 - one process can write, and the consumer can read
 - don't do it to a file, but do it to a pipe using the same
   read/write system calls you'd use on a file

If we type something like "ls /usr/bin | less" we are doing producer/consumer
 - ls is the producer, less is the consumer
 - when less has filled the screen, ls will stop writing

The pipe operator is actually creating two file descriptors
 - one assigned to standard out for the producer
 - one assigned to standard in for the consumer

What is between these two file descriptors?  the kernel implementing a shared data structure to coordinate the reads and writes

The | operator is implemented using the pipe(2) library function/system call
  - if it is in <unistd.h>, it is probably a system call
  - use strace to verify

If you want to have two processes talk to each other, one solution is to have two pipes, one for each direction.
 - but also, you could use a socket

So how does the signalling/start/stop happen with pipes?
 - the kernel pauses the producer when the pipe's buffer is filled:
   the write system call that the producer makes doesn't finish
   until there is room to save the output
 - the kernel pauses the consumer when the pipe's buffer is empty
   the read system call that the consumer makes doesn't finish until
   there is something to read

Remember that read and write system calls can pause (block) for arbitrary periods of time by default.  You have to pass them an option (to the open) to make them return immediately if the operation would otherwise cause it to wait.
  - but pipes always block

As long as both the producer and consumer are running, the kernel will manage access to the pipe by causing read/write system calls to the pipe to block
 - but if either the producer or consumer have terminated,
   the other will get an error when they access the pipe
   (SIGPIPE signal)

Note that pipes work on the principle of first in/first out, so they are also called FIFOs