Operating Systems 2022F Lecture 6

From Soma-notes

Video

Video from the lecture given on September 27, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 6
---------

 - validator not working: read the error messages, if they don't make sense
   please PM me

signals
-------

Mental model first

What is a signal?
 - a short message sent to a process by the kernel
 - only a set number of them, and the signal number is the only
   data sent
 - but really, the data isn't exactly sent
 - instead, the kernel causes the designated "signal handler" to be run

signals cause an immediate change in process flow of control
  - it is like jumping to a function, but without any explicit call
  - more like an exception in a higher level language

So you're process is executing its code, one instruction at a time
 - then a signal is received
 - the kernel causes the process to stop running the code it was running,
   instead making it jump to a signal handler
 - when signal handler returns (its a function), execution continues
   from where the process was before the interruption

ALMOST all signals have signal handlers
 - two don't: SIGKILL and SIGSTOP
 - the process can't do anything about these, the kernel does it all
   (either terminate the process or pause its execution)

Every other signal has a handler that is called

The C library defines default signal handlers
   (functions to be run when a process receives a given signal)
 - these mostly just terminate the process


Signals are an interrupt-like mechanism
 - but interrupts are normally generated by hardware and are handled by the kernel

So how does a signal handler return to previously-executing code?
 - the current address is saved on the stack before the signal handler is invoked


Signals are used to control the execution of a process and to handle certain kinds of errors.

Note that a signal is kind of the opposite of a system call
  - kernel calling userspace (process) code,
    rather than a process calling kernel code
  - note that a signal handler can make system calls...
    but it has to be VERY CAREFUL about what it does
      - only certain ones are safe

In "man 7 signal" man page there is a table listing all the signals
 - the Action column means the following
    - Term: terminate the process
    - Core: "dump core" and terminate the process
            (core was an old kind of RAM, so dump the contents of RAM to disk)
    - Ign: ignore
    - Stop: stop (pause) the process)

These are default actions

Why dump core?
 - so you can find out what happened!  (for debugging)
 
So note that by default, processes are not multithreaded

A process is:
 - address space plus one or more execution contexts

A thread is:
 - an execution context inside of an address space

"execution context"
 - the state of the CPU (its registers)

When you switch to the kernel from a process or switch between processes
 - the CPU state has to be saved and later restored
 - processes don't get their own registers, they have to take turns

Traditional UNIX is single threaded
 - and signals were designed for single-threaded processes

Literally, in a multithreaded process you have multiple "currently executing" instructions and multiple stacks
  - one for each running thread
  - and they have to be very careful not to interfere with each other

Multithreaded programming is how one process can make use of multiple cores
  - otherwise you need separate processes to run on each core,
    each with their own memory

In a traditional (single threaded) UNIX process, it is either running regular application code OR running signal handler code, never both at the same time.

Signals are a mess on multithreaded processes on UNIX
 - really, best to avoid as much as possible
 - we really need a better mechanism

We've developed multithreaded processes to make efficient use of multiple cores
 - but I think it was a VERY BAD IDEA
 - if you need to share memory, just share the memory that you need
   (we can do that pretty easily)
 - with threads, you SHARE EVERYTHING, which is a recipe for chaos
   (you never need to share everything)
   - we will demonstrate these issues later in the term
   - I'm just mentioning it now to distinguish it from signal processing


When we create a child process with fork, the parent process gets SIGCHLD when the child process terminates
 - it is important to handle the death of your children, otherwise you get zombies

A zombie process is a process that has terminated (and so is dead) but is still listed as a process
 - zombies are unkillable because they are already dead
 - but if you kill their parent process they die
    - because the system becomes their parent

Zombies exist because all processes have a return value
 - somebody must get this return value
 - if nobody gets it, the system has to hold onto the value until someone does
   (that's all a zombie is, it is a placeholder for the return value
    of a process)

The return value of a process is the return value of main() or the argument given to an exit() system call.

The parent gets a child's return value by calling wait()

So how are dead child processes handled if the parent won't call wait/waitpid?
 - they will stick around for the lifetime of the parent process
 - when the parent dies, we have (terminated) child processes with no parent
 - this is a problem, because every process has to fit into the tree of processes somehow
    - so we "re-parent" the child proceses (a new parent adopts them), there is
      always a designated process to handle such orphans
    - (by default it is process #1, but it varies on modern systems)
    - the new parent will normally call wait and so the dead children's
      return values will be received and they can be put to rest

Each wait only works for one child process, so if you have multiple children you'll have to call wait multiple times
 - the process should get a SIGCHLD for each

Note that 3000shell does one thing that is kind of bad
 - you shouldn't be doing printf's in a signal handler, not so safe
   - you want to mess with process state as little as possible while
     in a handler because the process isn't necessarily prepared for what
     the handler will do, as it can happen at any point

Make sure your tutorials 1 & 2 have been marked by the end of tomorrow!
Assignment 1 is due tomorrow night!