Operating Systems 2020W Lecture 20

Video

Video from the lecture given on March 25, 2020 is now available.
Notes

Lecture 20
----------
 - Assignment 3 solutions
 - scheduling

If you are a few minutes late in submitting your assignment it is okay, but please upload your solutions now

Assignment 4 is due April 3rd
 - but you're answering multiple choice questions, posted by Monday
 - General questions are up now, you should try answering those
   (solutions for them will be posted but we won't grade answers,
    will just do auto-grading of multiple choice questions)

Final exam will be similar to the midterm, but covering the whole class
 - difficulty will be similar as if you look back at the midterm,
   open book would only have helped for a few questions

You'll submit a text file via cuLearn
 - I'll give you all a template for the text file along with the questions
 - Please fill out the template so we can use scripts to assist with grading

Final exam grade replaces midterm if you do better on the final

(And I will make adjustments to compensate for A3 difficulty, so
 focus on learning the material!)

Review session for final
 - last class will be solutions for A4 plus exam review

Participation marks cannot lower your grade
 - will also calculate grades with 20% assignments, 20% tutorials, 0%
   participation

So hopefully most of you will get a 4% boost from participation

Interviews will start by April 20th.


Assignment 3 solutions
 - Q1: pipes (named & created by pipe())
 - Q2: understanding the logic behind semaphores and condition variables
       (separate from their implementation)
       Why two cores?
         - without 2 cores, only one process/thread can execute at a time
	 - means that the only concurrency you have is one being interrupted
	   at arbitrary times.  (In practice it tends not to be so arbitrary)
	 - e.g., you'll see deadlock much less often with only one core
 - Q3: see how to make threads, compare with making processes with fork
    - multiple ways to get a working implementation
    - TAs will do their best to grade fairly, don't need to do exactly
      what was in the solutions
        - probably better to just add parameters to the shared struct!
        - main thing is you used threads and not processes,
	  and got rid of the mmap and the shared memory hints to pthread
	  (calls to setpshared)
 - Q4: understand the steps in creating a device in the kernel
       see that the device files created are nothing special,
       could have been created manually
       (note that device files in /dev/pts are an exception, because
        they are actually device files in a special filesystem,
	so have weird semantics.  Devices in /dev are "normal")
       you can copy device files between filesystems most of the time,
       for "normal" device files

 - Q5: understand how character device modules work
       - how read/write access is determined
       - how the fops struct controls what functions are
         called for file operations
       understand the significance of uid vs euid (vs. pid)
        - be able to express in english
       get experience navigating the kernel source code
        - see how to call kernel code by following examples in
	  the code base
       understand better the task abstraction for processes & threads
       kernel space vs user space (and transferring data safely between them)

    using helper function in the kernel is the best way to access things
      - I got a bit sloppy in part when I directly accessed task->cred->uid
      
    How do debug kernel modules?
      - printk is most straightforward
      - there are ways to use gdb (kgdb) but the setup is non-trivial
      - there's a reason why it is good to code in userspace!
      - and why eBPF is so wonderful!
    Why used current_ns?  because seems dangerous to use any other namespace (to me)
       - but I could be convinced
 Q6: realize that some things (like usernames) are purely in userspace,
     kernel knows nothing about them


(Optional info)
Traditionally on Linux systems, logs are in /var/log
 - /var is for data that is variable (get it?)
 - logs can be put there by individual applications, normally in
   their own subdirectory
 - traditionally syslogd would handle the other log files, but
   now that is part of systemd
 - in fact, with systemd these text logs are copies of data in a binary
   log (stored in /var/log/journal), can read this log directly
   with journalctl
 - kernel logs go to the console initially, then go to a userspace program
   that has requested access to the kernel logging buffer
     - traditionally klogd, now part of systemd
     - copies of kernel logs are put in /var/log/kern.log, can
       get current ones with dmesg
 - log rotation is responsible for files with a number after them
   - periodically current logs are moved to a numbered log file
   - older logs are compressed, and oldest are deleted
   - keeps logs from taking up all of the disk (generally)
 - journalctl has lots of ways to select which logs are shown
   - or you use grep on the text logs

systemd is a project to replace much of the core userspace functionality on
linux systems.  Controversial but now ubiquitous.  Replaces lots of low level
stuff including
 - init
 - networking
 - logging
 - kernel module loading
 - container management (partial)
 - getty (login from a console)
 - time management
 - others
Complaint is systemd replaced stuff that wasn't broken.  Response is a
unified approach to this functionality provides lots of benefits
(power management, fast boot in particular)

See https://www.freedesktop.org/wiki/Software/systemd/


Process states
 - # cores = # processes (maximum) in the running state
 - everyone else is waiting for some reason
   - waiting for I/O (e.g. waiting for the disk)
   - sleeping (nothing to do)
   - idle (waiting to be scheduled, to get access to a core)

grep State /proc/*/status | grep -v idle | grep -v sleeping | less
 - to see what is not sleeping or idle