Operating Systems 2022F Lecture 5

From Soma-notes

Video

Video from the lecture given on September 22, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 5
---------

Today, Tutorial 3, along with some old business

Admin notes
 - Assignment 1 is due Sept 28th before midnight
 - Tutorial 1 & 2 are also due at the same time
    - really, you should have them done by now
    - MAKE SURE YOU GET CHECKED OFF ASAP
 - Don't do the assignment before doing the tutorials!

 - change in TA & tutorial policy
    - you can go to ANY TA to get marked for a tutorial
    - your assigned TA will be following up with anyone who is running
      behind on tutorials (i.e., starting tomorrow they will start
      checking in with you)

 - Why aren't there tutorial solutions?
    - because the answers aren't the point!
    - you should know when you understand something
        - there are no solution keys once you leave school
    - if you focus on the answer rather than techniques for figuring
      out answers (beyond searching), you won't do well in this class
      because you'll nevel develop enough of a mental model to do the work



Ctrl-D is end of file in UNIX, so if you type it into many command line programs they will terminate.

"Background processing"
 - normally when you run a command in a shell, it waits for it to finish
 - if you run a command "in the background", it won't wait
    - so the shell & the command are running at the same time
 - specifically
    - with foreground processing, the shell does a blocking wait
      on the child.
    - with background processing, the shell continues execution and doesn't
      do a wait until it gets a message (signal) that the child has exited

In classic UNIX systems, we have two core system calls for creating processes and running programs:
 - fork duplicates the current process
 - execve replaces the program *in the current process* with that from a file

(fork/execve is not great in lots of circumstances, people are developing alternative APIs currently for Linux, but for a shell they are near perfect)

When I type "ls" at a shell prompt
 - the shell (bash, 3000shell) first "forks", creates a child process
   that is a duplicate process (i.e., it is running bash or 3000 shell as well
   in exactly the same state, except for the return value of fork)
 - the child process (the new one that was just created) does an execve
   of /usr/bin/ls, thus replacing the bash binary with ls
 - the parent process (still running bash) waits for the child process to
   terminate (do an exit system call), and once it does it prints a new prompt

If I type "ls &"
 - it is exactly the same except for the last step.  It just prints the prompt.
 
   
Why does the output look weird with "ls &"?
  - because the prompt is printed before the output of ls

Note in bash, if you say "exec <cmd>" it will skip the fork, will just do an execve

What is execve actually doing?
 - we will be discussing this, several details, but basically it throws away
   everything that was there, replacing it with the code from the new process
     - some bits are preserved, but it is very limited

Note that this fork/execve thing happens in a shell every time you run an *external* command.  Internal ones are just handled by the shell process.


On a standard UNIX machine, there are many processes always running "in the background" providing services
 - these are referred to as daemons
   (not evil, more useful servants)

For example, sshd is running on your virtual machines (the d is for daemon)
  - it is listening for incoming ssh connection requests
  - this is what lets you connect to your VM!

processes are the basic unit of concurrency on UNIX systems
  - threads were later added as part of this, but at its heart
    UNIX just wants to deal with processes
      - process: something created by fork

Now if you do strace on a modern Linux system, you won't see "fork", you'll see "clone"
  - it is a more general system call, superset of fork's functionality

envp, environ - all refer to environment variables
 - key/value combinations
 - each is one string, with the first part being the key, then an =, then the value
 - passed in by execve, is there whenever a new program runs, given to it
   by the previous program that was running in that process (along with argv)
 - used to specify the process's "environment"
    - whatever info might be relevant, can be almost anything


Remember that environment variables are copied
 - when you do a fork, child process has a copy of the parent's environment variables (because they have a copy of all memory)
 - on execve, the new program has whatever the old program gave it as envp to execve

So that means changes to environment variables are always local
 - there is no global state across processes for environment variables

But if you define environment variables as part of your startup configuration,
all processes in your session will get the same environment variables so they can seem to be "global", but they were just copied all along the way
  - unless someone in the middle decided to mess with them


Ones that we are using here:
  PATH: directories to search for program binaries

"session" - a user's session, starts when the log in and ends when they log out
 - configured by dot files:
    - .profile, .bash_profile   <--- note .profile is for all shells,
    - .bashrc, .bash_aliases    <--- rest start with bash because they are for
                                <--- bash!