Operating Systems 2022F Lecture 21

From Soma-notes

Video

Video from the lecture given on December 1, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 21
----------

 - Assignment 4 due in a week
 - advice?  well I gave a lot of hints last class


Standard view of hardware, kernel, and processes:

	   +------+ 	+------+    +-------+
	   |   e  |    	|   b  |    |   t   |
	   |   m  | 	|   a  |    |   o   |
	   |   a  | 	|   s  |    |   p   |
	   |   c  |  	|   h  |    |       |
	   |   s  |  	|      |    |       |
	   +------+ 	+------+    +-------+
		     ---system calls---
	  +-----------------------------------+
	  |                                   |
	  |           Kernel                  |
	  +-----------------------------------+
		     interrupts, DMA
	  +-----------------------------------+
	  |                                   |
	  |           Hardware (I/O)          |
	  +-----------------------------------+

DMA - direct memory access
 - device reading/writing RAM directly
 - without DMA, CPU must send/receive data itself


       +---------------+		  +-----------------+
       |               |		  |                 |
       |    Emacs      |       	       	  |    Bash         |
       |       	       |		  |                 |
       |               |		  |                 |
       |               |		  |		    |
       |               |		  |                 |
       |               |		  |                 |
       |               |		  |                 |
       |               |		  |                 |
       |               |		  |                 |
       |               |		  |                 |
       |  +----------+ |		  |   +---------+   |
       |  |          | |		  |   |         |   |
       |  |  libc    | |		  |   |  libc   |   |
       |  +----------+ |		  |   +---------+   |
       +---------------+		  +-----------------+


So what about libc, how is it actually shared between these processes?

We *could* just load these into each address space separately, but that
would be wasteful.

So what do we do instead?
 - we load one copy of libc into (physical) memory
 - we then *share* that copy between processes that load libc
 - HOW?

The sharing happens at the page table level
 - bash and emacs have their own page tables, but
 - for shared libraries, portions of that page table are shared

Note for this sharing to happen, we *have* to be doing dynamic linking

When you look at the physical address for functions in libc, they should remain consistent between runs (because libc isn't moving, it is already loaded)

But what is the mechanism for making sure memory is shared?  How do we load in the library file so that its memory will be shared?
 - what system call?
 - mmap!

When we mmap a file, we're actually saying that a given range of virtual addresses is equivalent to the contents of a specific file.  So "loading" libc is actually mapping the libc file into the process's address space
 - when multiple processes map the same file, the kernel only has to
   load the file once.  Only the mapping is duplicated, and only for the top of
   the page table subtree

So if you map in a file read/write, multiple processes will see the same data (except to the degree CPU caching messes things up, you'll still want to ensure mutual exclusion)
 - but most of the time when we mmap files, we do it read only
   (because it is code)

What define's a process's address space?  Its page table!
 - really, it is just a data structure that defines how to map
   virtual to physical addresses
 - the address space is a virtual address space
 - it can map into physical memory in all kinds of ways
    - parts could have NO physical memory for them
    - parts can be duplicated between processes (shared mmap'd memory,
      from a file or anonymous)

An anonymous mmap is just an mmap with no file behind it.
 - useful for allocating memory!

How could parts of an address space have no physical memory associated with it?
 - was allocated but hasn't been used.  the kernel can be lazy, will
   often only allocate pages when they are needed, not when they are asked for
 - or, could instead be on disk (in a file or the pagefile/swap)


  Processes A, B, C, and D, how are they in physical memory?


  |AABACCAXXXDDAAXXBBACDA....AA..BB|   <--- it is chaos!  but we don't notice!

  (of course some pages are shared, we'll call them X above)

Memory management is fundamentally lazy
 - only does the work it has to when needed (much of the time)
 - memory isn't set aside when it is allocated but when it is used
    - so data from a file is loaded as the file is accessed
    - code from an executable is loaded when it is needed

This is called demand paging (loading pages on demand)

So another benefit of dynamic linking is the whole file doesn't have to be loaded - only the portion that is being used
 - rest will stay on disk


Remember at the heart of a page table is a page table entry (PTE).  I've been saying this is essentially a pointer, but it is more than that.  It is a page number (the upper portion of an address, minus the offset) PLUS page metadata.
 - because we don't need the offset, we have room for metadata (up to 12 bits)

What metadata is there per page?  It is just a small number of bits
 - read, write, execute bits
 - present/valid (so if the page isn't loaded, set this to 0)
 - dirty bit (has anyone written to this page)
 - accessed bit (has anyone read from this page)

Read/write/execute bits are a big way we detect "segfaults"
  - writing to read-only memory

If you access an address that is not present, the kernel has a choice
  - signal an error, this memory isn't allocated
  - or, load the necessary data/allocate the page

Because memory can be stolen from underneath a process, it is possible
for a process to be delayed at any time even just when accessing memory
 - because the kernel may need to do some work to make that memory "valid"

So what happens if the kernel runs out of pages?
 - it is very easy for the kernel to "overcommit" memory
 - try it on your VM, you can probably allocate an 8G array even though the machine only has 4G of RAM.  Just don't access it all and everything will be *fine*

When you look at top, it will tell you how stats about RAM.  To understand these, you have to understand how virtual memory is managed
 - free Mem is the amount of RAM available (pages that are ready to be allocated)
 - used Mem is physical memory being used by running processes
 - buff/cache is physical memory storing copies of data that is on disk

Note that we are using RAM both for running program memory and to store copies of files from disk.

Why store copies of files?
 - to make things faster!

But...what if you write to a file, it is stored in RAM, and then the computer loses power?
 - you lose the changes to the file!
 - so it is important for changes to files be written to disk regularly
 - but, won't that slow things down?  yes it can, but
   normally it can happen "in the background"

Normally process memory is volatile, meaning if you lose power it is gone.
When a process writes data to disk, it is normally persistent
across power loss
  - but actually, writing to a file normally just writes it to RAM
  - later on (generally within a few seconds) it gets written to disk

For systems that need to be robust to power loss, they can use battery-backed RAM to store info on recently written files very fast

This is "buffered I/O" - faster, but less reliable

When the OS runs into memory pressure (gets low on the number of free pages), it will have to take steps to free up RAM.  How does it decide what to remove?
 - to a first approximation, it wants to get rid of what was least recently used (LRU)
    - because if you haven't used it in a long time, you probably won't need it

One problem with LRU - how do you keep track of when a page was most recently used?
 - ideally you'd use a time stamp, but that is a) too much data and b) too expensive to update

We normally use what are known as "clock algorithms" to approximate LRU.
Basic idea is we periodically go through memory and mark everything as having not been accessed (set the accessed bit to 0)
 - hardware sets the PTE bit to 1 if the page is accessed
 - so then when we look in memory and see a page with its access bit set to 0, we know that page hasn't been accessed since the last "sweep" of the "clock hand"

Remember there is one page table entry per page
 - the PTE stores the metadata associated with the page

So our page freeing algorithm looks for
 - pages that haven't been accessed that aren't dirty
 
So what about dirty pages?
 - a separate "hand" (algorithm) sweeps through memory looking for pages with the dirty bit set.  It then writes those to disk (either to the corresponding file or to the pagefile)
 - one written to disk, the dirty bit can be cleared

Dirty bit => we can't throw this page away without losing data

In general we'll treat dirty pages belonging to files differently than those that just belong to a process

The TLB is just a cache of recently used PTE's (and their corresponding virtual address) that can all be queried in parallel
PTE's are the component parts of page tables

What happens if we still can't get enough free pages?  What if too many are being accessed & written to?
 - well, your system will slow down as you get close to limits
 - but if things get really bad, drastic action has to be taken
 - in Linux, there is an "oom killer" that gets deployed and will
   terminate processes as needed to get memory usage under control
     - lots of controversy about how the oom killer works, it can
       REALLY mess things up so it has gotten regular adjustments

oom = out of memory

One thing - kernel memory isn't swappable, it is always in RAM.  So if
you have a memory leak in a kernel module, it will continuously reduce
the amount of RAM available to processes.

How would you avoid ever running out of RAM?  You'd have to be more conservative in memory allocations, not be so lazy
 - but that would mean lower performance because you reduce RAM usage

Linux kind of operates like a financial insitution, it leverages (memory) debt to maximize performance
 - risking bankruptcy!