Operating Systems 2015F Lecture 18

Video

The video from the lecture given on November 13, 2015 is now available.
Notes

Lecture 18
----------

Virtual Memory

swapping, er...paging

Swapping technically means writing currently running program to disk and loading another one to run

Paging means moving pages out of RAM and onto disk

Question: why does linux use much more than the bare minimum of RAM?  What is it doing with all of that extra memory?

Answer: very aggressive caching of disk contents

Problem: But what if I need more RAM for programs?

Need algorithms for moving pages between RAM and disk

Types of page contents:
* Program data
* Program code
* File contents

Note that program code is also (read only) file contents

Really, you have:
1. File contents, read only
2. File contents, read write
3. Program data

You need more RAM.  What do you kick out first?
 - not such an easy question!

Kick out (evict) read only file contents first
Pro: we don't have to write anything to disk!
Con: Hey, I was running that code!

Evict read write file contents
Pro: hey they are supposed to go to disk anyway, go ahead and write it
Con: but we have to wait for it to be written first

Evict program data...to swap
Pro: Hey, we didn't need that data anyway
Con: Hey, we *did* need that data!


Hardware helps with the decision on what pages can be kicked out
* accessed bit
* dirty bit

Ideal eviction targets are pages with accessed bit clear and dirty bit clear (a clean, non-accessed page)

Next eviction target is non-accessed dirty pages.  Save them to disk then evict

After evict accessed but clean pages...but this is bad if done frequently (system is "thrashing")

Dirty pages are easy - just have a process constantly writing dirty pages to disk, thus "cleaning" them
  - sweeps through memory periodically looking for dirty pages

Ideally what you'd want is a timestamp of last access for every page

For accessed bits, periodically sweep through memory setting accessed bits to 0.  If it is 1, then you know the page has been accessed since the last sweep.

These "sweep" type algorithms are called clock algorithms because you can think of the pages as positions on a very very big clock, with a hand moving from one to the next.

Page replacement algorithms can get very complex because, in their ideal form, they predict the future.  Best approximation is a bunch of heuristics and adaptive mechanisms based on past behavior.

But page replacement needs to be reasonable, especially on systems with no swap


When a program asks for memory, how do you know if you have enough?  When do you say no?

Conservative way: find pages for every byte requested, return yes only once you've allocated it all

Linux kernel is not conservative at all

Example: fork
 - logically, doubles RAM usage
 - at least, doubles program data RAM usage
 - but in practice it generally only causes a modest increase in RAM usage

Example: malloc
 - are you really going to use all that memory at once?

Linux kernel memory allocation is like banking
 - gives out loans at a drop of the hat, takes deposits...
 - assumes everyone won't want their money at once