COMP 3000 2012 Week 11 Notes

CPU SCHEDULING

Kernel operations:
- switch to supervisor-> caused by interrupt
- when we switch back into user, we need to figure out what process to run next. THIS IS SCHEDULING

Which process does the scheduler run next?

Scheduling is HARD. Figuring out fine grained scheduling is still an open problem. Everything we have is a bit "hacky"

CFS (completely fair scheduler) is the linux scheduler

I/O scheduling

FSCK walks the FS checking for errors. Takes long time. Bad idea. Journaled FSes fixes this.

Journaled FSes

Transactions do not happen directly to disk.
We first write to a journal.
Later, the journal will be read and the transactions will write to disk
These transactions happen in a stream (sequentially), which will be much faster than random writes
We're actually writing twice as much (time, transactions, not data) to the disk
we actually write to the rest of the disk in the background, when nothing is relying on I/O

If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk.
- To solve this issue, convert the FS to a log structured FS.
- the entire file system becomes the FS.

Log structured FS

All data is added sequentially to the disk
the actual log of additions becomes the FS, telling us where to find data on that disk

SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD.

Top command. Memory Types:

virt- size of virtual address space - number of pages (doesn't typically matter)
RES - how many frames in physical ram are taken up (should be smaller than virt)
SHR - Shared memory between processes (shared frames amongst ALL processes)

[1]
Memory that you need for a certain task to make efficient progress.
if you don't have enough space on the working set, your process will be spending alot of time writing and reading from disk
there are many ways to manage the working set (page replacement)
Optimal WS: The optimal working set is the one where you know ahead of time which pages you'll need to load into memory and those you won't. You're effetiively predicting the future
LRU: least recently used (good for non streaming)
- Updating the last access time for every page would have alot of overhead, so we approximate this instead....
  - We can use the page table entry's metadata to help with this
  - Basic strategy: Clock algorithm. Sets zero to all page entry access metada every second. If we haven't accessed that page, the bit will still be zero.

Page replacement algorithm

- - If a page is not dirty and hasn't been accessed, kick it out
Please Expand
Dirty: If a page has data waiting to be written on disk, it is considered 'dirty'
Accessed: Some measure of the last time it was accessed, if it hasn't been accessed in a while (from the last sweep?) it is considered 'not accessed'

Ubuntu's memory management is optimistic

File caches

write back caching: when writing to a file, a promise is made that we've written the bits to disk, despite not having done.
allows the coder not to worry.

Unified Page Cache please expand

TLB: Cachei of page table entries of working set of current process Address space is page table

Copy on write

how forks are made
after fork, parent and child share memory until memory is accessed
share pages until write
detect by marking all pages read only
informally, it is making a shallow copy. It'll make a "deeper copy" when needed.

Fairness