COMP 3000 2012 Week 11 Notes: Difference between revisions

Latest revision as of 10:58, 13 December 2012

CPU SCHEDULING

Kernel operations:
- switch to supervisor-> caused by interrupt
- when we switch back into user, we need to figure out what process to run next. THIS IS SCHEDULING

CPU state
- ready
- running
- loading (loading into memory)
- blocked (waiting for I/O)
- Scheduling takes processes and changes its state
- see diagram

Which process does the scheduler run next?

To avoid starvation, uses round robin algorithm
preemption: max time a process can use the CPU
prioritize unblocked processes: those using limited resources are prioritized
priorities set by nice values and other jigging

Scheduling is HARD. Figuring out fine grained scheduling is still an open problem. Everything we have is a bit "hacky"

CFS (completely fair scheduler) is the linux scheduler

I/O scheduling

Disk scheduling.
- Elevator algorithm
  - "Get on elevator when it passes by, get off when it gets to your floor, but you're not controlling the elevator
  - there are exceptions though. Some processes "can control the elevator"
  - To avoid long "queues", we batch I/O requests together (I/O reordering)
  - We employ HD caches to help deal with batching and I/O reordering

Filesystems -- Journaled (revisted)

FSCK walks the FS checking for errors. Takes long time. Bad idea. Journaled FSes fixes this.

Journaled FSes

Transactions do not happen directly to disk.
We first write to a journal.
Later, the journal will be read and the transactions will write to disk
These transactions happen in a stream (sequentially), which will be much faster than random writes
We're actually writing twice as much (time, transactions, not data) to the disk
we actually write to the rest of the disk in the background, when nothing is relying on I/O

If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk.
- To solve this issue, convert the FS to a log structured FS.
- the entire file system becomes the FS.

Log structured FS

All data is added sequentially to the disk
the actual log of additions becomes the FS, telling us where to find data on that disk

SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD.

Top

Top command. Memory Types:

virt- size of virtual address space - number of pages (doesn't typically matter)
RES - how many frames in physical ram are taken up (should be smaller than virt)
SHR - Shared memory between processes (shared frames amongst ALL processes)

Working set

[1]
Memory that you need for a certain task to make efficient progress.
if you don't have enough space on the working set, your process will be spending alot of time writing and reading from disk
there are many ways to manage the working set (page replacement)
Optimal WS: The optimal working set is the one where you know ahead of time which pages you'll need to load into memory and those you won't. You're effetiively predicting the future
LRU: least recently used (good for non streaming)
- Updating the last access time for every page would have alot of overhead, so we approximate this instead....
  - We can use the page table entry's metadata to help with this
  - Basic strategy: Clock algorithm. Sets zero to all page entry access metada every second. If we haven't accessed that page, the bit will still be zero.

Misc

Page replacement algorithm

- - If a page is not dirty and hasn't been accessed, kick it out
Please Expand
Dirty: If a page has data waiting to be written on disk, it is considered 'dirty'
Accessed: Some measure of the last time it was accessed, if it hasn't been accessed in a while (from the last sweep?) it is considered 'not accessed'

Ubuntu's memory management is optimistic

File caches

write back caching: when writing to a file, a promise is made that we've written the bits to disk, despite not having done.
allows the coder not to worry.

Unified Page Cache please expand

TLB: Cachei of page table entries of working set of current process Address space is page table

Copy on write

how forks are made
after fork, parent and child share memory until memory is accessed
share pages until write
detect by marking all pages read only
informally, it is making a shallow copy. It'll make a "deeper copy" when needed.

Fairness

devoting resources to the programs you want is hard to implement

@@ Line 46: / Line 46: @@
 * If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk.
-** To solve this issues, the entire file system becomes the FS.
+** To solve this issue, convert the FS to a log structured FS.
-** '''Could someone expand on this?'''
+** the entire file system becomes the FS.
-Log structured FS '''Could someone expand on this'''
+Log structured FS
+* All data is added sequentially to the disk
+* the actual log of additions becomes the FS, telling us where to find data on that disk
 SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD.
+== Top ==
+Top command.  Memory Types:
+* virt- size of virtual address space - number of pages (doesn't typically matter)
+* RES - how many frames in physical ram are taken up (should be smaller than virt)
+* SHR - Shared memory between processes (shared frames amongst ALL processes)
+== Working set ==
+* [http://en.wikipedia.org/wiki/Working_set]
+* Memory that you need for a certain task to make efficient progress.
+* if you don't have enough space on the working set, your process will be spending alot of time writing and reading from disk
+* there are many ways to manage the working set (page replacement)
+* Optimal WS: The optimal working set is the one where you know ahead of time which pages you'll need to load into memory and those you won't. You're effetiively predicting the future
+* LRU: least recently used (good for non streaming)
+** Updating the last access time for every page would have alot of overhead, so we approximate this instead....
+*** We can use the page table entry's metadata to help with this
+*** Basic strategy: Clock algorithm. Sets zero to all page entry access metada every second. If we haven't accessed that page, the bit will still be zero.
+== Misc ==
+Page replacement algorithm
+*** If a page is not dirty and hasn't been accessed, kick it out
+* '''Please Expand'''
+* Dirty: If a page has data waiting to be written on disk, it is considered 'dirty'
+* Accessed: Some measure of the last time it was accessed, if it hasn't been accessed in a while (from the last sweep?) it is considered 'not accessed'
+Ubuntu's memory management is optimistic
+File caches
+* write back caching: when writing to a file, a promise is made that we've written the bits to disk, despite not having done.
+* allows the coder not to worry.
+Unified Page Cache
+''' please expand '''
+TLB: Cachei of page table entries of working set of current process
+Address space is page table
+Copy on write
+* how forks are made
+* after fork, parent and child share memory until memory is accessed
+* share pages until write
+* detect by marking all pages read only
+* informally, it is making a shallow copy. It'll make a "deeper copy" when needed.
+Fairness
+* devoting resources to the programs you want is hard to implement