COMP 3000 2012 Week 11 Notes: Difference between revisions
inital notes |
Shahanahmed (talk | contribs) |
||
(4 intermediate revisions by one other user not shown) | |||
Line 46: | Line 46: | ||
* If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk. | * If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk. | ||
** To solve this | ** To solve this issue, convert the FS to a log structured FS. | ||
** the entire file system becomes the FS. | |||
Log structured FS | |||
Log structured FS | |||
* All data is added sequentially to the disk | |||
* the actual log of additions becomes the FS, telling us where to find data on that disk | |||
SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD. | SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD. | ||
== Top == | |||
Top command. Memory Types: | |||
* virt- size of virtual address space - number of pages (doesn't typically matter) | |||
* RES - how many frames in physical ram are taken up (should be smaller than virt) | |||
* SHR - Shared memory between processes (shared frames amongst ALL processes) | |||
== Working set == | |||
* [http://en.wikipedia.org/wiki/Working_set] | |||
* Memory that you need for a certain task to make efficient progress. | |||
* if you don't have enough space on the working set, your process will be spending alot of time writing and reading from disk | |||
* there are many ways to manage the working set (page replacement) | |||
* Optimal WS: The optimal working set is the one where you know ahead of time which pages you'll need to load into memory and those you won't. You're effetiively predicting the future | |||
* LRU: least recently used (good for non streaming) | |||
** Updating the last access time for every page would have alot of overhead, so we approximate this instead.... | |||
*** We can use the page table entry's metadata to help with this | |||
*** Basic strategy: Clock algorithm. Sets zero to all page entry access metada every second. If we haven't accessed that page, the bit will still be zero. | |||
== Misc == | |||
Page replacement algorithm | |||
*** If a page is not dirty and hasn't been accessed, kick it out | |||
* '''Please Expand''' | |||
* Dirty: If a page has data waiting to be written on disk, it is considered 'dirty' | |||
* Accessed: Some measure of the last time it was accessed, if it hasn't been accessed in a while (from the last sweep?) it is considered 'not accessed' | |||
Ubuntu's memory management is optimistic | |||
File caches | |||
* write back caching: when writing to a file, a promise is made that we've written the bits to disk, despite not having done. | |||
* allows the coder not to worry. | |||
Unified Page Cache | |||
''' please expand ''' | |||
TLB: Cachei of page table entries of working set of current process | |||
Address space is page table | |||
Copy on write | |||
* how forks are made | |||
* after fork, parent and child share memory until memory is accessed | |||
* share pages until write | |||
* detect by marking all pages read only | |||
* informally, it is making a shallow copy. It'll make a "deeper copy" when needed. | |||
Fairness | |||
* devoting resources to the programs you want is hard to implement |
Latest revision as of 10:58, 13 December 2012
CPU SCHEDULING
- Kernel operations:
- switch to supervisor-> caused by interrupt
- when we switch back into user, we need to figure out what process to run next. THIS IS SCHEDULING
- CPU state
- ready
- running
- loading (loading into memory)
- blocked (waiting for I/O)
- Scheduling takes processes and changes its state
- see diagram
Which process does the scheduler run next?
- To avoid starvation, uses round robin algorithm
- preemption: max time a process can use the CPU
- prioritize unblocked processes: those using limited resources are prioritized
- priorities set by nice values and other jigging
- Scheduling is HARD. Figuring out fine grained scheduling is still an open problem. Everything we have is a bit "hacky"
CFS (completely fair scheduler) is the linux scheduler
I/O scheduling
- Disk scheduling.
- Elevator algorithm
- "Get on elevator when it passes by, get off when it gets to your floor, but you're not controlling the elevator
- there are exceptions though. Some processes "can control the elevator"
- To avoid long "queues", we batch I/O requests together (I/O reordering)
- We employ HD caches to help deal with batching and I/O reordering
- Elevator algorithm
Filesystems -- Journaled (revisted)
FSCK walks the FS checking for errors. Takes long time. Bad idea. Journaled FSes fixes this.
Journaled FSes
- Transactions do not happen directly to disk.
- We first write to a journal.
- Later, the journal will be read and the transactions will write to disk
- These transactions happen in a stream (sequentially), which will be much faster than random writes
- We're actually writing twice as much (time, transactions, not data) to the disk
- we actually write to the rest of the disk in the background, when nothing is relying on I/O
- If you're writing a lot to your FS, though, this'll be slow. We're writing twice to disk.
- To solve this issue, convert the FS to a log structured FS.
- the entire file system becomes the FS.
Log structured FS
- All data is added sequentially to the disk
- the actual log of additions becomes the FS, telling us where to find data on that disk
SSDs use wear leveling. Each memory cell has a dedicated lifespan. We try to lengthen their life by spreading the wear across the SSD.
Top
Top command. Memory Types:
- virt- size of virtual address space - number of pages (doesn't typically matter)
- RES - how many frames in physical ram are taken up (should be smaller than virt)
- SHR - Shared memory between processes (shared frames amongst ALL processes)
Working set
- [1]
- Memory that you need for a certain task to make efficient progress.
- if you don't have enough space on the working set, your process will be spending alot of time writing and reading from disk
- there are many ways to manage the working set (page replacement)
- Optimal WS: The optimal working set is the one where you know ahead of time which pages you'll need to load into memory and those you won't. You're effetiively predicting the future
- LRU: least recently used (good for non streaming)
- Updating the last access time for every page would have alot of overhead, so we approximate this instead....
- We can use the page table entry's metadata to help with this
- Basic strategy: Clock algorithm. Sets zero to all page entry access metada every second. If we haven't accessed that page, the bit will still be zero.
- Updating the last access time for every page would have alot of overhead, so we approximate this instead....
Misc
Page replacement algorithm
- If a page is not dirty and hasn't been accessed, kick it out
- Please Expand
- Dirty: If a page has data waiting to be written on disk, it is considered 'dirty'
- Accessed: Some measure of the last time it was accessed, if it hasn't been accessed in a while (from the last sweep?) it is considered 'not accessed'
Ubuntu's memory management is optimistic
File caches
- write back caching: when writing to a file, a promise is made that we've written the bits to disk, despite not having done.
- allows the coder not to worry.
Unified Page Cache
please expand
TLB: Cachei of page table entries of working set of current process
Address space is page table
Copy on write
- how forks are made
- after fork, parent and child share memory until memory is accessed
- share pages until write
- detect by marking all pages read only
- informally, it is making a shallow copy. It'll make a "deeper copy" when needed.
Fairness
- devoting resources to the programs you want is hard to implement