Operating Systems 2019W Lecture 14
Video
The lecture given on March 6, 2019 is now available.
Notes
Lecture 14 ---------- Virtual memory! every process has its own virtual address space on every memory access, each virtual address is translated into a physical address The kernel must keep track of these per-process virtual->physical address mappings Theoretically, you could have a table mapping each virtual address to each physical address Mappings on a per-byte basis would be *way* too inefficient process memory is divided into fixed-sized pages - normally 4K - but sometimes also really large (2M+) for special purposes Before pages, there were segments Segments have a base address and a bound (size) - variable length - typically had a semantic purpose (code, data, stack, etc) Segment terminology still used when discussing parts of an executable (e.g. parts of an ELF file) You could relocate segments - all memory accesess could be relative to a "base" register But the world moved to a "flat" memory model (i.e. no segments) - segments can be confusing when they overlap - but the real problems is external fragmentation internal fragmentation - space lost when allocating using fixed-sized chunks (e.g., 4K at a time) external fragmentation - space divided into discontiguous chunks - cannot make larger contiguous allocations - happens when using variable-sized memory allocations XXX....XXXX...XXXX 7 units available 4 is the largest contiguous piece what if an allocation for 6 comes in? Only way would be to compact memory - move things around until you got a large enough contiguous block Virtual memory is a solution for the external fragmentation problem - virtual addresses can be contiguous even when physical addresses aren't To make virtual memory work, we need a mapping of virtual to physical addresses at a page-level resolution 4K => 4K ******** Sidebar: the memory hierarchy - fastest: small & volatile - slowest: large & persistent PROGRAMMER/COMPILER/HARDWARE MANAGED CPU registers HARDWARE MANAGED TLB CPU cache (L1) <- smallest, fastest CPU cache (L2) CPU cache (L3) <- often shared between cores OS MANAGED - Virtal memory DRAM -------- XPoint? OS MANAGED - filesystems SSD/flash memory Spinning Hard drives APP MANAGED Tapes **************** 4K Page (virtual memory) -> 4K frame (physical memory) This mapping will be used by the CPU, but managed by the OS *Page tables* do this mapping - but it is more a (very wide) tree, not a table 851F1 521 <- 32 bit virtual address ^ ^ page# page offset I just need to translate the page # to a frame # use the frame # plus the page offset to get the physical address upper 20 bits: page number lower 12 bits: page offset need a way to translate 20 bits (page #) to 20 bits (frame #) Could just use an array with 2^20 entries But most processes only need a small fraction of 2^20 entries - so we want a sparse data structure Remember we want to do all memory allocation in 4K chunks - even for the page table! How many mappings can I store in 4K? I can store 1024 (1K) 32-bit entries in 4K 1024 = 2^10 1st level page table: 1024 entries for 2nd-level page tables (PTEs) each 2nd level page table: 1024 entries for pages (PTEs) * We can have up to 1024 2nd level page tables, giving us 2^20 entries But we're missing something - how many memory accesses do we need to resolve one address?! TLB: "translation lookaside buffer" - caches virtual->physical mappings (PTEs) Page table entries have frame #'s AND metadata - valid? - modified? (dirty) - accessed? (recently) <--- NOT a time stamp - permission bits: rwx Does the CPU or OS manage TLB entries? - depends on the architecture - most nowadays have the CPU walk the page table How does mmap work in this? Lazy allocation - pages are allocated on demand - disk is read on demand Lazy allocation allows for aggressive file caching and memory overcommitment - improving performance in general but can lead to bad situations