Operating Systems 2020W Lecture 16
Video
Video from the lecture given on March 11, 2020 is now available.
Notes
Lecture 16 ---------- Virtual memory - virtual memory vs physical memory - page tables - page table entries (PTEs) - TLB - mmap and virtual memory - swapping & paging What is virtual memory? - abstraction over physical memory that allows for memory to be shared with running programs not having to do anything special - each program sees its own memory map To implement virtual memory, have to map between virtual addresses and physical addresses - processes use virtual addresses - CPU maps virtual addresses to physical addresess to find actual memory (to read or write) In other words, when, in assembly/machine code, there is an address in a register or an instruction refers to an address, the CPU must translate this virtual address to a physical address before going to RAM and getting the value (or changing it) So this translation should seem really really slow, and it is...if you did it purely in software. But there is hardware to help. But how do we store this mapping between virtual and physical addresses? - every process has a "page table" associated with it. - this is the data structure that defines the mapping between virtual and physical memory Basic question: what units does the kernel use to manage memory? - on disk, kernel uses blocks (1k or 4k, fixed size) - in RAM, memory is addressable in bytes, so shouldn't we allocate it in bytes? NO - RAM is allocated in fixed-sized units, typically 4k or 8K - known as pages, NOT blocks - often times, the page size and the block size of a system are equal (both 4K), but this can vary We use fixed-sized units of allocation to avoid memory fragmentation - keep them small to minimize internal fragmentation (space lost to small allocations, e.g. need one byte but allocate 4k) - fixed sized prevents external allocation (any memory allocation can be accomplished if there are a sufficient number of pages/blocks) Physical RAM ************ Program 1 ... ... ------------ Program 2 ... ... ... ------------ Free Space ... ... <--- put Program 3 here ... ------------ Kernel ... ... ... ************ Want to load program 3, could put it in free space BUT normally, memory map is much messier Physical RAM ************ Program 1 ... ... ------------ Free Space ... ------------ Program 2 ... ... ... ------------ Free Space ... <--- put Program 3 here, but not enough room! ------------ Kernel ... ... ... ************ Fragmentation also takes place in filesystems - so we try to allocate "extents" which are ranges of contiguous blocks, faster to access But in RAM fragmentation is much more painful - want to assemble a 1M array out of scattered 4k pieces? So really virtual memory is an abstraction over RAM much like a file is an abstraction over disk blocks The purposes of virtual memory are - give programmers the illusion that their program occupies a contiguous range of bytes in memory - allow programs to be bigger than physical memory (Look up segmentation in the textbook to see another scheme for abstrating memory) The limits to physical memory is determined by cost, space, and power constraints (4G versus 16G versus 256M versus 16K) Virtual memory is limited by the size of pointers/addresses for the CPU - 32 bit processors normally have 32 bit addresses (max 4G) - 64 bit processors normally have 64 bit addresses - 8 bit processors normally had 16 bit adresses (max 64k) 2^64 is a very big number, don't expect to ever see a system with that much RAM! - so on 64 bit systems, virtual addresses are 64 bit but physical addresses are smaller (e.g. 48 bits) Address spaces is all about limits of naming - need a unique address for every byte in memory - there are ways around this, but they are very painful (e.g, bank switching) Virtual memory is made up - process sees that memory, but it doesn't all have to be allocated - e.g., process is using 64-bit pointers, so can address 2^64 bytes of memory...but if it accesses more than the kernel has allocated it (e.g., 1M), access to unallocated memory ranges results in segfaults A segfault happens when a process accesses a virtual address that has no corresponding physical address So what is a page table? - data structure for mapping virtual->physical addresses - does not have mappings on an address-by-address basis, that would be crazy inefficient - instead, mappings are virtual page->physical page - can copy page offset from virtual to physical address Page tables are basically arrays stored in a memory-effient way - use virtual page number as index into array - value at that index is the physical page number But if you made this a real array, it would be crazy inefficient - would need to be huge, e.g. 2^52 entries for a 64-bit address space with 4k pages (4k = 2^12) - so we need something like a sparse array Is data in swap persistent? - it isn't supposed to be, is deallocated when procesess terminate - but...data isn't normally overwritten on disk - solution: look up encrypted swap Why not make a page table a hash table? - want something that can be implemented quickly in hardware - calculating values on each memory access would be slower generally How do we do this "sparse array"? Big idea: split virtual address up into portions that each becomes an index into a page - need to make page tables out of pages! e.g., divide a 32 bit address as follows: ----------- ---------- ------------ 10 bits 10 bits 12 bits = 32 bits total lev 1 lev 2 offset level 1 = offset into top page of page table level 2 = offset into level two of page table offset = offset into data page 2^10 = 1024, thus allows for 1024 4-byte values in 4k 4 bytes = 32 bits = enough to store a pointer/address value (actually stores a page table entry as we'll discuss below) (These offsets are a lot like how we find locations of inodes in ext4) (Computer scientists have a limited number of ideas) So a page table is just a bunch of pages which define virtual to physical mappings. When you get a virtual address (on 32 biut) - use first 10 bits to find PTE in level 1 of the page table - use found PTE to find location of level 2 of page table page - use next 10 bits to find PTE in level 2 page table page - use found PTE to find location of desired data page - use last 12 bits to index into desired data page - this is the memory you want! PTE = page table entry page table entry is - a physical page number (for 32 bit sytems, 20 bits) - metadata about the page If you had 4G of RAM, that is 2^32 2^32 = 2^20 * 2^12 num size pages of page (4k) For 64-bit systems it is the same, but just gets messier (because virtual addresses are larger than physical addresses) can have 3 or 4 levels of pages We split page tables into levels because we want entries for a level to fit into one page There is 1 level 1 page, for 32 bit systems, per process up to 2^20 level 2 pages, for 32-bit systems, per process (but most procesess will only allocate some of these unless it uses 4G of RAM on its own) Remember every process needs its own page table - start with one level 1 page and 1 level 2 page - as process uses more RAM, add level 2 pages Why "only" 64 bits? This is 4G^4G 2^64 = (2^32)^32 It is very big I haven't talked about performance of virtual memory - note that each level of the page table is a memory access - so to look up one virtual address, we have to look at - the level 1 page table page - a level 2 page table page - the data page - 3 physical memory accesses for every virtual memory access - NOT A GOOD DEAL Solution: have a very fast cache of virtual->physical memory mappings - this cache is called the TLB - TLB=translation lookaside buffer, nobody knows why this name is used nowadays (ok, go to the IBM design literature and you can see the rationale but it makes no sense today) - just remember the TLB is a cache of recently used PTEs (page table entries) CPU looks in the TLB for a virtual addresss, - if it finds it, we're done, and we have lost almost no time - if we don't find it in the TLB, go look at the page table - lost time, but access will succeed - if we don't find it in the page table, tell the OS - because the process tried to access an invalid range of its virtual address space On Intel and AMD systems, you cannot directly control the TLB entries, CPU does its own thing (it looks at the page table directly) But on other processers (I believe SPARC), the hardware doesn't look at page tables, the OS does. So the kernel can directly control TLB entries What is the metadata associated with each PTE? - there's the physical page number (that's the data) - read, write, execute bits (like a file, but for each page) - valid bit - dirty bit - accessed bit - others? We have per-page rwx bits so the CPU can alert the kernel when a process does an improper memory access - write to read-only memory - execute to no-execute memory Big help for security! Remember the kernel isn't running most of the time - it is only called in when required - most of the time the CPU is running userspace processes - so need to make sure kernel is woken up when key decisions have to be made Valid bit is used to say whether the PTE refers to a real physical page - you can have a physical page number of zero! - so need another bit to say whether it is valid or not - needed for when that particular page isn't allocated, OR if it refers to data that is on disk (and so needs to be read in) (His name is Roshi, he's a boxer) Dirty bit says whether the page has been modified since allocated or read from disk - clean pages can be tossed out of RAM without consequence, can be read from disk later - dirty pages have to be written to disk ("cleaned"), until then the MUST stay in RAM Kernel will try to kick out clean pages when RAM is needed - changing dirty to clean pages happens, but is a lower priority unless RAM is getting scarce Accessed bit says whether the page has been recently accessed - 1 bit to represent time - not efficient to store a whole timestamp - but how is 1 bit enough? clock algorithms (next time) The kernel has its own page table for its code and data - thus it mostly uses virtual addresses itself - no fun to program with fragmented memory! - but it deals with physical addresses when necessary (userspace can't see them) What if your system doesn't have swap? - you either tell processes to free memory or you kill them - this is what happens on Android, iOS (no swap), they kill procesess and restart them all the time - not completely sure why they don't have swap...but it is probably primarily for power consumption reasons (which is mostly how phones get weird) Pages appear sequential in virtual memory but are fragmented in physical memory - just like how data in files is sequential but blocks of a file can be scattered across a disk Next week, tutorial will show you virtual versus physical addresses using a kernel module and special userspace program (developed again by William!)