Operating Systems 2019W Lecture 14

From Soma-notes
Jump to navigation Jump to search


The lecture given on March 6, 2019 is now available.


Lecture 14
Virtual memory!

every process has its own virtual address space

on every memory access, each virtual address is translated into a physical address

The kernel must keep track of these per-process virtual->physical address mappings

Theoretically, you could have a table mapping each virtual address to each physical address

Mappings on a per-byte basis would be *way* too inefficient

process memory is divided into fixed-sized pages
 - normally 4K
 - but sometimes also really large (2M+) for special purposes

Before pages, there were segments

Segments have a base address and a bound (size)
 - variable length
 - typically had a semantic purpose (code, data, stack, etc)

Segment terminology still used when discussing parts of an executable
 (e.g. parts of an ELF file)

You could relocate segments
 - all memory accesess could be relative to a "base" register

But the world moved to a "flat" memory model (i.e. no segments)
 - segments can be confusing when they overlap
 - but the real problems is external fragmentation

internal fragmentation
 - space lost when allocating using fixed-sized chunks (e.g., 4K at a time)

external fragmentation
 - space divided into discontiguous chunks
 - cannot make larger contiguous allocations
 - happens when using variable-sized memory allocations


7 units available
4 is the largest contiguous piece

what if an allocation for 6 comes in?

Only way would be to compact memory - move things around until you got
a large enough contiguous block

Virtual memory is a solution for the external fragmentation problem
 - virtual addresses can be contiguous even when physical addresses aren't

To make virtual memory work, we need a mapping of virtual to physical
addresses at a page-level resolution

 4K => 4K


Sidebar: the memory hierarchy
 - fastest: small & volatile
 - slowest: large & persistent

 CPU registers

 CPU cache (L1) <- smallest, fastest
 CPU cache (L2)
 CPU cache (L3) <- often shared between cores

OS MANAGED - Virtal memory

OS MANAGED - filesystems
 SSD/flash memory
 Spinning Hard drives



 4K Page (virtual memory) -> 4K frame (physical memory)

This mapping will be used by the CPU, but managed by the OS

*Page tables* do this mapping
 - but it is more a (very wide) tree, not a table

851F1   521  <- 32 bit virtual address
 ^       ^
page#   page offset

I just need to translate the page # to a frame #

use the frame # plus the page offset to get the physical address

upper 20 bits: page number
lower 12 bits: page offset

need a way to translate 20 bits (page #) to 20 bits (frame #)

Could just use an array with 2^20 entries

But most processes only need a small fraction of 2^20 entries
 - so we want a sparse data structure

Remember we want to do all memory allocation in 4K chunks - even for the page table!

How many mappings can I store in 4K?

I can store 1024 (1K) 32-bit entries in 4K

1024 = 2^10

1st level page table: 1024 entries for 2nd-level page tables (PTEs)

each 2nd level page table: 1024 entries for pages (PTEs)

* We can have up to 1024 2nd level page tables, giving us 2^20 entries

But we're missing something
 - how many memory accesses do we need to resolve one address?!

TLB: "translation lookaside buffer"
 - caches virtual->physical mappings (PTEs)

Page table entries have frame #'s AND metadata
 - valid?
 - modified? (dirty)
 - accessed? (recently)  <--- NOT a time stamp
 - permission bits: rwx

Does the CPU or OS manage TLB entries?
 - depends on the architecture
 - most nowadays have the CPU walk the page table

How does mmap work in this?

Lazy allocation
 - pages are allocated on demand
 - disk is read on demand

Lazy allocation allows for aggressive file caching and memory overcommitment
 - improving performance in general but can lead to bad situations