Operating Systems 2019W Lecture 14 - Revision history

Soma: Created page with "==Video== The lecture given on March 6, 2019 [https://homeostasis.scs.carleton.ca/~soma/os-2019w/lectures/comp3000-2019w-lec14-20190306.m4v is now available]. ==Notes==
2019-03-06T22:29:05Z

Created page with "==Video== The lecture given on March 6, 2019 [https://homeostasis.scs.carleton.ca/~soma/os-2019w/lectures/comp3000-2019w-lec14-20190306.m4v is now available]. ==Notes== <pr..."

New page
==Video==

The lecture given on March 6, 2019 [https://homeostasis.scs.carleton.ca/~soma/os-2019w/lectures/comp3000-2019w-lec14-20190306.m4v is now available].

==Notes==

<pre>
Lecture 14
----------
Virtual memory!

every process has its own virtual address space

on every memory access, each virtual address is translated into a physical address

The kernel must keep track of these per-process virtual->physical address mappings

Theoretically, you could have a table mapping each virtual address to each physical address

Mappings on a per-byte basis would be way too inefficient

process memory is divided into fixed-sized pages
- normally 4K
- but sometimes also really large (2M+) for special purposes

Before pages, there were segments

Segments have a base address and a bound (size)
- variable length
- typically had a semantic purpose (code, data, stack, etc)

Segment terminology still used when discussing parts of an executable
(e.g. parts of an ELF file)

You could relocate segments
- all memory accesess could be relative to a "base" register

But the world moved to a "flat" memory model (i.e. no segments)
- segments can be confusing when they overlap
- but the real problems is external fragmentation

internal fragmentation
- space lost when allocating using fixed-sized chunks (e.g., 4K at a time)

external fragmentation
- space divided into discontiguous chunks
- cannot make larger contiguous allocations
- happens when using variable-sized memory allocations

XXX....XXXX...XXXX

7 units available
4 is the largest contiguous piece

what if an allocation for 6 comes in?

Only way would be to compact memory - move things around until you got
a large enough contiguous block

Virtual memory is a solution for the external fragmentation problem
- virtual addresses can be contiguous even when physical addresses aren't

To make virtual memory work, we need a mapping of virtual to physical
addresses at a page-level resolution

4K => 4K

Sidebar: the memory hierarchy
- fastest: small & volatile
- slowest: large & persistent

PROGRAMMER/COMPILER/HARDWARE MANAGED
CPU registers

HARDWARE MANAGED
TLB
CPU cache (L1) <- smallest, fastest
CPU cache (L2)
CPU cache (L3) <- often shared between cores

OS MANAGED - Virtal memory
DRAM
--------
XPoint?

OS MANAGED - filesystems
SSD/flash memory
Spinning Hard drives

APP MANAGED
Tapes

********

4K Page (virtual memory) -> 4K frame (physical memory)

This mapping will be used by the CPU, but managed by the OS

Page tables do this mapping
- but it is more a (very wide) tree, not a table

851F1 521 <- 32 bit virtual address
^ ^
page# page offset

I just need to translate the page # to a frame #

use the frame # plus the page offset to get the physical address

upper 20 bits: page number
lower 12 bits: page offset

need a way to translate 20 bits (page #) to 20 bits (frame #)

Could just use an array with 2^20 entries

But most processes only need a small fraction of 2^20 entries
- so we want a sparse data structure

Remember we want to do all memory allocation in 4K chunks - even for the page table!

How many mappings can I store in 4K?

I can store 1024 (1K) 32-bit entries in 4K

1024 = 2^10

1st level page table: 1024 entries for 2nd-level page tables (PTEs)

each 2nd level page table: 1024 entries for pages (PTEs)

* We can have up to 1024 2nd level page tables, giving us 2^20 entries

But we're missing something
- how many memory accesses do we need to resolve one address?!

TLB: "translation lookaside buffer"
- caches virtual->physical mappings (PTEs)

Page table entries have frame #'s AND metadata
- valid?
- modified? (dirty)
- accessed? (recently) <--- NOT a time stamp
- permission bits: rwx

Does the CPU or OS manage TLB entries?
- depends on the architecture
- most nowadays have the CPU walk the page table

How does mmap work in this?

Lazy allocation
- pages are allocated on demand
- disk is read on demand

Lazy allocation allows for aggressive file caching and memory overcommitment
- improving performance in general but can lead to bad situations
</pre>