<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=Operating_Systems_2020W_Lecture_16</id>
	<title>Operating Systems 2020W Lecture 16 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=Operating_Systems_2020W_Lecture_16"/>
	<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2020W_Lecture_16&amp;action=history"/>
	<updated>2026-06-02T22:26:51Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.42.1</generator>
	<entry>
		<id>https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2020W_Lecture_16&amp;diff=22581&amp;oldid=prev</id>
		<title>Soma: Created page with &quot;==Video==  Video from the lecture given on March 11, 2020 [https://homeostasis.scs.carleton.ca/~soma/os-2020w/lectures/comp3000-2020w-lec16-20200311.m4v is now available].  ==...&quot;</title>
		<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2020W_Lecture_16&amp;diff=22581&amp;oldid=prev"/>
		<updated>2020-03-20T02:39:32Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;==Video==  Video from the lecture given on March 11, 2020 [https://homeostasis.scs.carleton.ca/~soma/os-2020w/lectures/comp3000-2020w-lec16-20200311.m4v is now available].  ==...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Video==&lt;br /&gt;
&lt;br /&gt;
Video from the lecture given on March 11, 2020 [https://homeostasis.scs.carleton.ca/~soma/os-2020w/lectures/comp3000-2020w-lec16-20200311.m4v is now available].&lt;br /&gt;
&lt;br /&gt;
==Notes==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Lecture 16&lt;br /&gt;
----------&lt;br /&gt;
&lt;br /&gt;
Virtual memory&lt;br /&gt;
 - virtual memory vs physical memory&lt;br /&gt;
 - page tables&lt;br /&gt;
 - page table entries (PTEs)&lt;br /&gt;
 - TLB&lt;br /&gt;
 - mmap and virtual memory&lt;br /&gt;
 - swapping &amp;amp; paging&lt;br /&gt;
&lt;br /&gt;
What is virtual memory?&lt;br /&gt;
 - abstraction over physical memory that allows for&lt;br /&gt;
   memory to be shared with running programs not having to do&lt;br /&gt;
   anything special&lt;br /&gt;
 - each program sees its own memory map&lt;br /&gt;
&lt;br /&gt;
To implement virtual memory, have to map between virtual addresses&lt;br /&gt;
and physical addresses&lt;br /&gt;
 - processes use virtual addresses&lt;br /&gt;
 - CPU maps virtual addresses to physical addresess to find actual memory&lt;br /&gt;
   (to read or write)&lt;br /&gt;
&lt;br /&gt;
In other words, when, in assembly/machine code, there is an address&lt;br /&gt;
in a register or an instruction refers to an address, the CPU must translate this virtual address to a physical address before going to RAM and getting the value (or changing it)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So this translation should seem really really slow, and it is...if you&lt;br /&gt;
did it purely in software.  But there is hardware to help.&lt;br /&gt;
&lt;br /&gt;
But how do we store this mapping between virtual and physical addresses?&lt;br /&gt;
 - every process has a &amp;quot;page table&amp;quot; associated with it. &lt;br /&gt;
 - this is the data structure that defines the mapping between virtual&lt;br /&gt;
   and physical memory&lt;br /&gt;
&lt;br /&gt;
Basic question: what units does the kernel use to manage memory?&lt;br /&gt;
 - on disk, kernel uses blocks (1k or 4k, fixed size)&lt;br /&gt;
 - in RAM, memory is addressable in bytes, so shouldn&amp;#039;t we allocate it&lt;br /&gt;
   in bytes?  NO&lt;br /&gt;
 - RAM is allocated in fixed-sized units, typically 4k or 8K&lt;br /&gt;
   - known as pages, NOT blocks&lt;br /&gt;
&lt;br /&gt;
 - often times, the page size and the block size of a system are equal&lt;br /&gt;
   (both 4K), but this can vary&lt;br /&gt;
&lt;br /&gt;
We use fixed-sized units of allocation to avoid memory fragmentation&lt;br /&gt;
  - keep them small to minimize internal fragmentation (space lost&lt;br /&gt;
    to small allocations, e.g. need one byte but allocate 4k)&lt;br /&gt;
  - fixed sized prevents external allocation (any memory allocation&lt;br /&gt;
    can be accomplished if there are a sufficient number of pages/blocks)&lt;br /&gt;
&lt;br /&gt;
Physical RAM&lt;br /&gt;
************&lt;br /&gt;
Program 1&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Program 2&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Free Space&lt;br /&gt;
...&lt;br /&gt;
...             &amp;lt;--- put Program 3 here&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Kernel&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
************&lt;br /&gt;
&lt;br /&gt;
Want to load program 3, could put it in free space&lt;br /&gt;
BUT normally, memory map is much messier&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Physical RAM&lt;br /&gt;
************&lt;br /&gt;
Program 1&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Free Space&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Program 2&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
------------&lt;br /&gt;
Free Space&lt;br /&gt;
...             &amp;lt;--- put Program 3 here, but not enough room!&lt;br /&gt;
------------&lt;br /&gt;
Kernel&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
...&lt;br /&gt;
************&lt;br /&gt;
&lt;br /&gt;
Fragmentation also takes place in filesystems&lt;br /&gt;
 - so we try to allocate &amp;quot;extents&amp;quot; which are ranges of contiguous blocks,&lt;br /&gt;
   faster to access&lt;br /&gt;
&lt;br /&gt;
But in RAM fragmentation is much more painful&lt;br /&gt;
 - want to assemble a 1M array out of scattered 4k pieces?&lt;br /&gt;
&lt;br /&gt;
So really virtual memory is an abstraction over RAM much like a file&lt;br /&gt;
is an abstraction over disk blocks&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The purposes of virtual memory are&lt;br /&gt;
 - give programmers the illusion that their program occupies a contiguous&lt;br /&gt;
   range of bytes in memory&lt;br /&gt;
 - allow programs to be bigger than physical memory&lt;br /&gt;
&lt;br /&gt;
(Look up segmentation in the textbook to see another scheme for&lt;br /&gt;
abstrating memory)&lt;br /&gt;
&lt;br /&gt;
The limits to physical memory is determined by cost, space, and power constraints (4G versus 16G versus 256M versus 16K)&lt;br /&gt;
&lt;br /&gt;
Virtual memory is limited by the size of pointers/addresses for the CPU&lt;br /&gt;
 - 32 bit processors normally have 32 bit addresses (max 4G)&lt;br /&gt;
 - 64 bit processors normally have 64 bit addresses&lt;br /&gt;
 - 8 bit processors normally had 16 bit adresses (max 64k)&lt;br /&gt;
&lt;br /&gt;
2^64 is a very big number, don&amp;#039;t expect to ever see a system with that much RAM!&lt;br /&gt;
 - so on 64 bit systems, virtual addresses are 64 bit but physical addresses are&lt;br /&gt;
   smaller (e.g. 48 bits)&lt;br /&gt;
&lt;br /&gt;
Address spaces is all about limits of naming&lt;br /&gt;
 - need a unique address for every byte in memory&lt;br /&gt;
 - there are ways around this, but they are very painful (e.g, bank switching)&lt;br /&gt;
&lt;br /&gt;
Virtual memory is made up&lt;br /&gt;
 - process sees that memory, but it doesn&amp;#039;t all have to be allocated&lt;br /&gt;
 - e.g., process is using 64-bit pointers, so can address 2^64 bytes of&lt;br /&gt;
   memory...but if it accesses more than the kernel has allocated it&lt;br /&gt;
   (e.g., 1M), access to unallocated memory ranges results in segfaults&lt;br /&gt;
&lt;br /&gt;
A segfault happens when a process accesses a virtual address that has&lt;br /&gt;
no corresponding physical address&lt;br /&gt;
&lt;br /&gt;
So what is a page table?&lt;br /&gt;
 - data structure for mapping virtual-&amp;gt;physical addresses&lt;br /&gt;
 - does not have mappings on an address-by-address basis,&lt;br /&gt;
   that would be crazy inefficient&lt;br /&gt;
 - instead, mappings are virtual page-&amp;gt;physical page&lt;br /&gt;
   - can copy page offset from virtual to physical address&lt;br /&gt;
&lt;br /&gt;
Page tables are basically arrays stored in a memory-effient way&lt;br /&gt;
 - use virtual page number as index into array&lt;br /&gt;
 - value at that index is the physical page number&lt;br /&gt;
&lt;br /&gt;
But if you made this a real array, it would be crazy inefficient&lt;br /&gt;
 - would need to be huge, e.g. 2^52 entries for a 64-bit address space&lt;br /&gt;
   with 4k pages (4k = 2^12)&lt;br /&gt;
 - so we need something like a sparse array&lt;br /&gt;
&lt;br /&gt;
Is data in swap persistent?&lt;br /&gt;
 - it isn&amp;#039;t supposed to be, is deallocated when procesess terminate&lt;br /&gt;
 - but...data isn&amp;#039;t normally overwritten on disk&lt;br /&gt;
 - solution: look up encrypted swap&lt;br /&gt;
&lt;br /&gt;
Why not make a page table a hash table?&lt;br /&gt;
 - want something that can be implemented quickly in hardware&lt;br /&gt;
 - calculating values on each memory access would be slower generally&lt;br /&gt;
&lt;br /&gt;
How do we do this &amp;quot;sparse array&amp;quot;?&lt;br /&gt;
&lt;br /&gt;
Big idea: split virtual address up into portions that each becomes an index&lt;br /&gt;
into a page&lt;br /&gt;
 - need to make page tables out of pages!&lt;br /&gt;
&lt;br /&gt;
e.g., divide a 32 bit address as follows:&lt;br /&gt;
&lt;br /&gt;
 ----------- ---------- ------------&lt;br /&gt;
   10 bits     10 bits   12 bits      = 32 bits total&lt;br /&gt;
&lt;br /&gt;
   lev 1       lev 2     offset&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
level 1 = offset into top page of page table&lt;br /&gt;
level 2 = offset into level two of page table&lt;br /&gt;
offset  = offset into data page&lt;br /&gt;
&lt;br /&gt;
2^10 = 1024, thus allows for 1024 4-byte values in 4k&lt;br /&gt;
 4 bytes = 32 bits = enough to store a pointer/address value&lt;br /&gt;
   (actually stores a page table entry as we&amp;#039;ll discuss below)&lt;br /&gt;
&lt;br /&gt;
(These offsets are a lot like how we find locations of inodes in ext4)&lt;br /&gt;
(Computer scientists have a limited number of ideas)&lt;br /&gt;
&lt;br /&gt;
So a page table is just a bunch of pages which define virtual to physical mappings.  When you get a virtual address (on 32 biut)&lt;br /&gt;
 - use first 10 bits to find PTE in level 1 of the page table&lt;br /&gt;
     - use found PTE to find location of level 2 of page table page&lt;br /&gt;
 - use next 10 bits to find PTE in level 2 page table page&lt;br /&gt;
     - use found PTE to find location of desired data page&lt;br /&gt;
 - use last 12 bits to index into desired data page&lt;br /&gt;
     - this is the memory you want!&lt;br /&gt;
&lt;br /&gt;
PTE = page table entry&lt;br /&gt;
&lt;br /&gt;
page table entry is&lt;br /&gt;
 - a physical page number (for 32 bit sytems, 20 bits)&lt;br /&gt;
 - metadata about the page&lt;br /&gt;
&lt;br /&gt;
If you had 4G of RAM, that is 2^32&lt;br /&gt;
  2^32 = 2^20 * 2^12&lt;br /&gt;
         num    size&lt;br /&gt;
         pages  of page&lt;br /&gt;
                (4k)&lt;br /&gt;
&lt;br /&gt;
For 64-bit systems it is the same, but just gets messier&lt;br /&gt;
 (because virtual addresses are larger than physical addresses)&lt;br /&gt;
 can have 3 or 4 levels of pages&lt;br /&gt;
&lt;br /&gt;
We split page tables into levels because we want entries for a level to&lt;br /&gt;
fit into one page&lt;br /&gt;
&lt;br /&gt;
There is            1 level 1 page, for 32 bit systems, per process&lt;br /&gt;
           up to 2^20 level 2 pages, for 32-bit systems, per process&lt;br /&gt;
	    (but most procesess will only allocate some of these&lt;br /&gt;
	     unless it uses 4G of RAM on its own)&lt;br /&gt;
&lt;br /&gt;
Remember every process needs its own page table&lt;br /&gt;
 - start with one level 1 page and 1 level 2 page&lt;br /&gt;
 - as process uses more RAM, add level 2 pages&lt;br /&gt;
  &lt;br /&gt;
Why &amp;quot;only&amp;quot; 64 bits?&lt;br /&gt;
&lt;br /&gt;
This is 4G^4G&lt;br /&gt;
  2^64 = (2^32)^32&lt;br /&gt;
&lt;br /&gt;
It is very big&lt;br /&gt;
&lt;br /&gt;
I haven&amp;#039;t talked about performance of virtual memory&lt;br /&gt;
 - note that each level of the page table is a memory access&lt;br /&gt;
 - so to look up one virtual address, we have to look at&lt;br /&gt;
    - the level 1 page table page&lt;br /&gt;
    - a level 2 page table page&lt;br /&gt;
    - the data page&lt;br /&gt;
 - 3 physical memory accesses for every virtual memory access&lt;br /&gt;
 - NOT A GOOD DEAL&lt;br /&gt;
&lt;br /&gt;
Solution: have a very fast cache of virtual-&amp;gt;physical memory mappings&lt;br /&gt;
 - this cache is called the TLB&lt;br /&gt;
 - TLB=translation lookaside buffer, nobody knows why this name&lt;br /&gt;
   is used nowadays (ok, go to the IBM design literature and you can&lt;br /&gt;
   see the rationale but it makes no sense today)&lt;br /&gt;
 - just remember the TLB is a cache of recently used PTEs&lt;br /&gt;
   (page table entries)&lt;br /&gt;
&lt;br /&gt;
CPU looks in the TLB for a virtual addresss,&lt;br /&gt;
 - if it finds it, we&amp;#039;re done, and we have lost almost no time&lt;br /&gt;
 - if we don&amp;#039;t find it in the TLB, go look at the page table - lost time,&lt;br /&gt;
   but access will succeed&lt;br /&gt;
 - if we don&amp;#039;t find it in the page table, tell the OS&lt;br /&gt;
    - because the process tried to access an invalid range of its&lt;br /&gt;
      virtual address space&lt;br /&gt;
&lt;br /&gt;
On Intel and AMD systems, you cannot directly control the TLB entries, CPU&lt;br /&gt;
does its own thing (it looks at the page table directly)&lt;br /&gt;
&lt;br /&gt;
But on other processers (I believe SPARC), the hardware doesn&amp;#039;t look at&lt;br /&gt;
page tables, the OS does.  So the kernel can directly control TLB entries&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
What is the metadata associated with each PTE?&lt;br /&gt;
 - there&amp;#039;s the physical page number (that&amp;#039;s the data)&lt;br /&gt;
 - read, write, execute bits (like a file, but for each page)&lt;br /&gt;
 - valid bit&lt;br /&gt;
 - dirty bit&lt;br /&gt;
 - accessed bit&lt;br /&gt;
 - others?&lt;br /&gt;
&lt;br /&gt;
We have per-page rwx bits so the CPU can alert the kernel when&lt;br /&gt;
a process does an improper memory access&lt;br /&gt;
 - write to read-only memory&lt;br /&gt;
 - execute to no-execute memory&lt;br /&gt;
Big help for security!&lt;br /&gt;
&lt;br /&gt;
Remember the kernel isn&amp;#039;t running most of the time&lt;br /&gt;
 - it is only called in when required&lt;br /&gt;
 - most of the time the CPU is running userspace processes&lt;br /&gt;
 - so need to make sure kernel is woken up when key decisions have&lt;br /&gt;
   to be made&lt;br /&gt;
&lt;br /&gt;
Valid bit is used to say whether the PTE refers to a real physical page&lt;br /&gt;
 - you can have a physical page number of zero!&lt;br /&gt;
 - so need another bit to say whether it is valid or not&lt;br /&gt;
 - needed for when that particular page isn&amp;#039;t allocated,&lt;br /&gt;
   OR if it refers to data that is on disk (and so needs&lt;br /&gt;
   to be read in)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(His name is Roshi, he&amp;#039;s a boxer)&lt;br /&gt;
&lt;br /&gt;
Dirty bit says whether the page has been modified since allocated&lt;br /&gt;
or read from disk&lt;br /&gt;
 - clean pages can be tossed out of RAM without consequence,&lt;br /&gt;
   can be read from disk later&lt;br /&gt;
 - dirty pages have to be written to disk (&amp;quot;cleaned&amp;quot;), until then&lt;br /&gt;
   the MUST stay in RAM&lt;br /&gt;
&lt;br /&gt;
Kernel will try to kick out clean pages when RAM is needed&lt;br /&gt;
 - changing dirty to clean pages happens, but is a lower&lt;br /&gt;
   priority unless RAM is getting scarce&lt;br /&gt;
&lt;br /&gt;
Accessed bit says whether the page has been recently accessed&lt;br /&gt;
 - 1 bit to represent time&lt;br /&gt;
 - not efficient to store a whole timestamp&lt;br /&gt;
 - but how is 1 bit enough?  clock algorithms (next time)&lt;br /&gt;
&lt;br /&gt;
The kernel has its own page table for its code and data&lt;br /&gt;
 - thus it mostly uses virtual addresses itself&lt;br /&gt;
 - no fun to program with fragmented memory!&lt;br /&gt;
 - but it deals with physical addresses when necessary&lt;br /&gt;
   (userspace can&amp;#039;t see them)&lt;br /&gt;
&lt;br /&gt;
What if your system doesn&amp;#039;t have swap?&lt;br /&gt;
 - you either tell processes to free memory or you kill them&lt;br /&gt;
 - this is what happens on Android, iOS (no swap),&lt;br /&gt;
   they kill procesess and restart them all the time&lt;br /&gt;
 - not completely sure why they don&amp;#039;t have swap...but it is probably&lt;br /&gt;
   primarily for power consumption reasons (which is mostly how&lt;br /&gt;
   phones get weird)&lt;br /&gt;
&lt;br /&gt;
Pages appear sequential in virtual memory but are fragmented in physical memory&lt;br /&gt;
 - just like how data in files is sequential but blocks of a file can&lt;br /&gt;
   be scattered across a disk&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Next week, tutorial will show you virtual versus physical addresses&lt;br /&gt;
using a kernel module and special userspace program (developed again by William!)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Soma</name></author>
	</entry>
</feed>