<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=Operating_Systems_2022F_Lecture_19</id>
	<title>Operating Systems 2022F Lecture 19 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://homeostasis.scs.carleton.ca/wiki/index.php?action=history&amp;feed=atom&amp;title=Operating_Systems_2022F_Lecture_19"/>
	<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2022F_Lecture_19&amp;action=history"/>
	<updated>2026-04-05T22:41:57Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.42.1</generator>
	<entry>
		<id>https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2022F_Lecture_19&amp;diff=24149&amp;oldid=prev</id>
		<title>Soma: Created page with &quot;==Video==  Video from the lecture given on November 22, 2022 is now available: * [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.m4v video] * [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.cc.vtt auto-generated captions] Video is also available through Brightspace (Resources-&gt;Zoom meeting-&gt;Cloud Recordings tab)  ==Notes==  &lt;pre&gt; Lecture 19 ----------  - No class Thursday (US Thanksgivi...&quot;</title>
		<link rel="alternate" type="text/html" href="https://homeostasis.scs.carleton.ca/wiki/index.php?title=Operating_Systems_2022F_Lecture_19&amp;diff=24149&amp;oldid=prev"/>
		<updated>2022-11-23T00:08:48Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;==Video==  Video from the lecture given on November 22, 2022 is now available: * [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.m4v video] * [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.cc.vtt auto-generated captions] Video is also available through Brightspace (Resources-&amp;gt;Zoom meeting-&amp;gt;Cloud Recordings tab)  ==Notes==  &amp;lt;pre&amp;gt; Lecture 19 ----------  - No class Thursday (US Thanksgivi...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Video==&lt;br /&gt;
&lt;br /&gt;
Video from the lecture given on November 22, 2022 is now available:&lt;br /&gt;
* [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.m4v video]&lt;br /&gt;
* [https://homeostasis.scs.carleton.ca/~soma/os-2022f/lectures/comp3000-2022f-lec19-20221122.cc.vtt auto-generated captions]&lt;br /&gt;
Video is also available through Brightspace (Resources-&amp;gt;Zoom meeting-&amp;gt;Cloud Recordings tab)&lt;br /&gt;
&lt;br /&gt;
==Notes==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Lecture 19&lt;br /&gt;
----------&lt;br /&gt;
 - No class Thursday (US Thanksgiving)&lt;br /&gt;
 - A3 &amp;amp; A4 updates&lt;br /&gt;
&lt;br /&gt;
Topics for today&lt;br /&gt;
 - virtual memory&lt;br /&gt;
 - task &amp;amp; scheduling&lt;br /&gt;
&lt;br /&gt;
A3 was still officially due last night before midnight&lt;br /&gt;
 - but I&amp;#039;m now allowing submissions through Thursday&lt;br /&gt;
 - so solutions will be posted on Friday&lt;br /&gt;
&lt;br /&gt;
So also more time for T5 &amp;amp; T6&lt;br /&gt;
 - but really you should be working on T8 and T9 now&lt;br /&gt;
&lt;br /&gt;
A4 status&lt;br /&gt;
 - code is finished &amp;amp; posted&lt;br /&gt;
 - just updating the questions&lt;br /&gt;
&lt;br /&gt;
Tasks in the Linux kernel&lt;br /&gt;
 - tasks are what the kernel schedules to run on a core&lt;br /&gt;
 - associated with a task is CPU state (registers) and an address space&lt;br /&gt;
    - note the address space may be shared with other tasks&lt;br /&gt;
 - so processes and kernel threads both are &amp;quot;tasks&amp;quot; in the kernel&lt;br /&gt;
&lt;br /&gt;
Tasks are represented by &amp;quot;struct task_struct&amp;quot;&lt;br /&gt;
And when the kernel is processing a system call, the task that made the system call is referred to as &amp;quot;current&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Remember that kernel code is running in three contexts&lt;br /&gt;
 - on behalf of a task, processing a system call&lt;br /&gt;
 - on behalf of a device, processing an interrupt&lt;br /&gt;
 - in a kernel tasks, doing background processing for some reason&lt;br /&gt;
    - many drivers have kernel threads so they can do work whenever they need to&lt;br /&gt;
    - seems that interrupts are handled by specific kernel tasks,&lt;br /&gt;
      for some drivers at least&lt;br /&gt;
 - kernel tasks are identified as being in [] in a ps listing&lt;br /&gt;
 - don&amp;#039;t even bother trying to send signals to these, the kernel will ignore you&lt;br /&gt;
   (I&amp;#039;m pretty sure, might be fun to try)&lt;br /&gt;
&lt;br /&gt;
If you look in any process in /proc, there is /task subdirectory that shows all of the kernel-schedulable threads associated with this process&lt;br /&gt;
 - really, /proc shows you a list of tasks, not processes/threads&lt;br /&gt;
&lt;br /&gt;
So what really is a process then?&lt;br /&gt;
 - it is a group of tasks&lt;br /&gt;
 - there are various ways they can be associated with each other, but&lt;br /&gt;
   if you look in /task and see a bunch of things, those all together are the process, with one of them potentially being the &amp;quot;lead task&amp;quot;&lt;br /&gt;
 - a single-threaded process will have only one directory in /proc/&amp;lt;PID&amp;gt;/task&lt;br /&gt;
 - we&amp;#039;re not going to discuss this further&lt;br /&gt;
&lt;br /&gt;
Let&amp;#039;s talk about address spaces&lt;br /&gt;
 - abstraction around virtual addresses&lt;br /&gt;
&lt;br /&gt;
What does this mean?&lt;br /&gt;
 - when we work with memory addresses in assembly in UNIX, we are almost always using virtual addresess&lt;br /&gt;
 - how do we know?  because different processes can use the *same* addresses with no issues&lt;br /&gt;
&lt;br /&gt;
Emacs accessing address 0x12345678 and Firefox accessing 0x12345678 are COMPLETELY DIFFERENT THINGS&lt;br /&gt;
&lt;br /&gt;
But, the CPU needs a single way to access all of the RAM that it has&lt;br /&gt;
This numbering is known as physical addresses&lt;br /&gt;
&lt;br /&gt;
Now, do we NEED virtual addresses?  No!&lt;br /&gt;
 - older computers and smaller, embeded ones often don&amp;#039;t have them&lt;br /&gt;
 - instead, we can just split up memory between programs:&lt;br /&gt;
    - one will get 20000-40000, another will get 40000-50000, and so on&lt;br /&gt;
    - but man, that means a program can&amp;#039;t ever count on getting access&lt;br /&gt;
      to any specific range of memory - makes life complicated for&lt;br /&gt;
      programmers&lt;br /&gt;
        - especially if one program misbehaves, memory protection is nice!&lt;br /&gt;
    - so hardware people gave us hardware to make using virtual addresses fast&lt;br /&gt;
&lt;br /&gt;
If you want to see what life is like without virtual memory, look up TSR&amp;#039;s for MS-DOS...or get used to just running one program at a time, having to reboot between programs.&lt;br /&gt;
&lt;br /&gt;
Virtual memory is implemented with a combination of hardware and software&lt;br /&gt;
 - hardware: TLB, page tables lookup&lt;br /&gt;
 - software: operating system kernel managing page tables&lt;br /&gt;
&lt;br /&gt;
Page tables are the data structure that defines virtual-&amp;gt;physical memory&lt;br /&gt;
mappings for every task (process)&lt;br /&gt;
&lt;br /&gt;
So logically, what happens *on every memory access*&lt;br /&gt;
 - program references virtual address v&lt;br /&gt;
 - system looks in the page table for v, finds the corresponding physical address p&lt;br /&gt;
 - it then looks in physical memory at what is in p&lt;br /&gt;
 - then it does a read or write to p as needed&lt;br /&gt;
&lt;br /&gt;
So this means that every virtual memory access requires&lt;br /&gt;
 - looking things up in the page table, THEN&lt;br /&gt;
 - accessing physical memory to do a read/write&lt;br /&gt;
&lt;br /&gt;
The page table is ALSO in RAM, so we&amp;#039;re doubling (or more) the amount of time needed on every memory access.&lt;br /&gt;
  - the page table is actually a big tree, so may need more than one memory access to look up a value (we have to &amp;quot;walk the tree&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So there has to be a way to speed this up!  That&amp;#039;s the TLB.&lt;br /&gt;
 - the TLB is a cache of recently accessed virtual-&amp;gt;physical memory mappings&lt;br /&gt;
 - it is SUPER FAST (basically the fastest memory in the system)&lt;br /&gt;
 - it is also not that large&lt;br /&gt;
&lt;br /&gt;
The TLB is implemented differently from other caches because it is more like a hash table&lt;br /&gt;
 - you query it with a virtual address, it then checks ALL of its entries&lt;br /&gt;
   to see if any match and then returns the corresponding physical address&lt;br /&gt;
&lt;br /&gt;
But really, these lookups aren&amp;#039;t happening on a per-byte level, but are at a per-page level.&lt;br /&gt;
&lt;br /&gt;
What is a page?&lt;br /&gt;
 - a block but for RAM (disks have blocks, RAM has pages)&lt;br /&gt;
 - a page is generally 4K but can be much larger (4M)&lt;br /&gt;
   - larger pages are used for specialized purposes normally,&lt;br /&gt;
     mapping device memory for example&lt;br /&gt;
&lt;br /&gt;
So the page table is mapping virtual to physical *pages*&lt;br /&gt;
  - reduces the number of bits that have to be mapped&lt;br /&gt;
&lt;br /&gt;
Think of every address as having a prefix plus an offset&lt;br /&gt;
 - prefix is mapped between virtual and physical&lt;br /&gt;
 - offset doesn&amp;#039;t change, it is just an index into the page&lt;br /&gt;
   (you can think of it as a 4K or 8K array of bytes indexed by the offset)&lt;br /&gt;
&lt;br /&gt;
Page tables are interesting as a data structure&lt;br /&gt;
 - because we want to make sure parts of it *also* fit in pages, and so&lt;br /&gt;
   should do so most efficiently&lt;br /&gt;
 - much as a filesystem data structures all fit into multiples of blocks&lt;br /&gt;
&lt;br /&gt;
Filesystems are optimized for persistence, robustness, while data in RAM is optimized for performance&lt;br /&gt;
&lt;br /&gt;
Virtual memory is the set of technologies that allows each process to have its own address space&lt;br /&gt;
&lt;br /&gt;
One nice feature of pages and blocks being basically the same size (or multiples of each other) is we can easily move data between RAM and disk&lt;br /&gt;
 - when we run out of RAM we can just store process data on disk&lt;br /&gt;
 - this is what a &amp;quot;page file&amp;quot; or &amp;quot;swap file&amp;quot; is used for&lt;br /&gt;
    - swap is an older term for when an entire process would be put on disk,&lt;br /&gt;
      rather than specific pages&lt;br /&gt;
    - nowadays only portions of processes are kicked out of RAM and onto disk&lt;br /&gt;
&lt;br /&gt;
Page tables are data structures for organizing page table entries (PTEs)&lt;br /&gt;
  - a PTE identifies a specific page and associated metadata (valid or not, read/write/execute permissions, etc)&lt;br /&gt;
  - the TLB uses PTEs to handle requests to map a virtual address prefix to a physical address prefix&lt;br /&gt;
  - when the TLB doesn&amp;#039;t have a suitable PTE, then the CPU goes and checks the designated page table for that process&lt;br /&gt;
&lt;br /&gt;
So as memory has gotten bigger, page tables have gotten deeper&lt;br /&gt;
For 32-bit address spaces it was pretty simple&lt;br /&gt;
 - a PTE is 4 bytes, a page is 4K&lt;br /&gt;
 - so one page can hold 1024 PTE&amp;#039;s&lt;br /&gt;
 - an offset for a byte in 4K requires 12 bits (2^12 = 4096)&lt;br /&gt;
&lt;br /&gt;
So addresses were split into three fields:&lt;br /&gt;
  10 bits for index in level 1 of the page table (the root of the tree)&lt;br /&gt;
  10 bits for index in level 2 of the page table (middle nodes)&lt;br /&gt;
  12 bits for offset in the page&lt;br /&gt;
 ----&lt;br /&gt;
  32 bits&lt;br /&gt;
&lt;br /&gt;
Much messier with 64 bits, hence the number of levels handled in the Linux kernel currently!&lt;br /&gt;
&lt;br /&gt;
Note that the page table is a tree, but rather than having 2 pointers in each node, we have 1024 typically.&lt;br /&gt;
&lt;br /&gt;
I showed you some code from the Linux kernel to make things concrete,&lt;br /&gt;
but I&amp;#039;m not expecting you to really understand it&lt;br /&gt;
 - I only understand bits and pieces!&lt;br /&gt;
&lt;br /&gt;
What I *do* want you to understand are the high-level concepts and to&lt;br /&gt;
see how those concepts are reflected in what you can observe on a real Linux system.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Soma</name></author>
	</entry>
</feed>