Operating Systems 2022F Lecture 4

From Soma-notes
Revision as of 17:31, 20 September 2022 by Soma (talk | contribs) (→‎Video)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Video

Video from the lecture given on September 20, 2022 is now available:

Video is also available through Brightspace (Resources->Zoom meeting->Cloud Recordings tab)

Notes

Lecture 4
---------

* Assignment 1 has been released, due Sept. 28th
   - based on T1 & T2, if you've done them it should be straightforward
   - please use the template and validate after completing the assignment
   - we use scripts to split up the assignment by question so they
     can be graded all at once
   - collaborators should be at the top; resources (e.g., web pages)
     should be listed with the question it helped with (remember
     grading is done on a per-question basis)
   - NOTE THAT T1 & T2 ARE DUE WITH A1!
   - man pages are technically resources but you don't need to list them
     I just assume you're using them

Regarding tutorials
 - I don't give out solutions for them, because if you do them right you *know* your answer is correct.
   - I don't want you to just be able to answer the questions I come up with
   - you should be able to answer the questions you or someone else asks you!

So what I just showed was the difference between library and system calls
 - you may want to review later
 - In C we can only call functions, we have to drop down to assembly
   to make actual system calls
   - so the C library includes function wrappers for commonly used system calls
 - Notice I chose a system call that doesn't take any arguments
   - system calls process arguments a bit differently than C functions,
     so the C library versions have to move things around for
     the system calls to work
     
Note that there isn't a 1:1 mapping between library calls and system calls
 - some library functions make no system calls (e.g., atoi)
 - some may make many system calls
 - and some are thin wrappers around basic system calls (those are 1:1, like getpid)

There is a C library function called "syscall"
 - it is a generic wrapper for system calls, you give it the system call
   number and the arguments and it invokes the system call
 - you use this if there isn't a specialized wrapper for the system call


So, what is the "call" assembly language instruction actually doing?
 - it is actually a combination of two operations:
    push %rip  (which may not be a real instruction?)
    jmp <given address>

 - jmp -> jump (goto) the given address, i.e., start executing there
 - push is push onto the stack

So what is the stack?
 - you know the stack data structure right?
   - first in, last out
 - operations: push, pop
   - push: store something at the top of the stack
   - pop: get something at the top of the stack and remove it

modern processors all have a stack data structure that is used to keep program state as it runs
 - stores function return addresses, local variables, and function arguments
    - (local variables and arguments are first stored in registers,
       use the stack when there isn't room)

(Instruction pipelining is not something we can directly observe from
 assembly/machine language, that is an optimization at a lower level that
 we can pretend doesn't exist)

So when we enter a function
 - put arguments in correct places (registers/stack via push)
 - call <address of function entry>
   - pushes current address (actually address of next instruction
     in current function)
   - jumps to called function entry

And when we exit a function
 - clean up stack (may have used for local variables)
 - ret
   - pops stack, puts it in instruction pointer, "jumping" to where we were

Modern CPUs are mostly load/store architectures
 - load data from RAM into registers
 - manipulate data in registers
 - store register data into RAM

(as opposed to an architecture which manipulates RAM directly)


Recall that RAM is a (for a programmer) a giant array of bytes
 - indexed by memory address

A "pointer" in C is just a memory address in assembly

Modern CPUs have "caches" - L1, L2, L3
 - higher speed memory
 - sit between registers and RAM (main memory)
 - NOT DIRECTLY CONTROLLED from machine code
   - CPU uses them automatically, transparently
   - you can structure your code to work well with caches,
     but it is implicit
 - needed because RAM is very, very slow compared to CPU registers
    - so otherwise the CPU would be waiting all the time for data
      to come in from RAM

registers are directly accessed via assembly language/machine language
cache is not!

When you load data from RAM (i.e., do a movl instruction from a memory
address) it will likely go through one or more levels of cache
 - again, the CPU handles this automatically

It used to be that machine language/assembly language directly corresponded to how the CPU worked
  - now, however, it is just another API, and LOTS of things are going on underneath
  - just, we mostly can't control the computer at that level, it does its
    own thing
     (well, CPU manufacturers can, but they don't let others have
      that level of control)

Memory hierarchy is an important concept
 - goes from small amounts of fast storage to vast amounts of slow storage
 - to go fast, you have to make sure your code isn't waiting on data to arrive
   - so you have to manage the hierarchy properly

registers
L1 cache (SRAM)
L2 cache (SRAM)
L3 cache (SRAM)
DRAM
SSD
Hard drives
Optical media
Tapes


DRAM: dynamic RAM
SRAM: static RAM
RAM: Random Access Memory

SRAM is "static" because data in it doesn't have to be refreshed
DRAM has to be periodically refreshed otherwise it forgets
 (refreshed - bit patterns updated, basically needs power periodically
  otherwise will lose integrity)

DRAM is higher density (fewer transistors per bit stored) but slower
SRAM is lower density (more transistors per bit stored) but is faster


Hennessey & Patterson, "Computer Architecture: A Quantitative Approach"
 - GREAT BOOK to learn about computer architecture and the tradeoffs
 - will not be covering it here!


So the memory hierarchy tells us about relative performance, but it doesn't tell us about abstractions

Operating systems abstract memory resources to allow multiple programs to run concurrently

Specifically, each process has its own private, virtual view of RAM
 - it can't see it all
 - and in fact, it sees a nice, continuous range of memory containing its code and data and NOBODY ELSES
   - we call this virtual memory because it is a lie
   - data is actually stored all over the place in RAM,
     we just maintain a table of mappings so it looks nice, clean, and contiguous (and we only see what we're supposed to see)

- when you program in assembly language, you're dealing with addresses, but they are VIRTUAL ADDRESSES, not "physical addresses" (direct references to locations in DRAM)


Older computers did not have virtual memory, they only had physical memory
 - and only one program would run on them at a time
 - virtualization was introduced to allow for the computer to be easily shared between multiple programs running at the same time
 - so really, the process abstraction lets us pretend it is the 1980's or earlier

the OS kernel (i.e., the Linux kernel) deals in both virtual and physical addresses
  - but even it mostly uses virtual addresses, and only manipulates
    physical addresess when it has to

In modern systems, we deal with "systems on a chip" consisting of billions of transistors
 - EVERYTHING is in here somewhere
 - (well, DRAM is generally separate but can be packaged together)

Because they are so complex and it is so hard to change software in incompatible ways, there are many lies that hardware tells to software
 - mostly can ignore, except when you can't for security and performance reasons

Code

getpid-test.c

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
        pid_t mypid;

        mypid = getpid();

        printf("My PID is %d\n", mypid);
        
        return 0;
}

getpid-test-mod.s

This is getpid-test.c's assembly, but with the call to the getpid() library call replaced with a direct system call.

	.file	"getpid-test.c"
	.text
	.section	.rodata.str1.1,"aMS",@progbits,1
.LC0:
	.string	"My PID is %d\n"
	.text
	.globl	main
	.type	main, @function
main:
.LFB41:
	.cfi_startproc
	endbr64
	subq	$8, %rsp
	.cfi_def_cfa_offset 16
	movl	$39, %eax
	syscall
	movl	%eax, %edx
	leaq	.LC0(%rip), %rsi
	movl	$1, %edi
	movl	$0, %eax
	call	__printf_chk@PLT
	movl	$0, %eax
	addq	$8, %rsp
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc
.LFE41:
	.size	main, .-main
	.ident	"GCC: (Ubuntu 11.2.0-19ubuntu1) 11.2.0"
	.section	.note.GNU-stack,"",@progbits
	.section	.note.gnu.property,"a"
	.align 8
	.long	1f - 0f
	.long	4f - 1f
	.long	5
0:
	.string	"GNU"
1:
	.align 8
	.long	0xc0000002
	.long	3f - 2f
2:
	.long	0x3
3:
	.align 8
4: