Operating Systems 2022F Lecture 4: Difference between revisions
Created page with "==Video== ==Notes== <pre> Lecture 4 --------- * Assignment 1 has been released, due Sept. 28th - based on T1 & T2, if you've done them it should be straightforward - please use the template and validate after completing the assignment - we use scripts to split up the assignment by question so they can be graded all at once - collaborators should be at the top; resources (e.g., web pages) should be listed with the question it helped with (remembe..." |
|||
Line 189: | Line 189: | ||
===getpid-test.c=== | ===getpid-test.c=== | ||
< | <syntaxhighlight lang="c" line> | ||
#include <stdio.h> | #include <stdio.h> | ||
#include <sys/types.h> | #include <sys/types.h> | ||
Line 204: | Line 204: | ||
return 0; | return 0; | ||
} | } | ||
</ | </syntaxhighlight> | ||
===getpid-test-mod.s=== | |||
This is getpid-test.c's assembly, but with the call to the getpid() library call replaced with a direct system call. | |||
<syntaxhighlight lang="asm" line> | |||
.file "getpid-test.c" | |||
.text | |||
.section .rodata.str1.1,"aMS",@progbits,1 | |||
.LC0: | |||
.string "My PID is %d\n" | |||
.text | |||
.globl main | |||
.type main, @function | |||
main: | |||
.LFB41: | |||
.cfi_startproc | |||
endbr64 | |||
subq $8, %rsp | |||
.cfi_def_cfa_offset 16 | |||
movl $39, %eax | |||
syscall | |||
movl %eax, %edx | |||
leaq .LC0(%rip), %rsi | |||
movl $1, %edi | |||
movl $0, %eax | |||
call __printf_chk@PLT | |||
movl $0, %eax | |||
addq $8, %rsp | |||
.cfi_def_cfa_offset 8 | |||
ret | |||
.cfi_endproc | |||
.LFE41: | |||
.size main, .-main | |||
.ident "GCC: (Ubuntu 11.2.0-19ubuntu1) 11.2.0" | |||
.section .note.GNU-stack,"",@progbits | |||
.section .note.gnu.property,"a" | |||
.align 8 | |||
.long 1f - 0f | |||
.long 4f - 1f | |||
.long 5 | |||
0: | |||
.string "GNU" | |||
1: | |||
.align 8 | |||
.long 0xc0000002 | |||
.long 3f - 2f | |||
2: | |||
.long 0x3 | |||
3: | |||
.align 8 | |||
4: | |||
</syntaxhighlight> |
Revision as of 21:29, 20 September 2022
Video
Notes
Lecture 4 --------- * Assignment 1 has been released, due Sept. 28th - based on T1 & T2, if you've done them it should be straightforward - please use the template and validate after completing the assignment - we use scripts to split up the assignment by question so they can be graded all at once - collaborators should be at the top; resources (e.g., web pages) should be listed with the question it helped with (remember grading is done on a per-question basis) - NOTE THAT T1 & T2 ARE DUE WITH A1! - man pages are technically resources but you don't need to list them I just assume you're using them Regarding tutorials - I don't give out solutions for them, because if you do them right you *know* your answer is correct. - I don't want you to just be able to answer the questions I come up with - you should be able to answer the questions you or someone else asks you! So what I just showed was the difference between library and system calls - you may want to review later - In C we can only call functions, we have to drop down to assembly to make actual system calls - so the C library includes function wrappers for commonly used system calls - Notice I chose a system call that doesn't take any arguments - system calls process arguments a bit differently than C functions, so the C library versions have to move things around for the system calls to work Note that there isn't a 1:1 mapping between library calls and system calls - some library functions make no system calls (e.g., atoi) - some may make many system calls - and some are thin wrappers around basic system calls (those are 1:1, like getpid) There is a C library function called "syscall" - it is a generic wrapper for system calls, you give it the system call number and the arguments and it invokes the system call - you use this if there isn't a specialized wrapper for the system call So, what is the "call" assembly language instruction actually doing? - it is actually a combination of two operations: push %rip (which may not be a real instruction?) jmp <given address> - jmp -> jump (goto) the given address, i.e., start executing there - push is push onto the stack So what is the stack? - you know the stack data structure right? - first in, last out - operations: push, pop - push: store something at the top of the stack - pop: get something at the top of the stack and remove it modern processors all have a stack data structure that is used to keep program state as it runs - stores function return addresses, local variables, and function arguments - (local variables and arguments are first stored in registers, use the stack when there isn't room) (Instruction pipelining is not something we can directly observe from assembly/machine language, that is an optimization at a lower level that we can pretend doesn't exist) So when we enter a function - put arguments in correct places (registers/stack via push) - call <address of function entry> - pushes current address (actually address of next instruction in current function) - jumps to called function entry And when we exit a function - clean up stack (may have used for local variables) - ret - pops stack, puts it in instruction pointer, "jumping" to where we were Modern CPUs are mostly load/store architectures - load data from RAM into registers - manipulate data in registers - store register data into RAM (as opposed to an architecture which manipulates RAM directly) Recall that RAM is a (for a programmer) a giant array of bytes - indexed by memory address A "pointer" in C is just a memory address in assembly Modern CPUs have "caches" - L1, L2, L3 - higher speed memory - sit between registers and RAM (main memory) - NOT DIRECTLY CONTROLLED from machine code - CPU uses them automatically, transparently - you can structure your code to work well with caches, but it is implicit - needed because RAM is very, very slow compared to CPU registers - so otherwise the CPU would be waiting all the time for data to come in from RAM registers are directly accessed via assembly language/machine language cache is not! When you load data from RAM (i.e., do a movl instruction from a memory address) it will likely go through one or more levels of cache - again, the CPU handles this automatically It used to be that machine language/assembly language directly corresponded to how the CPU worked - now, however, it is just another API, and LOTS of things are going on underneath - just, we mostly can't control the computer at that level, it does its own thing (well, CPU manufacturers can, but they don't let others have that level of control) Memory hierarchy is an important concept - goes from small amounts of fast storage to vast amounts of slow storage - to go fast, you have to make sure your code isn't waiting on data to arrive - so you have to manage the hierarchy properly registers L1 cache (SRAM) L2 cache (SRAM) L3 cache (SRAM) DRAM SSD Hard drives Optical media Tapes DRAM: dynamic RAM SRAM: static RAM RAM: Random Access Memory SRAM is "static" because data in it doesn't have to be refreshed DRAM has to be periodically refreshed otherwise it forgets (refreshed - bit patterns updated, basically needs power periodically otherwise will lose integrity) DRAM is higher density (fewer transistors per bit stored) but slower SRAM is lower density (more transistors per bit stored) but is faster Hennessey & Patterson, "Computer Architecture: A Quantitative Approach" - GREAT BOOK to learn about computer architecture and the tradeoffs - will not be covering it here! So the memory hierarchy tells us about relative performance, but it doesn't tell us about abstractions Operating systems abstract memory resources to allow multiple programs to run concurrently Specifically, each process has its own private, virtual view of RAM - it can't see it all - and in fact, it sees a nice, continuous range of memory containing its code and data and NOBODY ELSES - we call this virtual memory because it is a lie - data is actually stored all over the place in RAM, we just maintain a table of mappings so it looks nice, clean, and contiguous (and we only see what we're supposed to see) - when you program in assembly language, you're dealing with addresses, but they are VIRTUAL ADDRESSES, not "physical addresses" (direct references to locations in DRAM) Older computers did not have virtual memory, they only had physical memory - and only one program would run on them at a time - virtualization was introduced to allow for the computer to be easily shared between multiple programs running at the same time - so really, the process abstraction lets us pretend it is the 1980's or earlier the OS kernel (i.e., the Linux kernel) deals in both virtual and physical addresses - but even it mostly uses virtual addresses, and only manipulates physical addresess when it has to In modern systems, we deal with "systems on a chip" consisting of billions of transistors - EVERYTHING is in here somewhere - (well, DRAM is generally separate but can be packaged together) Because they are so complex and it is so hard to change software in incompatible ways, there are many lies that hardware tells to software - mostly can ignore, except when you can't for security and performance reasons
Code
getpid-test.c
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
pid_t mypid;
mypid = getpid();
printf("My PID is %d\n", mypid);
return 0;
}
getpid-test-mod.s
This is getpid-test.c's assembly, but with the call to the getpid() library call replaced with a direct system call.
.file "getpid-test.c"
.text
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "My PID is %d\n"
.text
.globl main
.type main, @function
main:
.LFB41:
.cfi_startproc
endbr64
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $39, %eax
syscall
movl %eax, %edx
leaq .LC0(%rip), %rsi
movl $1, %edi
movl $0, %eax
call __printf_chk@PLT
movl $0, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE41:
.size main, .-main
.ident "GCC: (Ubuntu 11.2.0-19ubuntu1) 11.2.0"
.section .note.GNU-stack,"",@progbits
.section .note.gnu.property,"a"
.align 8
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.string "GNU"
1:
.align 8
.long 0xc0000002
.long 3f - 2f
2:
.long 0x3
3:
.align 8
4: