Operating Systems 2021F Lecture 6
Video
Video from the lecture given on September 28, 2021 is now available:
Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)
Notes
Lecture 6 --------- - deadlines - A1 is due Friday, Oct 1 - but will be accepted until October 5th 10 AM no penalty (we'll discuss solutions in class) - but T1 & T2 are due by midnight on October 1st - Tutorial 3 came out yesterday, officially due in a week - but will be accepted until the official due date of A2 - missing official deadlines don't have grading penalties - until the hard deadline - but if you miss one, you should consider yourself behind What is the material for the midterm? - the two assignments - I literally take the assignment questions as a basis for the midterm questions (making a question up to cover the concepts of an assignment question or two) - midterm is all short answer, should be answerable without referring to notes or using a computer - but it is open book, open note, open internet - just NO COLLABORATION - remember there will be randomized interviews after - to make sure they were graded fairly - but yes, if it is clear what you know is different from what is on the test...we'll have a problem - My dog's name is Roshi Topics for today ---------------- - stack vs. heap - assembly directives - standard I/O, I/O redirection - terminals To try and understand a memory map, helpful to run with setarch -R - disables ASLR (address space layout randomization), a security mitigation against code injection attacks Note that the addresses in a process are consistent from run to run with setarch -R - because each process has its own address space - process address spaces are "virtual" - no direct correspondence to addresses of actual RAM in your computer (physical addresses) - OS manages virtual <-> physical address mappings - memview gives us a view of virtual addresses - cannot access physical addresses outside of the kernel - (page tables are the data structure that maintains virtual <-> physical mappings, but we'll talk about that later) - In a classic view of C programs, variables are stored in the "stack" or the "heap" - stack is for local variables - heap is for malloc'd variables - but the memory map is more complex - normally there is an area for runtime data storage in memory that holds the stack and the heap - heap is allocated starting from the bottom (lowest address) and the stack is allocated from the top (highest address) - sbrk(0) gives you the address of the beginnig of the free space in the heap - as you allocate things dynamically, sbrk(0) increases - note that code & static data (data fixed at compile time) are stored together (i.e., strings embedded in the code) A segment is just a variable-sized area of memory - generally with specific semantics (purpose) .text is normally the code .rodata is read-only data known at compile time - main thing to know is the stack and the heap don't have anything really in the on-disk segments - at most you have a declaration - they are then dynamically allocated at runtime - stack/heap is really one segment used for dynamic allocation - but nowadays they are logically separate because we don't want the stack and the heap to ever collide - we add barriers between them - a "quad" is a 64-bit quantity - I think this is historic, quad word, when a "word" was 16 bits (i.e., when registers only held 16 bits, not 64 bit quantities) - With static linking, all references are resolved at link time - With dynamic linking, some references have to be resolved at runtime (the runtime dynamic linker has to run before main starts) - this is why dynamically linked programs have many more system calls before main than do statically linked programs - it is literally loading files from disk, i.e., the C library - make sure you understand the output of memview - and try similar things in 3000quiz to see its memory map I would suggest translating the output of 3000memview into a diagram - label it with addresses - see where things are relative to each other - you should see a clear picture - and if parts don't make sense, ask! Standard I/O - you've learned about standard in, out, and error - correspond to file descriptors 0, 1, and 2 as we previously discussed - but how do we change them, and where do they point by default? In most shells, you can use <, >, and | to redirect standard in and out - > redirect standard output - < redirect standard input - | redirect the standard output of one program to the standard input of another These operators can work with arbitrary file descriptors, just give the number before them (for < and > at least) Example, standard error redirection is 2> (so 1> is the same as >) If I don't redirect standard in, out, and error, where are they going by default? - they are going to a file, but what file? Whether you are in a graphical text window locally or have ssh'd to a remote system, you'll probably see bash's file descriptors referring to /dev/pts/? (where ? is a small number) What is /dev/pts? - pseudo TTYs A teletype is like a fancy telegraph - you type in things locally, they appear at a remote printer - remote person types, you see it locally, printed out - OLD fashion version of texting When computers came around, early interactive interfaces were teletypes - computer was one side, rather than a human - was printing to paper rather than a screen - could produce A LOT of paper Eventually the paper was replaced with a CRT (cathode ray tube) - see the VT100, a "video terminal" - teletype but with screen output, not paper output - pure text! (except sometimes for weird fonts with shapes) There were LOTS of video terminals - many incompatibilities - so UNIX developed ways of dealing with different terminals - terminfo database - the TERM environment variable says which terminal you're using - when you connect to your class VM, what's the value of TERM for you? It may be different from mine Why does the type of terminal matter? - because not all text interfaces are the same! - when I run a program in a terminal that shows colors or positions text at specific locations on the screen, it is using special escape codes to do all that - it is just sending data to standard out, encoded so the terminal understands it - the terminfo files tells common libraries how to use common terminal capabilities - but how do you get a program like gnome-terminal to behave like a teletype, and how can ssh do the same? - they implement the pseudo tty interface via /dev/pts - any program that implements the pseudo tty interface can behave like a terminal to a program So question, why does it have to be a special device? - because terminals are devices with special capabilities - special things for interacting with keyboard and screen - big thing is speed traditionally, we'd connect via modems and so buffering, local echo were necessary - different from a regular file When you interact with a terminal, you're dealing with tech with 50+ years of history - lots of weird bits if you really dig in Terminal tricks - reset state: reset - disable local echo (say for password entry): stty -echo man stty to see all the things you can do with a terminal