Operating Systems 2014F Lecture 6

Audio from the lecture given on September 24, 2014 is now available.

Library - code you did not write but it is sitting in your address space. The routines in there can be called, but to a first look, it is opaque. I am going to call it to do something, and it should do whatever I ask. On older systems, that's what the operating system does, they are just routines, you might call them directly, or you might call a dispatcher. Let's say you don't like what this operating system does, you just write your own routine to do it.

What happens if the operating system changes? Your code becomes tied to the exact version of the operating system.

When you call a library you call an API. If accessing private functions / data structures, you can do it, it will work, but when the api changes your code will break.

When you used an Apple 2 or commodore 64, you wouldn't normally have a hard disk with everything you are running, you would load off of a floppy. There might be multiple programs on that disk. It had the operating system on the disk as well. It would first load the OS and then the program would load off the disk. Every bootable floppy would have a copy of the OS. Stick in the first floppy of the program, and at some point the program would request you to put in another disk.

What would you do for copy protection? You would modify the operating system on the disk. The operating system had such low level control. Take one, create one from scratch or modify one. On these systems, the OS is just a library. A library that can do I/O stuff.

If you don't like it, you want to mess with it, you can. You can overwrite the OS. It's just routines to help you. This is good when your program essentially controls the machine. Lets' say I want to run more than 1 program. Instead of code and data..

Every time you see something new, it is built upon something older. An operating system, code and data, all sitting in one address space. We'll give everyone their own address space. Everyone will have their code and data.

A process cannot see the other processes running. Address spaces, with some number of virtual cpu attached. Single threaded processes have 1 virtual cpu. Library, code that is linked into your address space. This code knows how to access the larger operating system. You want a clean api that takes care of the mess so you don't have to. Which of the libraries are part of the operating system, and which aren't?

No one talks directly to the kernel. Something in the libraries has to know something about the kernel. What abstracts the kernel? What services does this os kernel provide? Provide access to disk, memory, GPU / video hardware. All different hardware it can provide access to. at what level of abstraction does this stop? Ok the OS provides access to graphics to the screen as a simple memory frame buffer, or is it a higher level library that provides windows, widgets. Does it provide compositing? These are all libraries that are provided in some way. They may in fact delegate some tasks to a special process. Why do you do this? because the kernel gets too big. You may want to provide some high level functionality so you dump it into another process. At the end of the day it is a bit arbitrary, where do you say this functionality is in the operating system, and this is not. Modern mobile operating systems provide an embeddable web view object, that provides a standard service (windows OS, IOS, Android) you can't easily change it, you can install alternate browsers on those systems. Are we saying the web browser is part of the operating system now? It kind of is. It's big, doesn't look like the other stuff we're talking about.

What defines the operating system changes over time depending on what you are expecting of it. Why don't we optimize that old stuff, and simplify it. There's still code that assumes the old ways. The only time you break those old systems, is when you bring a new operating system out. Most of the code, is code from older operating systems. Only changed relatively few things.

Traps do not work like regular function calls because they have to change the state. CPU has to change modes. Going from user mode to supervisor mode. In Supervisor mode, the CPU allows everything to be accessed. In user mode, access is restricted according to what was set in supervisor mode. Just remember the first abstraction for creating , it was a thing embodied in hardware. People tried to run multiple programs in that address space, and realized it was unpleasant, so they built hardware. How this happens, there are a lot of details to figure out. How the CPU is switched between things. Here we are talking about how memory is switched.

How do you say which memory is going to be active? Two basic ways. More general mechanism.

1) mmap -

When you access memory that you have not specifically asked for. It sends an error to your process (segmentation fault) If you don't catch that error, your process will terminate.

The library is a file on disk. This also happens for your executable. This isn't done directly through an mmap. THis is what execve it maps, as the program is accessed. If you only use 10% of the code in your executable, only 10% of the code will be loaded. When you think of a map, think of it as a reference, if you access this byte range in memory, go to this address and load , do it as needed. Wait if I only do that on every memory access, wouldn't that be slow. Prelinking. yes it can be slow, this is one way of getting things into your address space. It happens to all of the libraries using mmap. For the most part those are loaded read-only. On modern systems, you normally do not do self modifying code. We avoid it, except in specialized situations.

You have in the data area from the top of memory:

1) the stack 2) the heap

Does it have to be done this way? nope. but it turns out these things have distinct uses. This is for the function call stack. This is when you are in function A, function a calls b, b calls c, when function d finishes, how do you resume c from where it was? That's a stack, when you enter a function you push it to the stack, when you exit a function, you pop it from the stack. Since you are already having to push return addresses on and off of the stack, what also gets pushed are arguments and local variables. The storage associated with functions. You can write programs that have, that just pass everything via the arguments, and do everything with local variables. But there is one fundamental limitation about using stack. Data can only persist for the scope of the function in which it is allocated. you could allocate a bunch of stuff in main. When main exits everyone exits. If you allocate it in any other function. you could still maintain a pointer to it, it's just that it's going to be used for something else. Bad thing to do in C. Local variables live on the stack.

For short term state.

function call stack: return addresses, arguments, local variables. (for short term state)

heap: "long term" state. - stick data structures with arbitrary

What happens if you accidentally free something, and then use it. Some malloc implementations try to put things into here to detect those problems. But malloc cannot just move around data behind the program. Because al lit gives the program is a pointer. The pointer is a reference to an address. Malloc can't go around and change that pointer. If you want to be able to move memory around, you have to use a handle. A handle is a pointer to a pointer.

What is the modern solution for doing this? Where is the real problem with this? All data structures sticking in the heap with different lifespans. Is there room for it in there? But if I compacted everything I might have room over here. Modern programming environments (java javascrtip, python) they don't rely on something as primitive as malloc. You don't want to manually free memory. You just allocate it and go. Garbage collection is the colelctive result of programmers going I hate managing memory. They make a program do it for them. It collects the garbage. Allocated this memory for a data structure. Am I done using this data structure. I don't want to have to manually call free. C/ C++ is a primitive programming languages. You are responsible for managing your own memory.