SystemsSec 2018W Lecture 21
Last class we talked about virtualization and what security that gives you. In user mode and supervisor mode we have different privileges, the CPU is helping you, but what about when you don’t have the hardware to help you? How do you implement privileges inside a process? Like in a browser, code in a page shouldn’t have access to everything a browser can do. Modern browsers use the OS multiprocess model to protect, separate tabs are running in a different separate process. Inside of a process we want to limit the resources, we do this by using an abstraction, this abstraction is the language runtime.
In classic compile languages we have a an input source -> preprocessor -> compiler -> linker -> outputs a binary. A linker is just taking code and sticking it together, it resolve the symbolic references into actual addresses. The compiler takes the language and converts it into code that the processor executes. By processor we’re not talking about hardware, the processors actually have their own small compilers that do translation into another lower language. But the idea is that you can walk through from your code to to the assembly and map the instructions together.
What is a language runtime and how do we implement it? The language runtime is the code that provides the environment for your code. Ex: we expect command line arguments to be loaded to main, something has to do that. So any language higher than assembly has some sort of language runtime.
In most languages, apart from C, we don’t have direct access to resources. There is a language runtime in C but it’s very lightweight and is mostly before main is called. In C, the code that runs doesn’t start at the beginning of main, there are a whole bunch of system calls done automatically before main runs. Once you’re actually running C, you’re basically running directly on the hardware so there isn’t abstraction but most other languages aren’t like that.
Most other languages, classically, used an interpreter. An interpreter is similar to a shell prompt, it takes instructions line by line, interprets it, and then send it to run. Advantageous because it’s not very complex, just like a big set of case statements. Allows for fine grain control, like array bounds checking, but we loose efficiency because it’s going line by line.
Can we get safety and efficiency? Just in time (JIT) compilation is doing compilation in chunks as you go. But compiling is slow so we get around this by having multiple compilers. Compilation is not optimization, but we can do optimizations especially for loops. By using JIT, modern languages can get close to C’s speed.
Checking code at runtime isn’t efficient but the JIT compilers can guarantee some things so we don’t need to check them. But there can be bugs here and getting around them gives an attack C-like power. So how do we handle this? Instead of a compiler producing a binary, it produces byte code which is then run through a JIT to be translated into machine code. You can annotate your byte code with restrictions and context and the JIT will enforce the safety with this information and do optimization.
In this context, what is sandboxing? Isolating code, limiting access kind of like a process. How do we implement this? Sandboxing is a goal, there’s no implementation specific details associated. NaCl (Native Client) runs X86 machine code in the browser but sandboxes it. Have something that makes sure the code is safe, removes instructions that violate rules, before running it. But they need to be more portable so they made pNaCl that runs byte code. Was standardized into web assembly. Lesson: sandboxing is not a technology, it’s a goal that wants to limit access, usually by filtering, maybe CPU help, etc.
What do we mean when we say that OSs sandbox apps? OS virtualization is at the heart of apps on mobiles and containers of servers. In the system model, it’s like putting a box around process(es). Disadvantage, kernels weren’t designed to do resource isolation, so resource sharing is complicated.