OSWebSec: Code Injection Attacks
Operating System and web security Notes September 25, 2012
Good literature review - can be hard to do
here is this body of work - what connects the pieces, what are the patterns? What did they say? They are all kind of related aren't they? Compression algorithm to give others a view of what is in there. What exactly the patterns are. What are the common things between them? A chronological order within all of them. Whatever the patterns are that's what your thing is about. You are basically making an argument about what the patterns are. Make citations as part of building your argument. A research proposal is exactly the same thing. you show the patterns have the structure, but you are showing something is missing. This part has been covered, this part hasn't. Look at the possibilities that are there.
Smash the Stack look at the source code or look for crashes (seg fault) I gave it some weird input and then it crashes. Bring out GDB look at the stack and see what happened. Well if it's this type of pattern this is what you can do. There is a classic structure to shell code. It gets you a shell. If you get a root shell on a unix box you own the machine. How do you get a shell code? They really want to run a system call. They want to run execve (“/bin/sh” ---);
in the ideal world you would just inject into the process. System calls are not regular function calls. int 0x80 Presumably you want to make sure you can interact with the shell in some way. If you are doing this over a network connection. If you just do an execve those are inherited. By giving a program input. In programming language terms what are we trying to construct? If i wasn't talking about C, talking about lisp or python? I want eval – I want the program to call eval on my data, turning my data into code. I mention this because this is going to become more important later on. We are just trying to construct eval the hard way. Give it some data and turn it into code. It doesn't want to be eval, so you have to kind of work with it.
There is nothing special about execve persay, it's just the most straightforward way of getting what we wanted. You want as small a payload as possible.
How are we going to execute it? How are we going to trick it? To get this to run:
send code as data get instruction pointer to point to code (how do I change the instruction pointer?) - overwrite IP stored in memory (you have to know the address of the next instruction)
NOP sled and multiple return addresses are to account for the inexact nature of where the address will be. This is a little fragile – you put things in specific places – even in the best scenarios they are tricky things. Attackers when attempting to exploit buffer overflows will attack a bunch of machines. While it won't work on all of them it should at least get some.
Return oriented programming
Intro / motivation what the paper is about, why should we care about what they are doing? It uses code that is already marked as trusted in your system. The exploit is valid on multiple architectures now. Return oriented programming was already published – it's general – it works on SPARC too. They also made it really easy to do. How SPARC is the opposite of X86 – if these two are exploitable – then all the ones in between are exploitable too. (Return to libc attacks – uses shared libraries to do the attacks) you don't inject the code, you just inject the call to the library. You make a function call to the system function and give it the argument /bin/sh. So why is this scarier than return to libc attacks? (defenses of return to libc – reduce the functionalities of the library functions) You could potentially use it for other libraries but they've done it for the C library. They are always claiming that “we are doing this wonderful thing!” variable length instructions make it easier. Variable length has more entry points. Similar to having more code in memory you have more places to choose from as attack points. Variable instruction lengths aren't a protection. But it's still very do-able on the non variable instruction length architecture. They are trying to distinguish this paper from the last paper.
related work stack canaries for detecting malicious code ProPolice, address obfuscation and code islands. You still care about our work because it's still possible to do these attacks. W+X – non executable memory – write XOR execute. - if you overwrite other instruction pointers you can potentially get control. It only comes into play when you have control of the IP. Once the attacker is in your stack space – let's make the attacker not be able to predict the space in memory. Ethics part is osmething to keep in mind. This being published in open literature – well attackers are already doing this so we should tell everyone about it. This is also the aspect of these guys are really smart and they are spending a lot of time thinking about it. Giving these gadgets – here, hit me please.
dealing with SPARC Fixed length variable length instructions – with fixed length they have to use the ends of functions. Return bytes very frequently in these things. That's how they maintain control. If you don't have the return bytes you are screwed. What is SPARC used on – Solaris. SPARC – risc architecture – register windows (sliding windows) quirky thing – main thing is that it's a RISC architecture used for a lot of servers. Gadgets There are two particular things – what is the gadget itself – table of them they have – 1 or two instructions just before the return that already exist in the code – you want to keep control – do not usually have branches – right next to a return to keep control of what you are trying to do. In a sufficiently large complicated program – you can find one of everything next to a return. Once you have this collection you can do whatever you want. Return to libc functions are chained. Challenge is to provide arguments to each function. Buidling that string, and placing your future arguments in that string so the sequence calls each other. That's what they solve for us in the next section. It doesn't have to be libc it could be chrome – something that has thousands of return statements all over. Gadgets are simple pieces of code. API