Exploit generation, a specialisation of testing?

It sounds like a silly question, doesn’t it? Nobody would consider exploit development to be a special case of vulnerability detection. That said, all research on exploit generation that relies on program analysis/verification theory (From now on assume these are the projects I’m discussing. Other approaches exist based on pattern matching over program memory but they are riddled with their own problems.) has essentially ridden on the coat-tails of research and tools developed for test-case generation. The almost standard approach to test-case generation consists of data flow analysis in combination with some sort of decision procedure. We then generate formulae over the paths executed to create inputs that exercise new paths. This is also the exact approach taken by all exploit generation projects.

There are pros and cons to this relationship. For instance, some activities are crucial to both test-case generation and exploit generation, e.g., data flow and taint analysis. Algorithms for these activities are almost standardised at this stage and when we work on exploit generation we can basically lift code from test generation projects. Even for these activities though there are sufficient differences and opportunities presented by exploit generation that it is worth doing some re-engineering. For example, during my research I extending the taint analysis to reflect the complexity of the instructions involved in tainting a location. When building a formula to constrain a buffer to shellcode we can then use this information to pick the locations that result in the least complex formulae. An exploit only needs a single successful formula (usually) so we can pick and choose the locations we want to use; testing on the other hand typically requires exhaustive generation and thus this optimisation hasn’t been previously applied because the benefits are less evident (but still might be a decent way of increasing the number of test cases generated in a set time frame).

The two problems share other similarities as well. In both cases we find ourselves often dissatisfied with the results of single path analysis. When generating an exploit the initial path we analyse might not be exploitable but there may be another path to the same vulnerability point that is. Again in this case we can look to test case generation research for answers. It is a common problem to want to focus on testing different sub-paths to a given point in a program and so there are algorithms that use cut points and iterative back-tracking to find relevant paths. So with such research available one might begin to think that exploit generation is a problem that will be inadvertently ‘solved’ as we get better at test case generation.

Wrong.

With test case generation all test cases are essentially direct derivatives from the analysis of a previous test case. We build a formula that describes a run of the program, negate a few constraints or add on some new ones, and generate a new input. Continue until boredom (or some slightly more scientific measure). What I am getting at is that all the required information for the next test is contained within the path executed by a previous test. Now consider an overflow on Windows where we can corrupt the most significant byte of a function pointer that is eventually used. If you decide to go down the ‘heap spray’ route to exploit this vulnerability you immediately hit a crucial divergence from test case generation. In order to successfully manipulate the structure of a programs heap(s) we will almost always require information that is not contained in the path executed to trigger the vulnerability initially. Discovering heap manipulation primitives is a problem that requires an entirely different approach to the test case generation approach of data flow analysis + decision procedure over a single path. It is also not a problem that will likely ever be solved by test case generation research as it really isn’t an issue in that domain. Whole classes of vulnerabilities relating to memory initialisation present similar difficulties.

What about vulnerability classes that fit slightly better into the mould carved out by test generation research? One of the classes I considered during my thesis was write-4-bytes-anywhere style vulnerabilities. Presuming we have a list of targets to overwrite in such cases (e.g. the .dtors address) this is a solvable problem. But what if we only control the least significant byte (or word) and can’t modify the address to equal one of the standard targets? Manually one would usually see what interesting variables fall within the controllable range, looking for those that will be at a static offset from the pointer base. But what is an ‘interesting variable’? Lets assume there are function pointers within that range. How do we automatically detect them? Well we’d need to monitor the usage of all byte sequences within the range we can corrupt. It’s a problem we can approach using data flow/taint analysis but once you start to consider that solution it starts to look a lot like a multi-path analysis problem but over a single path. We are no longer considering just data that is definitely tainted by user input, we are considering data that might be, and as we can only control a single write we have different ‘paths’ depending on what bytes we choose to modify….. and we’re doing this analysis over a single concrete path? Fun.

I guess the core issue is that test-case generation and exploit generation are close enough that we can get adequate results by applying the algorithms developed for the former to the latter. To get consistently good results though we need to consider the quirks and edge cases presented by exploit generation as a separate problem. Obviously there are many useful algorithms from test case generation research that can be applied to exploit generation but to apply these blindly misses opportunities for optimisations and improvements (e.g. the formula complexity issue mentioned). On the other hand there are problems that will likely never be considered by individuals working on test case generation; these problems will require focused attention in their own right if we are to begin to generate exploits for certain vulnerability classes.