At INFILTRATE ’14 I gave a talk on the topic of fuzzing language interpreters. The slides are now available here. The results generated by the system presented and, subsequent, related work, were sufficiently good that my bottleneck quite soon moved from bug discovery to crash triage, which ended up forming the basis for my talk at INFILTRATE ’16.
The question addressed in the talk is ‘How can we fuzz stateful APIs in an efficient manner?’, where ‘efficient’ in this case means we want to reduce the number of wasted tests. For a stateful API, it’s probable that a given API call requires a preceding set of API calls to have set the environment into a particular state before it does anything ‘interesting’. If we don’t make an attempt to set up the environment correctly then we will likely end up with a lot of our tests discarded without exercising new functionality.
As you’ve probably guessed, and as I would assume many other people have concluded before me: a good source of such information is existing code written using the API. In particular, regression tests (and other test types) often provide self-contained examples of how to exercise a particular API call.
The talk itself is in two parts: firstly, a minimal system which works versus ‘easy’ targets, such as PHP and Ruby, and secondly, a more complex system which works versus more difficult targets, such as the JS interpreters found in web browsers.
Given that this talk was two years ago, the ideas in it have evolved somewhat and if you are interested in fuzzing either category of software mentioned above I would offer the following advice:
- For the easier targets, compile them with ASAN, extract the tests as mentioned and mutate them using something like radamsa. ASAN plus some batch scripts to automate execution and crash detection is sufficient to reproduce the results mentioned and also to find bugs in the latest versions of these interpreters. The system mentioned in the slides ended up being overkill for these targets.
On the topic of program analysis: if you are interested in learning about SMT-based analysis engines the early-bird rate for the public editions of ‘Advanced Tool Development with SMT Solvers’ is available for another week. The details, including a syllabus, are here, and if you are interested drop me an email via firstname.lastname@example.org.