Fuzzing Language Interpreters Using Regression Tests

At INFILTRATE ’14 I gave a talk on the topic of fuzzing language interpreters. The slides are now available here. The results generated by the system presented and, subsequent, related work, were sufficiently good that my bottleneck quite soon moved from bug discovery to crash triage, which ended up forming the basis for my talk at INFILTRATE ’16.

The question addressed in the talk is ‘How can we fuzz stateful APIs in an efficient manner?’, where ‘efficient’ in this case means we want to reduce the number of wasted tests. For a stateful API, it’s probable that a given API call requires a preceding set of API calls to have set the environment into a particular state before it does anything ‘interesting’. If we don’t make an attempt to set up the environment correctly then we will likely end up with a lot of our tests discarded without exercising new functionality.

As you’ve probably guessed, and as I would assume many other people have concluded before me: a good source of such information is existing code written using the API. In particular, regression tests (and other test types) often provide self-contained examples of how to exercise a particular API call.

The talk itself is in two parts: firstly, a minimal system which works versus ‘easy’ targets, such as PHP and Ruby, and secondly, a more complex system which works versus more difficult targets, such as the JS interpreters found in web browsers.

Given that this talk was two years ago, the ideas in it have evolved somewhat and if you are interested in fuzzing either category of software mentioned above I would offer the following advice:

  1. For the easier targets, compile them with ASAN, extract the tests as mentioned and mutate them using something like radamsa. ASAN plus some batch scripts to automate execution and crash detection is sufficient to reproduce the results mentioned and also to find bugs in the latest versions of these interpreters. The system mentioned in the slides ended up being overkill for these targets.
  2. For the harder targets, a reimplementation of Langfuzz by Holler et al is probably a good first port of call. Also, rather than going down the somewhat insane route of hand-coding Javascript parsing and AST traversal in Go, the esprima and estraverse libraries make the process a lot less painful. My only excuse for not doing this to begin with is along the lines of ‘I’ve just learned Go and I must use it for everything!’. I have too much free time.

On the topic of program analysis: if you are interested in learning about SMT-based analysis engines the early-bird rate for the public editions of ‘Advanced Tool Development with SMT Solvers’ is available for another week. The details, including a syllabus, are here, and if you are interested drop me an email via contact@vertex.re.

Some Early-Stage Work on Statistical Crash Triage

Last week at Infiltrate I presented some early-stage work on crash triage under the title “Automated Root Cause Identification for Crashing Executions“. The slides can be found here, and the presentation notes contain further details in draft form.

A better title for this work would probably have been “Statistical Crash Triage” or possibly something involving the term “predicate synthesis“, as really those things are at the core of what was presented. The main algorithm can be summarised fairly quickly as follows:

  1. Run the target application on all the good inputs and all the bad inputs. Record coverage information.
  2. Use the coverage information to predict what functions are likely relevant to the crash.
  3. Rewrite those functions to insert instrumentation which will record all computed values and other properties.
  4. Rerun the instrumented version of the target application on all the good and bad inputs.
  5. From the recorded state, and predicate templates, synthesise and evaluate predicates over the program’s variables.
  6. Apply a lightweight statistical analysis to the evaluations of these predicates to shake out those relevant to the crash from those that are not.

The actual details are in the slides so I won’t go into that any further. The advantages of this approach are that it isn’t tuned towards one bug type, or class of applications, and simply relies on large amounts of input data and lightweight analyses to discover ‘interesting’ things. Alongside that, the analyses at each stage are parallelisable and so it’s quite simple to scale up. On a reasonably sized target (PHP) the approach is fast enough to fit into a normal workflow.

The most significant downside is that the generality means that if a bug is best described by a very complicated/precise predicate it may not be discovered (although more simple predicates may be, which can help a user manually discover the precise predicate more easily). In terms of the work as a whole there is also a weakness in the limited evaluation so far, but this is something I’m currently in the process of alleviating. In particular I’m working on testing with a wider variety of memory management issues as those pose some unique challenges for the algorithm above.

Besides the analysis algorithm itself, I also want to quickly mention CrashCorpus*. This is a corpus I’m currently working on for the evaluation of crash triage (and related) tools. A significant issue with the academic literature is that the algorithms tend to be tested almost exclusively on toy programs or synthetic bugs. The goal of CrashCorpus is to eliminate that by providing easy to build, easy to run, real-world targets. I’ll have more info on this once I get closer to a public release.

* I’m still building towards a public release of the corpus as it needs more targets before it is truly useful, but if you’d like to contribute drop me an email (sean _at_ vertex.re). All I require is a set of crashing inputs, a set of non-crashing inputs and the source code for the target. I’ll take care of the rest.

 

Training Dates Confirmed (Plus a Contest for Students)

Good news everyone! The location and dates for the public edition of “Advanced Tool Development with SMT Solvers” have been settled and are as follows:

New York, August 15th-17th
London, August, 22nd-24th

The full details, including pricing can be found here, while the syllabus is here. I’ve been teaching this class since early 2013 to private groups so I’m quite looking forward to a public edition. If you’re interested in bug finding, RE, exploit dev or more general automated program analysis, and want an exercise-driven introduction to SMT-based analyses, then send an email to contact@vertex.re to reserve your place now. There’s an early-bird discount available until April 29th (UPDATE: deadline extended until May 13th!), and further discounts available for groups, so drop me an email for a quote!

Now, for the other part of this post: there will be a seat at each training* up for grabs for full-time students** who are not otherwise employed. I considered a few different options for ‘challenges’ but, realistically, I would much rather see what fun things people have come up with on their own, than go over responses to a contrived challenge I have come up with. So, we’re going to go with something simple (and hassle-free for everyone involved!): published something awesome in the past 6 months? Great! Send an email with a link to contact@vertex.re and whomever makes me go “Well that’s f**king awesome!” with the greatest degree of enthusiasm wins. Simple.

Ideally, your entry will have something to do with program analysis, bug finding, exploit dev, RE and such things, but use your imagination here. GitHub repos, blog posts, bug reports, exploits, academic papers, CTF write-ups, and so on, are all fair game.

To enter, send the following to contact@vertex.re:

  • A link to a write-up, paper or code, which has been published in the past 6 months
  • Your name
  • The university/college at which you are enrolled, and the title of the programme***
  • Which location (NYC or London) you can make it to

The contest will run until April 29th (UPDATE: deadline extended until May 13th) and I’ll notify the winners shortly thereafter.

* Just so it’s clear: students will still be responsible for their own travel, accommodation and all other costs except for the cost of the training.

** In order to take the most from the course you will still need to meet the minimum requirements set out in the syllabus. In short: you’ll need a solid grasp of Python, and be comfortable reading plenty of x86 assembly.

*** If you win we’ll also need some proof that you are in fact enrolled in this programme full-time.

 

Public Edition of “Advanced Tool Development with SMT Solvers” Coming Soon!

SMT solvers are an interesting, and powerful, technology and can be leveraged as a reasoning engine in a variety of solutions. The most popular uses being symbolic execution systems, like KLEE and angr, and concolic execution/whitebox fuzzing systems, such as SAGE. Due to the nature of the technology, and the overwhelming number of academic publications on the topic, it can take a while for people to get to grips with the underlying theory, and even longer to figure out the strengths and limitations of such tools in practice.

For the above reasons, since 2013 I have taught the “Advanced Tool Development with SMT Solvers” course to students from a variety of backgrounds (you can see a high level syllabus here). It’s a fun course to teach as it’s a bit different from your usual info-sec training, and the material is constantly evolving as program analysis research progresses. While the material regularly changes, my goals for the class have remained the same: after the class students should be able to perform informed cost-benefit analysis when considering integrating SMT-backed technologies into their workflow, to use and to extend publicly available SMT-based tools, to begin to develop their own SMT-based systems, and to keep up with the published state of the art.

On to the point of this post: Bar one public version in Iceland in 2013, I have exclusively hosted private versions of this class. To do so requires organisations to have at least 6 individuals interested in the course and obviously that isn’t always possible. As a result I’ve had a number of requests for a public edition, which until now I haven’t got around to following up on! Over the summer/early fall I intend to host one or more public editions and I’m currently soliciting feedback as to where they will be. If you are interested, there is a poll here where you can let me know your location of choice (at the moment London and New York seem to be the most popular).

I’ll make a decision on the location in the next few weeks and if you’d like to be sent the details email contact@vertex.re or follow me on Twitter (@seanhn).

(If you do happen to have 6 or more people from your organisation interested in taking the course it is likely to be significantly more cost effective to organise a private, on-site training at your location. To do so email contact@vertex.re.)

Rust Compiler Plugins: A Simple Example

As my adventure at Persistence Labs has come to an end I’ve decided to move my blogging back here. In the interests of getting back to writing, I’m going to start with something rather mundane but also easily summarised: Rust is awesome. You should seriously consider putting some time into figuring out if it’s useful for your engineering tasks. For program analysis related experimentation I’ve found it provides a nice balance between execution speed and productivity.

That’s it really, but in the interests of verbosity I’m going to quickly run through one rather pleasing discovery I made when flicking through the documentation a few months back. Most conversations on Rust’s advantages revolve around approaches to types, memory safety and other core language and runtime features. Way down at the end of the Rust book, in the ‘nightly’ section, I discovered that alongside all of these advantages Rust also makes it very simple to write compiler plugins, which can be used for a variety of ends. A minor feature in comparison to the other strides Rust makes, but an important one to be included as a (nearly) first-class member of the language. Tooling is an significant factor in determining language adoption, and standard infrastructure to develop such tooling is a useful foundation.

Compiler plugins are still considered unstable in Rust, and are enabled only on the nightly build, but they’re easy to work with. An example of a linter is provided in the Rust repository, but no sample of a standalone linter package, and its inclusion/use to check other code, is provided (that I could see). To figure out what exactly is involved in that, and to get a feel for how one writes and uses a compiler plugin, I created pedantrs, a very simple compiler plugin which can be included in other Rust projects and will run a few very basic lint checks. The linter isn’t intended to be used “for real”, but it will hopefully provide answers if you’re wondering how to create a simple compiler plugin project and make use of if. Here I’ll just run through some of the things that are underdocumented [1] elsewhere or took me some experimentation to figure out.

Crate Setup

Compiler plugins are libraries and can be created as per usual via cargo create. You indicate that you are creating a compiler plugin by setting plugin = true in the [lib] section of the Cargo.toml.

Early vs Late Lint Passes

The plugin registry provides both the register_early_lint_pass and register_late_lint_pass functions, but the documentation doesn’t have a whole lot to say about when each should be used. Early lint pass functions are provided with an EarlyContext, while late lint pass functions are provided with a LateContext. As the documentation says, and as was clarified by the helpful folks on #rust, the former provides context for checking of the AST before it is lowered to HIR, while the latter provides context after type checking has occurred. Most significantly, the LateContext contains a ctxt instance containing the information generated by the type checker. If you need access to the latter information then you need to register a late pass, while if you only need AST information then you can use an early pass.

The lints which pedantrs provides can function without any type information as they are quite simple, but the lints builtin to Rust provide examples of late passes. It’s worth noting that even access to things like variable names requires the context provided to a late pass. For example, as demonstrated in the builtin bad style checker.

Using the Plugin

Utilising the plugin/linter is quite simple. In the demo folder of pedantrs you’ll find another project, which lists pedantrs as a dependency in its Cargo.toml file. The plugin is then activated in the main.rs file, via:

#![feature(plugin)]
#![plugin(pedantrs)]

When you run cargo build in the demo folder you should then see something like the following output:

$ cargo build
Compiling unicode-normalization v0.1.1
Compiling clippy v0.0.23 (https://github.com/Manishearth/rust-clippy#31969a38)
Compiling pedantrs v0.1.0 (file:///Users/sean/Documents/git/pedantrs/demo)
Compiling demo v0.1.0 (file:///Users/sean/Documents/git/pedantrs/demo)
src/main.rs:6:1: 6:39 warning: public constant is missing documentation, #[warn(pub_const_docs)] on by default
src/main.rs:6 pub const UNDOCUMENTED_CONST: i32 = 6;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...
src/main.rs:62:13: 62:56 warning: function has an excessive number of arguments, #[warn(fn_arg_list_length)] on by default
src/main.rs:62 let _ = |_: i32, _: i32, _: i32, _: i32, _: i32| {};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sample Lints

The lints in pedantrs aren’t particularly interesting, but their source can be found here. They are as follows:

And that’s about it. Rust is fun. Go play.

[1] On the topic of documentation: the Rust internal’s documents can be found here and are incredibly useful.

 

Moving location!

A few months back I started Persistence Labs with the goal of developing better tools for bug discovery, reverse engineering and exploit development. I’ve also moved my blog over to that domain and the new RSS feed is here.

Anyway, that’s about it really =) I’ll be making any future blog posts over there, starting with the release of a new research paper by Agustin Gianni and I titled Augmenting Vulnerability Analysis of Binary Code, published at ACSAC this year. It describes an approach to attack surface identification and code prioritisation during vulnerability auditing. Go check it out!

SMT Solvers for Software Security (USENIX WOOT’12)

At WOOT’12 a paper co-written by Julien Vanegue, Rolf Rolles and I will be presented under the title “SMT Solvers for Sofware Security”. An up-to-date version can be found in the Articles/Presentation section of this site.

In short, the message of this paper is “SMT solvers are well capable of handling decision problems from security properties. However, specific problem domains usually require domain specific modeling approaches. Important limitations, challenges, and research opportunities remain in developing appropriate models for the three areas we discuss – vulnerability discovery, exploit development, and bypassing of copy protection”. The motivation for writing this paper is to discuss these limitations, why they exist, and hopefully encourage more work on the modeling and constraint generation sides of these problems.

A quick review of the publication lists from major academic conferences focused on software security will show a massive number of papers discussing solutions based on SMT technology. There is good reason for this 1) SMT-backed approaches such as symbolic/concolic execution have proved powerful tools on certain problems and 2) There are an increasing number of freely available frameworks.

The primary domain where SMT solvers have shone, in my opinion, is in the discovery of bugs related to unsafe integer arithmetic using symbolic/concolic execution. There’s a fairly obvious reason why this is the case; the quantifier free, fixed size, bitvector logic supported by SMT solvers provides direct support for the precise representation of arithmetic at the assembly level. In other words, one does not have to do an excessive amount of work when modeling the semantics of a program to produce a representation suitable for the detection of unsafe arithmetic. It suffices to perform a near direct translation from the executed instructions to the primitives provided by SMT solvers.

The exploit generation part of the paper deals with what happens when one takes the technology for solving the above problem and applies it to a new problem domain. In particular, a new domain in which the model produced simply by tracking transformations and constraints on input data no longer contains enough data to inform a solution. For example, in the case of exploit generation, models that do not account for things like the relationship between user input and memory layout. Obviously enough, when reasoning about a formula produced from such a model a solver cannot account for information not present. Thus, no amount of computational capacity or solver improvement can produce an effective solution.

SMT solvers are powerful tools and symbolic/concolic execution can be an effective technique. However, one thing I’ve learned over the past few years is that they don’t remove the obligation and effort required to accurately model the problem you’re trying to solve. You can throw generic symbolic execution frameworks at a problem but if you’re interested in anything more complex than low level arithmetic relationships you’ve got work to do!

Better Interpreter Fuzzing with Clang

Last weekend I decided to finally put some effort into investigating Clang, the C/C++/Objective-C frontend for LLVM. Clang is interesting as it is not only designed to provide efficient parsing and processing during compilation, but also as a platform for program analysis tools. Other options for static analysis of C/C++ exist e.g. GCC extensions, cscope hackery, and the various things built on top of CIL (e.g. Frama-C). However, Clang is attractive to me as it supports both C and C++, has a very permissive license, includes a well designed, extensive and documented API and is itself written in C++ (which might be a negative point depending on your view of the world but I like it :P). It also appears to have the backing of both Apple and Google, with a number of their engineers contributing regularly to both LLVM and Clang. All in all, it feels like a maturing project with a skilled and active community behind it.

Clang supports three approaches to writing analysis tools. Using libclang is the encouraged method for most tasks and is probably what you’re looking for if all you want to do is iterate over ASTs and pull out data. The next option is to write a plugin for Clang to be run during the build phase of a project. This is slightly more involved and doesn’t have the benefit of allowing you to use a Python/whatever wrapper around the API. However, if you want to do things like automated source rewriting, or more involved static analysis as part of a build, then this is where you want to be. The final approach is using libtooling. I haven’t looked into writing anything based on this yet but it appears to offer the benefits of writing a standalone tool provided by libclang with the power of full AST control offered by a plugin. The only potential downsides, in comparison to libclang, that I can see are that you cannot write your code in a higher level language and it does not have access to as convenient an API.

After a bit of digging around in documentation and source, I think libclang offers the most gentle introduction to Clang as a program analysis platform. It’s not as powerful as the other two methods but has a stable API (check out include/clang-c/Index.h for the C interface and bindings/python/clang/cindex.py for the Python bindings) and supports a nice visitor based approach for writing tools that traverse an AST. From reading over the above code, and following Eli Bendersky’s introduction to using the Python bindings, you should be able to get up and running fairly easily. As a side note, libclang has progressed somewhat since that blog post so the limitations related to missing cursor kinds are now largely alleviated. There are still some missing features but we’ll get to that later.

A relevant question about now is probably “What the hell has all of that got to do with interpreter fuzzing?”. In general, not a whole lot =) However, when trying to think of some small problem I could tackle to get a feel for libclang I recalled an issue I previously encountered when fuzzing PHP. When fuzzing an interpreter there are typically two high level stages that are somewhat unique to interpreter fuzzing; 1) Discover the functions you can call and 2) Discover how to correctly call them e.g. the number and types of their arguments.

In this regard, PHP is a slightly easier target than say, Python or Ruby, because on the first point it has a lot of global functions that can be called without instantiating a class or figuring out what module to import. These functions can be easily discovered using get_defined_functions. Using this approach it is simple to enumerate a fairly large attack surface that calls directly into the C backend (somewhere upwards of 500 functions in a default configuration) and get fuzzing.

Point 2 remains a problem however. Each PHP function implemented in the C backend takes a number of typed parameters. Some type conversion can and does take place, which means you can specify a boolean when an int is required or other such things. Naturally, there are some types that are completely incompatible, e.g. a function callback is required and you provide an integer type. If you get this wrong the function will bail out without performing any further processing. As well as incorrect argument types, specifying an incorrect number of arguments will also often result in the function execution terminating before anything interesting happens. Precise implementation details can be found in the zend_parse_va_args and zend_parse_arg_impl functions of Zend/zend_API.c.

When fuzzing PHP we are then left with a bit of a conundrum. We can easily discover a substantial attack surface but we still need to figure out the types and number of arguments that each function expects. One solution is to just play a numbers game. Fuzzing an interpreter is fast… really fast. It is also easily distributed and instrumented. So, if we just guess at the types and parameters we’ll probably be correct enough of the time [1]. This has a slight issue, in that there is going to be a bias towards functions that take a smaller number of arguments for purely probabilistic reasons. An improvement is to just parse the error messages from PHP which give an indication as to why the function bailed out e.g. incorrect types or incorrect number of arguments. This is all kind of ugly however and it would be nice if we could just know up front the correct argument types and the exact number required.

Enter Clang! An efficient solution for this problem is simply to parse every C file included in the PHP build, find all calls to zend_parse_parameters, extract the format string that specifies the arguments and their types and then relate this information back to the function names discovered via get_all_functions. It’s a fairly straightforward task on the PHP code base and could probably be achieved via regular expressions and some scripting. That said, using libclang we can come up with a much tidier and less fragile solution. Before I delve into the solution I should mention that I’ve uploaded the code to GitHub under the name InterParser, minus the fuzzer, should you want to try it out for yourself, extend it or whatever.

I’ll explain the code using questions for which the answers were not immediately obvious from the Clang documentation or are things that were interesting/surprising to me.

How do I run my libclang based tool on all files in a build?

The solution to this is also a solution to the next problem so I’ll deal with it there!

How do I know the correct compiler arguments to use when parsing a file?

The clang_parseTranslationUnit of Index.h allows you to specify the arguments that would be passed to the compiler when parsing a particular source file. In some cases it is necessary that you specify these arguments correctly or the code that is parsed will not accurately reflect the code that is included when you run make. For example, if a chunk of code is enclosed within #if BLA / #endif tags and BLA is supposed to be defined, or not, via the compiler arguments then unless you provide the same argument when creating the translation unit the code will not be parsed.

This problem, and the previous one, could easily be solved if we could tell 1) every file processed when make is run and 2) the command line argument passed to the compiler on each invocation. After browsing the Clang documentation for a solution a friend of mine suggested the obvious answer of just wrapping the compiler in a script that logged the relevant information during the build. The script creplace.py provides this functionality. By pointing the CC and CXX environment variables at it you should get a log of all files processed by the compiler when make is ran, as well as the arguments passed.

The ccargparse.py module provides the load_project_data function which you can then use to build a Python dictionary of the above information. If there’s an inbuilt Clang API to do all this I didn’t find it [2], but if not you should be able to just drop in my code and work from there! See the main function of parse_php.py for exact usage details.

Once the compiler arguments have been extracted and converted to a list they can just be provided to the index.parse method when constructing the translation unit for a given file. e.g.

comp_args = load_project_data(cc_file)
args = comp_args[file_path]
index = clang.Index.create()
tu = index.parse(file_path, compiler_args)

How do I find all functions within a file?

Our first task is to find all calls to zend_parse_parameters and note the name of the top level backend API function so that we can later relate this to a user-space PHP function call. To do this we first need to find all function declarations within a translation unit for a source file. e.g the process_all_functions method:

for c in tu.cursor.get_children():
    if c.kind == clang.CursorKind.FUNCTION_DECL:
        f = c.location.file
        if file_filter and f.name != file_filter:
            continue
        fmt_str = process_function(c)

This is a fairly simple task — we iterate over all children of the top level translation unit for the given source file and compare the cursor kind against FUNCTION_DECL, for a function declaration. It’s worth noting that the children of a translation unit can include various things from files that are #include‘d. As a result, you may want to check the the file name associated with the location for the child is the name of the file passed to index.parse.

How do I find all calls to a particular function?

For each function declaration we can then iterate over all of its children and search for those with the kind CALL_EXPR. This is performed in the process_function method. Firstly, we need to check if the function being called has the correct name. The first child of a CALL_EXPR AST node will represent the function itself, with the rest representing the arguments to this function.

The last statement isn’t quite correct. The C and C++ standards specify a number of conversions that can take place on a function call, so the direct children of a CALL_EXPR AST node will represent these conversions and then the direct child of these conversion nodes will contain the details we require. Clang does have a mode to dump the AST for a given source file so if you’re curious what the AST actually looks like try passing -ast-dump or -ast-dump-xml as parameters to the clang compiler.

As an example, on a function call we encounter a CALL_EXPR node and its first child represents the decay of the function to a pointer as specified by the C/C++ standards. This node will have the kind UNEXPOSED_EXPR. It will have a single child of type DECL_REF_EXPR from which we can retrieve the function name via the displayname attribute. The following code in process_function performs the function name extraction and comparison.

if n.kind == clang.CursorKind.CALL_EXPR:
    unexposed_exprs = list(n.get_children())
    func_name_node = get_child(unexposed_exprs[0], 0)
    if func_name_node.displayname == "zend_parse_parameters":
        ...

How do I extract the parameters to a function call?

Once we know we have the correct function call we then need to extract the format string parameter that provides the type specification for the user-space API call. As with the node representing the function, arguments may also undergo conversions. The second argument to zend_parse_parameters is the one we are interested in, and it is usually specified as a string literal. However, string literals undergo a conversion from an rvalue to an lvalue. To account for this, we again go through an UNEXPOSED_EXPR node representing the conversion to access the node for the string literal itself.

Things get a little inconvenient at this point. Once we have the node representing the string literal we ideally want some way to access the actual value specified in the source code. It turns out that this functionality hasn’t been added to the Python bindings around libclang yet. Fortunately, a guy by the name of Gregory Szorc has been working on a branch of libclang and the associated Python bindings that adds the get_tokens method to AST nodes. His code is great but also contains a bug when get_tokens is called on an empty string literal. Until that gets fixed I’ve mirrored his cindex.py and enumerations.py files. Drop both into your clang/bindings/python/clang directory and you should be good to go!

The actual code to extract the string literal given the node representing it is then straightforward.

if tkn_container.kind != clang.CursorKind.STRING_LITERAL:
    raise VariableArgumentError()

tokens = list(tkn_container.get_tokens())
if tokens[0] is None:
    return ""

The check on the kind of the node is required as in a miniscule number of cases the format string parameter is passed via a variable rather than directly specified. This situation is so rare that it isn’t worth handling. The only other slightly odd thing is that when get_tokens is called on an AST node representing a string literal, like the first parameter in func("xyz", a), it will return a generator for two elements. The first will be the string literal itself, including the double quotes, while the second will represent the comma separator. I’m not sure if this is a bug or not but it’s not a big deal. If the string literal is empty then the generator will return None.

Conclusion

At this point we can associate the functions calling zend_parse_parameters with the format string passed to that call, and thus the types and number of arguments that the function expects. In a fairly small amount of Python we have a robust solution to problem 2, discussed earlier. A question you may have is How do you then associate the function name/argument specification from the C code to the corresponding user-space PHP function?. Well, PHP is nice in that it doesn’t do a massive amount of name mangling and consistently if we have a user-space function called func_one then the backend C function will be called zif_func_one (in the code it will probably look like PHP_FUNCTION(func_one) but Clang will expand the macro for you).

When parse_php.py terminates it will output something like the following:

INFO:main:API info for 518 functions in 68 files written to php_func_args.txt

Inspecting the output file we’ll see function to type specification mappings:

# /Users/sean/Documents/git/php-src/ext/standard/levenshtein.c
zif_levenshtein ss sss sslll
# /Users/sean/Documents/git/php-src/main/main.c
zif_set_time_limit l
# /Users/sean/Documents/git/php-src/ext/standard/versioning.c
zif_version_compare ss|s
...

This tells us that a user-space PHP script can correctly call levenshtein with two strings, three strings or two strings and three longs. It can call set_time_limit with a single long only. version_compare can be called with two or three strings and so on and so forth. So, with this information at hand we can construct a PHP fuzzer that no longer has to guess at the number of arguments to a function or their types.

Extensions

The most obvious extension of this code that comes to mind is in pulling out numeric values and string literals from comparison operators. For example, say we’re analyzing a function with the following code:

if (x == 0xc0)
    ...
else
   ....

It might be an idea to pull out the constant 0xc0 and add it to a list of possible values for numeric input types when fuzzing the function in question. Naturally, you’ll end up with extra constants that are not compared with the function inputs but I would guess that this approach would lead to some gain in code coverage during fuzzing.

Resources

Introduction to writing Clang tools
Eli Bendersky’s blog post on parsing C with libclang
Unofficial Clang tutorials

[1] This is what I did when I first fuzzed PHP a long time ago. You hit a lot of incorrect type/argument count errors but you can get so many executions per second that in general it works out just fine =)

[2] Two days after I wrote this it seems that support for the compilation database functionality was added to the Python bindings. You can read about this feature here. Using this feature is probably a more portable way of gathering compiler arguments and pumping them into whatever APIs require them.

Anatomy of a Symbolic Emulator, Part 3: Processing Symbolic Data & Generating New Inputs

In this final video in the series we go over how to generate new inputs for a program once we detect a user-influenced conditional branch. At the end there is also an example of the type of condition/resulting formula that we get from working on a real file parser, in this case libwebp.

(You probably want to click the “Watch on YouTube” option on the bottom right of the video and set the quality to 720p)

Conclusion
This type of emulation, input generation and formula checking does not need to be limited to conditional jumps. As I discussed in a previous post you can use a similar approach to discover variable ranges, check for variable relationships and assist in figuring out complex algorithms. For example, one could generate a query to the solver every time an argument to malloc is found to be influenced by the user, or export a list of all functions that operate on user-influenced data to IDA for manual review. (In fact, a light-weight version of this approach in combination with fuzzing and an IDA importer is possibly more generally useful to an individual auditor than going down the route of full on whitebox fuzzing. More on that later =))

Anyway, I hope these videos provide some insight into how a whitebox fuzzer might work as well as one approach to building a symbolic emulator. To give an idea of the effort involved – the combined whitebox fuzzing, trace parsing and emulation code (along with supporting libraries) comes to around 10,000 lines of Python. Of this, the emulator itself is only 3000 lines or so. The PIN tracer is just under 1000 lines of C++.

Tracing is currently fairly unoptimised and parsing something like a video or image while tracing can result in a factor of 10-100 increase in running time. This usually means a wait of 30 seconds, which isn’t too bad for whitebox fuzzing as tracing is not performed too often but for other uses of a symbolic emulator (like tracing while fuzzing normally) this will require some work. The emulator itself is Python based and as such is not lightning fast. In the default run-mode it emulates ~5000 instructions per second. What this translates to is about 30-40 minutes per trace of an average file parser. This isn’t as bad as you might think however as the tests cases generated tend to be much more effective at hitting new code than what you would get from dumb fuzzing. Despite this we still need performance gains and I’m working on a few different solutions for that. Somewhere around 30,000+ instructions per second would be what I would consider approaching acceptable =)

To preempt the inevitable questions – for now JESTER is not publicly available but that may change in the future. It’s very much a research prototype at the moment where we’re testing out several approaches to improving performance and general usefulness. However, if you are interested in working on this type of research post a comment with a contact address (it won’t appear publicly) as I’m fairly sure we are currently hiring.

Anatomy of a Symbolic Emulator, Part 2: Introducing Symbolic Data

In the previous post I discussed one way to go about gathering a trace for emulation. In this I’m going to talk about how we go about emulating such a trace, how and why we hook functions as they are emulated and how symbolic operations are performed.

As before, this post is accompanied by a video which demonstrates the code in action. Unlike the previous post I’ve decided to skip the paragraphs of rambling and instead most of the info is in the actual video itself =)

Topics covered:
– Introducing symbolic data via function hooks
– Performing computations on symbolic data

(You probably want to click the “Watch on YouTube” option on the bottom right of the video and set the quality to 720p. Btw, near the end of the video I said something along the lines of “one of the advantages of whitebox fuzzing over symbolic emulation”. That makes no sense =) What I meant to say was “one of the advantages of whitebox fuzzing over normal symbolic execution”.)