Rust Compiler Plugins: A Simple Example

As my adventure at Persistence Labs has come to an end I’ve decided to move my blogging back here. In the interests of getting back to writing, I’m going to start with something rather mundane but also easily summarised: Rust is awesome. You should seriously consider putting some time into figuring out if it’s useful for your engineering tasks. For program analysis related experimentation I’ve found it provides a nice balance between execution speed and productivity.

That’s it really, but in the interests of verbosity I’m going to quickly run through one rather pleasing discovery I made when flicking through the documentation a few months back. Most conversations on Rust’s advantages revolve around approaches to types, memory safety and other core language and runtime features. Way down at the end of the Rust book, in the ‘nightly’ section, I discovered that alongside all of these advantages Rust also makes it very simple to write compiler plugins, which can be used for a variety of ends. A minor feature in comparison to the other strides Rust makes, but an important one to be included as a (nearly) first-class member of the language. Tooling is an significant factor in determining language adoption, and standard infrastructure to develop such tooling is a useful foundation.

Compiler plugins are still considered unstable in Rust, and are enabled only on the nightly build, but they’re easy to work with. An example of a linter is provided in the Rust repository, but no sample of a standalone linter package, and its inclusion/use to check other code, is provided (that I could see). To figure out what exactly is involved in that, and to get a feel for how one writes and uses a compiler plugin, I created pedantrs, a very simple compiler plugin which can be included in other Rust projects and will run a few very basic lint checks. The linter isn’t intended to be used “for real”, but it will hopefully provide answers if you’re wondering how to create a simple compiler plugin project and make use of if. Here I’ll just run through some of the things that are underdocumented [1] elsewhere or took me some experimentation to figure out.

Crate Setup

Compiler plugins are libraries and can be created as per usual via cargo create. You indicate that you are creating a compiler plugin by setting plugin = true in the [lib] section of the Cargo.toml.

Early vs Late Lint Passes

The plugin registry provides both the register_early_lint_pass and register_late_lint_pass functions, but the documentation doesn’t have a whole lot to say about when each should be used. Early lint pass functions are provided with an EarlyContext, while late lint pass functions are provided with a LateContext. As the documentation says, and as was clarified by the helpful folks on #rust, the former provides context for checking of the AST before it is lowered to HIR, while the latter provides context after type checking has occurred. Most significantly, the LateContext contains a ctxt instance containing the information generated by the type checker. If you need access to the latter information then you need to register a late pass, while if you only need AST information then you can use an early pass.

The lints which pedantrs provides can function without any type information as they are quite simple, but the lints builtin to Rust provide examples of late passes. It’s worth noting that even access to things like variable names requires the context provided to a late pass. For example, as demonstrated in the builtin bad style checker.

Using the Plugin

Utilising the plugin/linter is quite simple. In the demo folder of pedantrs you’ll find another project, which lists pedantrs as a dependency in its Cargo.toml file. The plugin is then activated in the main.rs file, via:

#![feature(plugin)]
#![plugin(pedantrs)]

When you run cargo build in the demo folder you should then see something like the following output:

$ cargo build
Compiling unicode-normalization v0.1.1
Compiling clippy v0.0.23 (https://github.com/Manishearth/rust-clippy#31969a38)
Compiling pedantrs v0.1.0 (file:///Users/sean/Documents/git/pedantrs/demo)
Compiling demo v0.1.0 (file:///Users/sean/Documents/git/pedantrs/demo)
src/main.rs:6:1: 6:39 warning: public constant is missing documentation, #[warn(pub_const_docs)] on by default
src/main.rs:6 pub const UNDOCUMENTED_CONST: i32 = 6;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...
src/main.rs:62:13: 62:56 warning: function has an excessive number of arguments, #[warn(fn_arg_list_length)] on by default
src/main.rs:62 let _ = |_: i32, _: i32, _: i32, _: i32, _: i32| {};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sample Lints

The lints in pedantrs aren’t particularly interesting, but their source can be found here. They are as follows:

And that’s about it. Rust is fun. Go play.

[1] On the topic of documentation: the Rust internal’s documents can be found here and are incredibly useful.

 

Moving location!

A few months back I started Persistence Labs with the goal of developing better tools for bug discovery, reverse engineering and exploit development. I’ve also moved my blog over to that domain and the new RSS feed is here.

Anyway, that’s about it really =) I’ll be making any future blog posts over there, starting with the release of a new research paper by Agustin Gianni and I titled Augmenting Vulnerability Analysis of Binary Code, published at ACSAC this year. It describes an approach to attack surface identification and code prioritisation during vulnerability auditing. Go check it out!

Blackhat USA paper

I submitted an abstract etc. for a Blackhat talk a few days ago. The title is “Automatic exploit generation for complex programs” and the following is the abstract:

The topic of this presentation is the automatic generation of control flow hijacking exploits. I will explain how we can generate functional exploits that execute shellcode when provided with a known ’bad’ input, such as the crashing input from a fuzzing session, and sample shellcode. The theories presented are derived from software verification and I will explain their relevance to the problem at hand and the benefits of using them compared to approaches based on ad-hoc pattern matching in memory.

The novel aspect of this approach is the combination of techniques from data flow analysis and symbolic execution for the purpose of exploit generation. We track input data as it is passed through a running program and taints other variables; in parallel we also track all constraints and modifications imposed on such data. As a result, we can precisely locate all memory regions influenced by the tainted input. We can then apply a constraint solver to generate an exploit.

This technique is effective in environments where the input data is subjected to complex, low level manipulations that may be difficult and time consuming for a human to unravel. I will demonstrate that this approach can be used in the presence of ASLR, non-executable regions and other protections for which known work-arounds exist.

During the presentation I will show functioning exploits generated by this technique and describe their creation in detail. I will also discuss a number of auxiliary benefits of the tool and possible extensions. These include the ability to denote sections of a given input used in determining the path taken, in memory allocation routines and in length constraints. Possible uses of this information are in generating more reliable versions of known exploits and in guiding a fuzzer.

 
So, in a nutshell I’m using dynamic data flow analysis in combination with path constraint gathering and SAT/SMT solving to generate an input for a program that will result in shellcode execution…. assuming it works 😉 I should know by June 1st if it was accepted or not.

Update: The talk was rejected. Success!… or not.