Validity, Satisfiability and Code Semantics

In a recent post to DD I mentioned some interesting features that will be available in Immunity Debugger 2.0. These features rely on a translation layer from x86 code to SMT formulae that Pablo constructed as part of his work on DEPLIB 2.0. In the past few weeks I got some time to put together a few scripts for ID that replicate some of the more useful tasks that we can use a solver for. In this post I am going to elaborate on one such script which is designed to find ROP gadgets meeting user specified conditions.

find_gadget.py is a relatively simple piece of software which is a testament to Pablo’s API development, the Python programming language and, I think quite importantly, the suitability of mathematical logic for reasoning about machine code. This final point is of course unsurprising but it provides good motivation when considering the merits of abstracting x86 code to more logical/mathematical representations for analysis.

We begin our gadget search by running gadgets.py. Currently this script finds any instruction sequence ending in a ret instruction (with support for pop REG/jmp REG etc. in the works). This takes about 360 seconds to run on QuickTimeAuthoring.qtx (2.16MB) and finds 80,000 candidate gadgets. For now no further analysis is done on these gadgets but we’re working on some fun stuff in that area.

At this point we can use find_gadget.py to search the candidate gadgets for one that meets whatever semantics we have in mind. For now the semantics are specified via the -d, -n and -v/-s options, i.e. DESTINATION RELATION SOURCE/VALUE. For example, to represent EAX <= [EBX+24] we would use -d EAX -n <= -s [EBX+24]. (This is a little cumbersome and not as flexible as one might like so we built a BNF grammar based on pyparsing that allows for arbitrary semantics to be specified as constraints and that should be swapped in pretty soon.) The full set of arguments are shown in this screenshot:

find_gadget.py arguments — Arguments to find_gadget.py

Once we have specified our arguments the script gets on with the fun part. Our algorithm is as follows:


for gadget in candidiate_gadgets:
    sa = SequenceAnalyzer(reg_model, flag_model, mem_model) # 1
    sa.analyze(gadget.addr, gadget.depth) # 2
    
    if relation == '=':
         rel_expr = sa.solver.eqExpr(dest, src) # 3
    elif relation == '<=':
         ...

    preserve_expr = None
    if preserve_regs != None: # 4
         for reg in preserve_regs:
              eq_expr = sa.solver.eqExpr(sa.regs_before[reg], sa.regs_after[reg])
              preserve_expr = sa.solver.boolAndExpr(preserveExpr, eq_expr)
        rel_expr = sa.solver.boolAndExpr(rel_expr, preserve_expr)

    res = sa.solver.queryFormula(rel_expr) # 5
    if generic_gadgets_only:
         if res == VALID:
              # VALID GADGET FOUND
     else:
          if res == SATISFIABLE:
              # SATISFIABLE GADGET FOUND

#1 Modelling registers, flags and memory
One of the arguments you may have noticed to the script is -c. This argument tells the script that we want to look for ‘context specific’ gadgets that satisfy our semantics when current values in registers, memory and flags are taken into account. The alternative is to look for generic gadgets that will meet the required semantics regardless of the context e.g. inc eax will only set eax to 0 if it previously contained -1 while xor eax, eax will always set it to zero. We’ll discuss this a little more on point #5.

#2 Gadgets to equations
The main coding effort in building this framework was building a transformer for each x86 instruction into the domain of our theorem prover. This requires one to encode accurately the semantics of the instruction over flags, registers and memory and is the backbone of any tool that reasons about assembly code using a solver. This long and drawn out functionality is hidden behind the analyze function which iterates over each instruction and updates the solver state. Once this has completed our solver internally has an accurate representation of the gadgets semantics in the form of several equations.

# 3 & # 4 Adding our gadget constraints
Once one has a representation of a sequence of instructions as equations the next step is typically to append some further constraints derived from the task at hand and then query the solver to determine if they are valid or satisfiable, and if so, to retrieve an input that makes it so. In our case our constraints simply specify a relation between an output register/memory location and an input register/memory location or a constant. We also allow the user to specify a number of registers that the gadget should preserve. We end up with a formula specifying something like (ESP_after == EAX_before+4 AND EBX_after == EBX_before), which is essentially a stack swap with the buffer pointed to by EAX and preserving the EBX register.

# 5 Context specific vs. generic gadgets
After encoding our constraints as a formula in steps #3 and #4 we can then query the solver to determine the status of these constraints in the context of the gadget encoding constructed in step #2. What do we mean by ‘status’? Well, when considering the SAT problem a formula can either be valid, invalid, satisfiable or unsatisfiable. A formula is satisfiable if there exists an assignment to free variables that makes it true and is unsatisfiable otherwise. A formula is valid if its negation is unsatisfiable or invalid otherwise. In other words, a formula is valid if any variable assignment makes it true. Why is this important? If we’re looking for generic gadgets then we want them to meet our constraints regardless of the context, therefore we leave all memory locations, registers and flags free and require that our constraints are VALID in the context of the gadget. That is, there exists no assignment to the memory locations, registers and flags that would result in the gadget not meeting our constraints.

The next few screenshots show some usage examples.

EAX == 0x1 — A generic gadget to set EAX to 1

EAX == 0x1, preserve ESI — A generic gadget to set EAX to 1 while preserving ESI

Multiple context specific gadgets — Finding multiple gadgets to achieve the same result

find_gadget.py is a very simple script but at the same time can find some very complicated instruction sequences in a short amount of time. In particular it can find tricky stack swapping gadgets and can process about 200,000 gadgets in 3 hours, which is probably a bit quicker than doing it by hand. In some upcoming posts I’ll elaborate a bit more on what can be done with the x86 -> SMT transformer -> solver code. For an idea of some of the things we’re hoping to implement many of the papers published in the last few years on symbolic execution and derived techniques are a good starting point and quite a few are linked from the Practical Software Verification Wiki

Sean Heelan's Blog

Software Exploitation and Optimisation

Validity, Satisfiability and Code Semantics

Share this:

Related