# Augment your Auditing with a Theorem Prover

A better post title may have been ‘Outsourcing your thinking when lack of sleep makes basic arithmetic difficult’. Anyways, consider the following code from the  ArrayBuffer::tryAllocate function found in WebCore/html/canvas/ArrayBuffer.cpp.

```85	void* ArrayBuffer::tryAllocate(unsigned numElements, unsigned elementByteSize)
86	{
87	    void* result;
88	    // Do not allow 32-bit overflow of the total size
89	    if (numElements) {
90	        unsigned totalSize = numElements * elementByteSize;
91	        if (totalSize / numElements != elementByteSize)
92	            return 0;
93	    }
94	    if (WTF::tryFastCalloc(numElements, elementByteSize).getValue(result))
95	        return result;
96	    return 0;
97	}
```

Lets ignore for now whether or not you know that the check on line 91 is valid or not. For the sake of argument, assume you’re unsure about whether this code does in fact prevent a situation where totalSize can overflow and the allocation function on line 94 can still get called. In such a situation you could try and reason through on paper whether the check is safe or not, but this is potentially error prone. The other option is to model the code and throw a theorem prover at it. The advantage of this approach is that if the code is safe then you end up with a proof of that fact, while if it’s unsafe you end up with a set of inputs that violate the safety check.

The following is my model of the above code:

```[sean@sean-laptop bin]\$ cat test.smt
(benchmark uint_ovf
:status unknown
:logic QF_BV

:extrafuns ((totalSize BitVec[32])(numElements BitVec[32])(elSize BitVec[32]))
:extrafuns ((a BitVec[64])(b BitVec[64])(big32 BitVec[64]))

; if (numElements) {
:assumption (bvugt numElements bv0[32])

; unsigned totalSize = numElements * elementByteSize;
:assumption (= totalSize (bvmul numElements elSize))

; totalSize / numElements != elementByteSize
:assumption (= elSize (bvudiv totalSize numElements))

; Check if an overflow is possible in the presence of the
; above conditions
:assumption (= big32 bv4294967295[64])
:assumption (= a (zero_extend[32] numElements))
:assumption (= b (zero_extend[32] elSize))
:formula (bvugt (bvmul a b) big32)
)```

(The above .smt file is in SMTLIB format. Further information can be found at [1], [2] and [3])
The above models tryAllocate pretty much exactly. The final three assumptions and the formula are used to check if the integer overflow can occur. Mixing bitvectors of different types isn’t allowed for most operations so it is necessary first to extend numElements and elSize into 64 bit variables. We then check for overflow by multiplying these 64 bit extensions by each other and checking if the result can be greater than 0xffffffff (big32) while also satisfying the conditions imposed in modelling the function.

```[sean@sean-laptop bin]\$ ./yices -V
Yices 2.0 prototype. Copyright SRI International, 2009
GMP 4.3.1. Copyright Free Software Foundation, Inc.
Build date: Fri Apr 23 11:15:16 PDT 2010
Platform: x86_64-unknown-linux-gnu (static)
[sean@sean-laptop bin]\$ time ./yices -f < test.smt
unsat

real	0m0.587s
user	0m0.576s
sys	0m0.008s
```

And there we have it, our proof of safety (modulo modelling errors on my behalf and implementation errors in yices). Given the assumptions specified it is not possible for the multiplication at line 90 to overflow and still satisfy the condition at line 91. Total time from starting modelling to a proof, about 10 minutes.

Edit: I modified the above formula and some of the text after noticing an error. Hence any comments below may refer to an older version of the post.

## 2 thoughts on “Augment your Auditing with a Theorem Prover”

1. A student showed me your blog.

First, your formalization in Yices is not entirely faithful, for several reasons.

1. The code does not require elementByteSize to be non-0, but you do. You claim that you need this assumption because otherwise you would wind up with a divide by 0, but the only division in the code (line 91) is dividing by numElements.

2. It’s been a while since I’ve looked at the semantics of C, but my recollection is that the size of integers is implementation dependent, so assuming a 32-bit number is problematic.

3. With the same caveat as in 2, it isn’t clear to me that the C semantics for multiplication when overflow occurs and what Yices does are the same. Does the semantics of C even specify what result you get when you get overflow?

4. The reason why you added the elementByteSize is > 0 is probably because Yices would give a counterexample to your requirement, which is that totalSize *))

(local (defun ubsize () 32000))

(defthm ubsize-thm
(natp (ubsize))))

; The following function is a recognizer for unsigned integers.
; Such an integer has to be a natural number < 2^(ubsize)

(defun usbvp (x)
(and (natp x)
(<= 0 x)
( *)

; This is a faithful way of representing C multiplication. If you give
; me unsigned integers and if the result of multiplying them is also
; an unsigned integer, then we return it; otherwise we return (whatever).

(defun c-mult (x y)
(if (and (usbvp x)
(usbvp y)
(usbvp (* x y)))
(* x y)
(whatever)))

; This theorem states that if numElements and elementByteSize are both
; unsigned integers (line 85 of the code), if numElements is not 0
; (line 91 of the code), totalSize is c-mult of numElements and
; elementByteSize (line 90), and if totalSize/numElements =
; elementByteSize (line 91 fails), then no overflow occurred, i.e.,
; totalSize is really the product of numElements and elementByteSize,
; where we are really using God’s multiplication.

(thm
(implies (and (usbvp numElements)
(usbvp elementByteSize)
(/= numElements 0)
(= totalSize (c-mult numElements elementByteSize))
(= (/ totalSize numElements) elementByteSize))
(= totalSize (* numElements elementByteSize))))

• seanhn says:

1. Yup, I’ve no idea why I wrote that. The real reason I added it is that you can trivially satisfy the formula if elementSize can be 0. Basically any numElements value that isn’t itself 0 will do. For whatever reason, when writing the blog post I decided this was inconsequential though (most likely because from looking through the code base I know this isn’t going to happen in any paths I saw, but it does provide something to keep in mind. A precondition for the functions safe execution I guess). I’ll edit the post, thanks.

2. In practice I’ll happily assume 32 bits for now.

3. For unsigned integers the result is defined and they should naturally wrap around 2^32 – 1. As far as I know Yices has the same functionality for its bitvector arithmetic although if it doesn’t I’d like to know.

4. I’m not entirely sure what you intended by this part. For unsigned arithmetic I’d consider it safe to use the normal semantics of modulo bitvector arithmetic as provided by most SMT solvers.