Fault Localisation – Sean Heelan's Blog

Anyone who’s spent time doing vulnerability analysis on C/C++ has had the experience of floundering around in a debugger for hours on end trying to figure out the source of a mysterious crash.

The reason this can be an incredibly time consuming and frustrating process is simple: memory corruption is often quite subtle in its effect. The fact that corruption has occurred may not actually become apparent until far later in the program’s execution, when all trace of the buggy function is gone from the call-stack. Along with that, due to randomisation of various things, such as memory layout and allocation patterns, the same root cause can manifest in a variety of different ways across multiple runs.

For example, lets say we’re analysing an interpreter, e.g. php, and the following occurs: an API call triggers a function containing a bug, and a write to buffer X overflows into the memory occupied by object Y, smashing some pointers internal to Y. The API call returns, the interpreted program eventually ends and the interpreter begins to shutdown. During the shutdown process it notifies Y, which tries to use its (now corrupted) internal pointers and crashes. By the time the crash occurs, there is no trace of the buggy API call on the call-stack, nor is there any apparent link in the source code between the contents of buffer X and the corrupted internal pointer of object Y.

The above scenario isn’t limited to buffer overflows on the heap. Use-after-frees and other memory life-time management issues can have similar effects. At a high level the question we want to answer when debugging such situations is straightforward ‘Where did the data that corrupted this location get written?’. Taint tracking solutions attempt to solve this problem by tracking data forwards from a taint source. For various reasons, all existing implementations tend to be limited to two of fast, accurate or detailed, and all three are often required for a trace of tainted data to be useful.

This brings me to rr, a tool developed by Robert O’Callahan of Mozilla. The project page is here, and this video is a good introduction to the tool. rr can do really fast record and replay, but on top of that it can also do really fast reverse execution. During replay, the user interface provided is gdb‘s, and thus you effectively get your standard debugger interface but with a guarantee that repeated runs will be identical, as well as the ability to execute in reverse. The key to solving our heap overflow problem is that not only can you execute in reverse, but that, during reverse execution, watch points (and normal breakpoints) still function. In other words, if we want to know where some data in memory came from, we can set a watch point on it, execute reverse-continue and the program will execute in reverse until the point where the data was written [1].

I hope the following example highlights how useful this capability can be.

The php interpreter recently had a potentially remotely exploitable issue due to a flaw in the libgd library, which it bundles. The libgd issue was assigned CVE-2016-3074 and the PHP bug tracker entry is here. The reporter rather helpfully included a PoC exploit, which you can find at the previous link, that is intended to spawn a reverse shell. This PoC is a useful starting point for a ‘real’ exploit, but is pretty much ‘hit and hope’ in terms of reliability: it sprays data after a overflowed buffer, seemingly in the hope of corrupting a function pointer called before the corruption causes the process to die for some other reason [2]. For fun I decided to build a more reliable exploit around the primitives offered by the bug, and the first step of this was figuring out exactly why the existing PoC was failing.

When I first ran the PoC, mod_php died in a strlen function call due to being passed a pointer to an unmapped region of memory. The backtrace at that point (in normal gdb) looked as follows:

[root@localhost ~]# gdb --pid=1963
GNU gdb (GDB) Fedora 7.10.1-31.fc23
0x00007f7ca5dc6a68 in accept4 (fd=4, addr=addr@entry=..., addr_len=addr_len@entry=0x7ffe3580e2e0, flags=flags@entry=524288) at ../sysdeps/unix/sysv/linux/accept4.c:40
40      return SYSCALL_CANCEL (accept4, fd, addr.__sockaddr__, addr_len, flags);
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
strlen () at ../sysdeps/x86_64/strlen.S:106
106        movdqu    (%rax), %xmm12
(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x00007f7ca312aad9 in format_converter (odp=0x7ffe3580bff0, fmt=0x7f7ca37dca49 "s(%d) :  Freeing 0x%.8lX (%zu bytes), script=%s\n", ap=0x7ffe3580c040)
    at php-7.0.5-debug/main/snprintf.c:993
#2  0x00007f7ca312b40c in strx_printv (ccp=0x7ffe3580c05c, buf=0x7ffe3580c380 "[Thu May 26 12:32:29 2016]  Script:  '/var/www/html/upload.php'\n", len=512, format=0x7f7ca37dca48 "%s(%d) :  Freeing 0x%.8lX (%zu bytes), script=%s\n", 
    ap=0x7ffe3580c040) at php-7.0.5-debug/main/snprintf.c:1248
#3  0x00007f7ca312b645 in ap_php_snprintf (buf=0x7ffe3580c380 "[Thu May 26 12:32:29 2016]  Script:  '/var/www/html/upload.php'\n", len=512, format=0x7f7ca37dca48 "%s(%d) :  Freeing 0x%.8lX (%zu bytes), script=%s\n")
    at php-7.0.5-debug/main/snprintf.c:1293
#4  0x00007f7ca3126483 in php_message_handler_for_zend (message=4, data=0x7ffe3580d410) at php-7.0.5-debug/main/main.c:1444
#5  0x00007f7ca31b5523 in zend_message_dispatcher (message=4, data=0x7ffe3580d410) at php-7.0.5-debug/Zend/zend.c:998
#6  0x00007f7ca3182bef in zend_mm_check_leaks (heap=0x7f7c96a00040) at php-7.0.5-debug/Zend/zend_alloc.c:2129
#7  0x00007f7ca3182f2e in zend_mm_shutdown (heap=0x7f7c96a00040, full=0, silent=0) at php-7.0.5-debug/Zend/zend_alloc.c:2201
#8  0x00007f7ca3183d1b in shutdown_memory_manager (silent=0, full_shutdown=0) at php-7.0.5-debug/Zend/zend_alloc.c:2637
#9  0x00007f7ca3127255 in php_request_shutdown (dummy=0x0) at php-7.0.5-debug/main/main.c:1849
#10 0x00007f7ca3274e8a in php_apache_request_dtor (r=0x55a587a49780) at php-7.0.5-debug/sapi/apache2handler/sapi_apache2.c:518
...

Again using normal gdb we can quickly figure out that the issue is that strlen is passed a pointer to unmapped memory, and that the source of that pointer is the php_message_handler_for_zend function (frame 4 above). e.g.:

(gdb) f 1
#1 0x00007f7ca312aad9 in format_converter (odp=0x7ffe3580bff0, fmt=0x7f7ca37dca49 "s(%d) : Freeing 0x%.8lX (%zu bytes), script=%s\n", ap=0x7ffe3580c040)
 at php-7.0.5-debug/main/snprintf.c:993
993 s_len = strlen(s);
(gdb) l
988 
989 case 's':
990 case 'v':
991 s = va_arg(ap, char *);
992 if (s != NULL) {
993     s_len = strlen(s);
994     if (adjust_precision && precision < s_len) {
995         s_len = precision;
996     }
997 } else {
(gdb) p/x s
$1 = 0xa8bc37
(gdb) x/x s
0xa8bc37: Cannot access memory at address 0xa8bc37

This tells us that the argument to strlen is a vararg. We can figure out where that came from easily enough:

(gdb) f 2
#2 0x00007f7ca312b40c in strx_printv (ccp=0x7ffe3580c05c, buf=0x7ffe3580c380 "[Thu May 26 12:32:29 2016] Script: '/var/www/html/upload.php'\n", len=512, format=0x7f7ca37dca48 "%s(%d) : Freeing 0x%.8lX (%zu bytes), script=%s\n", 
 ap=0x7ffe3580c040) at php-7.0.5-debug/main/snprintf.c:1248
1248 cc = format_converter(&od, format, ap);
(gdb) f 3
#3 0x00007f7ca312b645 in ap_php_snprintf (buf=0x7ffe3580c380 "[Thu May 26 12:32:29 2016] Script: '/var/www/html/upload.php'\n", len=512, format=0x7f7ca37dca48 "%s(%d) : Freeing 0x%.8lX (%zu bytes), script=%s\n")
 at php-7.0.5-debug/main/snprintf.c:1293
1293 strx_printv(&cc, buf, len, format, ap);
(gdb) f 4
#4 0x00007f7ca3126483 in php_message_handler_for_zend (message=4, data=0x7ffe3580d410) at php-7.0.5-debug/main/main.c:1444
1444 snprintf(memory_leak_buf, 512, "%s(%d) : Freeing 0x%.8lX (%zu bytes), script=%s\n", t->filename, t->lineno, (zend_uintptr_t)t->addr, t->size, SAFE_FILENAME(SG(request_info).path_translated));
(gdb) p/x t->filename
$2 = 0xa8bc37

From the above code we can see that php_message_handler_for_zend is where the corrupted pointer is read and utilised as an argument to snprintf, while ap_php_snprintf and strx_printv just pass through the ap variable, from which the invalid pointer is eventually extracted and used in format_converter.

We are now back at the problem I stated in the beginning: We have a corrupted memory location, &(t->filename),and we want to know the source of that corruption. In a ‘normal’ dataflow tracking scenario we could read the source code and find locations that write to &(t->filename) and work from there. When a variable is influenced due to a heap overflow of an unrelated memory buffer that is no longer an option. Once corruption has occurred we’ve traversed from the state-space which is apparent by inspecting the source to the state-space of a ‘weird machine’ which, among other things, is dependent on memory layout.

Fortunately, ‘What is the source of data at memory location x?‘ is exactly the kind of question that rr can help us answer. We begin by starting the target (apache in this case) under rr, and triggering the crash. This is simply achieved via rr httpd -X. We can then replay the session up until the crash point as follows:

[root@localhost tmp]# rr replay
GNU gdb (GDB) Fedora 7.10.1-31.fc23
0x00007f8710739c80 in _start () from /lib64/ld-linux-x86-64.so.2
(rr) c
Continuing.
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using localhost.localdomain. Set the 'ServerName' directive globally to suppress this message
[New Thread 57236.57239]

Program received signal SIGSEGV, Segmentation fault.
strlen () at ../sysdeps/x86_64/strlen.S:106
106        movdqu    (%rax), %xmm12

From our previous analysis we know the data which we wish to know the source of is the memory backing t->filename in frame 4, so we can switch to that frame and add a watch point:

(rr) f 4
#4  0x00007f870c079483 in php_message_handler_for_zend (message=4, data=0x7ffd14c702c0) at php-7.0.5-debug/main/main.c:1444
1444                        snprintf(memory_leak_buf, 512, "%s(%d) :  Freeing 0x%.8lX (%zu bytes), script=%s\n", t->filename, t->lineno, (zend_uintptr_t)t->addr, t->size, SAFE_FILENAME(SG(request_info).path_translated));
(rr) p &(t->filename)
$1 = (const char **) 0x7ffd14c702d0
(rr) watch -l t->filename
Hardware watchpoint 1: -location t->filename
(rr) reverse-continue 
Continuing.

Program received signal SIGSEGV, Segmentation fault.
strlen () at ../sysdeps/x86_64/strlen.S:106
106        movdqu    (%rax), %xmm12
(rr) reverse-continue 
Continuing.
Hardware watchpoint 1: -location t->filename

Old value = 0xa8bc37 <error: Cannot access memory at address 0xa8bc37>
New value = 0xff15e8fef2615642 <error: Cannot access memory at address 0xff15e8fef2615642>
0x00007f870c0d5bab in zend_mm_check_leaks (heap=0x7f86ffa00040) at php-7.0.5-debug/Zend/zend_alloc.c:2123
2123                                leak.filename = dbg->filename;
(rr) p &(leak.filename)
$2 = (const char **) 0x7ffd14c702d0
(rr) p/x dbg->filename
$3 = 0xa8bc37

From the above we discover that the write to t->filename (address 0x7ffd14c702d0) was line 2123, in the zend_mm_check_leaks function. The value was sourced from dbg->filename, which already contains the invalid pointer, so we repeat the process to discover the source of that variable.

(rr) delete 1
(rr) watch -l dbg->filename
Hardware watchpoint 2: -location dbg->filename
(rr) reverse-continue 
Continuing.
Hardware watchpoint 2: -location dbg->filename

Old value = 0xa8bc37 <error: Cannot access memory at address 0xa8bc37>
New value = 0x0
__mempcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:219
219        movq     %r8,  8(%rdi)
(rr) bt
#0  __mempcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:219
#1  0x00007f870ec8f55e in __GI__IO_file_xsgetn (fp=0x56128d40a9a0, data=<optimized out>, n=18446744073709551615) at fileops.c:1392
#2  0x00007f870ec848b6 in __GI__IO_fread (buf=<optimized out>, size=1, count=18446744073709551615, fp=0x56128d40a9a0) at iofread.c:42
#3  0x00007f870be6ed79 in fileGetbuf (ctx=0x7f86ffa5a0e0, buf=0x7f86ffa5e0a0, size=-1) at php-7.0.5-debug/ext/gd/libgd/gd_io_file.c:91
#4  0x00007f870be6e47e in php_gd_gdGetBuf (buf=0x7f86ffa5e0a0, size=-1, ctx=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_io.c:131
#5  0x00007f870be6c79c in _gd2ReadChunk (offset=1176, compBuf=0x7f86ffa5e0a0 '\220' <repeats 40 times>, "\242", <incomplete sequence \354\266>, compSize=-1, chunkBuf=0x7f86ffa70000 'A' <repeats 100 times>, chunkLen=0x7ffd14c6e898, 
    in=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:213
#6  0x00007f870be6cad8 in php_gd_gdImageCreateFromGd2Ctx (in=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:311
#7  0x00007f870be6c801 in php_gd_gdImageCreateFromGd2 (inFile=0x56128d40a9a0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:232
#8  0x00007f870be5d337 in _php_image_create_from (execute_data=0x7f86ffa130d0, return_value=0x7f86ffa130c0, image_type=9, tn=0x7f870c5b4e77 "GD2", func_p=0x7f870be6c7d9 <php_gd_gdImageCreateFromGd2>, 
    ioctx_func_p=0x7f870be6c86c <php_gd_gdImageCreateFromGd2Ctx>) at php-7.0.5-debug/ext/gd/gd.c:2385
#9  0x00007f870be5d572 in zif_imagecreatefromgd2 (execute_data=0x7f86ffa130d0, return_value=0x7f86ffa130c0) at php-7.0.5-debug/ext/gd/gd.c:2483
...
#28 0x000056128c671437 in main (argc=2, argv=0x7ffd14c71688) at main.c:777

Executing in reverse we discover the corruption was caused by a call to __GI__IO_freadwith a count of -1, which is the overflow leveraged by the PoC. At this point we might like to know a few other things, such as how much data was actually read in, as well as how big the destination buffer is.

The first value is trivial to find, in the same way as you normally would with gdb, as it is the return value of the fread function.

(rr) f 3
#3 0x00007f7958587d79 in fileGetbuf (ctx=0x7f794c05a0e0, buf=0x7f794c05e0a0, size=-1) at php-7.0.5-debug/ext/gd/libgd/gd_io_file.c:91
91 return fread(buf, 1, size, fctx->f);
(rr) finish
Run till exit from #3 0x00007f7958587d79 in fileGetbuf (ctx=0x7f794c05a0e0, buf=0x7f794c05e0a0, size=-1) at php-7.0.5-debug/ext/gd/libgd/gd_io_file.c:91
php_gd_gdGetBuf (buf=0x7f794c05e0a0, size=-1, ctx=0x7f794c05a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_io.c:132
132 }
Value returned is $2 = 612

To find the size of the destination buffer (which is at address 0x7f86ffa5e0a0) we can look over the backtrace to see the first function that doesn’t take it as an argument (php_gd_gdImageCreateFromGd2Ctx), set a breakpoint where it calls the first function in the backtrace that does take the buffer as an argument and replay the execution again:

(rr) f 6
#6  0x00007f870be6cad8 in php_gd_gdImageCreateFromGd2Ctx (in=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:311
311                    if (!_gd2ReadChunk(chunkIdx[chunkNum].offset, compBuf, chunkIdx[chunkNum].size, (char *) chunkBuf, &chunkLen, in)) {
(rr) b 311
Breakpoint 6 at 0x7f870be6ca88: file php-7.0.5-debug/ext/gd/libgd/gd_gd2.c, line 311.
(rr) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/sbin/httpd 

[Thread 57236.57236] #1 stopped.
0x00007f8710739c80 in _start () from /lib64/ld-linux-x86-64.so.2
(rr) c
Continuing.
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using localhost.localdomain. Set the 'ServerName' directive globally to suppress this message
warning: Temporarily disabling breakpoints for unloaded shared library "/usr/lib64/httpd/modules/libphp7.so"
[New Thread 57236.57239]

Breakpoint 6, php_gd_gdImageCreateFromGd2Ctx (in=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:311
311                    if (!_gd2ReadChunk(chunkIdx[chunkNum].offset, compBuf, chunkIdx[chunkNum].size, (char *) chunkBuf, &chunkLen, in)) {
(rr) p compBuf
$13 = 0x7f86ffa5e0a0 ""

Now we’re back to the familiar problem of having a variable (compBuf) and wanting to know where its value came from, so we set a watchpoint and switch from forwards execution to reverse execution to find the allocation site of the buffer, and its size.

(rr) watch -l compBuf
Hardware watchpoint 7: -location compBuf
(rr) reverse-continue 
Continuing.
Hardware watchpoint 7: -location compBuf

Old value = 0x7f86ffa5e0a0 ""
New value = 0x0
0x00007f870be6ca1d in php_gd_gdImageCreateFromGd2Ctx (in=0x7f86ffa5a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:292
292            compBuf = gdCalloc(compMax, 1);
(rr) p compMax
$14 = 112

Finally, for the sake of completeness, it might be useful to know what the data represented before it was corrupted, which will tell us what the key difference between crashing and non-crashing runs is. Since the data flows straight from the overflow location to its first use in zend_mm_check_leaks, which is a function internal to the allocator and which operates on heap metadata, a reasonable assumption is that the overflow corrupts something other than dbg->filename, which tricks the allocator into thinking that dbg->filename is a valid pointer to a string. If we look at zend_mm_check_leaks we can confirm this may be the case:

(rr) l
2118 j = 0;
2119 while (j < bin_elements[bin_num]) {
2120     if (dbg->size != 0) {
2121         leak.addr = (zend_mm_debug_info*)((char*)p + ZEND_MM_PAGE_SIZE * i + bin_data_size[bin_num] * j);
2122         leak.size = dbg->size;
2123         leak.filename = dbg->filename;
2124         leak.orig_filename = dbg->orig_filename;
2125         leak.lineno = dbg->lineno;
2126         leak.orig_lineno = dbg->orig_lineno;

We can assume based on the above that the entire dbg object has been corrupted, and because dbg->size has been changed to a non-zero value we have ended up at the crash location that we have. This can be confirmed by tracing the value to its source in the same way as before:

(rr) awatch -l dbg->size
Hardware access (read/write) watchpoint 4: -location dbg->size
(rr) reverse-continue 
Continuing.
Hardware access (read/write) watchpoint 4: -location dbg->size

Value = 4665427
0x00007f7db1825b9c in zend_mm_check_leaks (heap=0x7f7da5200040) at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:2122
2122 leak.size = dbg->size;
(rr) 
Continuing.
Hardware access (read/write) watchpoint 4: -location dbg->size

Value = 4665427
0x00007f7db1825b56 in zend_mm_check_leaks (heap=0x7f7da5200040) at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:2120
2120 if (dbg->size != 0) {
(rr) 
Continuing.
Hardware access (read/write) watchpoint 4: -location dbg->size

Old value = 4665427
New value = 0
__mempcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:218
218 movq %rax, (%rdi)
(rr) bt
#0 __mempcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:218
#1 0x00007f7db43df55e in __GI__IO_file_xsgetn (fp=0x563db9fbb9a0, data=<optimized out>, n=18446744073709551615) at fileops.c:1392
#2 0x00007f7db43d48b6 in __GI__IO_fread (buf=<optimized out>, size=1, count=18446744073709551615, fp=0x563db9fbb9a0) at iofread.c:42
#3 0x00007f7db15bed79 in fileGetbuf (ctx=0x7f7da525a0e0, buf=0x7f7da525e0a0, size=-1) at php-7.0.5-debug/ext/gd/libgd/gd_io_file.c:91
#4 0x00007f7db15be47e in php_gd_gdGetBuf (buf=0x7f7da525e0a0, size=-1, ctx=0x7f7da525a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_io.c:131
#5 0x00007f7db15bc79c in _gd2ReadChunk (offset=1176, compBuf=0x7f7da525e0a0 '\220' <repeats 40 times>, "\242", <incomplete sequence \354\266>, compSize=-1, chunkBuf=0x7f7da5270000 'A' <repeats 100 times>, chunkLen=0x7fff6af192a8, 
 in=0x7f7da525a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:213
#6 0x00007f7db15bcad8 in php_gd_gdImageCreateFromGd2Ctx (in=0x7f7da525a0e0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:311
#7 0x00007f7db15bc801 in php_gd_gdImageCreateFromGd2 (inFile=0x563db9fbb9a0) at php-7.0.5-debug/ext/gd/libgd/gd_gd2.c:232
...

Above we can see that it was, as we presumed, the overflow that corrupted dbg->size. To figure out what its value was before it was corrupted, and what code set that value, we just reverse-continue again.

(rr) reverse-continue 
Continuing.
Hardware watchpoint 2: -location dbg->size

Old value = 0
New value = <unreadable>
_raw_syscall () at /home/sean/Documents/Git/rr/rr/src/preload/raw_syscall.S:120
120 callq *32(%rsp)
(rr) bt
#0 _raw_syscall () at /home/sean/Documents/Git/rr/rr/src/preload/raw_syscall.S:120
#1 0x00007f795cc261ab in traced_raw_syscall (call=<optimized out>) at /home/sean/Documents/Git/rr/rr/src/preload/preload.c:282
#2 sys_readlink (call=<optimized out>) at /home/sean/Documents/Git/rr/rr/src/preload/preload.c:1914
#3 syscall_hook_internal (call=0x7fff28911668) at /home/sean/Documents/Git/rr/rr/src/preload/preload.c:2345
#4 0x00007f795cc25b81 in syscall_hook (call=0x7fff28911668) at /home/sean/Documents/Git/rr/rr/src/preload/preload.c:2396
#5 0x00007f795cc29cea in _syscall_hook_trampoline () at /home/sean/Documents/Git/rr/rr/src/preload/syscall_hook.S:170
#6 0x00007f795cc29cfb in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/sean/Documents/Git/rr/rr/src/preload/syscall_hook.S:221
#7 0x00007f795b42b960 in mmap64 () at ../sysdeps/unix/syscall-template.S:84
#8 0x00007f79587eb2dd in zend_mm_mmap (size=4190208) at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:477
#9 0x00007f79587eba73 in zend_mm_chunk_alloc_int (size=2097152, alignment=2097152) at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:767
#10 0x00007f79587ede46 in zend_mm_init () at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:1828
...

(rr) f 8
#8 0x00007f79587eb2dd in zend_mm_mmap (size=4190208) at /home/sean/Documents/Git/EntomologyPrivate/PHP/CVE-2016-3074/php-7.0.5-debug/Zend/zend_alloc.c:477
477 ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0);

We can see that our hypothesis is correct and that prior to the overflow dbg->size was zero, having being allocated during the initialisation of the memory manager and not used after that. At this point we have a fairly good idea of why the PoC is triggering the crash we’re seeing, and for the purposes of this post we’re done!

All of the above takes a few minutes in total, as reverse execution has a barely noticeable overhead and thus can be integrated into your normal workflow. Higher overhead tools, such as taint trackers, have their place and can offer more automated or more complete answers, but rr finds a sweet-spot between performance and capability that makes it a very interesting tool.

Entirely Unrelated Training Update

There are still seats available for the London/NYC editions of Advanced Tool Development with SMT Solvers, scheduled for this August. These are likely to be the only public editions in the near future, so if you’d like to attend send an email to contact@vertex.re ASAP.

[1] It doesn’t actually execute in reverse in the sense of walking backwards through the instruction stream. The video linked to above provides some detail on what actually happens.

[2] I can only guess this is the intended operation of the PoC as I never got it to work in its original form.

Sean Heelan's Blog

Software Exploitation and Optimisation

Category: Fault Localisation

Tracking Down Heap Overflows with rr

Some Early-Stage Work on Statistical Crash Triage