Overview
We reported an overflow vulnerability in the base64 decode function of Exim on 5 February, 2018, identified as CVE-2018-6789. This bug exists since the first commit of exim, hence ALL versions are affected. According to our research, it can be leveraged to gain Pre-auth Remote Code Execution and at least 400k servers are at risk. Patched version 4.90.1 is already released and we suggest to upgrade exim immediately.
Affected
- All Exim versions below 4.90.1
Vulnerability Analysis
This is a calculation mistake of decode buffer length in b64decode
function:
base64.c: 153 b64decode
b64decode(const uschar *code, uschar **ptr)
{
int x, y;
uschar *result = store_get(3*(Ustrlen(code)/4) + 1);
*ptr = result;
// perform decoding
}
As shown above, exim allocates a buffer of 3*(len/4)+1
bytes to store decoded base64 data. However, when the input is not a valid base64 string and the length is 4n+3
, exim allocates 3n+1
but consumes 3n+2
bytes while decoding. This causes one byte heap overflow (aka off-by-one).
Generally, this bug is harmless because the memory overwritten is usually unused. However, this byte overwrites some critical data when the string fits some specific length. In addition, this byte is controllable, which makes exploitation more feasible.
Base64 decoding is such a fundamental function and therefore this bug can be triggered easily, causing remote code execution.
Exploitation
To estimate the severity of this bug, we developed an exploit targeting SMTP daemon of exim. The exploitation mechanism used to achieve pre-auth remote code execution is described in the following paragraphs. In order to leverage this one byte overflow, it is necessary to trick memory management mechanism. It is highly recommended to have basic knowledge of heap exploitation [ref] before reading this section.
We developed the exploit with:
- Debian(stretch) and Ubuntu(zesty)
- SMTP daemon of Exim4 package installed with apt-get (4.89/4.88)
- Config enabled (uncommented in default config) CRAM-MD5 authenticator (any other authenticator using base64 also works)
- Basic SMTP commands (EHLO, MAIL FROM/RCPT TO) and AUTH
Memory allocation
First, we review the source code and search for useful memory allocation. As we mentioned in the previous article, exim uses self-defined functions for dynamic allocation:
extern BOOL store_extend_3(void *, int, int, const char *, int); /* The */
extern void store_free_3(void *, const char *, int); /* value of the */
extern void *store_get_3(int, const char *, int); /* 2nd arg is */
extern void *store_get_perm_3(int, const char *, int); /* __FILE__ in */
extern void *store_malloc_3(int, const char *, int); /* every call, */
extern void store_release_3(void *, const char *, int); /* so give its */
extern void store_reset_3(void *, const char *, int); /* correct type */
Function store_free()
and store_malloc()
calls malloc()
and free()
of glibc directly. Glibc takes a slightly bigger (0x10
bytes) chunk and stores its metadata in the first 0x10
bytes (x86-64) on every allocation, and then returns the location of data
. The following illustration describes structure of chunk:
Metadata includes size of previous chunk (the one exactly above in memory), size of current block and some flags. The first three bits of size
are used to store flags. In this example, size of 0x81
implies current chunk is 0x80
bytes and the previous chunk is in use.
Most of released chunks used in exim are put into a doubly linked list called unsorted bin. Glibc maintains it according to the flags, and merges adjacent released chunks into a bigger chunk to avoid fragmentation. For every allocation request, glibc checks these chunks in an FIFO (first in, first-out) order and reuses the chunks.
For some performance issues, exim maintains its own linked list structure with store_get()
, store_release()
, store_extend()
and store_reset()
.
The main feature of storeblocks is that every block is at least 0x2000
bytes, which becomes a restriction to our exploitation. Note that a storeblock is also the data
of a chunk. Therefore, if we look into the memory, it is like:
Here we list functions used to arrange heap data:
- EHLO hostname
For each EHLO(or HELO) command, exim stores the pointer of hostname insender_host_name
.store_free()
old namestore_malloc()
for new name
1839 /* Discard any previous helo name */ 1840 1841 if (sender_helo_name != NULL) 1842 { 1843 store_free(sender_helo_name); 1844 sender_helo_name = NULL; 1845 } ... 1884 if (yield) sender_helo_name = string_copy_malloc(start); 1885 return yield;
- Unrecognized command
For every unrecognized command with unprintable characters, exim allocates a buffer to convert it to printablestore_get()
to store error message
smtp_in.c: 5725 smtp_setup_msg
5725 done = synprot_error(L_smtp_syntax_error, 500, NULL, 5726 US"unrecognized command");
- AUTH
In most authentication procedure, exim uses base64 encoding to communicate with client. The encode and decode string are stored in a buffer allocated bystore_get()
.store_get()
for strings- can contain unprintable characters, NULL bytes
- not necessarily null terminated
- Reset in EHLO/HELO, MAIL, RCPT
When a command is done correctly,smtp_reset()
is called. This function callsstore_reset()
to reset block chain to a reset point, which means all storeblocks allocated bystore_get()
after last command are released.store_reset()
to reset point (set at the beginning of function)- release blocks added at a time
smtp_in.c: 3771 smtp_setup_msg
3771 int 3772 smtp_setup_msg(void) 3773 { 3774 int done = 0; 3775 BOOL toomany = FALSE; 3776 BOOL discarded = FALSE; 3777 BOOL last_was_rej_mail = FALSE; 3778 BOOL last_was_rcpt = FALSE; 3779 void *reset_point = store_get(0); 3780 3781 DEBUG(D_receive) debug_printf("smtp_setup_msg entered\n"); 3782 3783 /* Reset for start of new message. We allow one RSET not to be counted as a 3784 nonmail command, for those MTAs that insist on sending it between every 3785 message. Ditto for EHLO/HELO and for STARTTLS, to allow for going in and out of 3786 TLS between messages (an Exim client may do this if it has messages queued up 3787 for the host). Note: we do NOT reset AUTH at this point. */ 3788 3789 smtp_reset(reset_point);
Exploit steps
To leverage this off-by-one, the chunk beneath decoded base64 data should be freed easily and controllable. After several attempts, we found that sender_host_name
is a better choice. We arrange the heap layout to leave a freed chunk above sender_host_name
for the base64 data.
-
Put a huge chunk into unsorted bin
First of all, we send a EHLO message with huge hostname to make it allocate and deallocate, leaving a0x6060
length (3 storeblocks long) chunk in unsorted bin. -
Cut the first storeblock
Then we send an unrecognized string to triggerstore_get()
and allocate a storeblock inside the freed chunk. -
Cut the second storeblock and release the first one
We send a EHLO message again to get the second storeblock. The first block is freed sequentially because of thesmtp_reset
called after EHLO is done.After the heap layout is prepared, we can use the off-by-one to overwrite the original chunk size. We modify
0x2021
to0x20f1
, which slightly extends the chunk. -
Send base64 data and trigger off-by-one
To trigger off-by-one, we start an AUTH command to send base64 data. The overflow byte precisely overwrites the first byte of next chunk and extends the next chunk. -
Forge a reasonable chunk size
Because the chunk is extended, the start of next chunk of is changed to somewhere inside of the original one. Therefore, we need to make it seems like a normal chunk to pass sanity checks in glibc. We send another base64 string here, because it requiresNULL
byte and unprintable character to forge chunk size. -
Release the extended chunk
To control the content of extended chunk, we need to release the chunk first because we cannot edit it directly. That is, we should send a new EHLO message to release the old host name. However, normal EHLO message callssmtp_reset
after it succeeds, which possibly makes program abort or crash. To avoid this, we send an invalid host name such asa+
. -
Overwrite the
next
pointer of overlapped storeblock
After the chunk is released, we can retrieve it with AUTH and overwrite part of overlapped storeblock. Here we use a trick called partial write. With this, we can modify the pointer without breaking ASLR (Address space layout randomization). We partially changed thenext
pointer to a storeblock containing ACL (Access Control List) strings. The ACL strings are pointed by a set of global pointers such as:uschar *acl_smtp_auth; uschar *acl_smtp_data; uschar *acl_smtp_etrn; uschar *acl_smtp_expn; uschar *acl_smtp_helo; uschar *acl_smtp_mail; uschar *acl_smtp_quit; uschar *acl_smtp_rcpt;
These pointers are initialized at the beginning of exim process, set according to the configure. For example, if there is a line
acl_smtp_mail = acl_check_mail
in the configure, the pointeracl_smtp_mail
points to the stringacl_check_mail
. Whenever MAIL FROM is used, exim performs an ACL check, which expandsacl_check_mail
first. While expanding, exim tries to execute commands if it encounters${run{cmd}}
, so we achieve code execution as long as we control the ACL strings. In addition, we do not need to hijack program control flow directly and therefore we can bypass mitigations such as PIE (Position Independent Executables), NX easily. -
Reset storeblocks and retrieve the ACL storeblock
Now the ACL storeblock is in the linked list chain. It will be released oncesmtp_reset()
is triggered, and then we can retrieve it again by allocating multiple blocks. -
Overwrite ACL strings and trigger ACL check
Finally, we overwrite the whole block containing ACL strings. Now we send commands such as EHLO, MAIL, RCPT to trigger ACL checks. Once we touch an acl defined in the configure, we achieve remote code execution.
Fix
Upgrade to 4.90.1 or above
Timeline
- 5 February, 2018 09:10 Reported to Exim
- 6 February, 2018 23:23 CVE received
- 10 February, 2018 18:00 Patch released
Credits
Vulnerabilities found by Meh, DEVCORE research team.
meh [at] devco [dot] re
Reference
https://exim.org/static/doc/security/CVE-2018-6789.txt
https://git.exim.org/exim.git/commit/cf3cd306062a08969c41a1cdd32c6855f1abecf1
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-6789
http://www.openwall.com/lists/oss-security/2018/02/07/2
Heap exploitation materials [return]
- Heap Exploitation: A tutorial of heap exploitation by Dhaval Kapil
- how2heap: A repo for learning heap exploitation by Shellphish
- Heap exploitation: (Chinese) A slide introducing basic glibc heap exploitation by Angelboy
- Advanced heap exploitation: (Chinese) A slide of advanced heap exploitation techniques by Angelboy
- The poisoned NUL byte: An article of Null byte off-by-one exploitation by Project Zero