Exim Off-by-one RCE: Exploiting CVE-2018-6789 with Fully Mitigations Bypassing

Overview

We reported an overflow vulnerability in the base64 decode function of Exim on 5 February, 2018, identified as CVE-2018-6789. This bug exists since the first commit of exim, hence ALL versions are affected. According to our research, it can be leveraged to gain Pre-auth Remote Code Execution and at least 400k servers are at risk. Patched version 4.90.1 is already released and we suggest to upgrade exim immediately.

Affected

All Exim versions below 4.90.1

Vulnerability Analysis

This is a calculation mistake of decode buffer length in b64decode function:
base64.c: 153 b64decode

b64decode(const uschar *code, uschar **ptr)
{
int x, y;
uschar *result = store_get(3*(Ustrlen(code)/4) + 1);

*ptr = result;
// perform decoding
}

As shown above, exim allocates a buffer of 3*(len/4)+1 bytes to store decoded base64 data. However, when the input is not a valid base64 string and the length is 4n+3, exim allocates 3n+1 but consumes 3n+2 bytes while decoding. This causes one byte heap overflow (aka off-by-one).
Generally, this bug is harmless because the memory overwritten is usually unused. However, this byte overwrites some critical data when the string fits some specific length. In addition, this byte is controllable, which makes exploitation more feasible.
Base64 decoding is such a fundamental function and therefore this bug can be triggered easily, causing remote code execution.

Exploitation

To estimate the severity of this bug, we developed an exploit targeting SMTP daemon of exim. The exploitation mechanism used to achieve pre-auth remote code execution is described in the following paragraphs. In order to leverage this one byte overflow, it is necessary to trick memory management mechanism. It is highly recommended to have basic knowledge of heap exploitation [ref] before reading this section.

We developed the exploit with:

Debian(stretch) and Ubuntu(zesty)
SMTP daemon of Exim4 package installed with apt-get (4.89/4.88)
Config enabled (uncommented in default config) CRAM-MD5 authenticator (any other authenticator using base64 also works)
Basic SMTP commands (EHLO, MAIL FROM/RCPT TO) and AUTH

Memory allocation

First, we review the source code and search for useful memory allocation. As we mentioned in the previous article, exim uses self-defined functions for dynamic allocation:

extern BOOL    store_extend_3(void *, int, int, const char *, int);  /* The */
extern void    store_free_3(void *, const char *, int);     /* value of the */
extern void   *store_get_3(int, const char *, int);         /* 2nd arg is   */
extern void   *store_get_perm_3(int, const char *, int);    /* __FILE__ in  */
extern void   *store_malloc_3(int, const char *, int);      /* every call,  */
extern void    store_release_3(void *, const char *, int);  /* so give its  */
extern void    store_reset_3(void *, const char *, int);    /* correct type */

Function store_free() and store_malloc() calls malloc() and free() of glibc directly. Glibc takes a slightly bigger (0x10 bytes) chunk and stores its metadata in the first 0x10 bytes (x86-64) on every allocation, and then returns the location of data. The following illustration describes structure of chunk:

Metadata includes size of previous chunk (the one exactly above in memory), size of current block and some flags. The first three bits of size are used to store flags. In this example, size of 0x81 implies current chunk is 0x80 bytes and the previous chunk is in use.
Most of released chunks used in exim are put into a doubly linked list called unsorted bin. Glibc maintains it according to the flags, and merges adjacent released chunks into a bigger chunk to avoid fragmentation. For every allocation request, glibc checks these chunks in an FIFO (first in, first-out) order and reuses the chunks.

For some performance issues, exim maintains its own linked list structure with store_get(), store_release(), store_extend() and store_reset().
architecture of storeblock
The main feature of storeblocks is that every block is at least 0x2000 bytes, which becomes a restriction to our exploitation. Note that a storeblock is also the data of a chunk. Therefore, if we look into the memory, it is like:

Here we list functions used to arrange heap data:

EHLO hostname
For each EHLO(or HELO) command, exim stores the pointer of hostname in sender_host_name.

store_free() old name
store_malloc() for new name

smtp_in.c: 1833 check_helo

  1839 /* Discard any previous helo name */
  1840
  1841 if (sender_helo_name != NULL)
  1842   {
  1843   store_free(sender_helo_name);
  1844   sender_helo_name = NULL;
  1845   }
  ...
  1884 if (yield) sender_helo_name = string_copy_malloc(start);
  1885 return yield;

Unrecognized command
For every unrecognized command with unprintable characters, exim allocates a buffer to convert it to printable
- store_get() to store error message
smtp_in.c: 5725 smtp_setup_msg
```
  5725   done = synprot_error(L_smtp_syntax_error, 500, NULL,
  5726     US"unrecognized command");
```
AUTH
In most authentication procedure, exim uses base64 encoding to communicate with client. The encode and decode string are stored in a buffer allocated by store_get().
- store_get() for strings
- can contain unprintable characters, NULL bytes
- not necessarily null terminated

Reset in EHLO/HELO, MAIL, RCPT
When a command is done correctly, smtp_reset() is called. This function calls store_reset() to reset block chain to a reset point, which means all storeblocks allocated by store_get() after last command are released.

store_reset() to reset point (set at the beginning of function)
release blocks added at a time

smtp_in.c: 3771 smtp_setup_msg

  3771 int
  3772 smtp_setup_msg(void)
  3773 {
  3774 int done = 0;
  3775 BOOL toomany = FALSE;
  3776 BOOL discarded = FALSE;
  3777 BOOL last_was_rej_mail = FALSE;
  3778 BOOL last_was_rcpt = FALSE;
  3779 void *reset_point = store_get(0);
  3780
  3781 DEBUG(D_receive) debug_printf("smtp_setup_msg entered\n");
  3782
  3783 /* Reset for start of new message. We allow one RSET not to be counted as a
  3784 nonmail command, for those MTAs that insist on sending it between every
  3785 message. Ditto for EHLO/HELO and for STARTTLS, to allow for going in and out of
  3786 TLS between messages (an Exim client may do this if it has messages queued up
  3787 for the host). Note: we do NOT reset AUTH at this point. */
  3788
  3789 smtp_reset(reset_point);

Exploit steps

To leverage this off-by-one, the chunk beneath decoded base64 data should be freed easily and controllable. After several attempts, we found that sender_host_name is a better choice. We arrange the heap layout to leave a freed chunk above sender_host_name for the base64 data.

Put a huge chunk into unsorted bin
First of all, we send a EHLO message with huge hostname to make it allocate and deallocate, leaving a 0x6060 length (3 storeblocks long) chunk in unsorted bin.
Cut the first storeblock
Then we send an unrecognized string to trigger store_get() and allocate a storeblock inside the freed chunk.
Cut the second storeblock and release the first one
We send a EHLO message again to get the second storeblock. The first block is freed sequentially because of the smtp_reset called after EHLO is done.

After the heap layout is prepared, we can use the off-by-one to overwrite the original chunk size. We modify 0x2021 to 0x20f1, which slightly extends the chunk.
Send base64 data and trigger off-by-one
To trigger off-by-one, we start an AUTH command to send base64 data. The overflow byte precisely overwrites the first byte of next chunk and extends the next chunk.
Forge a reasonable chunk size
Because the chunk is extended, the start of next chunk of is changed to somewhere inside of the original one. Therefore, we need to make it seems like a normal chunk to pass sanity checks in glibc. We send another base64 string here, because it requires NULL byte and unprintable character to forge chunk size.
Release the extended chunk
To control the content of extended chunk, we need to release the chunk first because we cannot edit it directly. That is, we should send a new EHLO message to release the old host name. However, normal EHLO message calls smtp_reset after it succeeds, which possibly makes program abort or crash. To avoid this, we send an invalid host name such as a+.
Overwrite the next pointer of overlapped storeblock

After the chunk is released, we can retrieve it with AUTH and overwrite part of overlapped storeblock. Here we use a trick called partial write. With this, we can modify the pointer without breaking ASLR (Address space layout randomization). We partially changed the next pointer to a storeblock containing ACL (Access Control List) strings. The ACL strings are pointed by a set of global pointers such as:
```
 uschar *acl_smtp_auth;
 uschar *acl_smtp_data;
 uschar *acl_smtp_etrn;
 uschar *acl_smtp_expn;
 uschar *acl_smtp_helo;
 uschar *acl_smtp_mail;
 uschar *acl_smtp_quit;
 uschar *acl_smtp_rcpt;
```
These pointers are initialized at the beginning of exim process, set according to the configure. For example, if there is a line acl_smtp_mail = acl_check_mail in the configure, the pointer acl_smtp_mail points to the string acl_check_mail. Whenever MAIL FROM is used, exim performs an ACL check, which expands acl_check_mail first. While expanding, exim tries to execute commands if it encounters ${run{cmd}}, so we achieve code execution as long as we control the ACL strings. In addition, we do not need to hijack program control flow directly and therefore we can bypass mitigations such as PIE (Position Independent Executables), NX easily.
Reset storeblocks and retrieve the ACL storeblock
Now the ACL storeblock is in the linked list chain. It will be released once smtp_reset() is triggered, and then we can retrieve it again by allocating multiple blocks.
Overwrite ACL strings and trigger ACL check
Finally, we overwrite the whole block containing ACL strings. Now we send commands such as EHLO, MAIL, RCPT to trigger ACL checks. Once we touch an acl defined in the configure, we achieve remote code execution.

Fix

Upgrade to 4.90.1 or above

Timeline

5 February, 2018 09:10 Reported to Exim
6 February, 2018 23:23 CVE received
10 February, 2018 18:00 Patch released

Credits

Vulnerabilities found by Meh, DEVCORE research team.
meh [at] devco [dot] re

Reference

https://exim.org/static/doc/security/CVE-2018-6789.txt
https://git.exim.org/exim.git/commit/cf3cd306062a08969c41a1cdd32c6855f1abecf1
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-6789
http://www.openwall.com/lists/oss-security/2018/02/07/2

Heap exploitation materials [return]

Heap Exploitation: A tutorial of heap exploitation by Dhaval Kapil
how2heap: A repo for learning heap exploitation by Shellphish
Heap exploitation: (Chinese) A slide introducing basic glibc heap exploitation by Angelboy
Advanced heap exploitation: (Chinese) A slide of advanced heap exploitation techniques by Angelboy
The poisoned NUL byte: An article of Null byte off-by-one exploitation by Project Zero