Lessons from Securing FreeRDP

Lessons from Securing FreeRDP
2024-1-1 23:1:29 Author: eyalitkin.wordpress.com(查看原文) 阅读量:24 收藏

Introduction

The story behind this 2-part blog series started quite a while ago, on September 2018, when I started a vulnerability research on a (then) novel attack vector: “Reverse RDP” while working at Check Point Research (CPR). Luckily, this research project proved itself right away, and on October the same year we started the coordinated disclosure process with the affected projects: rdesktop, FreeRDP (Open sources) and Microsoft’s MSTSC.

The research itself included several rounds of vulnerability research and publications, and was supposed to reach its end on July 2020 with the publication of a full remote code execution chain (demo) on the FreeRDP-based Apache Guacamole. Sounds like the end of story, doesn’t it?

Well, not only this wasn’t the end, chronologically speaking we were barely halfway there.

While finding a vulnerability, reporting it and waiting for the project to fix it might take some time, eventually we are talking about an isolated fix for a single vulnerability. It is common practice to measure these processes in a magnitude of a few months (90 days disclosure process). Not exactly a smooth waiting process, yet it could be worse. In addition, one might wonder about the overall benefits that might arise from such an isolated patching process. After all, similar vulnerabilities will probably still be discovered in said product, thus turning into a cat-and-mouse game between attackers and defenders.

In this blog post we will present the technical details of the attempt to provide a complete fix to the root cause of the software vulnerabilities found in FreeRDP, and the timeline of this process. Our case study will be a patch I submitted to the project on October 2021 and that just recently (Mid-December 2023) was announced as part of the latest release (3.0.0) of the project. Yup, you read it right. The fix was merged two years ago, was available on the development branch, and yet it was officially launched only the past few days.

To give you a sense of the security implications of this fix, here are some statistics about it. If only the fix was merged early on in 2018, it would have blocked more than half of the info-leak vulnerabilities that were reported to the project. Under my initial role as a security researcher, I found this timeline staggering. After all, what could be the reason for a 2 year fix process? And still, in my current capacity as as software developer I can say that it was actually a good and reasonable software release process on behalf of the project.

The goal of this article series is to present FreeRDP through two perspectives: Information security research on the one hand, and software development on the other. In this process, I hope to shed some light on the side that in my opinion gets too little attention and consideration from the infosec world: The team that actually develops the project.

Let’s start.

Remote Desktop Protocol & FreeRDP

Before diving into the attack vector, let’s first present the protocol that will be at the core of this article – Remote Desktop Protocol, or RDP in short. This protocol is frequently being used by technical users and IT personnel so to allow for remote connections to a target machine. The protocol itself was designed by Microsoft and is mostly known to users through the mstsc.exe program – Microsoft’s own built-in RDP client that is shipped as part of the Windows operating system.

While the program itself is closed-source, much like most of Microsoft’s products, there are still some open source projects that implement the protocol at various maturity levels. Such implementations include the relatively basic rdesktop (which seems to be discontinued), and the more mature FreeRDP. The latter is so common that quite a few commercial network solutions are based on top of it, some of which we will encounter later on. In addition, FreeRDP was chosen to be part of the Secure Open Source Rewards program (sos.dev), a program that aims to provide financial incentive for improving the security of open source projects that were classified as “critical”.

Reverse RDP

In the common scenario, users will use a an RDP client so to connect to a remote RDP server running on the target machine. This could be a virtual machine, our own personal computer located in the office while we work from home, or could “just” be the computer in the workstation near us when we are too lazy to stand up and walk to it. The goal of this remote connection is to have full access on the remote machine, limited by the permissions of our user. That is, full control (code execution) by design, this is what the protocol was designed for.

However, what will happen if an attacker will be able to reverse the nature of the protocol? Will a malicious RDP server be able to exploit a vulnerability in the (quite complex) RDP protocol so to gain control over the unsuspecting client attempting to connect to it?

Figure 1: Illustration of the Reverse RDP attack vector – A client connecting to a malicious RDP server, which in turn leverages the protocol so to attack it back.

There are several common scenarios in which such an attack could allow an attacker to significantly upgrade their foothold in the network:

Attacking an IT personnel trying to connect to a corporate machine so to solve a technical issue. Such an attack will allow us to gain control over the coveted workstation of a network admin in the corporate network.
Attacking a Malware researcher trying to connect to a virtual machine that acts as a sandbox for an analyzed malware sample. Such an attack will effectively lead to a sandbox escape.

There is a third scenario allowing an attacker in control of a single machine to upgrade itself into having an almost full control over the corporate network. We will elaborate more on this scenario later on.

The research methodology we chosen was an increasing complexity curve. Starting studying the protocol through the open source projects, from the basic (rdesktop) to the mature (FreeRDP), while saving mstsc.exe to last. Our base assumption was that along the way we will familiarize ourselves with the protocol and the common implementation mistakes in it, so that we will be fully prepared to tackle Microsoft’s closed-source RDP client. This is especially important given that mstsc.exe dominates the market of RDP clients.

Implementation Vulnerabilities in RDP Clients

Soon enough, we found evidence pointing to the fact that Reverse RDP is a viable attack vector. At least in the studied open source clients we managed to find several basic implementation vulnerabilities in the modules responsible for parsing the RDP messages originating from the server. We classified these vulnerabilities into two main categories:

Out of bounds read, aiming to leaking memory information back to the attacker – Info Leak.
Memory corruption vulnerabilities that might even allow for Remote Code Execution over the connecting client.

Here is an example of two such vulnerabilities that we found, one of each category:

rdesktop – CVE 2018-8798 – Info Leak

Figure 2: The fix for the info-leak vulnerability CVE 2018-8798 in rdesktop – Added check that the message contains at least 4 bytes.

This is a classic vulnerability, showing that the input is handled without first checking that the assumed content was ever sent to by the server. As a matter of fact, this is a repeated pattern in which the developers must remember to check for minimal message size before parsing any of the fields in the message.

In this specific case, we are reading 2 fields of 2-bytes each, which are read and sent back right away. Which means that this is an information leak of 4 memory bytes located right after the end of our message. From an attacker’s perspective, the hope is that these bytes will hold some sensitive information, possibly a pointer that will teach us something about the memory layout of the victim program.

FreeRDP – CVE 2018-8786 – Heap-based Buffer Overflow

Figure 3: The fix for a memory corruption vulnerability CVE 2018-8786 in FreeRDP – Saving the calculated value in a variable of type UINT32 instead of UINT16.

Once again, one can notice that the handling of the incoming message starts with a call to Stream_GetRemainingLength() before extracting the amount of rectangles we will need to handle.

Still, the vulnerability in this case originates from the multiplication operation done on the input, and specifically the fact we store the result in a UINT16 variable even though the calculated value can reach a range of 17 bits. As an illustration:

bitmapUpdate->count = 0x8001
bitmapUpdate->count * 2 = 0x10002
UINT16 count = 0x10002 & 0xFFFF = 2

Later on, the loop that handles the incoming rectangles will run for 0x8001 iterations, while the memory allocation will use the smaller value of 2, thus leading to a Heap-based memory corruption.

Summary of the found vulnerabilities

After the first iteration of our research, we found the following vulnerabilities:

rdesktop (19): CVE 2018-8791, CVE 2018-8792, CVE 2018-8793, CVE 2018-8794, CVE 2018-8795, CVE 2018-8796, CVE 2018-8797, CVE 2018-8798, CVE 2018-8799, CVE 2018-8800, CVE 2018-20174, CVE 2018-20175, CVE 2018-20176, CVE 2018-20177, CVE 2018-20178, CVE 2018-20179, CVE 2018-20180, CVE 2018-20181, CVE 2018-20182
FreeRDP (6): CVE 2018-8784, CVE 2018-8785, CVE 2018-8786, CVE 2018-8787, CVE 2018-8788, CVE 2018-8789
Mstsc (1): No CVE

26 vulnerabilities overall, spanning over 3 different implementations. The vulnerability we found in mstsc.exe allows an attacker to leverage a copy-paste event during the RDP connection so to drop fully controlled files in arbitrary paths in the victim’s file system. As an example, such an attack primitive could allow us to write our Malware to the Startup folder of the user, leading to its execution once the victim’s computer reboots. This attack was demonstrated in the following video.

It took a while (more about it on Part 2) until Microsoft got convinced about the validity of our report, and eventually they issued the following CVE IDs to our findings: CVE 2019-0887 and later on also CVE 2020-0655 (Microsoft’s initial fix was incomplete).

Diving Into the Root Cause of the Issues in FreeRDP – Apache Guacamole as a Case Study

When the Covid-19 lockdown was in early stages in early 2020, our team received a new topic for our research project – Apache Guacamole. The product, which is based on FreeRDP, is a common remoting solution that provides access to a corporate computer when working from home. Here is an illustration figure for the network architecture of a common deployment of the product:

Figure 4: An example network architecture for a deployment of Apache Guacamole.

As is evident from the above figure, the product acts as a proxy that on the one hand acts as an authentication server and on the other hand makes use of a RDP, VNC or SSH client so to connect to the corporate machine (based on the server’s configuration).

The implications of this architecture is that when an employee connects to their organizational machine, they allow an attacker that controls said machine to launch a Reverse RDP attack on the Apache Guacamole Proxy server. An attacker’s best hope in such a case would be that any employee that connects to their machine will allow them to fully control the Proxy server through which pass all employee connections in the organization. Such an attack will upgrade the attacker from controlling a single computer in the network, and into having the lucrative access to the sessions of all employees. This is a classic example of a Man-In-The-Middle attack.

And indeed, during our research we found vulnerabilities that allowed us to build a full attack chain that demonstrates this exact attack scenario: Taking over the Proxy and from there fully controlling the sessions of the rest of the employees. Here is a video of our demonstration.

This attack made use of two of the vulnerabilities we found in our research:

CVE 2020-9497 – A controlled 16-bit length field info-leak (just like heartbleed).
CVE 2020-9498 – Dangling pointer allowing for memory corruption leading to the demonstrated remote code execution.

The first vulnerability was leveraged so to leak internal memory addresses so to bypass ASLR and learning about the program’s internal memory layout. The second vulnerability was then used so to take over the machine itself. For more details about our research, please refer to the full research blog.

Core Issues in FreeRDP’s Software Design

During our research, we noticed two main issues in FreeRDP’s software design:

Fragile design: Before parsing any field in an incoming message, the developer must first check that the message is even big enough to hold said field, otherwise we will read out-of-bounds.

Figure 5: Calling Stream_GetRemainingLength() so to check that the message contains at least 2 bytes, followed by the parsing of numberOrders fields which is 2-bytes long.

Singular defense line: The input validation function was defined with a return value of type size_t, which is unsigned, instead of a signed numerical type. As such, reading past the message bounds will cause any future input validation check to pass, regardless of the message’s actual length.

Figure 6: Implementation of Stream_GetRemainingLength() showing a return value of the type size_t (unsigned).

For example, reading 2 bytes out of bounds will lead to a remaining message length of 2⁶⁴-2 bytes. In practice, this will mean that any future check of “Can we read X more bytes?” will pass with room to spare.

The combinations of these two design issues is the leading factor for many of the vulnerabilities found in FreeRDP over the years. In addition, even if we magically train all developers of the project to be aware of the importance of proper input validation checks, and even if we ignore human errors, it is hard to believe that this knowledge will correctly pass on to any new developer working on this project or its extensions. After all, FreeRDP supports extensions, and in the Apache Guacamole case our attack was aimed at an extension implemented by external developers.

At this state, in which the security of the project is dependent upon all developers remembering to correctly place input validation checks in every new piece of code, the project is forever doomed to be vulnerable. Given this vulnerable software design, and given the reduced attention of the maintainers that are focusing on each tree at a time and fail to see the wider forest, how can we systematically solve this problem?

In my opinion, the only way to solve this situation is through an intervention of an external party. In most cases, not just any external party, but external security researchers that have the needed security awareness so to pinpoint the root cause in the software design. The underlying assumption in this view point is that security researchers are willing to work together with the project maintainers so to significantly improve the security of the project. This is a very strong assumption that we will tackle on Part 2 of this series.

Fixing FreeRDP

As presented earlier, the problem we will be targeting is the attacker’s ability to read out of the message bounds, with an emphasis of leveraging this ability so to have an info-leak attack primitive. One of the reasons for this emphasis is that without such a primitive, the attacker’s ability to bypass ASLR, or any other security mitigation in the Heap, is limited. This way, we can methodically build layer upon layers of defense, each of which is targeting a singular, well-defined, simple problem.

In order to systematically address this issue, we must tackle the two design issues in FreeRDP’s software design. This could be either by addressing them or by reducing the attacker’s incentive to exploit them.

Fixing the Input Validation Function

To the innocent bystander, this is the simplest fix in the world: We simply replace the unsigned size_t type with the signed ssize_t one. And Walla, we added a single character (fixed some possible compilation issues) and the issue is now gone.

Figure 7: Updating the function prototype of Stream_GetRemainingLength() so to have a return value of ssize_t.

This change will ensure that even if we missed some input validation check, any future check in the message path will catch the attempt to read out of bounds, and will discard the message.

Important Note: While this might seem like a small change (single character), from the perspective of the development world this is actually a very big deal. As seen in the Apache Guacamole case, this function is part of the API that FreeRDP provides for implementing extensions. As such, this is an API change.

This change might seem negligible, and most probably that it won’t break anyone’s code, however it is a breaking API change. As such, it should be handled accordingly so to avoid harming the user experience of our customers. One possible such harm occurs when recompiling extensions that directly used this type, and suddenly forcing developers to adjust their code as well.

Centralized Sanity Check

One of the core issues in the current software design is that we need to place input validation checks throughout our code, before we access any input field. It is clearly evident that this is not a robust approach, and there will be code paths in which these checks will be forgotten.

“It is indisputable that the defense line will eventually be breached”
Carl Von Clausewitz

As such, instead of eliminating the problem, we can try and detect it in a way that will punish attackers trying to leverage it. While every incoming message may pass through countless code paths, depending on its content, it will always end up in the same place – returning the message back to the message pool upon deallocation. If we add a sanity check to the deallocation function so to catch possible out of bounds access violations, then we can later treat these anomalies once detected.

Figure 8: Updating the StreamPool_Return() function that returns the message back to the message pool, so that it will ensure the message’s validity.

The validation function itself needs only to check the very basic assumptions about the message’s internal structure:

The reading/writing head (s->pointer) must always be at an address of at least the message’s start.
The amount of read bytes must not be larger than the message’s length.
The amount of written bytes must not be larger than the allocated message capacity.

Figure 9: The added sanity check function that ensures the message’s validaty.

Once any of the above conditions is found to be violated, the STREAM_ASSERT() macro will log this incident and terminate the process right away.

Figure 10: Logging the event and closing the process in case a condition is not met.

At its essence, this change blocks a two-phased approach (info-leak and then memory corruption) as the program will detect the info-leak and will right away close the process, well before the second stage of the attack can occur. From an attacker’s perspective, there are two possible implications:

Info-leak for bypassing ASLR or any other security mitigation: In such cases, closing the process renders the leaked information useless. This is obviously under the assumption that the address range is randomly generated every time a new process is being spawned.

Info-leak of a system-wide secret: If applicable, an attacker would still be able to leverage the vulnerability so to leak this asset. In cases where this asset is useful well beyond the scope of the runtime of the current process, then the attacker still learned something of value. Still, such an attacker would weight the benefits from such an attack against the risk of generating a log event and terminating the entire process. After all, an attacker will forever try to avoid exposure, and these are two very noisy operations.

Side Note: Sadly, the first assumption doesn’t hold in the Apache Guacamole case. Apache Guacamole had an interesting design choice of using fork() for every new process, instead of also calling execve(). This problematic choice played a key role in the attack we demonstrated.

Implications of our fix

While the code fixes we described were quite small, their quality and the locations in which they were placed, led to a significant boost in the project’s overall security posture. If these systematic fixes would have existed in the project way back in 2018, they would have blocked / downgraded to DoS the following vulnerabilities:

FreeRDP (23): CVE 2018-8789, CVE 2020-4032, CVE 2020-11018, CVE 2020-11040, CVE 2020-11042, CVE 2020-11043, CVE 2020-11045, CVE 2020-11046, CVE 2020-11047, CVE 2020-11048, CVE 2020-11049, CVE 2020-11058, CVE 2020-11085, CVE 2020-11086, CVE 2020-11087, CVE 2020-11088, CVE 2020-11089, CVE 2020-11099, CVE 2020-11526, CVE 2020-13396, CVE 2022-39316, CVE 2022-39319, CVE 2023-39350
Apache Guacamole (3*): CVE 2020-9497

23 out of 81 of the vulnerabilities reported in FreeRDP as part of our research or afterwards would have been blocked or already got blocked by this fix. This means a staggering 28% of the vulnerabilities in the project. And the fix only took 2 days of work.

In addition, we should remember that our goal was only to address info-leak vulnerabilities (45) and not all of the vulnerabilities (81). This means that when counting 23 out of 45 we get slightly more than half (51%) of the info-leak vulnerabilities would have been blocked.

Additional Notes:

Apache Guacamole only allocated 1 CVE ID to the 3 info-leak vulnerabilities that we reported in their product.
The report of the researchers that reported CVE 2020-11526 also correctly identifies the software design issue in the prototype of the input validation function.
The description of the CVE 2023-39350 vulnerability directly mentions our fix in FreeRDP as reducing the vulnerability from info-leak to a mere DoS.

Timeline

At the beginning of the blog post I mentioned that the fixing process took just above two years. Yet, it is important to understand how did we reach this timeline. As one might see from my initial pull request that described the problem, the leading concepts for addressing the issues, my suggestion and the discussion with the maintainers, the entire process took less than two days. OK, so what took them so long to release a version with this fix?

Here enters the timeline of the software development world. First and foremost, a fix is being introduced to the main development branch of the project. From there, if needed, it is ported/cherry-picked to the active release branches. Even if we would have ported this fix to an actively supported release branch, we would have needed to wait to the release of a new version on said branch so it would include our fix.

When addressing a security vulnerability, we expect a fix within the time window of 90 days, effectively forcing the project to ensure there will be a release in this time span for all affected versions. However, in this case this isn’t really a security vulnerability, therefore it is only natural that our fix will just be added to the list of changes waiting to be released once there a release cycle.

And here is where things get complicated. The nature of our fixes meant that they introduced an API change. If included in a future version release, it might lead to an API break of existing customers (binary compatibility) or of customers that are rebuilding their solution on top of our project (source compatibility). So when are we allowed to introduce such a change?

The software development world pretty much aligned itself to the following scheme in which project versions are marked as X.Y.Z:

X – Major version.
Y – Minor version.
Z – Patch version.

This method, called Semantic Versioning, also defines the scope of changes that is allowed in each version change:

Major version update – We can break the API.
Minor version update – We can add new features, while keeping the API intact.
Patch version update – Only small code fixes, without new features.

This means that if our fix broke the API, we must wait for the next major version of the project. In our case, the current version was 2.Y.Z, hence we should wait for 3.0.0.

For most mature software projects, bumping the major version is not to be taken lightly. By definition, it tells our customers that we might have broken their API, and we could scare them off. In addition, we want to guarantee our customers a given time span (2-3 years) of active support for said version. As such, if we release a new major version every 6 months, we will find ourselves struggling to support numerous versions in parallel, with all implied costs.

One classic example is the time it took OpenSSL to transition from version 1.X (March 2010) to version 3.X (September 2021). Yes, for some unknown reason they skipped major version 2. This shows that for such a widely used project, that is being used in an almost infinite amount of environments, the time span required for bumping the major version was more than a decade.

Going back to FreeRDP, maybe a time span of two years is not that bad. Especially when these were 2 out of the 5 years that passed from 2.0.0 to 3.0.0.

What does it all mean? Well, there are some benefits to isolated code patches, as they will arrive to all actively supported versions in a matter of a few months. Still, it also means that larger code changes require meticulous planning. If possible, one should avoid breaking the project’s API, so that the change would be able to be released faster (several orders of magnitude faster actually).

And yet, even a period of two years eventually passes, hence it is important to work with the project maintainers so that one of these days even our systematic fix to the project would be merged, instead of the large set of isolated patches. If our change was able to address 28% of the project’s vulnerabilities, this is indeed something we should keep aiming for, even if it takes some time for it to be published.

And what do we do once the new major version was released? We start working on the next one. This is how the software development works, we are here for the long run.

In the next Episodes

Given the benefits of a tailor-made security hardening as the one we designed for FreeRDP, one might ask about the scarcity of such fixes. And indeed, in part 2 we will discuss the wider reflections about the interactions between the InfoSec community and the Development community.

Stay tuned.