In late January 2022, several reports on social media indicated that a new Microsoft Windows privilege escalation vulnerability (CVE-2022-21882) was being exploited in the wild. These reports prompted us to do an analysis of CVE-2022-21882, which turned out to be a vulnerability in the Win32k.sys user-mode callback function xxxClientAllocWindowClassExtraBytes.
In 2021, a very similar vulnerability (CVE-2021-1732) was reported to – and patched by – Microsoft. We decided to take a closer look at both vulnerabilities to better understand the code involved in each. In our initial analysis we wanted to determine why the patch for CVE-2021-1732 was not sufficient to prevent CVE-2022-21882.
This is part one of a series that will cover Win32k internals and exploitation in general using these two vulnerabilities and their related proof-of-concept (PoC) exploits as examples.
Here, we cover a significant amount of background information, including years worth of background research, conducted by several excellent researchers, to get the reader up to date to the latest implementations of Win32k and its associated exploitation methodologies. For an even deeper understanding of the subject, we also recommend reading all of the linked references at the end of the post.
Both vulnerabilities discussed in this series are detected and blocked by the Cortex XDR Anti-LPE protection module. Both vulnerabilities are data-only exploits that copy the NT/Authority System privilege token to that of the current (exploit) process for privilege escalation. The XDR Anti-LPE modules monitor for this specific type of privilege escalation technique.
Related Unit 42 Topics | Microsoft Windows, CVE-2021-1732, CVE-2022-21882 |
Introduction to CVE-2022-1732 and CVE-2022-21882
Win32k – History and Background
Basic Windows GUI API Background
Creating a Window
Window Messages and Window Procedures
Window Structures
User-mode Callbacks
Conclusion
Additional Resources
Quite a lot has been written on Windows development via the Win32 API, and Windows internals. However, in our experience, very few security-related sources cover development via the graphical user interface (GUI) or its underlying internals. This interface is implemented within Win32k.sys, Win32kbase.sys and Win32kfull.sys.
Therefore, we decided to do some research to get better acquainted with the Windows GUI internals and associated APIs. We read a few whitepapers on Win32k exploitation written over the past 10 years or so, as well as the Microsoft Developer Network (MSDN) documentation on the Win32 API.
We don’t want to assume any level of expertise with the underlying code involved. Therefore, to ease the understanding of the two vulnerabilities being analyzed as well as other Win32k.sys vulnerabilities and exploits, in our first post we will cover some background on the relevant APIs, objects and data structures involved.
Because exploitation of these vulnerabilities and the patch bypass method was relatively easy to understand, we chose these two recent examples to walk through some of the Win32k internals to help people understand how they are commonly leveraged to obtain read/write primitives. It also provides us with a good opportunity to discuss common Win32k exploit targets (user-mode callbacks) within the Win32k.sys codebase.
Prior to Windows NT 4.0, Microsoft implemented the GUI functionalities of the Win32 API within a user-mode process called the Client-Server Runtime SubSystem (CSRSS.exe). However, context switches between user-mode and kernel-mode were computationally expensive and required large memory overhead.
To eliminate these issues and speed up the overall Windows operating system, Microsoft decided to move the Windows subsystem (Window Manager, GDI and graphics drivers) to the kernel. This transition started with Windows NT 4.0 in 1996.
This change was implemented through a kernel-mode driver called Win32k.sys, in what is now known as the kernel-mode Windows subsystem. The user-mode component of the Windows subsystem still resides within CSRSS.
Although the move to the kernel greatly reduced the overhead required, Microsoft had to resort to some old tricks, such as caching management data structures within the user-mode portion of the client’s address space. In fact, to further avoid context switches, some management structures have historically been stored exclusively in user-mode. However, in an effort to eliminate kernel address leaks, Microsoft has started to implement methods that use user-mode and kernel-mode copies of these structures to prevent kernel addresses from being stored in user-mode structures.
Additionally, because Win32k needed a way to access these user-mode structures and support some existing user-mode functionality such as window hooking, user-mode callbacks were implemented to facilitate these tasks.
“User-mode callbacks allow Win32k to make calls back into user-mode and perform tasks such as invoking application-defined hooks, providing event notifications, and copying data to/from user-mode,” Tarjei Mandt wrote in a detailed whitepaper. His research was also presented at Black Hat USA in 2011. In doing so, he demonstrated the challenges Microsoft faced in implementing user-mode callbacks and preserving data integrity.
Mandt demonstrated that many objects were not being properly locked before making user-mode callbacks, which allowed user-mode code to destroy these objects during the user-mode callback, resulting in Use-After-Free (UAF) vulnerabilities. Although Microsoft has addressed many of the issues Mandt pointed out in 2011, user-mode callbacks are still abused today.
Inspired by Mandt’s research, in 2019 Gil Dabah wrote a paper building upon Mandt’s research. He discovered that even if user-mode code destroys objects that are correctly locked during user-mode callbacks, the destroyed objects can have secondary effects on other objects that are not locked correctly. This activity resulted in secondary object destruction and further UAF vulnerabilities.
Before we discuss Win32k internals, we will briefly cover a simple C program that creates and destroys a window using the Win32 API. This will allow us to begin to understand how graphics windows are programmatically created and manipulated. It will also allow us to examine the underlying structures that define each window and their menus.
We’ll be referring to the sample code in Figures 1-3 below to discuss the basics of window creation and the underlying structures used to define windows and menus. Comments have been added to the sample code to make it as understandable as possible.
As shown in Figure 1, the sample program starts off by defining a window class. A process must register a window class before it can create a window of the type defined within the WNDCLASSEX structure. First a window class object is declared WNDCLASSEX wcx = { }, then the window class structure is filled in.
The elements of the window class are as follows:
Now that the attributes of the window class have been defined, we need to register it with the application using RegisterClassEx(), shown in Figure 2 below. On failure, RegisterClassEX() returns 0. Otherwise, it returns a class atom that uniquely identifies the class being registered. Registering the window class defines the class and its associated structure members to Windows.
Once the window has been registered, we can create an instance of the window class by calling CreateWindowExA(), shown in Figure 3 below.
The arguments of CreateWindowEX are as shown in Figure 4.
A brief description of each argument is listed below:
Once the window has been created with the call to CreateWindowEx(), the window has been created internally – which is to say, memory has been allocated and its structures populated – but not shown. To display the window, we call the ShowWindow() function.
ShowWindow() takes the handle obtained from the call to CreateWindowEXW() and the state variable nCmdShow, obtained from WinMain(). The nCmdShow variable determines how the window will be displayed on screen, whether it’s normal, maximized or minimized, for example.
ShowWindow() only controls how the application window is displayed. This includes elements such as a title bar, a menu bar, the window menu, the minimize button, etc. The client area is the area where the application displays data, such as where you type text in a text editor. The client area is painted by calling the UpdateWindow() function.
If you specify the WS_VISIBLE window style as the dwStyle parameter to the CreateWindowEXW() function, you do not need to call the ShowWindow() function. This is implied, and Windows will take care of invoking this for you. On a similar note, if you do not specify the WS_VISIBLE style and you also do not call the ShowWindow() function, the window will remain hidden from view.
After the call to UpdateWindow(), the window is fully visible and ready for use. When writing a simpler console application for Windows, the application makes explicit function calls in response to user input from the console.
In a windowed application, a user can typically interact with the application by entering text, clicking through buttons and menus, or just by moving the mouse. Each of these actions have their own special functionality. To make this work, Microsoft implemented an event driven system that relays messages from user input (e.g., keyboard, mouse or touch) to the various windows in each application. These messages are handled by a function within each window, known as the window procedure.
Windows maintains a message queue for each thread, which will relay any user input event that affects the state of the window. Windows then translates these events into messages and places them into the message queue. The application processes these messages by executing the code similar to that in Figure 5 below.
The GetMessage() function retrieves the next message from the message queue. The MSG parameter is a structure that contains the message information required for the assigned window procedure to properly handle the message.
Among the members of the MSG structure are a handle (hwnd) to the window whose window procedure receives the message, and a message that contains an identifier that determines what request is being asked of the window procedure. For example, if the message contains a WM_PAINT message, it tells the window procedure that the window’s client area has changed and must be repainted.
The TranslateMessage() function translates virtual-key messages into character messages, but this is not important for the current discussion. The DispatchMessage() sends the message to the window, identified by the window handle in the msg structure, to be handled by the window procedure defined by that window class.
Up to this point, the example code has accomplished the overhead of defining the window class by performing the following actions:
It’s the window procedure that determines what is displayed and how it responds to user input. Windows provides a default window procedure to handle any window messages that an application does not process, and it provides the minimal functionality for any window to function properly.
The window procedure is where all the functionality of the window is defined and, as one might guess, they can be quite involved. For our purposes we are currently only interested in Microsoft’s default window procedure, DefWindowProc().
As mentioned earlier, Windows now manages GUI objects such as menus, windows, etc., in the kernel via Win32k.sys. When a window object is created, its properties are tracked within a data structure known as tagWND.
Unfortunately, Microsoft removed many of the Win32k debugging symbols, making it much more difficult to gain transparency into these structures. Based on some reverse engineering, Figure 6 shows what the structure looks like in Windows 10 version 21H1.
This is not a comprehensive list of members, but only those important to this discussion. Looking at HMAllocObject during a call to xxxCreateWindowEx, where the allocation of the structure occurs, we can confirm this structure is 0x150 (336) bytes in size.
The WinDbg output just prior to the call to HMAllocObject is shown in Figure 7. You can see the fourth argument, which represents the allocation size, is stored in the r9 register and is equal to 0x150.
The tagWND structure, shown in Figure 6, used to be referenced within the Win32ClientInfo entry of the Thread Environment Block (TEB). It has since been removed to prevent easy leaking of kernel-mode addresses.
The first entry in the kernel tagWND structure is the window handle. Each window will have a representative tagWND structure associated with it in the kernel.
During the analysis of CVE-2022-21882, this structure will be important, but for now, we’ll focus on offset 0x28. I’ve labeled it as *pWND because Microsoft no longer supplies symbols. Additionally, Microsoft no longer provides a name for this structure — however, in the past it has been referred to as state or WW. According to Microsoft, these names are deprecated and no longer used internally.
What is important to know about this pointer is that it’s a user-mode version of the tagWND data that does not include kernel addresses, and it is structured differently than its parent tagWND structure. This child structure exists both in the kernel as well as in user-mode. This is how Windows manages the data in an attempt to avoid leaking kernel addresses, because any user-mode application will work with the copy of the tagWND structure located on the user-mode desktop heap and hence will not be able to see any kernel-mode addresses.
I will continue to refer to the child structure as a tagWND structure. Be aware, it is structured differently (as shown in Figure 8, below) than the parent tagWND structure above, but is still commonly referred to as tagWND in other research blogs and papers.
The child tagWND structure is shown in Figure 8, and the elements and their offsets were confirmed through reverse engineering. The gaps were not analyzed and are not important for this discussion.
Many of the elements of the WNDCLASSEX structure discussed in the section about creating a window can be seen within the tagWND structure. Therefore, it’s pretty clear that when a window is created, the properties assigned via the WNDCLASSEX structure are passed to the kernel and stored within the tagWND structures. The properties are then propagated to the user copies in both the kernel and user-mode desktop heaps.
Figures 9 and 10 show both the parent tagWND and the kernel copy of the user-mode safe tagWND structures respectively.
The Figure above is the parent tagWND, and you can see that the handle (offset 0x00) is the same as that of the copy tagWND below. You can also see that the parent structure has kernel addresses, while the user-mode safe copy only has user-mode addresses. Lastly, notice the parent tagWND+0x28 is a pointer to the child tagWND copy’s address.
Historically, there have been a few methods to leak the kernel mode addresses of window objects. All objects within Win32k that store properties set by user-mode code (e.g., windows, menus) are commonly referred to as user objects.
All user objects, the user-mode copy of the tagWND structure being one of many, are indexed within a per-session handle table commonly known as the UserHandleTable. (Though the tagWND structure wasn’t always a user-mode safe copy and did once contain kernel addresses.)
It used to be possible to locate tagWND objects via the UserHandleTable through an exportable structure within User32.dll, called gSharedInfo. This is no longer possible as of Windows 10 version 1703. Due to Microsoft’s continued efforts to eliminate kernel address leaks, they have removed the kernel addresses of objects within the desktop heap from the UserHandleTable.
The Window Manager validates handles with an non-exported function, HMValidateHandle, which is located in User32.dll. Prior to Windows 10 version 1803, the Window Manager returned the kernel-mode pointer to the object whose handle was to be validated and it was commonly used to leak this address. Even though the kernel address leak has been fixed, this method will be important when we look at the two vulnerabilities later.
The reason exploit writers are so interested in locating tagWND structures is because, historically, you can modify tagWND.cbExtra with a large value to allow for an arbitrary write into an adjacent tagWND structure using the SetWindowLong function. However, as of Windows 10 version 1703, any bytes written by SetWindowLong are no longer written to the kernel. That is except under a specific condition, which will be discussed later during the analysis of the exploit. This fix has effectively killed this technique for creating an arbitrary write.
User objects are stored in one of three types of memory within the kernel: the desktop heap, the shared heap or the session pool. For the purposes of this discussion, we are interested in the desktop heap because the objects we’ll be working with are stored here.
Each desktop gets its own desktop heap. Since Windows stores management structures in the desktop heap, which resides in the kernel, there needs to be a way for user-mode applications to access these structures.
Historically, Windows created a user-mode mapped copy of the desktop heap that contained kernel-mode pointers to relevant structures. Today, these pointers have been replaced with user-mode pointers, indicating that Windows has begun to create an isolated copy of the desktop heap in user-mode to eliminate the disclosure of kernel-mode pointer addresses.
Typically when you are trying to exploit a kernel vulnerability, you need a few things to enable exploitation from user-mode. One of these is a way to determine where objects of interest are located within the kernel in order to get around kernel address space layout randomization (KASLR). Therefore, knowing where the desktop heap is located, as well as the ability to find specific objects of interest within the desktop heap is highly desirable.
In fact, since Windows 10 version 1607, Microsoft started adding mitigations in an attempt to prevent exploit writers from locating the desktop heap and its associated objects in the kernel. These mitigations include removal of kernel addresses from the UserHandleTable, as discussed above, as well as removal of kernel pointer references to the desktop heap within the Win32ClientInfo structure located in each process’ Thread Environment Block (TEB). Additionally, HMValidateHandle now returns user-mode (versus kernel-mode) pointers for any object handles passed to it.
For more information on the history of Microsoft’s Win32k kernel mitigation history, see Morten Schenk’s 2017 Black Hat USA presentation.
We should note that there is an assumption that we are operating under a low-integrity process. If the process is medium-integrity or higher, it is trivial to use API functions such as EnumDeviceDrivers and NtQuerySystemInformation to obtain kernel pointers of interest.
The final thing that needs to be discussed in this first post discussing the background of these PoCs are user-mode callbacks.
Because the windows subsystem is primarily located within the Windows kernel, while the windows themselves operate in user-mode, Win32k must make frequent calls from the kernel into user-mode. User-mode callbacks provide a mechanism to implement items such as application defined hooks, event notifications and copying data to/from the kernel from/to user-mode.
Win32k calls KeUserModeCallback, with the associated ApiNumber of the user-mode function it wants to call, when making a user-mode callback. The ApiNumber is an index into a function table, located within User32.dll (USER32!apfnDispatch). The address of this table is copied to the process environment block (PEB) (PEB.KernelCallbackTable) during initialization of User32.dll within each process.
During the forthcoming analysis of the exploits, we will show how user-mode callbacks are hooked via the KernelCallback table and show what the table looks like in WinDbg. The function prototype of KeUserModeCallback and its associated parameters are shown in Figure 11 below.
The user-mode callback input parameters are passed via the InputBuffer, while the output from the callback function is returned within the OutputBuffer. Upon invoking a system call, ntdll!KiSystemService or ntdll!KiFastCallEntry stores a trap frame (TRAP_FRAME) on the kernel thread stack to save the current thread context and enable the restoration of registers upon returning to user-mode.
To make the transition back to user-mode in a user-mode callback, KeUserModeCallback first copies the InputBuffer to the user-mode stack using the trap frame information held by the thread object. It then creates a new trap frame with EIP set to ntdll!KiUserCallbackDispatcher, replaces the thread object's TrapFrame pointer, and finally calls ntdll!KiServiceExit to return execution to the user-mode callback dispatcher.
Once the user-mode callback has completed, NtCallbackReturn is called to resume execution in the kernel. This function copies the result of the callback back to the original kernel stack and restores the saved trap frame (PreviousTrapFrame) and kernel stack stored in the KERNEL_STACK_CONTROL structure. Before jumping to the location where it previously left off (in ntdll!KiCallUserMode), the kernel callback stack is deleted.
The Window Manager uses the Executive Resource (ERESOURCE) synchronization primitive, as opposed to exclusive locks, when operating on Win32k management structures. The ERESOURCE primitive allows multiple threads to access a shared resource in the cases where each thread is only attempting to read the resource in question. The ERESOURCE primitive is also known as a single writer, multiple readers primitive. Once the ERESOURCE is initialized, threads can acquire an exclusive lock (for writes) using ExAcquireResourceExclusiveLite, or a shared lock (for reads) by calling ExAcquireResourceSharedLite. The thread then releases the resource by calling ExReleaseResourceLite. There is a requirement for normal kernel APCs to be disabled to use the acquire APIs discussed here, and this is done by calling KeEnterCriticalRegion, prior to the acquire call, and KeLeaveCriticalRegion after the release call.
If Win32k didn’t release the resource upon calling a user-mode callback, and that user-mode callback allowed an application to freeze the GUI subsystem, Win32k would not be able to perform other tasks while the GUI subsystem was frozen. Therefore, Win32k always releases the resource upon calling a user-mode callback. The code in Figure 14 below demonstrates how this occurs.
This practice creates a dilemma. Because user-mode code is free to do things like modify the properties of objects and reallocate arrays, upon returning from a user-mode callback, Win32k must ensure that referenced objects are still in an untrusted state. Operating on such objects without performing the proper checks or object locking can and does create security vulnerabilities.
In fact, Tarjei Mandt’s 2011 paper, "Kernel Attacks Through User-Mode Callbacks," identified multiple instances of these types of vulnerabilities, and was the basis for research on Win32k exploitation for many years. Microsoft subsequently reviewed the Windows user-mode callback functions to ensure proper verification or locking of objects, making it significantly more difficult to exploit this class of bugs.
In 2019, Gil Dabah demonstrated that, although Microsoft had effectively eliminated Win32k bugs through operations on objects modified through direct calls to user-mode callbacks, it was still possible to create kernel-mode to user-mode states that indirectly modified objects (e.g. destroying a parent window while performing operations on a child window) in order to leverage similar bugs. The vulnerabilities identified were much more complex, making them harder to identify and likely much smaller in number.
User-mode callbacks were deemed important enough to track by Microsoft that they were given special prefixes. User-mode callback function names are preceded with either a xxx or zzz. Those preceded by an xxx leave the critical region and call the user-mode callback, just like we described above. Those that are preceded by a zzz invoke asynchronous or deferred callbacks. We’ll only be concerned with xxx type callbacks for this discussion.
In our first installment of this series, we’ve covered using the Win32 API to create GUI objects such as windows and menus. We covered the user-mode and kernel-mode data structures that are used to manage these objects and how they have changed over the years to help optimize and secure the transition between user-mode and kernel-mode.
In our next post on Tuesday, June 20, we’ll walk through a PoC for CVE-2022-21882 and explain what the code is doing. Finally, we’ll discuss the vulnerability, how it’s used to elevate privileges, and why the patch for CVE-2021-1732 wasn’t sufficient to prevent CVE-2022021882.
Both vulnerabilities discussed in this series are detected and blocked by the Cortex XDR Anti-LPE protection module. The XDR Anti-LPE modules monitor for techniques such as these that are data-only exploits, which copy the NT/Authority System privilege token to that of the current process for privilege escalation.
Sign up to receive the latest news, cyber threat intelligence and research from us