Introduction
Process Injection is a popular technique used by Red Teams and threat actors for defense evasion, privilege escalation, and other interesting use cases. At the time of this publishing, MITRE ATT&CK includes 12 (remote) process injection sub-techniques. Of course, there are numerous other examples as well as various and sundry derivatives.
Recently, I was researching remote process injection and looking for a few under-the-radar techniques that were either not documented well and/or contained minimalist core requirements for functionality. Although the classic recipe of VirtualAllocEx() -> WriteProcessMemory() -> CreateRemoteThread() is a stable option, there is just way too much scrutiny by EDR products to effectively use such a combination in a minimalist fashion.
In this post, we’ll explore a couple of entry point process injection techniques that do not require explicit memory allocation or direct use of methods that create threads or manipulate thread contexts.
AddressOfEntryPoint Process Injection
Repeat after me: when in doubt, go to Red Team Notes for a solution. This is where I came across this great write-up by @spotheplanet that showcases how to leverage the AddressOfEntryPoint relative virtual address for code injection.
When a Portable Executable (PE) is loaded into memory, the AddressOfEntryPoint is the address of the entry point relative to the image base (Microsoft Learn). In a PE exe file/image, the AddressOfEntryPoint field is located in the Optional Header:
Abusing the AddressOfEntryPoint field is not an entirely new concept. Although not always functional in implementation, the AddressOfEntryPoint field can be stomped and overwritten with shellcode in an arbitrary PE file to load the injected shellcode at program start (as demonstrated here). Interestingly, the technique is also achievable in the context of a remote process.
When a process is created, the first two modules loaded into memory are the program image and ntdll.dll. When a process is created in a suspended state, the only two modules loaded are the program image and ntdll.dll:
Essentially, the Operating System does just enough bootstrapping to load the bare essentials, however, the AddressOfEntryPoint is not yet called to begin formal program execution. So, you may be asking…how does one find the AddressOfEntryPoint in a suspended process to inject code?
Following the Red Team Notes write-up, the process is summarized as follows:
- Obtain the target image PEB address and pointer to the image base of the remote process via NtQueryInformationProcess().
- Obtain the target process image base address as derived from the PEB offset via ReadProcessMemory().
- Read and capture the target process image headers via ReadProcessMemory().
- Get a pointer to the AddressOfEntryPoint address within the target process optional header
- Overwrite the AddressOfEntryPoint with desired shellcode via WriteProcessMemory()
- Resume the process (primary thread) from a suspended state via ResumeThread()
Using the sample code provided, our shellcode is successfully injected and executed in the remote process:
Note: For a 64-bit code example of this technique, check out this GitHub project by Tim White.
‘ThreadQuery’ Process Injection
Maybe not as well known as NtQueryInformationProcess(), a similar-in-name method exported from ntdll.dll is NtQueryInformationThread():
While reading the Microsoft documentation for this function, a statement in the ThreadInformationClass parameter section stuck out:
“If this parameter is the ThreadQuerySetWin32StartAddress value of the THREADINFOCLASS enumeration, the function returns the start address of the thread”
Microsoft Docs
Although very interesting, information about the THREADINFOCLASS enum was not readily accessible on the Microsoft site. However, a quick Google search leads us to the ProcessHacker GitHub repo page containing a definition for the enum:
As shown in the previous image, a lot of information can be pulled from THREADINFOCLASS. For our purposes, we are most interested in obtaining a pointer to ThreadQuerySetWin32StartAddress. If we take what we already know about a suspended state process, the program entry point address has not been called (yet). So, any process thread address information that is obtained from ThreadQuerySetWin32StartAddress when querying for the primary process thread is likely going to be the address of the program entry point. Let’s explore this assumption…
First, we must figure out how to actually obtain a handle to the primary process thread. Fortunately, this is quite trivial since we start the process with CreateProcess(). The information is readily available as a pointer to the PROCESS_INFORMATION structure. Conveniently, Microsoft states:
[PROCESS_INFORMATION] contains information about a newly created process and its primary thread. It is used with the CreateProcess, CreateProcessAsUser, CreateProcessWithLogonW, or CreateProcessWithTokenW function.
Microsoft Docs
As such, we use NtQueryInformationProcess() to obtain a function pointer to the ThreadQuerySetWin32StartAddress (which is also represented as numerical value 0x09 in the THREADINFOCLASS enum).
Next, we write our shellcode to the address of ThreadQuerySetWin32StartAddress with WriteProcessMemory() and leverage ResumeThread() to resume the thread for launching the shellcode.
Putting it all together, this simple C++ program should accomplish the task (targeting notepad.exe):
#include <stdio.h>
#include <windows.h>
#include <winternl.h>
#pragma comment(lib, "ntdll")
int main()
{
// Embed our shellcode bytes
unsigned char shellcode[]{ 0x56,0x48,0x89, ... };
// Start target process
STARTUPINFOA si;
PROCESS_INFORMATION pi;
CreateProcessA(0, (LPSTR)"c:\\windows\\system32\\notepad.exe", 0, 0, 0, CREATE_SUSPENDED, 0, 0, &si, &pi);
// Get memory address of primary thread
ULONG64 threadAddr = 0;
ULONG retlen = 0;
NtQueryInformationThread(pi.hThread, (THREADINFOCLASS)9, &threadAddr, sizeof(PVOID), &retlen);
printf("Found primary thread start address: %I64x\n", threadAddr);
// Overwrite memory address of thread with our shellcode
WriteProcessMemory(pi.hProcess, (LPVOID)threadAddr, shellcode, sizeof(shellcode), NULL);
// Resume primary thread to execute shellcode
ResumeThread(pi.hThread);
return 0;
}
Once we compile and run the application, it appears everything works as intended.
Before declaring victory, let’s modify our code slightly and analyze the program operation to validate (or debunk) our initial assumption…
ThreadQuerySetWin32StartAddress Analysis
First, we comment out the ResumeThread() call in the program, recompile, and run. This of course, creates the target (notepad.exe) process in a suspended state. We will resume the process in a manual fashion when necessary.
In our program output, NtQueryInformationThread() returns a memory address of 0x7ff6a0ff3f40 when querying for ThreadQuerySetWin32StartAddress:
Analyzing the suspended process in ProcessHacker, we see a single thread pointing to a start address of 0x7ffdaf6a2680.
Once we attach the x64dbg debugger to the suspended program, the program state resumes but the single thread remains suspended. The instruction pointer currently points to the start address of the single thread for execution of the ntdll:RtlUserThreadStart() function.
For clarity, the currently suspended thread is not the primary program thread. Furthermore, the call to RtlUserThreadStart() is actually a part of the initial process start-up and initialization routine.
Moving forward, we manually resume the suspended thread to continue through the remainder of the process initialization, and then add a breakpoint in the debugger for the ThreadQuerySetWin32StartAddress returned memory address (0x7ff6a0ff3f40). When we run the application, the breakpoint hits on the resolved program entry point address:
Stepping through the remainder of the program, the shellcode is successfully executed:
*Note: Overwriting the entry point may result in unstable program functionality (e.g. if the shellcode is large).
Defensive Considerations
- While taking a look at the stack threads, I noticed an interesting method call for _report_securityfailure. This is a feature of VTGuard which “detects an invalid virtual function table which can occur if an exploit is trying to control execution flow via a controlled C++ object in memory”.
Tracing for such stack events and correlating with System/Application/Security-Mitigations Event Log errors may provide an interesting detection opportunity (Please reach out if you have more information on this!)
- The following POC Yara rule may be useful for identifying suspicious PE files that leverage methods associated with entry point process injection:
import "pe"
rule Identify_EntryPoint_Process_Injection
{
meta:
author = "@bohops"
description = "Identify suspicious methods in PE files that may be used for entry point process injection"
strings:
$a = "CreateProcess"
$b = "WriteProcessMemory"
$c = "NtWriteVirtualMemory"
$d = "ResumeThread"
$e = "NtQueryInformationThread"
$f = "NtQueryInformationProcess"
condition:
pe.is_pe and $a and ($b or $c) and $d and ($e or $f)
}
Conclusion
As always, thank you for taking the time to read this post.
-bohops