Over the span of the previous two blog posts in the series, I showed why the majority of Cobalt Strike (CS) BOFs are incompatible with Brute Ratel C4 (BRC4) and what you can do about it. I also presented CS2BR itself: it’s a tool that makes patching BOFs to be compatible with BRC4 a breeze. However, we also found some limitations to CS2BR’s current approach.
In this (final?) post in this series, we’ll take a look at one of CS2BR’s shortcomings: its reliance on source-code for patching. We’ll see how this can be resolved and – spoiler alert – why we couldn’t (yet!) but decided to pull the plug on it. That’s right: this blog post won’t present a fancy new solution but the challenges you’ll encounter when you go down this rabbit hole.
This post will get a bit more technical than its predecessors. Don’t worry though, I’ll try my best not to get lost in itty-bitty details. So feel free to grab a coffee and prepare for a journey into the wonderful world of object files, how you can mess with them, and what I did to them.
When I finished work on CS2BR’s source code patching, I realized that there were two major issues with it that caused me headaches and that I wasn’t happy with:
While the first issue is just somewhat awkward, the second one can be a real showstopper in some cases:
So wouldn’t it be great if we didn’t need access to source code? And wouldn’t it be cool to avoid recompilation of BOFs? There surely has to be a way to do this, right?
Since BOFs are object files (hence the name, beacon object files) and CS2BR’s compatibility layer can be compiled into an object file, we might just be able to merge both of them into a single object file.
And indeed, it appears that you can merge object files using ld
, the GNU linker:
ld --relocatable cs2br.o bof.o -o brc4bof.o
That’s the basic premise. Before we continue with the details of this idea, let’s have a brief look at the “Common Object File Format” (COFF) that our object files come in.
At their core, object files are an intermediate format of executables: they contain compiled code and data but aren’t directly executable. Here’s the source code of a simple BRC4 BOF that prints its input arguments:
#include "badger_exports.h" void coffee(char** argv, int argc, WCHAR** dispatch) { int i = 0; for (; i < argc; i++) { BadgerDispatch(dispatch, "Arg #%i: \"%s\"\n", (i+1), argv[i]); } }
This can be compiled into a COFF file using a C compiler such as mingw
on a Linux machine:
$ x86_64-w64-mingw32-gcc -o minimal.o -c minimal.c $ file minimal.o minimal.o: Intel amd64 COFF object file, no line number info, not stripped, 7 sections, symbol offset=0x216, 19 symbols, 1st section name ".text"
We can get detailed information about the compiled object file using the nd
or objdump
utilities:
objdump
output$ objdump -x minimal.o
minimal.o: file format pe-x86-64
minimal.o
architecture: i386:x86-64, flags 0x00000039:
HAS_RELOC, HAS_DEBUG, HAS_SYMS, HAS_LOCALS
start address 0x0000000000000000
Characteristics 0x4
line numbers stripped
Time/Date Wed Dec 31 19:00:00 1969
Magic 0000
MajorLinkerVersion 0
MinorLinkerVersion 0
SizeOfCode 0000000000000000
SizeOfInitializedData 0000000000000000
SizeOfUninitializedData 0000000000000000
AddressOfEntryPoint 0000000000000000
BaseOfCode 0000000000000000
ImageBase 0000000000000000
SectionAlignment 00000000
FileAlignment 00000000
MajorOSystemVersion 0
MinorOSystemVersion 0
MajorImageVersion 0
MinorImageVersion 0
MajorSubsystemVersion 0
MinorSubsystemVersion 0
Win32Version 00000000
SizeOfImage 00000000
SizeOfHeaders 00000000
CheckSum 00000000
Subsystem 00000000 (unspecified)
DllCharacteristics 00000000
SizeOfStackReserve 0000000000000000
SizeOfStackCommit 0000000000000000
SizeOfHeapReserve 0000000000000000
SizeOfHeapCommit 0000000000000000
LoaderFlags 00000000
NumberOfRvaAndSizes 00000000
The Data Directory
Entry 0 0000000000000000 00000000 Export Directory [.edata (or where ever we found it)]
Entry 1 0000000000000000 00000000 Import Directory [parts of .idata]
Entry 2 0000000000000000 00000000 Resource Directory [.rsrc]
Entry 3 0000000000000000 00000000 Exception Directory [.pdata]
Entry 4 0000000000000000 00000000 Security Directory
Entry 5 0000000000000000 00000000 Base Relocation Directory [.reloc]
Entry 6 0000000000000000 00000000 Debug Directory
Entry 7 0000000000000000 00000000 Description Directory
Entry 8 0000000000000000 00000000 Special Directory
Entry 9 0000000000000000 00000000 Thread Storage Directory [.tls]
Entry a 0000000000000000 00000000 Load Configuration Directory
Entry b 0000000000000000 00000000 Bound Import Directory
Entry c 0000000000000000 00000000 Import Address Table Directory
Entry d 0000000000000000 00000000 Delay Import Directory
Entry e 0000000000000000 00000000 CLR Runtime Header
Entry f 0000000000000000 00000000 Reserved
The Function Table (interpreted .pdata section contents)
vma: BeginAddress EndAddress UnwindData
0000000000000000: 0000000000000000 000000000000006a 0000000000000000
Dump of .xdata
0000000000000000 (rva: 00000000): 0000000000000000 - 000000000000006a
Version: 1, Flags: none
Nbr codes: 3, Prologue size: 0x08, Frame offset: 0x0, Frame reg: rbp
pc+0x08: alloc small area: rsp = rsp - 0x30
pc+0x04: FPReg: rbp = rsp + 0x0 (info = 0x0)
pc+0x01: push rbp
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000070 0000000000000000 0000000000000000 0000012c 2**4
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000000 0000000000000000 0000000000000000 00000000 2**4
ALLOC, LOAD, DATA
2 .bss 00000000 0000000000000000 0000000000000000 00000000 2**4
ALLOC
3 .rdata 00000010 0000000000000000 0000000000000000 0000019c 2**4
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .xdata 0000000c 0000000000000000 0000000000000000 000001ac 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .pdata 0000000c 0000000000000000 0000000000000000 000001b8 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
6 .rdata$zzz 00000020 0000000000000000 0000000000000000 000001c4 2**4
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
[ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x0000000000000000 minimal.c
File
[ 2](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 1) 0x0000000000000000 coffee
AUX tagndx 0 ttlsiz 0x0 lnnos 0 next 0
[ 4](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x6a nreloc 2 nlnno 0
[ 6](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 8](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 10](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .rdata
AUX scnlen 0xf nreloc 0 nlnno 0
[ 12](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .xdata
AUX scnlen 0xc nreloc 0 nlnno 0
[ 14](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .pdata
AUX scnlen 0xc nreloc 3 nlnno 0
[ 16](sec 7)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .rdata$zzz
AUX scnlen 0x17 nreloc 0 nlnno 0
[ 18](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000000 __imp_BadgerDispatch
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000046 IMAGE_REL_AMD64_REL32 .rdata
0000000000000050 IMAGE_REL_AMD64_REL32 __imp_BadgerDispatch
RELOCATION RECORDS FOR [.pdata]:
OFFSET TYPE VALUE
0000000000000000 IMAGE_REL_AMD64_ADDR32NB .text
0000000000000004 IMAGE_REL_AMD64_ADDR32NB .text
0000000000000008 IMAGE_REL_AMD64_ADDR32NB .xdata
This gave us quite a lot of information of which I’d like to highlight and unpack three particularly important bits:
.text
section with the CODE
flag set will usually contain compiled code whereas the .rdata
section with the READONLY
and DATA
flags will contain readonly data (such as strings used in the application)..text
), functions (e.g. coffee
) and imports (e.g. __imp_BadgerDispatch
).BadgerDispatch
call references the string Arg #%i: \"%s\"\n
which is located in the .rdata
section. The first relocation entry in the .text
section indicates that at offset 0x46
into the .text
section, there is a reference to the .rdata
symbol (which points to the .rdata
section), which needs to be resolved as a relative address.objdump
$ objdump -s -j .rdata minimal.o
minimal.o: file format pe-x86-64
Contents of section .rdata:
0000 41726720 2325693a 20222573 220a0000 Arg #%i: "%s"...
$ objdump -S -j .text minimal.o
minimal.o: file format pe-x86-64
Disassembly of section .text:
0000000000000000 <coffee>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 30 sub $0x30,%rsp
8: 48 89 4d 10 mov %rcx,0x10(%rbp)
c: 89 55 18 mov %edx,0x18(%rbp)
f: 4c 89 45 20 mov %r8,0x20(%rbp)
13: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
1a: eb 3e jmp 5a <coffee+0x5a>
1c: 8b 45 fc mov -0x4(%rbp),%eax
1f: 48 98 cltq
21: 48 8d 14 c5 00 00 00 lea 0x0(,%rax,8),%rdx
28: 00
29: 48 8b 45 10 mov 0x10(%rbp),%rax
2d: 48 01 d0 add %rdx,%rax
30: 48 8b 10 mov (%rax),%rdx
33: 8b 45 fc mov -0x4(%rbp),%eax
36: 8d 48 01 lea 0x1(%rax),%ecx
39: 48 8b 45 20 mov 0x20(%rbp),%rax
3d: 49 89 d1 mov %rdx,%r9
40: 41 89 c8 mov %ecx,%r8d
43: 48 8d 15 00 00 00 00 lea 0x0(%rip),%rdx # 4a <coffee+0x4a>
4a: 48 89 c1 mov %rax,%rcx
4d: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 54 <coffee+0x54>
54: ff d0 call *%rax
56: 83 45 fc 01 addl $0x1,-0x4(%rbp)
5a: 8b 45 fc mov -0x4(%rbp),%eax
5d: 3b 45 18 cmp 0x18(%rbp),%eax
60: 7c ba jl 1c <coffee+0x1c>
62: 90 nop
63: 90 nop
64: 48 83 c4 30 add $0x30,%rsp
68: 5d pop %rbp
69: c3 ret
6a: 90 nop
6b: 90 nop
6c: 90 nop
6d: 90 nop
6e: 90 nop
6f: 90 nop
These are the central parts of COFF files that are relevant to this blog post.
Continuing with the idea of merging object files, it turns out that it’s not just going to be a simple ld
. Let’s compare a regular BOF in Cobalt Strike to a CS2BR BOF in BRC4:
Pictured above is a regular CS BOF: It resides in a beacon, is executed via its go
entrypoint and can make use of serveral CS BOF APIs. In order to execute the BOF, the beacon acts as a linker: it maps the BOF’s sections into memory, resolves CS BOF API imports to the beacon’s internal implementations and resolves relocations. That’s the regular flow of things.
Here’s how the general CS2BR approach works: it provides the CS BOF APIs as part of its compatibility layer. This layer in turn uses the BRC4 BOF APIs which are implemented in the BRC4 badger. From our perspective, a badger loads & executes a BOF similar to how CS does.
When we patch a BOF’s source code via CS2BR and compile it afterwards, the coffee
entrypoint will be included in the BOF and able to invoke the original go
entrypoint (*). Additionally, calls to the CS BOF API will be “rerouted” to CS2BR’s compatibility layer (*). When both BOF and the CS2BR compatibility layer are compiled separately though, we need to ensure that those two connections are made when we merge the object files. For simplicity’s sake, let’s refer to the compiled CS BOF as bof.o
and to the compiled CS2BR compatibility layer as cs2br.o
:
coffee
entrypoint in cs2br.o
needs to reference the go
entrypoint in bof.o
. When the files are merged, this reference must be resolved.bof.o
must be “re-wired” so they don’t reference imports but cs2br.o
‘s implementations instead.Well, this doesn’t sound super complicated, does it?
Now it’s only a matter of putting everything together. We’ll start with the entrypoint:
In order to reference bof.o
‘s go
entrypoint from cs2br.o
, we can leverage the fact that such operations are precisely what object files and linkers are great at accomplishing: by defining go
as an external symbol in cs2br.o
, a linker will resolve it when also supplying it with bof.o
which provides this exact symbol. So here’s the single line we add to CS2BR’s badger_stub.c
that contains our custom coffee
entrypoint:
extern void go(void *, int);
Now, when we compile CS2BR’s entrypoint in badger_stub.c
and its compatibility layer beacon_wrapper.h
, we observe the resulting cs2br.o
‘s symbols. Also, for comparison, let’s also inspect bof.o
‘s symbols:
$ objdump -x cs2br.o | grep go [ 52](sec 0)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x0000000000000000 go $ objdump -x bof.o | grep go [ 2](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 1) 0x0000000000000000 go
We can use Microsoft’s documentation on the PE format (which also covers COFF) to better understand what those entries mean:
sec
: “The signed integer that identifies the section, using a one-based index into the section table. Some values have special meaning […].”
0
(IMAGE_SYM_UNDEFINED
): “[…] A value of zero indicates that a reference to an external symbol is defined elsewhere. […]”ty
: “A number that represents type. Microsoft tools set this field to 0x20 (function) or 0x0 (not a function). […]”scl
: “An enumerated value that represents storage class. […]”
2
(IMAGE_SYM_CLASS_EXTERNAL
): “[…] The Value field indicates the size if the section number is IMAGE_SYM_UNDEFINED (0). If the section number is not zero, then the Value field specifies the offset within the section.”Using this information, we can deduct that:
cs2br.o
‘s go
symbol is an external symbol defined elsewhere.bof.o
‘s go
symbol is located in section 1 (.text
) and located right at the start of the section (offset 0
).When we merge them using ld
(ld --relocatable bof.o cs2br.o -o brbof.o --oformat pe-x86-64
) and inspect them in a disassembler like Ghidra, we see that the linking worked as expected and cs2br.o
‘s coffee
actually calls bof.o
‘s go
:
Nice, the first thing is done. This was pretty easy!
In the previous section we declared go
as an external symbol in cs2br.o
‘s source code. This allowed us to have the linker resolve the reference to the supplied bof.o
‘s implementation of go
.
Rewiring the CS BOF API imports of bof.o
to cs2br.o
‘s implementations isn’t as straightforward though. Let’s have a look at the symbols involved:
$ objdump -x cs2br.o | grep BeaconPrintf [ 24](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x00000000000005e1 BeaconPrintf $ objdump -x bof.o | grep BeaconPrintf [ 18](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000000 __imp_BeaconPrintf 0000000000000027 IMAGE_REL_AMD64_REL32 __imp_BeaconPrintf
From this output we learn that:
cs2br.o
exports BeaconPrintf
as a symbol that
.text
)ty 20
)bof.o
exports __imp_BeaconPrintf
as a symbol that
__imp_
prefix, indicating that this function was declared using __declspec(import)
and needs to be imported at runtimeIMAGE_SYM_UNDEFINED
)ty 0
)bof.o
also references __imp_BeaconPrintf
in a relocation in the .text
section. Which makes sense considering that BeaconPrintf
is imported from the CS BOF API and its implementation is not included in the BOF’s source code.The fact that __imp_BeaconPrintf
referes to an import makes it special and more tricky to handle:
Contrary to how cs2br.o
called go
(which was a call to an address relative to the CALL
statement), bof.o
calls BeaconPrintf
by absolute address that is read from the place in memory where __imp_BeaconPrintf
is located. In other words, __imp_BeaconPrintf
is a pointer to the actual implementation and a loader needs to calculate and populate this address at runtime.
If we wanted to make the linker resolve these references in bof.o
like it did with the go
symbol in cs2br.o
above, we would need cs2br.o
to export not the function implementations but pointers to those implementations. Then we’d still need to rename all the imported functions in bof.o
so they don’t have the __imp_
prefix in their names anymore or else a loader might attempt to import them again (and fail doing so).
There are two major challenges to this though:
I’ll cover those next before continuing with the process of merging object files.
As outlined above, there are two major challenges related to the tooling I needed to overcome at this point.
There are lots of COFF parsers out there that allow you to parse existing or create new COFF files. Only very few also allow for modification of existing files though. Since I wanted to stick with Python for the tooling for this project and couldn’t find a suitable solution for my needs, I decided to implement this functionality based on a Python library I programmed in the past: structex.
The idea of structex is that, as a developer, you don’t imperatively write down code to serialize or deserialize individual fields of data structures but instead describe the data structure to your application. The library then does the heavy-lifting and figures out which field is at what offset and does all the (de-)serialization for you. Then you can just have your application map data structures to some binary buffer and access fields of those structures like you access fields in Python classes. Here’s a brief example:
class MachineType(IntEnum): IMAGE_FILE_MACHINE_I386 = 0x014c IMAGE_FILE_MACHINE_IA64 = 0x0200 IMAGE_FILE_MACHINE_AMD64 = 0x8664 class coff_file_header(Struct): # ... _machine: int = Primitive(uint16_t) NumberOfSections: int = Primitive(uint16_t) #... @property def Machine(self) -> MachineType: return MachineType(self._machine) # Load BOF into memory & parse header memory = BufferMemory.from_file('bof.o') bof_header = coff_file_header(memory, 0) print(bof_header.Machine.name) # Prints: IMAGE_FILE_MACHINE_AMD64 bof_header.NumberOfSections = 0 # Write modified BOF back to disk memory.to_file('bof_modified.o') # bof_modified.o has now set its NumberOfSections to 0
All that I needed to do then was write down the data structures used in COFF, add some property
decorators for even easier handling, and implement some bits of custom logic (e.g. reading & modifying the COFF string table). This allowed me to easily parse, inspect and modify any BOF files.
Implants are mainly designed to operate covertly, leave very few traces, and avoid getting noticed (and for that matter, sometimes even actively evade detection). This can make them hard to locate, observe and make sense of – not exactly ideal conditions for debugging. So I went out looking for alternatives.
It’s safe to assume that any program that executes BOFs does that in a way that is somewhat similar to TrustedSec’s COFFLoader. So why not use COFFLoader then? Well, it doesn’t support BRC4’s BOF API. Considering that COFFLoader is open source and the BRC4 API is pretty limited (as shown in our first blog post TODO: Insert link), it wasn’t terribly difficult to implement that functionality. I basically only needed to
InternalFunctions
array to point to the Badger*
APIs,InternalFunctions
,Badger
prefix andSince I didn’t want to spend much time on this, I kept the implementations of the BRC4 APIs very simple and didn’t add any sanity checks (or even proper formatting):
size_t BadgerStrlen(CHAR* buf) { returnstrlen(buf); } size_t BadgerWcslen(WCHAR* buf) { return wcslen(buf); } void* BadgerMemcpy(void* dest, const void* src, size_t len) { return memcpy(dest, src, len); } void* BadgerMemset(void* dest, int val, size_t len) { return memset(dest, val, len); } int BadgerStrcmp(const char* p1, const char* p2) { return strcmp(p1, p2); } int BadgerWcscmp(const wchar_t* s1, const wchar_t* s2) { return wcscmp(s1, s2); } int BadgerAtoi(char* string) { return atoi(string); } PVOID BadgerAlloc(SIZE_T length) { return malloc(length); } VOID BadgerFree(PVOID* memptr) { free(*memptr); } BOOL BadgerSetdebug() { return TRUE; } ULONG BadgerGetBufferSize(PVOID buffer) { return 0; } int BadgerDispatch(WCHAR** dispatch, const char* __format, ...) { va_list args; va_start(args, __format); vprintf(__format, args); va_end(args); } int BadgerDispatchW(WCHAR** dispatch, const WCHAR* __format, ...) { va_list args; va_start(args, __format); vwprintf(__format, args); va_end(args); }
I’m not very familiar with using gbd
for debugging and do most of my coding in Visual Studio and Visual Studio Code and debugging in x64dbg. That’s why I also used this opportunity to set up COFFLoader as a Visual Studio solution. Now I could use COFFLoader to run my BOFs and Visual Studio and x64dbg to debug both COFFLoader and my CS2BR BOFs, neat!
On the matter of actually rewiring CS BOF API imports, there are two things to consider:
Writing this, I realize that all of this sounds pretty abstract, so let’s have a look at an example:
Our
bof.o
sends text back to operators by using theBeaconPrintf
API. Because of that,bof.o
imports the API by defining a__imp_BeaconPrintf
symbol. This symbol refers to a place in memory where a pointer to the actualBeaconPrintf
is stored.
For binary patching in CS2BR this means that we need to overwrite these pointers in bof.o
to the actual implementations somehow so they point to CS2BR’s methods. These pointers are set by the loader (e.g. COFFLoader) though and that’s something we can’t control before or even at compile-time. So the question becomes: How can we make the loader point imports to CS2BR’s methods instead?
After staring at Ghidra, x64dbg, objdump output and my Python source code for more days than I’m comfortable to admit, I worked out a solution to this problem. It consists of some preparations and two processing phases that I’ll further detail in the following paragraphs.
The general idea is pretty simple:
CS2BR defines pointers (prefixed with __cs2br_
) to its compatibility layer’s methods. These pointers will also end up in its symbol table. After merging both object files, the __imp_
symbols (that originated from bof.o
) to CS BOF APIs are replaced with the __cs2br_
symbols (provided by cs2br.o
). This leaves us with symbols that are referenced relative to instructions and contain pointers to our desired CS2BR compatibility layer methods.
Here’s how the complete workflow is implemented in CS2BR:
As described earlier in this blog post, the compiled CS2BR object file needs to contain an external reference to the go
entrypoint. To do so, I just added the a declaration of this method to CS2BR’s the stub: extern void go(void *, int);
This will make ld
correctly resolve this symbol to the BOF’s entrypoint when we merge both object files.
Next, I added pointers to all of the CS BOF APIs implemented in CS2BR’s compatibility layer:
void* __cs2br_BeaconDataParse __attribute__((section(".data"))) = &BeaconDataParse; void* __cs2br_BeaconDataInt __attribute__((section(".data"))) = &BeaconDataInt; void* __cs2br_BeaconDataShort __attribute__((section(".data"))) = &BeaconDataShort; // ...
Before merging object files, CS2BR identifies all CS BOF API import symbols (named __imp_Beacon*
) and reconfigures them:
for symbol_name in cs_patches: symbol = osrcbof.get_symbol_by_name(f"__imp_{symbol_name}") symbol.Value = 0 symbol.SectionNumber = 0 symbol.StorageClass = StorageClassType.IMAGE_SYM_CLASS_EXTERNAL symbol.Name = symbol_name symbol.Name = f"__cs2br_{symbol_name}" symbol._type = 0
This reconfiguration achieves that the symbols are
0
, storage class IMAGE_SYM_CLASS_EXTERNAL
, type 0
) and__imp_*
to __cs2br_*
, which alles ld
to resolve them to cs2br.o
‘s defined symbols upon merging.Then CS2BR renames the symbols of windows APIs that are available to CS BOFs by default (LoadLibrary
, GetModuleHandle
, GetProcAddress
and FreeLibrary
) so they have the __imp_KERNEL32$
prefix. This ensures that, if any of those APIs are used by the BOF, BRC4 imports and links them before executing the BOF.
Both object files (bof.o
and cs2br.o
) are merged using ld
. The resulting object file contains the sections and symbols of both files.
At this point, both COFFLoader and BRC4 should be able to load and execute the patched BOF. Instead, COFFLoader just crashed and BRC4 gave me the silent treatment.
It turned out that the relocations were flawed and presumably not recalculated by ld
. I’ll briefly describe that bug right now, you can skip to my workaround if you want to.
Relocations are a tricky topic. In fact I don’t think I got my head fully wrapped around the topic myself. When I tested my BOFs at that point and saw COFFLoader crashing, I did a lot of manual investigation by debugging COFFLoader and tracing back why it crashed. Let’s have a look at an example:
We’ll execute a very simple BOF that only formats and outputs a string using BeaconPrintf
:
#include <windows.h> #include "beacon.h" VOID go(IN PCHAR Args, IN ULONG Length) { BeaconPrintf(CALLBACK_OUTPUT, "Hi from CS2BR %i\n", 1337); return; }
When executing the BOF in COFFLoader, it would end up executing some data, not actual instructions:
Inspecting the address of RIP in the dump, we can see that RIP lies in the .rdata
section of the BOF as we can clearly see the strings used in cs2br.o
‘s entrypoint:
By restarting and carefully stepping through the program we see that the coffee
entrypoint is invoked correctly, so that bit works just fine:
It also reaches the go
entrypoint:
The next call will fail though. It retrieves the address of the method to call from a pointer (mov rax, qword ptr ds:[7ff45d050068]
) and calls that. Taking a look at the memory dump of the address of the pointer, we see that this is our .data
section:
The 0xDEADBEEFDEADBEF
is a dummy value I made COFFLoader pass to the coffee
entrypoint to use as the _dispatch
variable. CS2BR saves this _dispatch
variable as a global variable in .data
as can be seen in the objdump
output:
$ objdump -x minimal.BR_bin21.o | grep "sec 3" [ 36](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .data ... 0x0000000000000068 __cs2br_BeaconPrintf [ 43](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000050 __cs2br_BeaconFormatPrintf [ 44](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000088 __cs2br_BeaconIsAdmin [ 45](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000000 _dispatch [ 46](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000070 __cs2br_BeaconOutput [ 47](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) ...
As expected, the call fails at this point as it jumps to 0x00007FF45D0705E1
which is just some random offset into a method:
It should be pointing to 0x00007FF45D070621
though, as the .text
section is mapped to 0x00007FF45D070000
and BeaconPrintf
‘s offset into this section is 0x621
. Apparently, the value of the pointer to BeaconPrintf
is a whopping 0x40
bytes short. This left me confused for quite a while. And just by accident, I noticed something in the objdump
output:
Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000dc0 0000000000000000 0000000000000000 000000b4 2**4 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE ... 5 .data 00000200 0000000000000000 0000000000000000 00001380 2**5 CONTENTS, ALLOC, LOAD, RELOC, DATA ... SYMBOL TABLE: ... [ 2](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000000 .text ... [ 28](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x0000000000000621 BeaconPrintf ... [ 34](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x0000000000000040 .text ... [ 42](sec 3)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x0000000000000068 __cs2br_BeaconPrintf ... RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE ... 0000000000000027 IMAGE_REL_AMD64_REL32 __cs2br_BeaconPrintf-0x0000000000000068 ... RELOCATION RECORDS FOR [.data]: OFFSET TYPE VALUE ... 0000000000000068 IMAGE_REL_AMD64_ADDR64 .text-0x0000000000000040 ...
Did you spot it? There are two .text
symbols, of which one has an offset of 0x40
into the .text
section. That same odd symbol is used in relocations of the __cs2br_*
symbols.
The ADDR64 relocations for the entries in .data
could be read as: “Read the relocation’s current value from its offset into .data
(aka its ‘addend’), add to it the absolute address of the .text-0x40
symbol, and write the calculated sum back at the relocation entry’s offset in .data
.” This doesn’t quite work though: these relocations aren’t relative to a symbol but to the section their symbols reside in. Thus, COFFLoader correctly resolves the relocation to the address of the .text
section plus the relocation’s addend 5E1
. We know the relocation’s addend is 5E1
by simply extracting it:
# 5096 is the decimal representation of 1380h (.data offset into the file) + 68h (relocation offset) od -j 5096 -N 8 -t x8 minimal.BR_bin.o 0011750 00000000000005e1 0011760
Here’s where the workaround finally comes into play!
Lastly, CS2BR recalculates relocations that
IMAGE_REL_AMD64_ADDR64
and.text-0x40
).For each of those relocations, it will acquire their current addend, add to it the value of the associated symbol, and write the newly calculated addend back to the image, as can be seen here with the __cs2br_BeaconPrintf
symbol:
[INFO] Pointing relocation .data:0x68 from .text:0x40+0x5e1 (=> __cs2br_BeaconPrintf) to .text:0x621 (=> BeaconPrintf)...
Patching a BOF using cs2br is very simple. One only needs to compile the compatibility layer (cs2br.o
) and supply & run the patchbin.py
script with paths to the BOF file to patch and the cs2br.o
:
Running a BOF that was binary-patched by CS2BR works great in COFFLoader:
Not so much in BRC4 though:
At this point, there wasn’t much I could do. I certainly didn’t feel like putting more work into it and testing against a black box didn’t make much sense.
I did reach out to Chetan Nayak, the developer of BRC4, via Discord a couple of times during the project. Since this was an internal project at the time however, I couldn’t share CS2BR’s source code. Provided with a fully patched binary, they said the entrypoint was not found and never executed by the badger. Apparently, debugging this blob could take a lot of time and they can’t provide support for such BOFs.
This marks the end of my work on this project – for now.
This was one long blog post to write. Working on the tool and debugging BOFs certainly took a long time – I honestly underestimated the effort of documenting all of it in this post though. So, let’s have a look at what CS2BR accomplished:
CS2BR’s source-code patching approach works very well and enables operators to use well-known and battle-tested BOFs that were formerly (almost) exclusive to CS now in BRC4. While it requires access to source code and recompilation of BOFs, it does provide a solid compatibility layer.
In its current iteration, CS2BR is able to patch binary CS BOFs and make them (on paper!) compatible with BRC4. It works well in a modified COFFLoader that provides a simple BRC4 BOF API but doesn’t seem to work with BRC4’s badgers. The reason as to why it doesn’t is a mystery to me. As such, this iteration of CS2BR effectively isn’t usable. Since this is an open-source project, everyone is free to have a look for themselves and maybe someone finds a solution – in which case I would be thrilled to learn all about it!
Both approaches, the source-code and binary patching, make use of the same custom entrypoint which, depending on the exact BOF being executed, requires encoding input parameters with the provided Python script. It would be nice to automate parts of those by parsing the CNA scripts that accompany the BOFs and making use of the BRC4 Ratel Server API to simplify the process.
To me, this project was a rewarding, albeit intense and at times frustrating, journey and deep-dive into BOF development and the COFF format. I certainly learned a lot! To be frank though, the fact that this isn’t a success-story leaves me quite unsatisfied.
I’ll be laying down my work on this project for now and provide support for the source-code patching approach. Maybe some day Chetan finds some time to look into his BOF loader, though, and lets me know what’s wrong with CS2BR’s approach to patching.
Since you made it this far, I can only assume that you are very interested in the topic (or skipped a fair bunch of this blogpost). I would love to know your thoughts on the topic, so please leave a reply!
Moritz Thomas
Moritz is a senior IT security consultant and red teamer at NVISO.
When he isnât infiltrating networks or exfiltrating data, he is usually knees deep in research and development, working on new techniques and tools in red teaming.