Reading Time: 6 minutes
It’s no secret that, since the beginning of the year, I’ve spent a good amount of time learning how to fuzz different Windows software, triaging crashes, filling CVE forms, writing harnesses and custom tools to aid in the process.
Today I would like to sneak peek into my high-level process of designing a Homemade Fuzzing Platform, which I’m currently using to fuzz some targets leveraging millions of samples; how I did it, what setup and technologies I’ve used etc…
This blog post is a re-post of the original article “Homemade Fuzzing Platform Recipe” that I have written for Yarix on YLabs.
As a target, I’ve chosen a popular music player (which results and crashes I’m going to disclose in a future blog post) and I’ve decided to focus solely on the MP3 file format as input vector.
I had roughly 20 million MP3 samples, each iteration (launching the target application, reproducing the MP3 sample, terminating the application) took between 3 and 4 seconds. I was not able to use multithreading as no more than a single instance of the target application can run at the same time. It was immediately clear that with a single machine (instance from now on) I would not be able to cut the chase as 20 M samples x 4 seconds =~ 30 months of execution time.
From that time constraint, it was obvious that I would have needed something more powerful in order to exhaust the space of the samples in a timely human manner.
When designing the platform, I’ve used to handle the fuzzing process, the following requirements were taken into considerations:
All the project is based on an HP ProLiant DL380 Gen 10 server with Intel Xeon 4214R CPU @2.40 GHz and 128 GB of RAM.
On top of the bare metal server, I’ve installed VMWare ESXI. VMware will allow for the needed scalability, deploying multiple VMs when increased fuzzing throughput capacity is needed.
Maestro (the instances’ orchestrator) was developed over the classical XAMP stack (Windows, Apache, PHP, MySQL) mainly because I’m familiar with it as well as the team. Apache and PHP should provide enough capabilities and performance given the fact that Maestro is mainly a GUI allowing users to:
Maestro-FFF is the Python software running on the deployed instances, it is responsible for:
I’ve chosen Python mainly because I’m familiar with it and because it was easy to implement, from scratch, a multithreaded application able to support different operative systems.
While the main “body” is developed in Python, the fault detection mechanism is a C executable that leverage the libdasm library to detect and report crashes.
It reports back to Maestro-FFF the occurred exception, faulting memory address and instruction as well as it dumps the status of the registers.
While it is a good point to start from, every crash must be loaded into WinDbg at a later time, leveraging its “!exploitable” plugin, the plugin output will be used to identify unique crashes and then crashes will be triaged to determine their exploitability.
I’ve deployed 30 instances of Maestro-FFF that exhausted the space of the fuzzing corpus in a month of continuous work.
Among all the different roadblocks I’ve encountered during this project, the absence of a working logging functionality in Python was the most problematic; while the logging module is awesome, it lacks support on Windows when multithreading/multiprocessing is used. Even if I’ve resolved the problem in the hackiest way possible (acquiring a global lock on the file in a thread and releasing it once it finished), I’m still hoping that I’ve missed something in the documentation or some non-standard python modules. So, if you know a Python multiprocessing/multithreading logging module compatible with Windows, please let me know!
While I’ve started my career as a web developer and later transitioned into the cyber security field, I’ve definitely jumped out of my comfort zone glueing all these different pieces of software and components together. I had to employ skills spanning many different IT fields and where I’m not proficient. While it’s unarguably clear to me that some things could be improved (or completely rethought and/or refactored to a more professional level, using modern and more performing technologies), I’ll recommend everyone taking this journey and trying to write some homemade fuzzers for themselves as the number of pieces needed to complete this project are vast but provide an enormous wealth of experience.
In the next blog post I’ll disclose the fuzzed target, Maestro/Maestro-FFF metrics, triage process, crashes, CVEs, and exploit PoC.