Trail of Bits has qualified for the final round of DARPA’s AI Cyber Challenge (AIxCC)! Our Cyber Reasoning System, Buttercup, placed in the top 7 out of 39 teams competing in the semifinal round held at DEF CON 2024.

Competition Overview

The AIxCC semifinal featured a series of challenges based on real-world software, including nginx, Jenkins, Apache Tika, SQLite, and the Linux kernel. Teams’ CRSs had to automatically discover and patch vulnerabilities in these complex codebases within strict time and resource constraints.

DARPA created an elaborate AIxCC Village at DEF CON for the competition. The futuristic cityscape, named “Northbridge” and described as “a futuristic cyber city that is under siege by a hacker with the alias ‘rat,'” served as a backdrop for the high-stakes contest. The AIxCC Village attracted an impressive 12,500 visitors over the course of DEF CON.

The AIxCC stage at DEF CON featured talks from cybersecurity and AI leaders, including Dr. Kathleen Fisher (Director of DARPA’s Information Innovation Office), Heather Adkins (VP of Security Engineering at Google), and industry panels on topics like “The Modern Evolution of LLMs” and “How Competitions Can Fuel Innovation.” These sessions provided valuable context around the competition and its broader implications for cybersecurity.

Buttercup’s Performance

Buttercup performed exceptionally well in the semifinals, particularly in the nginx round where it dominated the achievements leaderboard by:

  • Being first to successfully patch an nginx vulnerability
  • Being first to patch 6 bugs overall
  • Being first to discover 3 bugs

Our CRS seemingly excelled at patching vulnerabilities, which were worth roughly 3x more points than just discovering bugs.

Competition Highlights

The competition used an achievements-based leaderboard that showed which teams were “first to discover” and “first to patch” each vulnerability. This scoring system added an element of mystery to the event, as teams could only see part of the overall picture. While we don’t know the exact final scores or how many teams found the same bugs after the initial discoveries, we’re proud of Buttercup’s strong showing on the achievements board.

Our CEO, Dan Guido, live-tweeted the competition as it unfolded, providing insights and interpreting the achievements for the community.

We’re looking forward to more detailed information about the performance of all the CRSs during the semifinals. This data will undoubtedly provide valuable insights into their strengths and areas for improvement across all competing systems.

We’re honored to advance alongside some of the brightest minds in cybersecurity. The other finalists joining us at DEF CON 2025 are:

  • 42-b3yond-6ug
  • all_you_need_is_fuzzing_brain
  • Lacrosse
  • Shellphish
  • Team Atlanta
  • Theori

Each team has shown exceptional skill in developing AI-powered cybersecurity systems. Notably, Team Atlanta’s CRS discovered a real null dereference bug in SQLite during the competition, demonstrating the potential real-world impact of AIxCC.

Looking Ahead

Advancing to the finals is a major milestone, but our work is far from over. Next year, we’ll refine and enhance Buttercup’s capabilities as we prepare for the final round at DEF CON 2025. The top three teams in the finals will receive major cash prizes, with $4 million going to the winner.

We want to thank our incredible team of engineers who poured their expertise and passion into creating Buttercup. We’re also grateful to DARPA for organizing this groundbreaking competition that is pushing the boundaries of AI-powered cybersecurity.

Stay tuned for more updates as we continue our AIxCC journey. The future of automated vulnerability discovery and remediation is bright, and we’re excited to be at the forefront.

More about AIxCC:

For those interested in learning more about the competition, the AIxCC website features a collection of educational videos, including talks and interviews captured at DEF CON.