In 2021, the Cybersecurity and Infrastructure Security Agency (CISA) began publishing the “Known Exploited Vulnerabilities (KEV) Catalog.” Entries in this catalog are vulnerabilities that have been reported through the Common Vulnerabilities and Exposures (CVE®) program and are observed to be (or have been) actively exploited.
I think this part in particular is important for software engineers and management to be aware of:
CISA recommends that organizations monitor the KEV catalog and use its content to help prioritize remediation activities in their systems to reduce the likelihood of compromise.
Last year I did a similar thing (see also the spreadsheet), with a substantially different classification system but with essentially the same outcome.
Hand-waving away the differences in the bug classification for a moment, we see that MITRE got a similar result as I did: they got 46% memory unsafety (with use-after-free (UAF) leading), while my (incomplete) result was 40% memory unsafety. So we’re in the same ballpark, which is nice.
Of the various kinds of memory unsafety, why should UAF be so prominent in known exploitation in 2023? My take is that for arbitrarily complex object graphs, nothing but heap-walking garbage collection is reliable for achieving temporal safety. (GC; as opposed to less expensive lifetime management approaches like reference counting, arena allocation, and so on.) Temporal safety is much harder in general to fix efficiently than is spatial unsafety.
Moreover, browsers — which by definition must have complex object graphs (due to for example the HTML DOM, JavaScript, and cross-process IPC with entangled object lifetimes) — are naturally a big focus of exploit developers’ attention. For efficiency reasons, browsers tend not to use GC for much of the browser’s own internals. I think that accounts for the prominence of UAF in the 2023 catalog.
Let’s consider MITRE’s classification system, though.
At its core, I find the Common Weaknesses Enumeration taxonomy (CWE) to be trying for more precision than we can get or even need, and that it obscures more than it enlightens. (I find the same is true of the Common Vulnerability Scoring System (CVSS).)
For example, consider the 2nd- and 3rd-most prevalent bug classes, CWE-122 heap-based buffer overflow and CWE-787 out-of-bounds write. It’s not immediately clear what the important differences between these 2 taxa are. After reading their definiitons, I find it even less clear. Were the heap-overflows all reads, not writes? Were the OOB writes all on the stack?
At the high level of analysis we are doing here — that is, helping managers and engineers allocate their time and attention most effectively — the read vs. write distinction matters, but I’m not sure the heap vs. stack (vs. BSS) part is the biggest deal. It matters to exploit developers, but the solutions look similar and have similar costs to develop.
There is also a CWE-788 access of memory location after end of buffer taxon. Where does that fit in?
This is important, because it might be that the 2nd- and 3rd-most significant
categories actually outrank UAF, if you treat them as essentially the same: as
spatial unsafety. That might significantly impact an engineering team’s
cost/benefit analysis: solving temporal safety is very hard (expensive), while
solving spatial safety is typically much easier (cheaper). Consider a
C++ vector
type:
// `InlinedVector::operator[](...)` // // Returns a `reference` to the `i`th element of the inlined vector. reference operator[](size_type i) ABSL_ATTRIBUTE_LIFETIME_BOUND { ABSL_HARDENING_ASSERT(i < size()); return data()[i]; }
If the biggest security problem of 2023 can be solved by sprinkling 1 line of code in the right places in core libraries, that’s a very different story than if the biggest problem requires fancy allocation and deallocation strategies, as solving UAF typically does. Allocation and deallocation necessarily have a much greater impact on software efficiency and development cost than does the spatial safety fix above.
Similarly, the 4th-biggest problem of 2023 in MITRE’s analysis is CWE-20 improper input
validation. As defined, that could be a key contributing factor to all of
the other top 9 problems — and often is. (That
ABSL_HARDENING_ASSERT
above is proper input validation, for
example.)
Along the same lines, type confusion (#8) could lead to spatial unsafety, could be an exploitable outcome of UAF, and could be part of the exploitation of deserialization of untrusted data (#6).
Although CWE is Kind Of A Lot, MITRE’s analysis is, at a high level, correct and useful in directing our work:
eval
bugs
(#5, #6, possibly #8)Our goal should be to get things to where logic and (rarely) configration
bugs are our biggest problems. To get there, we gotta hammer on memory unsafety
and eval
.
Finally, if spatial unsafety really is the current biggest problem, that is great news. Time to go write a bunch of 1-liners, and write tests! 🙂