gold --detect-odr-violations
The gold linker supports an option --detect-odr-violations
to detect ODR violation based on debug information. This option considers symbols starting with _Z
and finds two STB_WEAK
definitions with different st_size
or different st_type
values. These symbols are candidates of ODR violation.
gold parses DWARF line tables. For a candidate, if both its definitions have associated line table information (if any definition does not have debug info, no warning) and disjoint file:line sets. If yes, gold issues a warning.
The check uses source locations as a proxy as an ODR violation. The proxy is usually good but not precisely ODR violation. The first line of a function may change among relocatable object files due to different optimization behaviors. And we may see spurious ODR violations. The check does not find differing class definitions and templates.
The feature is not implemented in other linkers. See "Future direction" for a probably better alternative.
ODR hash
In 2017, Clang implemented an AST-based ODR hash feature. Each definition is given a hash value. When definitions are merged, the hash values are compared and an error is reported if mismatching.
This feature works with both Clang header modules and C++ modules.
1 | echo 'module B { header "B.h" } module C { header "C.h" }' > module.modulemap |
-fmodules
implies -fimplicit-modules
to load module.modulemap
. The two #include
directives are translated to module loads. When foo
in B and C are merged, an error is issued.
1 | In module 'C' imported from A.cc:2: |
Let's see an example of C++ modules.
1 | echo 'import B; import C; int main() { return foo(); }' > A.cc |
1 | In file included from A.cc:1: |
AddressSanitizer detect_odr_violation
For an instrumented translation unit, there is a global constructor which calls __asan_register_globals
to register some types of global variables (non-thread-local, defined, external/private/internal
LLVM linkage, and a few other conditions). This can be used to check whether two global variables of the same name are defined in different modules. AddressSanitizer does not check functions (the more interesting case).
Poisoning based detection
The runtime poisons the red zone of a to-be-registered global variable (compiler-rt/lib/asan/asan_globals.cpp
). If the variable was poisoned when attempting a registration, it means that the variable has been registered by another component. The runtime will report an ODR violation error.
1 | echo 'int var; int main() { return var; }' > a.cc |
1 | % ./a |
The default detect_odr_violation=2
mode additionally disallows symbol interposition on variables. Change long
in b.cc
to int
and we will still see an odr-violation
error. detect_odr_violation=1
suppresses errors if the registered variable is of the same size.
1 | % ASAN_OPTIONS=detect_odr_violation=1 ./a |
This approach has a drawback when a global variable is defined in a non-instrumented module and an instrumented module, and the linker selects the non-instrumented component.
The variable metadata references the interposable variable symbol. If an instrumented global variable is interposed by an uninstrumented one, the runtime may poison bytes not belonging to the global variable. Since poisoning writes to shadow memory, this is usually benign. However, global variable instrumentation increases the alignment of a global variable (to at least 32) and checks that the metadata-referenced variable symbol has an alignment of at least shadow granularity (8). If the referenced variable symbol resolves to a non-instrumented module, the alignment check may fail (if the symbol is less aligned) and in this case the runtime reports a bogus odr-violation error as well.
Let's see an example. I add a dummy variable to make var
not aligned by 8 in a.o
(no guarantee but working in practice).
1 | echo 'char pad, var; int main() { return pad + var; }' > a.cc |
1 | % ./a |
ODR indicator
http://reviews.llvm.org/D15642 introduced a new mode: for a variable var
, a one-byte variable __odr_asan_gen_var
is created with the original linkage (essentially only external
). If var
is defined in two instrumented modules, their __odr_asan_gen_var
symbols reference to the same copy due to symbol interposition. When registering var
, set the associated __odr_asan_gen_var
to 1. The runtime checks whether __odr_asan_gen_var
is already 1, and if yes, the variable has an ODR violation.
To prevent the metadata-referenced symbol from interposed to another component, create a private alias for var
to be referenced in the metadata. This ensures that the metadata refers to the self copy.
1 | echo 'int var; int main() { return var; }' > a.cc |
For Clang 16, I landed https://reviews.llvm.org/D137227 to use -fsanitize-address-use-odr-indicator
by default for non-Windows targets.
KCFI
Clang has recently implemented a forward-edge control flow integrity instrumentation which does not require link-time optimization: KCFI. One side product of this feature is related to ODR violation detection.
For an address-taken function, a weak absolute symbol __kcfi_typeid_<function>
is defined. The symbol is weak. But imagine we use a STB_GLOBAL
symbol, a linker can find differing values. GNU ld has a hack that duplicate absolute definitions do not trigger an error and ld.lld has ported the behavior. While such a scheme would work, using magic symbols is not proper usage of a linker and I would object to such an attempt.
Future direction
As Clang has implemented the heavylifting work of ODR hashes, we can implement a feature to collect the hashes into a custom section. We can change lld to scan this section and find differing values.