Clang and GCC 4.9
implemented LeakSanitizer in 2013. LeakSanitizer
(LSan) is a memory leak detector. It intercepts memory allocation
functions and by default detects memory leaks at atexit
time. The implementation is purely in the runtime
(compiler-rt/lib/lsan
) and no instrumentation is
needed.
LSan has very little architecture-specific code and supports many 64-bit targets. Some 32-bit targets (e.g. Linux arm/x86-32) are supported as well, but there may be high false negatives because of the small pointer size. Every supported operating system needs to provide some way to "stop the world".
Usage
LSan can be used in 3 ways.
- Standalone (
-fsanitize=leak
) - AddressSanitizer (
-fsanitize=address
) - HWAddressSanitizer (
-fsanitize=hwaddress
)
The most common way to use LSan is
clang -fsanitize=address
(or GCC). For LSan-supported
targets (#define CAN_SANITIZE_LEAKS 1
), the
AddressSanitizer (ASan) runtime enables LSan by default.
1 | % cat a.c |
As a runtime-only feature, -fsanitize=leak
is mainly for
link actions. It does affect compile actions for the following C/C++
preprocessor feature.
1 | #if __has_feature(leak_sanitizer) |
Implementation overview
Standalone LSan intercepts malloc-family functions. It uses the
temporized SizeClassAllocator{32,64}
with chunk metadata.
The interceptors record the allocation information (requested size,
stack trace).
AddressSanitizer already intercepts malloc-family functions. Its chunk metadata has extra bits for LSan.
By default, the common options detect_leaks
and
leak_check_at_exit
are enabled. The runtime installs a hook
with atexit
which will perform the leak check.
Alternatively, the user can call __lsan_do_leak_check
to
request a leak check before the exit time.
Upon a leak check, the runtime performs a job very similar to a mark-and-sweep garbage collection algorithm. It suspends all threads ("stop the world") and scans root regions (find allocations reachable by the process). Root regions include:
- ignored allocations due to
__lsan_ignore_object
or__lsan_disable
- global regions (
__lsan::ProcessGlobalRegions
). On Linux, these refer to memory mappings due to writablePT_LOAD
program headers. This ensures that allocations reachable by a global variable are not leaks. - for each thread, registers (default
use_registers=1
), stack (defaultuse_stack=1
), thread-local storage (defaultuse_tls=1
), and additional pointers in the thread context - root regions due to
__lsan_register_root_region
- operating system specific allocations. This is currently macOS-specific: libdispatch and Foundation memory regions, etc.
The runtimes uses a flood fill algorithm to find reachable
allocations from a region. The runtime is conservative and scans all
aligned words which look like a pointer. (It is not feasible to
determine whether a pointer-like word is used as an integer/floating
point number, not as a pointer.) If the word looks like a pointer into a
heap (many 64-bit targets use
[0x600000000000, 0x640000000000)
), the runtime checks
whether it refers to an active chunk. In Valgrind's terms, a reference
can be a "start-pointer" (a pointer to the start of the chunk) or an
"interior-pointer" (a pointer to the middle of the chunk).
Finally, the runtime iterates over all active allocations and reports leaks for unmarked allocations.
Metadata
Each allocation reserves 2 bits to record its state:
leaked/reachable/ignored. For better diagnostics, "leaked" can be direct
or indirect. If a chunk is marked as leaked, all chunks reachable from
it are marked as indirectly leaked. (*p = malloc(43);
in
the very beginning example is an indirect leak.)
1 | enum ChunkTag { |
Standalone LSan uses this chunk metadata struct:
1 | struct ChunkMetadata { |
ASan just stores a 2-bit ChunkTag
in its existing chunk
metadata (__asan::Chunkheader::lsan_tag
). Similarly, HWASan
stores a 2-bit ChunkTag
in its existing chunk metadata
(__hwasan::Metadata::lsan_tag
).
Allocator
Both standalone LSan and ASan use the temporized
SizeClassAllocator{32,64}
(primary allocator) with a
thread-local cache (SizeClassAllocator{32,64}LocalCache
).
The cache defines many size classes and maintains free lists for these
classes. Upon an allocation request, if the free list for the requested
size class is empty, the cache will call Refill
to grab
more chunks from SizeClassAllocator{32,64}
. If the free
list is not empty, the cache hands over a chunk from the free list to
the user.
SizeClassAllocator64
has a total allocator space of
kSpaceSize
bytes. The space is split into multiple regions
of the same size (kSpaceSize
), each serving a single size
class. When the cache calls Refill
,
SizeClassAllocator64
takes a lock of the region and calls
mmap
to allocate a new memory mapping (and if
kMetadataSize!=0
, another memory mapping for metadata). For
an active allocation, it is very efficient to compute its index in the
region and its associated metadata.
Stop the world
On Linux, the runtime invokes the clone syscall to create a tracer
thread. The tracer thread calls ptrace
with
PTRACE_ATTACH
to stop every other thread.
Runtime options
Specify the environment variable LSAN_OPTIONS
to toggle
runtime behaviors.
For standalone LSan, exitcode=23
is the default. The
runtime calls _exit(23)
upon leaks. For ASan-integrated
LSan, exitcode=1
is the default.
LSAN_OPTIONS=use_registers=0:use_stack=0:use_tls=0
can
remove some default root regions.
leak_check_at_exit=0
disables registering an
atexit
hook for leak checking.
report_objects=1
reports the addresses of individual
leaked objects.
detect_leaks=0
disables all leak checking, including
user-requested ones due to __lsan_do_leak_check
or
__lsan_do_recoverable_leak_check
. This is similar to
defining
extern "C" int __lsan_is_turned_off() { return 1; }
in the
program. Using standalone LSan with detect_leaks=0
has a
performance characteristic similar to using pure
SizeClassAllocator{32,64}
and has nearly no extra overhead
except the stack trace.
If __lsan_default_options
is defined, the return value
will be parsed like LSAN_OPTIONS
.
Issue suppression
Call __lsan_ignore_object
to ignore an allocation.
__lsan_disable
ignores all allocations for the current
thread until __lsan_enable
is called.
The runtime scans each ignored allocation. An allocation reachable by an ignored allocation is not considered as a leak.
__lsan_register_root_region
registers a region as a
root. The runtime scans the intersection of the region and valid memory
mappings (/proc/self/maps
on Linux).
LSAN_OPTIONS=use_root_regions=0
can disable the registered
regions.
We can provide a suppression file with
LSAN_OPTIONS=suppressions=a.supp
. The file contains one
suppression rule per line, each rule being of the form
leak:<pattern>
. For a leak, the runtime checks every
frame in the stack trace. A frame has a module name associated with the
call site address (executable or shared object), and when symbolization
is available, a source file name and a function name. If any module
name/source file name/function name matches a pattern (substring
matching using glob), the leak is suppressed.
Note: symbolization requires debug information and a symbolizer
(internal symbolizer (not built by default) or an
llvm-symbolizer
in a PATH
directory).
Let's see an example.
1 | cat > a.c <<'eof' |
Miscellaneous
Standalone LeakSanitizer can be used with SanitizerCoverage:
clang -fsanitize=leak -fsanitize-coverage=func,trace-pc-guard a.c
The testsuite has been moved. Use
git log -- compiler-rt/lib/lsan/lit_tests/TestCases/pointer_to_self.cc
to do archaeology for old tests.
Valgrind's Memcheck defaults to --leak-check=yes
and
performs leak checking. The feature has been available since the initial
revision (git log -- vg_memory.c
) in 2002.
Google open sourced HeapLeakChecker as part of gperftools. It is part
of TCMalloc and used with debugallocation.cc
.
Multi-threading allocations need to grab a global lock and are much
slower than a sanitizer allocator.
heap_check_max_pointer_offset
(default: 2048) specifies the
largest offset it scans an allocation for pointers. The default makes it
vulnerable to false positives.
heaptrack is a heap memory profiler which supports leak checking.