C/C++ projects can benefit from using precompiled headers to improve compile time. GCC added support for precompiled headers in 2003 (version 3.4), and the current documentation can be found at https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html.
This article focuses on Clang precompiled headers (PCH). Let's begin with an example.
1 | cat > a1.cc <<'eof' |
We compile b.hh
using -c
, just like we
would compile a non-header file. Clang parses the file, performs
semantic analysis, and writes the precompiled header (as a serialized
AST file) into b.hh.pch
.
When compiling a.cc
, we use -include-pch
as
a prefix header. This means that the translation unit will get two
b.h
copies: one from b.hh.pch
and one from the
textual b.hh
. The same applies to a1.cc
. To
avoid a redefinition of 'fb'
error, b.hh
should have a header guard or use #pragma once
.
Now, let's examine the steps in detail.
PCH generation
Given a header file as input, Clang determines the input type as
either c-header
(.h
) or
c++-header
(.hh
/.hpp
/.hxx
) based on the file
extension.
For compilation actions, either clang
or
clang++
can be used. If we treat .h
as a C++
header, we need to specify -xc++-header
(e.g.,
clang -c -xc++-header b.h -o b.h.pcm
). (It's worth noting
that the behavior of clang++ -c a.h
is deprecated. Other
than that, the only significant difference between clang
and clang++
is the linking process, specifically whether
the C++ standard library is linked.)
When the input type is c-header
or
c++-header
, Clang Driver selects the -emit-pch
frontend action. (Note:
c++-user-header
/c++-system-header
are used for
C++ modules and have different functionality.)
Conventionally, the extension used for Clang precompiled headers is
.pch
(similar to MSVC). However, to match GCC, when the
-o
option is omitted, the default output file is
input_file + ".gch"
(see
Driver::GetNamedOutputPath
).
The frontend parses the file, performs semantic analysis, and writes
the precompiled header (as a serialized AST file) (see
PCHGenerator
). For the serialized format, refer to Precompiled Header
and Modules Internals.
Using PCH
-include-pch b.hh.pch
(PreprocessorOptions::ImplicitPCHInclude
) loads the
precompiled header b.hh.pch
as a prefix header.
We can also write -include b.hh
, and Clang will probe
b.hh.pch
/b.hh.gch
and use the file if present.
This is a behavior ported from GCC.
-include-pch
may specify a directory. Clang will search
for a suitable precompiled header in the directory (see
ASTReader::isAcceptableASTFile
). The directory may contain
precompiled headers for different compiler options. This is another
behavior ported from GCC.
1 | echo 'extern int X;' > d.hh |
1 | % clang++ -c -DX=z -include d.hh e.cc |
PCH validation
When we generate and use a precompiled header with different compiler
options, the behavior will be a combination of those options.
Consequently, the behavior of -include b.hh
may differ
depending on the presence of
b.hh.pch
/b.hh.gch
.
To identify this common pitfall, Clang performs PCH validation (see
PCHValidator
) to check for inconsistent options, similar to
how MSVC handles it. The validated options include those that can affect
AST generation, such as language options (-std=
), target
options (-triple
), file system options, header search
options, and preprocessor options.
Modules employ the same validation mechanism, but PCH validation is
stricter (!AllowCompatibleConfigurationMismatch
). This
means that
COMPATIBLE_LANGOPT
/COMPATIBLE_ENUM_LANGOPT
/COMPATIBLE_VALUE_LANGOPT
options (e.g., whether the built-in macro __OPTIMIZE__
is
defined) must match as well.
If one side of the precompiled header and the user code are compiled
with the -D
option, the other side should either use the
same -D
option or omit it entirely.
1 | clang -c -xc++-header b.h -o b.pch -DB=1 |
Performance optimization
In order to achieve better performance, it is possible to make certain compromises on properties such as language standard conformance.
-fpch-instantiate-templates
-fpch-instantiate-templates
allows pending template instantiations to be performed in the PCH file.
This means that these instantiations do not need to be repeated in every
translation unit that includes the PCH file. This optimization can
significantly improve the speed of certain projects. However, the option
changes the instantiation points of certain function templates, which is
non-conforming. Nevertheless, the altered behavior is generally harmless
in most cases.
1 | #ifndef HEADER |
1 | % clang++ -c -xc++-header a.cc -o a.pch |
Modular code generation was initially implemented. It was
later extended to support precompiled headers by https://reviews.llvm.org/D69778. To utilize this
feature, you can specify -Xclang -fmodules-codegen
as a
command-line option or use the driver option
-fpch-codegen
.
When generating a serialized AST file for PCH or modules, Clang
identifies non-always-inline functions that do not depend on template
parameters and have linkages other than GVA_Internal
or
GVA_AvailableExternally
. These functions are then
serialized (see ASTWriter::ModularCodegenDecls
).
In an importer that encounters such a definition, the linkage is
adjusted to GVA_AvailableExternally
. This allows for
discarding of the definition if it is not required for
optimizations.
Let's consider an example using the files a.cc
,
a1.cc
, and b.h
from the initial example
provided at the beginning of this article.
1 | echo 'module b { header "b.h" }' > module.modulemap |
Both a.cc
and a1.cc
include
b.h
and obtain an inline definition of fb
. In
a regular build, the fb
definition has
GVA_DiscardableODR
linkage and is compiled twice into
a.o
and a1.o
. These duplicate definitions are
then deduplicated by the linker, following COMDAT semantics.
In a modular code generation build, fb
is assigned
GVA_StrongODR
linkage in b.pcm
and is emitted
into b.o
. The copies of fb
in
a.cc
and a1.cc
are adjusted to
GVA_AvailableExternally
. They are used for optimizations by
callers but are not emitted otherwise. In a -O0
build, the
GVA_AvailableExternally
definitions are simply discarded.
Regardless, both the code generator and the linker have reduced work,
resulting in decreased build time.
However, there are two primary differences in behavior.
First, if b.h
contains a GVA_StrongExternal
definition, a regular build will encounter a linker error due to a
duplicate symbol. However, in the prebuilt modules build using
-fmodules-codegen
, this error does not occur.
Second, in a regular build, if fb
is unused, no
translation unit will contain its COMDAT definition. On the other hand,
in the prebuilt modules build using -fmodules-codegen
, we
compile the prebuilt module b.pcm
into b.o
and
link b.o
into the executable, always resulting in a
definition of fb
, even when using -O1
or
above. To discard fb
, linker garbage collection can be
leveraged by using
-ffunction-sections -Wl,--gc-sections
.
If b.h
contains an inline variable with an initializer
involving a side effect (e.g.,
inline int vb = puts("vb"), 1;
), the modular code
generation build will always observe the side effect. In contrast, a
regular build may not observe the side effect if, for example, the
containing header is not included in any translation unit.
Nevertheless, these behavior differences are almost always benign, and the speedup gained in build time may outweigh the downsides.
Note: Clang exhibits a similiar behavior when compiling module interface units and module partitions for strong definitions.
Modular code generation has been extended to support PCH in Clang 11.
We specify -fpch-codegen
to pass
-fmodules-codegen
to the frontend.
1 | clang++ -c -xc++-header -fpch-codegen b.h -o b.pch |
When using -fpch-codegen
, compared to the
non--fpch-codegen
usage of PCH, it is necessary to compile
the PCH file b.pch
into b.o
and link
b.o
into the executable. If b.o
is not linked,
a linker error for an undefined fb()
will occur.
-fpch-debuginfo
-fpch-debuginfo
serves a similar purpose as
-fpch-codegen
, but specifically for debug information
descriptions of types.
Here is an example of using MSVC-style precompiled headers with clang-cl (a CL-style driver mode)
1 | clang-cl /c /Ycb.hh a.cc |
Let's consider the initial example provided at the beginning of this article.
The /Ycb.hh
command instructs the Clang Driver to
perform two frontend actions. First, Clang parses the base source file
a.cc
up to and including #include "b.hh"
,
performs semantic analysis, and writes the precompiled header into
b.pch
. It replaces the header file extension with
.pch
, unlike GCC. Second, Clang compiles a.cc
using -include-pch b.pch
, but it skips preprocessing tokens
up to and including #include "b.hh"
(see
Preprocessor::SkippingUntilPCHThroughHeader
).
The /Yub.hh
command is similar to the second frontend
action of /Ycb.hh
. It compiles a1.cc
using
-include-pch b.pch
, but it also skips preprocessing tokens
up to and including #include "b.hh"
.
Internally, /Ycb.hh
and /Yub.hh
instruct
the driver to pass -pch-through-header=b.hh
to the
frontend. This helps Clang detect common pitfalls by examining whether
the source file contains the #include "b.hh"
directive.
It is also possible to use /Yc
and /Yu
without specifying a filename. In this case, the precompiled header
region is determined by #pragma hdrstop
or the end of the
source file. For more details, refer to /Yc
(Create Precompiled Header File).
Additionally, when using clang-cl /Yc
, the cc1 option
-building-pch-with-obj
is passed to the frontend to
serialize dllexport
declarations.
TODO: PCH signature and linker
Precompiled preamble
TODO
-fno-pch-timestamp
TODO