Control-flow integrity

UNDER CONSTRUCTION

Control-flow integrity (CFI) refers to techniques which prevent control-flow hijacking attacks. This article describes some compiler/hardware features with a focus on llvm-project implementations.

CFI commonly divided into forward-edge CFI (e.g. indirect calls) and backward-edge CFI (e.g. function returns). AIUI exception handling and symbol interposition are not categorized.

Let's start with backward-edge CFI. Fine-grained technologies check that a return address refers to a possible caller. This is unimplemented and the additional guarantee is possibly less useful.

Coarse technologies just check that return addresses are not tempered. Return addresses are typically stored on a memory region called "stack" with function arguments, local variables, and register save areas. Stack smashing is an attack that overwrites the return address to hijack the control flow. The name is made popular by Aleph One (Elias Levy)'s paper Smashing The Stack For Fun And Profit.

StackGuard/Stack Smashing Protector

StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks (1998) detects hijacted return addresses on the stack.

GCC 4.1 implemented -fstack-protector and -fstack-protector-all. More variants are added over the years, e.g. -fstack-protector-strong for GCC 4.9.

Retguard

OpenBSD introduced Retguard in 2017. See RETGUARD and stack canaries. It is more effective than StackGuard with more expensive function prologues/epilogues.

For each instrumented function, a cookie is allocated from a pool of 4000 entries. In the prologue, return_address ^ cookie is pushed next to the return address. The epilogue pops the XOR value and the return address and verifies that they match ((value ^ cookie) == return_address).

Not encrypting the return address directly is important to preserve return address prediction for the CPU. The two int3 instructions are to disrupt ROP gadgets which may form from je ...; retq (02 cc cc c3 is addb %ah, %cl; int3; retq). If a static branch predictor exists (likely not-taken for a forward branch) the initial prediction is likely wrong.

// From https://www.openbsd.org/papers/eurobsdcon2018-rop.pdf
// prologue
ffffffff819ff700: 4c 8b 1d 61 21 24 00 mov 2367841(%rip),%r11 # <__retguard_2759>
ffffffff819ff707: 4c 33 1c 24          xor (%rsp),%r11
ffffffff819ff70b: 55                   push %rbp
ffffffff819ff70c: 48 89 e5             mov %rsp,%rbp
ffffffff819ff70f: 41 53                push %r11 

// epilogue
ffffffff8115a457: 41 5b                pop %r11
ffffffff8115a459: 5d                   pop %rbp
ffffffff8115a45a: 4c 33 1c 24          xor (%rsp),%r11
ffffffff8115a45e: 4c 3b 1d 03 74 ae 00 cmp 11432963(%rip),%r11 # <__retguard_2759>
ffffffff8115a465: 74 02                je ffffffff8115a469
ffffffff8115a467: cc                   int3
ffffffff8115a468: cc                   int3
ffffffff8115a469: c3                   retq

SafeStack

Code-Pointer Integrity (2014) proposed an instrumentation which was merged into LLVM in 2015. Use clang -fsanitize=safe-stack.

The pass moves some stack objects into a separate stack (normally referenced by a thread-local variable __safestack_unsafe_stack_ptr or via a function call __safestack_pointer_address). These objects include those which are not guaranteed to be free of stack smashing (mainly via ScalarEvolution).

As an example, the local variable a below is moved to the unsafe stack as there is a risk that bar may have out-of-bounds accesses.

void bar(int *);
void foo() {
  int a;
  bar(&a);
}

foo:                                    # @foo
# %bb.0:                                # %entry
        pushq   %r14
        pushq   %rbx
        pushq   %rax
        movq    [email protected](%rip), %rbx
        movq    %fs:(%rbx), %r14
        leaq    -16(%r14), %rax
        movq    %rax, %fs:(%rbx)
        leaq    -4(%r14), %rdi
        callq   [email protected]
        movq    %r14, %fs:(%rbx)
        addq    $8, %rsp
        popq    %rbx
        popq    %r14
        retq

compiler-rt/lib/safestack/ provides the runtime.

Shadow call stack

If the two do not match, it indicates that the return address was altered.

Userspace

`-fsanitize=shadow-call-stack`

See https://clang.llvm.org/docs/ShadowCallStack.html. The instrumentation stores the return address in a shadow stack during a function call. Upon return, the return address is popped from the shadow stack. The return address is also stored on the regular stack for return address prediction and compatibility with unwinders, but is otherwise unused.

To use this for AArch64, run clang --target=aarch64-unknown-linux-gnu -fsanitize=shadow-call-stack -ffixed-x18. This is implemented for RISC-V as well. Interestingly, you can still use -ffixed-x18, as x18 (aka s2) is a callee-saved register.

GCC 12.0 ported the feature.

In the Linux kernel, select CONFIG_SHADOW_CALL_STACK to use this technology.

Hardware-assisted

Intel Control-flow Enforcement Technology and AMD Shadow Stack

Supported by Intel's 11th Gen and AMD Zen 3.

A RET instruction pops the return address from both the regular stack and the shadow stack, and compares them. A control protection exception (#CP) is raised in case of a mismatch.

On Windows the technology is used by Hardware-enforced Stack Protection. In the MSVC linker, /cetcompat marks an executable image as compatible with Control-flow Enforcement Technology (CET) Shadow Stack.

ARMv8.3 Pointer Authentication

Instructions are provided to sign a pointer with a 64-bit context value (usually zero, X16, or SP) and a 128-bit secret key. The instructions are allocated from the HINT space for compatibility with older CPUs.

A major use case is to sign/authenticate return addresses with PACIASP/AUTIASP.

ld.lld added support in https://reviews.llvm.org/D62609 (with substantial changes afterwards). If -z pac-plt is specified, autia1716 is used for a PLT entry. A relocatable file with .note.gnu.property and the GNU_PROPERTY_AARCH64_FEATURE_1_PAC bit cleared gets a warning.

Now let's discuss forward-edge CFI technologies.

`-fsanitize=cfi`

See https://clang.llvm.org/docs/ControlFlowIntegrity.html. Clang has implemented a number of CFI schemes under this umbrella option. They all rely on link-time optimizations.

Control Flow Guard

Windows 8.1 introduced Control Flow Guard. It was implemented in llvm-project in 2019. Use clang-cl /guard:cf or clang --target=x86_64-pc-windows-gnu -mguard=cf.

The compiler instruments indirect calls to call a global function pointer (___guard_check_icall_fptr or __guard_dispatch_icall_fptr) and records valid indirect call targets in special sections (.gfids$y, .giats$y, .gljmp$y). An instrumented file with applicable functions defines the @feat.00 symbol with at least one bit of 0x4800.

The linker combines the sections, marks additional symbols (e.g. /entry), creates address tables (__guard_fids_table, __guard_iat_table, __guard_longjmp_table (unless /guard:nolongjmp), __guard_eh_cont_table).

At run-time, the global function pointer refers to a function which verifies that an indirect call target is valid.

leaq   target(%rip), %rax
callq  *%rax

=>

leaq   target(%rip), %rax
callq  *__guard_dispatch_icall_fptr(%rip)

FineIBT

`-fsanitize=kcfi`

Introduced to llvm-project in 2022-11 (milestone: 16.0.0. Glad as a reviewer).

Hardware-assisted

Intel Indirect Branch Tracking

This is part of Intel Control-flow Enforcement Technology. When enabled, the CPU ensures that every indirect branch lands on a special instruction (endbr32 or endbr64), otherwise a control-protection (#CP) exception is raised.

ld.lld added support in 2020-01:

PLT entries need endbr
If all relocatable files with .note.gnu.property have set the GNU_PROPERTY_X86_FEATURE_1_SHSTK bit, or -z shstk, the output will have the bit.

StackGuard/Stack Smashing Protector

Retguard

SafeStack

Shadow call stack

Userspace

-fsanitize=shadow-call-stack

Hardware-assisted

Intel Control-flow Enforcement Technology and AMD Shadow Stack

ARMv8.3 Pointer Authentication

-fsanitize=cfi

Control Flow Guard

FineIBT

-fsanitize=kcfi

Hardware-assisted

Intel Indirect Branch Tracking

ARMv8.5 Branch Target Identification

`-fsanitize=shadow-call-stack`

`-fsanitize=cfi`

`-fsanitize=kcfi`