Advanced C++ Workshop 2023 (02)
View all talks at Advanced C++ Workshop (CSCS).
This talk is on basic topics of the C++ infrastructure and related resources.
Build Essentials
When working with C++, like any other language, a complete toolchain is required. A typical toolchain includes a compiler (e.g., g++/clang++/MSVC), an assembler, a linker, and binary utilities (e.g., archivers and inspection tools such as ar, nm, objdump/llvm-objdump). In addition, most real projects rely on a build system to manage dependencies and configurations.
There is one exception though, the Cling used in ROOT. It is built on top of Clang and LLVM, and is a JIT compiler for C++ to allow interpreted execution.
Compiler
The most important component is, of course, the compiler. It bridges the high-level programming languages and low-level machine instructions. And there are assemblers to do more specific translation from assembly to machine code.
Beyond translation, the compiler (and its accompanying toolchain) abstracts away differences in architecture and ABI. The output of compilation is usually a relocatable object file (or an intermediate representation), not a final executable. Object files contain code and data in sections (commonly .text, .data, .rodata, etc.), plus symbol and relocation information. They may also carry debug information (e.g., DWARF on many ELF-based platforms).
This separation makes large builds modular: you can compile many translation units independently, then link them together into an executable or a library.
Note that the object file is not executable. It contains data and instructions, but the target addresses of many references are not yet resolved. An executable needs these addresses resolved, which is the job of the linker.
For the ABI, which stands for Application Binary Interface, it is the contract at binary level. Usually, it defines (among other things):
- memory layout
- calling convention
- name mangling
- RTTI/type information conventions
- exception handling and unwinding
- dynamic linking conventions (symbol visibility and relocation model)
For C++, templates can be a source of ABI issues because their definitions live in headers and are instantiated in each translation unit. ABI mismatches typically come from incompatible headers or ABI settings across components (for example, mixing different standard library ABIs/versions, or toggles that change the representation of standard types). This is usually not about optimization level changing “which template gets instantiated”, but about whether all components were built against the same ABI expectations.
Inlining can also be a problem. If you change an inline (or template) function in a header, any code that used it must be recompiled, otherwise old call sites may still contain the old inlined machine code. Also, if a translation unit expects a non-inline out-of-line definition but the definition is missing (or not linked), you can get an undefined symbol at link time.
Why we need stack?
It is a cheap and simple way to allocate per-call storage and save state. Recall the Von Neumann architecture, which includes Control Unit, Arithmetic/Logic Unit, Memory Unit, Input and Output. The processor (CU/ALU) uses registers for fast computation, and RAM to store and retrieve data.
When you call a function, the stack provides a stack frame: locals, saved registers, and bookkeeping (such as the return address). The compiler can access frame data efficiently using a base register (frame pointer or stack pointer) plus a fixed offset.
Linker
The compiled object files need to be linked together to become an executable, or a shared library. Linking happens at build time; with dynamic linking, some symbol resolution/relocation is deferred to load time or runtime by the dynamic loader.
The primary mission for the linker is to combine object files, resolve symbols, and apply relocations. A static library is typically an archive of object files; at link time, the linker pulls in only the needed objects and then performs final address layout.
With dynamic linking, references to external shared libraries are left to be resolved by the platform’s dynamic loader at program startup (or lazily on first call). On ELF systems, calls often go through PLT/GOT stubs generated in the executable; on Windows, imports are typically resolved via the Import Address Table (IAT).
PIC/PIE and ASLR (and why it matters)
Don’t confuse relocatable objects with position-independent code.
- Relocatable object files (
.o) are designed to be relocated by the linker at build time. - Shared libraries (
.so) are typically built as PIC so they can be mapped at (almost) any runtime address without requiring expensive text relocations. - PIE (Position Independent Executable) is the same idea applied to the main executable: building the executable as PIE allows the loader to place it at a randomized base address.
PIE is important for enabling ASLR (Address Space Layout Randomization) for the main executable’s code segment. In practice, ASLR randomizes the base addresses of several memory mappings (commonly the main executable if it is PIE, shared libraries, heap, stack, and mmap regions), not necessarily every byte.
Further reading of PLT/GOT:
Build System
The most useful tool at scale is a build system. It manages configuration and dependencies, then drives the toolchain to compile/link in the right order. In practice there is often a generator (e.g., CMake) that produces build files, and a backend (e.g., Ninja/Make) that executes them. Key benefits include incremental builds (only rebuilding what changed) and automation.
Resources
Inspection Tools
- Compiler Explorer: Mainly for checking assembly code.
- Quick C++ Benchmark: Quickly benchmark performance.
- CPP Insights: See the de-sugared C++ code.
Online Resources









