syntax highlight

Tuesday 3 December 2013

Static initialization in C++

Let's analyze this seemingly simple code sample:

struct X {
    X();
};

void foo() {
    static X a;
}

X b;
void bar() {
    foo();
    X c;
}

Do you know what the order of initialization will be for a, b and c? b is rather easy: it's a plain global variable and it should be initialized first of all, even before main runs. c is also easy, it will be initialized only when the execution reaches the line where it is defined. How about a?

a is static, so just like b it should be initialized only once. Unlike b, though, it belongs to foo's scope, and it will only be initialized the first time foo is executed. Let's see how that happens in gcc by taking an even simpler example:

struct X {
    X() throw();
};

void foo() throw() {
    static X x;
}

Note: the throw()'s are in there only to tell the compiler we don't want any kind of exception handling code, that will make the assembly inspection a bit easier. Let's compile, disassemble and c++filt this. You should see something very interesting in the first few lines:

    .file    "foo.cpp"
    .local    guard variable for foo()::x
    .comm    guard variable for foo()::x,8,8
    .text

# Skipping the actual foo definition, we'll see that later

.LFE0:
    .size    foo(), .-foo()
    .local    foo()::x
    .comm    foo()::x,1,1

Inside the definition for foo gcc reserved some space for our static variable; interestingly, it also reserved 8 bytes for something called "Guard variable for foo()::x" (when demangled, of course). This means that there is a flag to determine whether foo()::x was already initialized, or not.

Let's analyze now the assembly for foo() to understand how the guard is used:

foo():
    movl    guard variable for foo()::x, %eax
    movzbl    (%rax), %eax
    testb    %al, %al
    jne    .L1
    movl    guard variable for foo()::x, %edi
    call    __cxa_guard_acquire
    testl    %eax, %eax
    setne    %al
    testb    %al, %al
    je    .L1
    movl    foo()::x, %edi
    call    X::X()
    movl    guard variable for foo()::x, %edi
    call    __cxa_guard_release
.L1:
    # Rest of the method (empty, in our example)

This is also interesting: initializing a static variable depends on libcpp (which is dependant on the compiler's ABI). We could translate the whole thing to, more or less, the following pseudocode:

void foo() {
    static X x;
    static guard x_is_initialized;
    if ( __cxa_guard_acquire(x_is_initialized) ) {
        X::X();
        x_is_initialized = true;
        __cxa_guard_release(x_is_initialized);
    }
}

(Note: exception safety ignored, which of course is not the case for a proper libcpp)

Eventually, __cxa_guard_acquire will check if this object was already initialized or if anyone else is trying to initialize this object, and then it will signal the calling method to run x's constructor if it's safe to do so.

There's another bit of information in here which is not immediately obvious: in case X's constructor fails (ie an exception is thrown within this method), x_is_initialized won't be set to true. Assuming the exception is caught somewhere else, if foo() is called again the initialization for foo()::x will be attempted to run once again.