syntax highlight

Thursday 30 May 2013

Vim tip: remember undos

Git makes this feature rather obsolete, but it's still a nifty trick: Vim can remember your undos even if you close it. To enable this feature just "set undofile" and now Vim will unforgivingly remember your mistakes. For ever.

Tuesday 28 May 2013

C++ exceptions under the hood 20: running destructors while unwinding

The mini ABI version 11 we have written last time was able to handle pretty much all the basics to handle an exception: we have an (almost working) ABI capable of throwing and catching exceptions, but it is still unable to properly run destructors. That's quite important if we want to write exception safe code. With what we know about .gcc_except_table running destructors is a piece of cake, we only need to see a bit of assembly:

# Call site table
.LLSDACSB2:
    # Call site 1
	.uleb128 ip_range_start
	.uleb128 ip_range_len
	.uleb128 landing_pad_ip
	.uleb128 (action_offset+1) => 0x3
    
    # Rest of call site table

# Action table start
.LLSDACSE2:
    # Action 1
	.byte	0
	.byte	0

    # Action 2
	.byte	0x1
	.byte	0x7d

	.align 4
	.long	_ZTI14Fake_Exception
.LLSDATT2:
# Types table start

On a regular landing pad, when an action has a type index greater than 0 it means we're seeing an index to a type tables, and we can use that to know which types the catch can handle; for a type index with a value of 0 it means we are instead seeing a cleanup block and we should run it anyway. Although the landing pad can't handle the exception it will still be able to perform the cleanup that's supposed to happen while unwinding. Of course the landing pad will call _Unwind_Resume when the cleanup is done and that will continue the regular stack unwinding process.

I've uploaded this latest and last version to my github repo, but there are some bad news: remember how we cheated by saying that a uleb128 == char? As soon as we start adding blocks to run destructors the .gcc_except_table starts to get quite big (where "big" means we have offsets over 127 bytes long) and that assumption no longer holds.

For the next version of this ABI we would have to rewrite our LSDA reading functions to read proper uleb128 code. Not a major change, but at this point we wouldn't gain much, we have already achieved our goal: a working minimal ABI capable of handling exceptions without the help of libcxxabi.

There are parts we haven't covered, like handling non-native exceptions, catching derived types or interoperability between compilers and linkers. Maybe some other time, in this rather long series of articles we already learned quite a bit about low level exception handling in C++.

Thursday 23 May 2013

C++ exceptions under the hood 19: getting the right catch in a landing pad

19th entry about C++ exception handling: we have written a personality function that can so far, by reading the LSDA, choose the right landing pad on the right stack frame to handle a thrown exception, but it was having some difficulties finding the right catch inside a landing pad. To finally get a decently working personality function we'll need to check all the types an exception can handle by going through all the actions table in the .gcc_except_table.

Remember the action table? Let's check it again but this time for a try with multiple catch blocks.

# Call site table
.LLSDACSB2:
    # Call site 1
	.uleb128 ip_range_start
	.uleb128 ip_range_len
	.uleb128 landing_pad_ip
	.uleb128 (action_offset+1) => 0x3
    
    # Rest of call site table

# Action table start
.LLSDACSE2:
    # Action 1
	.byte	0x2
	.byte	0

    # Action 2
	.byte	0x1
	.byte	0x7d

	.align 4
	.long	_ZTI9Exception
	.long	_ZTI14Fake_Exception
.LLSDATT2:
# Types table start

If we intend to read the exceptions supported by the landing pad 1 in the example above (that LSDA is for the catchit function, by the way) we need to do something like this:

  • Get the action offset from the call site table, 2: remember you'll actually read the offset plus 1, so 0 means no action.
  • Go to action offset 2, get type index 1. The types table is indexed in reverse order (ie we have a pointer to its end and we need to access each element by using -1 * index).
  • Go to types_table[-1]; you'll get a pointer to the type_info for Fake_Exception
  • Fake_Exception is not the current exception being thrown; get the next action offset for our current action (0x7d)
  • Reading 0x7d in uleb128 will actually yield -3; from the position where we read the offset move back 3 bytes to find the next action
  • Read type index 2
  • Get the type_info for Exception this time; it matches the current exception being thrown, so we can install the landing pad!

It sounds complicated because there's, again, a lot of indirection for each step but you can check the full sourcecode for this project in my github repo.

In the link above you will also see a bonus: a change to the personality function to correctly detect and use catch(...) blocks. That's an easy change once the personality functions knows how to read the types table: a type with a null pointer (ie a position in the table that instead of a valid pointer to an std::type_info holds null) represents a catch all block. This has an interesting side effect: a catch(T) will be able to handle only native (ie coming from C++) exceptions, whereas a catch(...) would catch also exceptions not thrown from within C++.

We finally know how exceptions are thrown, how the stack is unwinded, how a personality function selects the correct stack frame to handle an exception and how the right catch inside a landing pad is selected, but we still have on more problem to solve: running destructors. We'll change our personality function to support RAII objects next time.

Tuesday 21 May 2013

Vim tip: some cool moves

We all know the usual Vim moves, hjkl (though I admit I still use the arrows) et al, but there are some more obscure and very useful Vim moves, for the hipster in you. I only recently learned about L and H, for the beginning and the end of the screen.

I don't know how I lived so long without those. It's an incredibly useful shortcut.

Thursday 16 May 2013

C++ exceptions under the hood 18: getting the right stack frame

Our latest personality function knows whether it can handle an exception or not (assuming there is only one catch statement per try block and assuming no inheritance is used) but to make this knowledge useful, we have first to check if the exception we can handle matches the exception being thrown. Let's try to do this.

Of course, we need first to know the exception type. To do this we need to save the exception type when __cxa_throw is called (this is the chance the ABI gives us to set all our custom data):

Note: You can download the full sourcecode for this project in my github repo.

void __cxa_throw(void* thrown_exception,
                 std::type_info *tinfo,
                 void (*dest)(void*))
{
    __cxa_exception *header = ((__cxa_exception *) thrown_exception - 1);

    // We need to save the type info in the exception header _Unwind_ will
    // receive, otherwise we won't be able to know it when unwinding
    header->exceptionType = tinfo;

    _Unwind_RaiseException(&header->unwindHeader);
}

And now we can read the exception type in our personality function and easily check if the exception types match (the exception names are C++ strings, so doing a == is enough to check this:

// Get the type of the exception we can handle
const void* catch_type_info = lsda.types_table_start[ -1 * type_index ];
const std::type_info *catch_ti = (const std::type_info *) catch_type_info;

// Get the type of the original exception being thrown
__cxa_exception* exception_header = (__cxa_exception*)(unwind_exception+1) - 1;
std::type_info *org_ex_type = exception_header->exceptionType;

printf("%s thrown, catch handles %s\n",
            org_ex_type->name(),
            catch_ti->name());

// Check if the exception being thrown is of the same type
// than the exception we can handle
if (org_ex_type->name() != catch_ti->name())
    continue;

Check here for the full source with the new changes.

Of course there would be a problem if we add that (can you see it?). If the exception is thrown in two phases and we said in the first one we would handle it, then we can't say on the second one we don't want it anymore. I don't know if _Unwind_ handles this case according to any documentation but this is most likely calling upon undefined behavior, so just saying we'll handle everything is no longer enough.

Since we gave our personality function the ability to know if the landing pad can handle the exception being thrown we have been lying to _Unwind_ about which exceptions we can handle; even though we said we handle all of them on our ABI 9, the truth is that we didn't know whether we would be able to handle it. That's easy to change, we can do something like this:

_Unwind_Reason_Code __gxx_personality_v0 (...)
{
    printf("Personality function, searching for handler\n");

    // ...

    foreach (call site entry in lsda)
    {
        if (call site entry.not_good()) continue;

        // We found a landing pad for this exception; resume execution

        // If we are on search phase, tell _Unwind_ we can handle this one
        if (actions & _UA_SEARCH_PHASE) return _URC_HANDLER_FOUND;

        // If we are not on search phase then we are on _UA_CLEANUP_PHASE
        /* set everything so the landing pad can run */

        return _URC_INSTALL_CONTEXT;
    }

    return _URC_CONTINUE_UNWIND;
}

As usual, check the full sourcecode for this project in my github repo.

So, what would we get if we run the personality function with this change? Fail, that's what we'd get! Remember our throwing functions? This one should catch our exception:

void catchit() {
    try {
        try_but_dont_catch();
    } catch(Fake_Exception&) {
        printf("Caught a Fake_Exception!\n");
    } catch(Exception&) {
        printf("Caught an Exception!\n");
    }

    printf("catchit handled the exception\n");
}

Unfortunately, our personality function only checks for the first type the landing pad can handle. If we delete the Fake_Exception catch block and try it again, though, we'd get a different story: finally, success! Our personality function can now select the correct catch in the correct frame, provided there's no try block with multiple catches.

Next time we'll be further improving this.

Tuesday 14 May 2013

C++ exceptions under the hood 17: reflecting on an exception type and reading .gcc_except_table

By now we know that when an exception is thrown we can get a lot of reflexion information by reading the local data storage area AKA .gcc_except_table; reading this table we have been able to implement a personality function capable of deciding which landing pad to run when an exception is thrown. We also know how to read the action table part of the LSDA, so we should be able to modify our personality function to pick the correct catch statement inside a landing pad with multiple catches.

We left our ABI implementation last time, and dedicated some time to analyze the assembly for .gcc_except_table to discover how can we find the types a catch can handle. We found that indeed there is a part of this table that holds a list of types where this information can be found. Let's try to read it on the cleanup phase, but first let's remember the definition for our LSDA header:

struct LSDA_Header {
    uint8_t start_encoding;
    uint8_t type_encoding;

    // This is the offset, from the end of the header, to the types table
    uint8_t type_table_offset;
};

That last field is new (for us): it's giving us an offset into table of types. Let's also remember the definition of each call site:

struct LSDA_CS {
    // Offset into function from which we could handle a throw
    uint8_t start;
    // Length of the block that might throw
    uint8_t len;
    // Landing pad
    uint8_t lp;
    // Offset into action table + 1 (0 means no action)
    uint8_t action;
};

Check that last field, "action". That gives us an offset into the action table. That means we can find the action for a specific CS. The trick here is that for landing pads where a catch exists, the action will hold an offset for the types table; we can then use the offset into the types table pointer, which we can get from the header. Quite a mouthful: let's better talk code:

// Pointer to the beginning of the raw LSDA
LSDA_ptr lsda = (uint8_t*)_Unwind_GetLanguageSpecificData(context);

// Read LSDA headerfor the LSDA
LSDA_Header header(&lsda);

const LSDA_ptr types_table_start = lsda + header.type_table_offset;

// Read the LSDA CS header
LSDA_CS_Header cs_header(&lsda);

// Calculate where the end of the LSDA CS table is
const LSDA_ptr lsda_cs_table_end = lsda + cs_header.length;

// Get the start of action tables
const LSDA_ptr action_tbl_start = lsda_cs_table_end;

// Get the first call site
LSDA_CS cs(&lsda);

// cs.action is the offset + 1; that way cs.action == 0
// means there is no associated entry in the action table
const size_t action_offset = cs.action - 1;
const LSDA_ptr action = action_tbl_start + action_offset;

// For a landing pad with a catch the action table will
// hold an index to a list of types
int type_index = action[0];

// types_table_start actually points to the end of the table, so
// we need to invert the type_index. There we'll find a ptr to
// the std::type_info for the specification in our catch
const void* catch_type_info = types_table_start[ -1 * type_index ];
const std::type_info *catch_ti = (const std::type_info *) catch_type_info;

// If everything went OK, this should print something like Fake_Exception
printf("%s\n", catch_ti->name());

The code looks complicated because there are several layers of indirection before actually reaching the struct type_info, but it's not really doing anything complicated: it only reads the .gcc_except_table we found on the disassembly.

Printing the name of the type is already a big step in the right direction. Also, our personality function is getting a bit messy. Most of the complexity of reading the LSDA can be hidden under the rug for almost no cost at all. You can check my implementation here

Next time we'll see if we can match our newly found type to our original exception.

Thursday 9 May 2013

Vim tip: relative numbers

Knowing the line number in Vim is crucial: you might need to jump to a specific line to fix a compiler error, you might want to check your current line to tell someone else where they broke something, or you might need to know a line number, and the diff to your current line, so you can delete N lines.

You can "set number" to get the line number in a bar at the left, and that's fine for most things but also quite unnecessary:

  • You can jump to a line by typing ":"
  • You can see your current line by checking it on the lower right status box
  • ... but what if you need a delta from your current position?

That's even easier: just "set relativenumber" and the numbers on the left will turn into a relative position from the position of your cursor.

Now you won't have to count the lines you want to delete: you can instantly know the N on dNd!

Bonus: watch vim newies struggle with the changing line numbers

Tuesday 7 May 2013

C++ exceptions under the hood 16: finding the right catch in a landing pad

16th chapter on our quest to implement a mini-ABI capable of handling exceptions; last time we implemented our personality function so it would be able to handle functions with more than one landing pad. We are now trying to make it recognize whether a certain landing pad can handle a specific exception, so we can use the exception specification on the catch statement.

Of course, to know whether a landing pad can handle a throw is a difficult task. Would you expect anything else? The biggest problems to overcome right now will be:

  • First and foremost: how can we find the accepted types for a catch block?
  • Assuming we can find the types for a catch, how can we handle a catch(...)?
  • For a landing pad with multiple catch statements, how can we know all possibly catch types?
  • Take the following example. Not only we'll have to check whether the landing pad accepts the current exception, we'll have to check if it accepts any of the current exception's parents!

struct Base {};
struct Child : public Base {};
void foo() { throw Child; }

void bar()
{
    try { foo(); }
    catch(const Base&){ ... }
}

To make our work a bit easier let's say for now we work only with landing pads that have a single catch and no inheritance exists on our program. Still, how do we find out the accepted types for a landing pad?

Turns there is a place in .gcc_except_table we haven't analyzed yet: the action table. Let's dissasemble our throw.cpp object and see what's in there, right after the call site table is finished, for our "try but don't catch" function:

Note: You can download the full sourcecode for this project in my github repo.

.LLSDACSE1:
	.byte	0x1
	.byte	0
	.align 4
	.long	_ZTI14Fake_Exception
.LLSDATT1:

Doesn't look like much, but there's a promising pointer (both a proverbial and a real pointer) to something that has our exception's name. Let's go to the definition of _ZTI14Fake_Exception:

_ZTI14Fake_Exception:
	.long	_ZTVN10__cxxabiv117__class_type_infoE+8
	.long	_ZTS14Fake_Exception
	.weak	_ZTS9Exception
	.section	.rodata._ZTS9Exception,"aG",@progbits,_ZTS9Exception,comdat
	.type	_ZTS9Exception, @object
	.size	_ZTS9Exception, 11

And we reached something very interesting. Can you recognize it? This is the std::type_info for struct Fake_Exception!

Now we know there is indeed a way to get a pointer to some kind of reflexion information for our exception. Can we programmatically find it? We'll see next time.

Thursday 2 May 2013

C++ exceptions under the hood 15: finding the right landing pad

This is now the 15th installment in what's becoming the longest series I've written for this blog; we have so far learned how exceptions are thrown and we have written a personality function capable of, with some sort of reflexion, detecting where the catch-blocks are (landing pads, in exception speak). In the last article we wrote a personality function that can handle exceptions, but it does so only with the first landing pad of the first call frame in the stack. Let's improve that a little bit, let's make our personality function capable of choosing the right landing pad in a function with multiple landing pads.

In a TDD fashion we can first build a test for our ABI. Let's modify our test program, throw.cpp, to have two try/catch blocks:

#include <stdio.h>
#include "throw.h"

struct Fake_Exception {};

void raise() {
    throw Exception();
}

void try_but_dont_catch() {
    try {
        printf("Running a try which will never throw.\n");
    } catch(Fake_Exception&) {
        printf("Exception caught... with the wrong catch!\n");
    }

    try {
        raise();
    } catch(Fake_Exception&) {
        printf("Caught a Fake_Exception!\n");
    }

    printf("try_but_dont_catch handled the exception\n");
}

void catchit() {
    try {
        try_but_dont_catch();
    } catch(Fake_Exception&) {
        printf("Caught a Fake_Exception!\n");
    } catch(Exception&) {
        printf("Caught an Exception!\n");
    }

    printf("catchit handled the exception\n");
}

extern "C" {
    void seppuku() {
        catchit();
    }
}

Before you test it, try to think about what will happen upon running this test. Focus on the try_but_dont_catch function: the first try/catch block will not throw, then the second block will throw. Since our ABI is quite dumb the first block will handle the second block's exception. What will happen after the first block has finished handling the exception? The execution will resume from where the catch/try ends, thus entering again on the second try/catch block. Infinite loop! We have reinvented yet again a very complicated while(true).

Let's use our knowledge of the start/length fields in the call site table (LSDA) to properly choose our landing pad. For this we will need to know what the instruction pointer was when the exception was thrown, and we can do this with an _Unwind_ function we already know: _Unwind_GetIP. To understand what _Unwind_GetIP will return let's see an example:

void f1() {}
void f2() { throw 1; }
void f3() {}

void foo() {
L1:
    try{ f1(); } catch(...) {}
L2:
    try{ f2(); } catch(...) {}
L3:
    try{ f3(); } catch(...) {}
}

In this case our personality function would be called for the catch block for f2 and the stack would be like this:

+------------------------------+
|   IP: f2  stack frame: f2    |
+------------------------------+
|   IP: L3  stack frame: foo   |
+------------------------------+

Note that IP will be at L3 but the exception will be thrown in L2; this is because the IP will point to the next instruction to execute. This also means we need to substract one if we need to find the IP where an exception was thrown, otherwise the result from _Unwind_GetIP would not be in the range of our landing pad. Back to our personality function:

_Unwind_Reason_Code __gxx_personality_v0 (
                             int version,
                             _Unwind_Action actions,
                             uint64_t exceptionClass,
                             _Unwind_Exception* unwind_exception,
                             _Unwind_Context* context)
{
    if (actions & _UA_SEARCH_PHASE)
    {
        printf("Personality function, lookup phase\n");
        return _URC_HANDLER_FOUND;
    } else if (actions & _UA_CLEANUP_PHASE) {
        printf("Personality function, cleanup\n");

        // Calculate what the instruction pointer was just before the
        // exception was thrown for this stack frame
        uintptr_t throw_ip = _Unwind_GetIP(context) - 1;

        // Pointer to the beginning of the raw LSDA
        LSDA_ptr lsda = (uint8_t*)_Unwind_GetLanguageSpecificData(context);

        // Read LSDA headerfor the LSDA
        LSDA_Header header(&lsda);

        // Read the LSDA CS header
        LSDA_CS_Header cs_header(&lsda);

        // Calculate where the end of the LSDA CS table is
        const LSDA_ptr lsda_cs_table_end = lsda + cs_header.length;

        // Loop through each entry in the CS table
        while (lsda < lsda_cs_table_end)
        {
            LSDA_CS cs(&lsda);

            // If there's no LP we can't handle this exception; move on
            if (not cs.lp) continue;

            uintptr_t func_start = _Unwind_GetRegionStart(context);

            // Calculate the range of the instruction pointer valid for this
            // landing pad; if this LP can handle the current exception then
            // the IP for this stack frame must be in this range
            uintptr_t try_start = func_start + cs.start;
            uintptr_t try_end = func_start + cs.start + cs.len;

            // Check if this is the correct LP for the current try block
            if (throw_ip < try_start) continue;
            if (throw_ip > try_end) continue;

            // We found a landing pad for this exception; resume execution
            int r0 = __builtin_eh_return_data_regno(0);
            int r1 = __builtin_eh_return_data_regno(1);

            _Unwind_SetGR(context, r0, (uintptr_t)(unwind_exception));
            // Note the following code hardcodes the exception type;
            // we'll fix that later on
            _Unwind_SetGR(context, r1, (uintptr_t)(1));

            _Unwind_SetIP(context, func_start + cs.lp);
            break;
        }

        return _URC_INSTALL_CONTEXT;
    } else {
        printf("Personality function, error\n");
        return _URC_FATAL_PHASE1_ERROR;
    }
}

As usual: You can download the full sourcecode for this project in my github repo.

Try the example again and voila, no more infinite loop! That was a simple change and we can now choose the correct landing pad. Next time we'll try to make our personality function also pick the correct stack frame instead of choosing the first one.