syntax highlight

Tuesday 3 December 2013

Static initialization in C++

Let's analyze this seemingly simple code sample:

struct X {
    X();
};

void foo() {
    static X a;
}

X b;
void bar() {
    foo();
    X c;
}

Do you know what the order of initialization will be for a, b and c? b is rather easy: it's a plain global variable and it should be initialized first of all, even before main runs. c is also easy, it will be initialized only when the execution reaches the line where it is defined. How about a?

a is static, so just like b it should be initialized only once. Unlike b, though, it belongs to foo's scope, and it will only be initialized the first time foo is executed. Let's see how that happens in gcc by taking an even simpler example:

struct X {
    X() throw();
};

void foo() throw() {
    static X x;
}

Note: the throw()'s are in there only to tell the compiler we don't want any kind of exception handling code, that will make the assembly inspection a bit easier. Let's compile, disassemble and c++filt this. You should see something very interesting in the first few lines:

    .file    "foo.cpp"
    .local    guard variable for foo()::x
    .comm    guard variable for foo()::x,8,8
    .text

# Skipping the actual foo definition, we'll see that later

.LFE0:
    .size    foo(), .-foo()
    .local    foo()::x
    .comm    foo()::x,1,1

Inside the definition for foo gcc reserved some space for our static variable; interestingly, it also reserved 8 bytes for something called "Guard variable for foo()::x" (when demangled, of course). This means that there is a flag to determine whether foo()::x was already initialized, or not.

Let's analyze now the assembly for foo() to understand how the guard is used:

foo():
    movl    guard variable for foo()::x, %eax
    movzbl    (%rax), %eax
    testb    %al, %al
    jne    .L1
    movl    guard variable for foo()::x, %edi
    call    __cxa_guard_acquire
    testl    %eax, %eax
    setne    %al
    testb    %al, %al
    je    .L1
    movl    foo()::x, %edi
    call    X::X()
    movl    guard variable for foo()::x, %edi
    call    __cxa_guard_release
.L1:
    # Rest of the method (empty, in our example)

This is also interesting: initializing a static variable depends on libcpp (which is dependant on the compiler's ABI). We could translate the whole thing to, more or less, the following pseudocode:

void foo() {
    static X x;
    static guard x_is_initialized;
    if ( __cxa_guard_acquire(x_is_initialized) ) {
        X::X();
        x_is_initialized = true;
        __cxa_guard_release(x_is_initialized);
    }
}

(Note: exception safety ignored, which of course is not the case for a proper libcpp)

Eventually, __cxa_guard_acquire will check if this object was already initialized or if anyone else is trying to initialize this object, and then it will signal the calling method to run x's constructor if it's safe to do so.

There's another bit of information in here which is not immediately obvious: in case X's constructor fails (ie an exception is thrown within this method), x_is_initialized won't be set to true. Assuming the exception is caught somewhere else, if foo() is called again the initialization for foo()::x will be attempted to run once again.

Tuesday 19 November 2013

Vim tim: quickly switch from header to impl

Switching from header to implementation in vim takes up precious milliseconds of typing and thinking, so we'd better delegate that to a computer. Instead of typing :tabnew FOO.cpp, just download A (for alternate) from this url: http://www.vim.org/scripts/script.php?script_id=31

Add it to your bundles in vim and, for extra magic, just map some key to :AT in your vimrc. I have added this one:

map <F4> :AT<CR>

I don't know how I lived without this for such a long time.

Thursday 14 November 2013

Setting up a Linux GW VIII: Proxy and content filtering

Now that we have a basic gateway we can do crazy stuff, like installing a proxy. You may want to manually configure a proxy for each client, but you can also choose to install a transparent proxy for all your users. This can be done with squid, let's see how.

Start by installing squid on your gateway. You can choose a different machine, but you'll have to do some magic with iptales. It's easier to just use the same machine.

Once squid is installed head to /etc/squid/ to vim squid.conf. Yes, it's very scary to see such a long config file, but it's mostly just comments. Luckily squid has reasonable defaults, so you can just ignore most of this file. Just to test if your squid installation was successful, before changing anything, you can tail -f /var/log/squid/access.log and set your browser's proxy to your gateway's IP, port 3128 (squid's default port). If everything works you should be able to browse and also see the access logs scrolling by.

If you are getting a 'denied' page on every request, you may have to configure squid to allow http access. Search for the 'http_access deny all' and comment it. You may also have to search for the local networks definitions and set it up correctly (something like 'acl localnet src 192.168.0.0/24').

Once you have verified that your proxy is working, you can configure it to run on transparent mode. Search for the http_port directive, and change it to something like 'http_port 8213 transparent' (noticed I changed the default port). It is also a good practice to specify IP and port, so squid can bind only to the local interface (you are probably not interested in serving as a proxy for the outside world, unless you plan to run a reverse proxy).

Changing squid to run on transparent mode is not enough, though. You will also need to tell your router to redirect every incoming packet from port 80 to squid. Assuming your LAN is on the 192.168.10.0/24 address and squid is listenning on port 1234, you can use this magic command to setup your iptables rule:

iptables -t nat -A PREROUTING -s 192.168.10.0/24 -p tcp --dport 80 -j DNAT --to :1234

If this doesn't work for you, or you want a more detailed explanation, you can check my post about this iptables rule.

Everything ready, you should be able to unconfigure the proxy from your browser and start using squid right away, no configuration needed. tail -f /var/log/squid/access.log for hours of (thought-policing) fun.

Adding a content filter to squid

Now that you have a gateway and a transparent proxy, it's time to install a content filter too. It's not hard, just go to your squid's config file and search for the acl section. Over there, add the following two lines:

acl blocksites url_regex "/home/router/blocked_sites.acl"
http_access deny blocksites

This will include the blocked_sites.acl file and deny access to every URL on it. There are many blacklist services out there, from which you can download a nice filter to suit your needs.

Of course, you probably don't want to restart squid each time a new site is added to your blocklist. For this you can use "squid -k reconfigure" to make squid reload its configuration.

Some random tips for squid:

  • If you think your squid is responding too slowly, you can manually setup your DNS servers. Considering squid will most likely be installed on the gateway, it might be easier to just use the gateway's gateway for the DNS, instead of the bind service running on the box. You can set this option with the directive "dns_nameservers xxx.xxx.xxx.xxx yyy.yyy.yyy.yyy" on squid.conf.
  • The TCP_MISS on the access.log means that a request was successfully served, but the content was not cached. You can review your cache limits if you get this message too much, may be you can increase the caching limit.
  • You don't need to restart squid each time you change the configuration. That would be ackward if you have a lot of users. Try "squid -k reconfigure" instead.

Tuesday 12 November 2013

Human friendly c declarations

An appropriate use of typedef's can transform 99% of c's gruesome type declarations into a mostly maintainable and maybe even readable piece of code. For that remaining 1%, or if you got a legacy application from someone with a very twisted mind, you'll probably need a way decode what "int (*(Foo::*foo)(void**))[3]" means.

To decipher weird c declarations go to http://cdecl.org/ and type your type. It works for most cases... good luck trying to figure out templates, though, for template metaprogramming you are on your own.

Thursday 7 November 2013

Setting up a Linux GW VII: Fun with iptables, setting up port forwardings

In any LAN you'll probably want to expose some services to the outer world, be it for a bittorrent connection or because you have internal servers you need to access from outside your internal LAN. To do this, you'll have to tell your router to forward some external port to an internal one, like this:

iptables -t nat -A PREROUTING -i eth0 -p tcp 
	--dport PORT -j DNAT --to INTERNAL_IP:INTERNAL_PORT

# This rule may not be needed, depending on other chain confings
iptables -A INPUT -i eth0 -p tcp -m state --state NEW 
	--dport PORT -j DNAT --to INTERNAL_IP:INTERNAL_PORT

This is enough to expose a private server to the world, but it will not be very useful when your dynamic IP changes, so you'll need to set INTERNAL_IP to be a static IP.

Of course, this commands are little less than black magic. iptables are rather complex and quite difficult to master, but as a short description we can say they are a way of applying a set of rules to incoming network packets. In iptables you have different tables of rules (in this case we use -t[able] nat) and specify that we want our rule to be applied in the PREROUTING phase. -i specifies that this rule should be applied only to packets incoming from eth0, and --dport means this rule applies only to packets incoming from a certain port. Of course, if you are going to specify a port then you need to specify the protocol (in this case, tcp).

Now we have replicated in our setup almost all the functionalities a small COTS router has. Next time we'll see how to improve that by adding a proxy.

Tuesday 5 November 2013

Automagically setup breakpoints with gdb

When you are trying to debug a project you don't know you'll probably end up recompiling a few times, then restarting your debugging session. This can be quite frustrating, when you have gdb workset full of breakpoints, watch expressions and all that stuff.

Luckily you can easily restore your state if you just write all the gdb commands you need into a file, then start gdb with "--command=state.gdb". Magic! All your breakpoints are there.

Alternatively, an even better solution: just don't exit gdb after recompiling, simply "kill" your currently under-debug process (ie type "kill" inside gdb, do not kill gdb itself!) and gdb will be smart enough to reload your binary if it changed.

Thursday 31 October 2013

Setting up a Linux GW VI: Configuring a console friendly router and setting up static DHCP IPs

We have so far setup a device capable of working as a router for a medium sized LAN, providing NAT, DHCP and DNS services. This is great if you have a dedicated network admin, but you may prefer something easier for casual console users. We'll see how to "refactor" your server configuration now to make it more console friendly.

Moving DHCP config files

Since I want to keep everything together for easy administration, I will move the configuration files for DHCP to /home/router/dhcp. Changing the dhcpd.conf file itself is easy, just move the subnets declarations and add this line:

include "/home/router/dhcp/subnets.conf";
include "/home/router/dhcp/static_hosts.conf";

Like we did before with bind, we need to configure apparmor. vim /etc/apparmor.d/usr.sbin.dhcpd and add this two lines:

/home/router/dhcp/ rw,
/home/router/dhcp/** rw,

Restart apparmor service, then restart dhcpd. Everything should work just fine.

Setting up static IPs

Remember the static_hosts file we created before? We can use that to define a static IP. Add the following to set a static IP host:

host HostName {
	hardware ethernet 00:00:00:00:00:00
	fixed-address 192.168.10.50;
}

After that, just restart the DHCP service and renew your client's IP. Done, it's static now!

Wait a minute: how do you find the MAC for your host? I'm to lazy to copy and type it, so I did the following:

cd /home/router/dhcp
ln -s /var/lib/dhcp leases

Then you can check the hardware address in the leases/dhcpd.leases file. I created a symlink to keep this directory at hand, since it gives you a status of the current leases.

Tuesday 29 October 2013

Have you checked your stack?

While getting bitten by running out of stack space is not a common thing, it sure is painful to debug. Unless it's caused by a (very obvious) stack overflow you will usually just get an unrelated segmentation fault in a seemingly random place, and not much help to troubleshoot the problem.

Luckily gcc seems to have an option to verify that your functions do not use an unbounded amount of stack space: just compile with the option "-fstack-usage" and a file .su will be generated with stack information for each function.

You probably want to see only static or bounded stack usages; an unbounded stack usage might be a sign that you should be storing that object on the stack instead.

Thursday 24 October 2013

Setting up a Linux GW V: DCHP

In our custom Linux router we now have DNS and NAT so far, but the client configuration has been absolutely manual. We can't have many clients with this sort of setup, so let's automate the client config with a DHCP server. Begin by installing isc-dhcp-server.

Edit /etc/dhcp/dhcpd.conf, set the domain-name and the domain-name-servers, like this:

option domain-name "lan";
option domain-name-servers 192.168.10.1 192.168.0.1;

default-lease-time 86400;
max-lease-time 172800;

authoritative;

I'm not sure if the first line is needed. The other two will set the DNS servers for your clients. Also, increasing your lease time is recommended, I used one day for default leases. I set this DHCP server as the authoritative server. If this is your router, that's probably what you want.

Now we need to define the network topology:

# This is the WAN network, and we won't provide a service here
subnet 192.168.0.0 netmask 255.255.255.0 {
}

# Define the service we provide for the LAN
subnet 192.168.10.1 netmask 255.255.255.0 {
	range 192.168.10.100 192.168.10.200;
	option routers 192.168.10.1;
}

Now we need to restart ISC:

sudo /./etc/init.d/isc-dhcp-server restart

And now we need to check if everything worked in the client. It's easy this time, we just ask for an IP:

sudo dhclient
ifconfig

If everything went fine, we should now have an IP in the 100-200 range, as well as the DNS server in /etc/resolv.conf. We have now setup a very basic router and should be able to server several clients for basic browsing capabilities.

Next time we'll see how to tidy up everything, for easier administration.

Tuesday 22 October 2013

Some gratuitous MSVC bashing

Recently I found out Microsoft's Visual Studio doesn't support alternative tokens (ie "and" instead of "&&"). Even worse than that, apparently they don't think it's even necessary. And by the looks of this thread, the people working on MSVC need to take some time to actually READ the cpp standard. You know... it's kind of like a spec for your product. It's always good to take some time to understand the specs for your product...

I can only imagine how incredibly ugly their lexer must be to say it's not a fixable problem.

Thursday 17 October 2013

Setting up a Linux GW IV: Setting up apparmor

Apparmor is a service that runs in the background, checking what other binaries can and can't do. For example, it will allow bind9 to open a listening socket on port 53 (DNS), but it will deny an attempt to open a listening socket on port 64. This is a security measure to limit the damage a compromised bind9 binary running as root might do. And since we are going to use a non standard configuration, we need to tell apparmor that it's OK.

After installing bind9 we should get a new file in /etc/apparmor.d/usr.sbin.named. Add the following lines at the bottom:

  /home/router/named/** rw,
  /home/router/named/ rw,

And restart apparmor service:

/./etc/init.d/apparmor restart

Since we were modifying apparmor to allow a non-standard bind installation, now restart bind too. This time it will start without any errors, and you should be able to tail -f /home/router/named/dns.log to see the DNS queries on real time. If it doesn't, check that /home/router/named is writable to the bind user (I did a chgrp -R bind named).

Tuesday 15 October 2013

A C++ template device to obtain an underlying type

What happens when you need to get the underlying data type of a pointer or reference? You can write some crazy metaprogram to do it for you. Like this:

template <typename T> struct get_real_type      { typedef T type; };
template <typename T> struct get_real_type<T*>  { typedef T type; };
template <typename T> struct get_real_type<T&>  { typedef T type; };

template <class T>
int foo() {
    return get_real_type<T>::type::N;
}

struct Bar {
    static const int N=24;
};

#include <iostream>
using namespace std;
int main() {
    cout << foo<Bar*>() << endl;
    cout << foo<Bar&>() << endl;
    cout << foo<Bar>() << endl;
}

Incidentally, this is also the basis for the implementation of std::remove_reference. Actually you'd be better of using std::remove_reference, for your own sanity.

Thursday 10 October 2013

Setting up a Linux GW III: Setting up DNS with bind9

If you have been following my series on how to install a Linux based router, you should now have a setup where a client is able to see the outside world via a router. We can try something more complex now, like pinging a domain instead of an IP. Something like this:

ping google.com

You should get a message saying the host is unknown. Can you guess why? Right, there's no DNS.

Setting up DNS

DNS will be necessary to resolve domains to IPs. bind9 is the default option for Debian based servers (are there others? no idea).

sudo apt-get install bind9

This will get your DNS server up and running, but you will still need to add this server manually to your client (again, because there's no DHCP running):

sudo echo "nameserver 192.168.10.1" > /etc/resolv.conf

And now:

ping google.com

Magic again, it (may) work. If it doesn't, you may need to open /etc/bind/named.conf and setup your router (192.168.0.1) as a forwarder, then restart the bind server.

Of course this is rather boring. If you are going to install a DNS you might as well create a custom TLD for your LAN.

Setting up a custom TLD with bind9 for your LAN

So far on the series about how to install a Linux based router, we set up a Linux router with NAT and a basic DNS. Now we'll setup a custom TLD, so you can have custom domains for your LAN. For example, if you want your router to have a nice user friendly name, instead of just an IP.

Let's start by adding a local zone to /etc/bind/named.conf.local, for a domain we'll call "lan":

zone "lan" {
        type master;
        file "/home/router/named/lan.db";
};

Now we need to add a reverse zone. Note how the name is the IP reversed:

zone "10.168.192.in-addr.arpa" {
	type master;
        file "/home/router/named/rev.10.168.192.in-addr.arpa";
};

We still need to create both files (lan.db and rev.10.168.192.in-addr.arpa), but will do that later. Lets setup a place to log all the DNS queries (optional):

logging {
    channel query.log {
        file "/home/router/named/dns.log";
        severity debug 3;
		  print-time yes;
    };

    category queries { query.log; };
};

For the log entry I have chosen /home/router/named as the log directory, just because for this project I'm keeping everything together (config and logs) so it's easy for people not used to administer a Linux box, but of course this means that apparmor must be configured to allow reads and writes for bind in this directory. We'll get to that in a second, first let's create the needed zone files for our new TLD.

Remember our two zone files? I put them on /home/router/named, but usually they are on /etc/bind. Again, I did this so I can have all the config files together. These are my two files:

For lan.db

lan.      IN      SOA     ns1.lan. admin.lan. (
                                                        2006081401
                                                        28800
                                                        3600
                                                        604800
                                                        38400
 )

lan.      IN      NS              ns1.lan.

wiki             IN      A       192.168.0.66
ns1              IN      A       192.168.0.1
router           IN      A       192.168.0.1

For rev.10.168.192.in-addr.arpa

@ IN SOA ns1.lan. admin.example.com. (
                        2006081401;
                        28800; 
                        604800;
                        604800;
                        86400
)

                     IN    NS     ns1.lan.
1                    IN    PTR    lan

Most of these lines are black magic, and since an explanation of both DNS and Bind is out of scope (feel free to read the RFC if you need more info) let's just say you can add new DNS entries by adding lines like this:

NICE_NAME           IN      A       REAL_IP

This will make bind translate NICE_NAME.lan to REAL_IP. Of course, this will depend on the TLD you defined. Now restart bind to get a crapton of errors. It will complain about not being able to load a master file in /home/router/named. Remember that apparmor thing I mentioned?

Tuesday 8 October 2013

Gcc tip: better disassembly

Few things are more awesome than compiling with "g++ -S" and inspecting gcc's dissasembly and learn how the compiler optimizes things you wouldn't even think about. Unfortunately, the assembly might not be the most human friendly format for a program (though I've seen worse).

While you won't escape the need to learn some assembly to get any meaningful information out of gcc's assembly listing, there are some tips which might make your life much easier:

C++ filt

c++filt is part of the build essentials package, and will turn mangled names into proper C++ names. You won't need to remember that _Znwm is the mangled version of "operator new", just run "g++ -E foo.cpp -o /dev/stdout | c++filt" and you'll get an assembly with unmangled names.

fverbose-asm

Some people have the ability to read assembly and automatically understand how the data flows between registers and variables very quickly. For the mere mortals like us, gcc has a very helpful flag called "-fverbose-asm" which will add a comment to each line where a variable is referenced. This will let you keep track of the data flow inside a function.

Extra, unrelated, tip:

As far as I know, gcc has no option to write to stdout; just use "-o /dev/stdout" to let it write to a fake file which Linux will helpfully create for you, then you can pipe the hell out of gcc's output.

Friday 4 October 2013

Five years

It's been a while since I've used the meta-post category. I think it's a good opportunity to do so: exactly five years ago I typed "/./etc/init.d/blog start", on this same blog. Quite a long time.

I'm really surprised I've managed to keep on writing more or less regularly for five years on this blog. There are now about 360 articles on this site, which gives an average of about one per week. That's a nice metric, even if not very accurate. I spent almost a year without writing, while moving to a different country. Maybe I should start a blog about that too.

Here's for five years more!

Thursday 3 October 2013

Setting up a Linux GW II: NATting and forwarding

For our Linux GW, services like DNS and DHCP are nice-to-have, but having real connectivity is way more important. Let's set up the NAT and connection forwarding features of the new router, then we can test if our setup is working properly by pinging an IP of one LAN from the other.

We'll do this by setting up NAT with iptables. We'll also have to configure the OS to forward connections from one network card to the other:

echo 1 > /proc/sys/net/ipv4/ip_forward
iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE
# Add a line like this for each eth* LAN
iptables --append FORWARD --in-interface eth1 -j ACCEPT

We will also need to setup the IP for eth0, since there won't be a DHCP server (we ARE the server!). Open /etc/network/interfaces and add something like this:

# Configure the WAN port to get IP via DHCP
auto eth0
iface eth0 inet dhcp
# Configure the LAN port
auto eth1
iface eth1 inet static
	address 192.168.10.1	# (Or whatever IP you want)
	netmask	255.255.255.0	# Netmasks explanations not included

Once that's checked, restart networking services like this:

sudo /./etc/init.d/networking restart

Everything ready, now just plug your PC to the new router and test it. Remember to manually set your IP in the same network range as your router, since there's no DHCP at the moment. This may be useful to debug a problem.

In your client PC, set your IP address:

ifconfig eth0 192.168.10.10

Test if you IP is set:

ping 192.168.10.10

If you get a reply, your new IP is OK, if not there's a problem with your client. Second step, see if you can reach the router:

ping 192.168.10.1

Note that you may need to renew everything (i.e. restart networking and manually assign your IP) after you connect the cable.

Again, if you get a reply then you have connectivity with the router. So far we haven't tested the iptables rules nor the forwarding, so any issue at this point should be of IP configuration. If everything went well, it's time to test the NAT rules and the forwarding.

ping 192.168.1.1

That should give you an error. Of course, since there's no DHCP there's no route set. Let's manually set a route in the client:

sudo route add default gateway 192.168.10.1

Then again:

ping 192.168.0.1

Magic! It works! If it doesn't, you have a problem either in the NAT configuration or the IP Forwarding of the router. You can check this with wireshark, if the pings reach the server but they never get a reply back then it's the NAT, i.e. it can forward the IP packages on eth1 to eth0 but the router has no NAT, and it doesn't know how to route the answer back. If the pings never even reach eth0, then you have an ip forwarding problem.

Persisting the forwarding rules

In order to have the forwarding rules persisting after a reboot, we need first to change /etc/sysctl.conf to allow IP forwarding. It's just a mater of uncommenting this line:

net.ipv4.ip_forward = 1

We will also have a lot of iptables rules we need to setup during boot time. I have created a script at /home/router/set_forwarding.sh, which I also linked into /etc/init.d/rc.local so it's run whenever the system boots.

Next time we'll move on to something more complex: installing a DNS server and using domains instead of IPs.

Tuesday 1 October 2013

C preprocessor VII: Recursive expansion on function macros

The last time we talked about recursive expansion rules on C's preprocessor: to sum it up, each expansion creates a scope, that contains a list of all macros which have already been expanded in said scope, or in a parent scope. That gives us a very nice and easy to understand tree of already-expanded rules.

Clearly that's too easy for C. We need more complexity: we need to make the expansion rules interact with the argument substitution process and the preprocessor operators too!

How exactly? The whole process is specified by a very tiny paragraph, 16.3.1, on the standard, which despite being tiny contains a lot of information. Actually, it contains all the expansion and precedence rules for the preprocessor. And it's more or less like this:

  1. Argument scanning: the perprocessor binds a set of tokens to each argument name. If there are extra arguments and the token "..." is part of the macro's signature, a __VA_ARGS__ argument is created. (to put it simply: it will bind a set of tokens like "(a,b)" to an identifier like "ARG1").
  2. Stringify and token pasting is applied ONLY to the arguments, not to the body function.
  3. Each argument is recursively scanned for macro expansion, as if each argument was on a file on its own (imagine a new file is created with only preprocessor directives and the argument, then apply the expansion algorithm recursively to that file).
  4. After the arguments have been fully expanded, they are substituted on the macro's body.
  5. The resulting definition is then rescanned for macro expansions or token pasting operators.
  6. A side effect of this multi-phase macro expansion is that the nice expansion tree we used to have no longer works.

Let's take this example:

#define str(...) #__VA_ARGS__
#define foo(a, b) foo a bar str(b)
#define bar foo bar 1
foo(bar, (1, 2, 3))

How can we expand this macro call? Like this:

expand{ foo(bar) }
        Match foo with definition of macro: foo(a)
            Bind a to bar
            Macro expand argument a -> expand{ bar }
                    bar takes no arguments, no binding is done
                    Apply rule bar -> foo bar 1
                    Scan the result for new expanions
                            foo was already expanded, no further expansion

            Bind b to (1, 2, 3)
            Macro expand argument b -> nothing to expand

        Replace macro expanded arguments in body definition:
            -> foo foo bar 1 bar str((1, 2, 3))

        Rescan the body for further expansion:
                foo: Already expanded on current scope
                foo: Already expanded on current scope
                bar: Already expanded (The compiler will have too keep a map of expanded macros for each identifier in a definition!)
                bar: Needs expansion
                        Apply rule bar -> foo bar 1
                        Rescan for further expansion
                                foo: Already expanded on parent scope
                                bar: Already expanded on current scope
                str((1, 2, 3)): Expand macro call
                        Bind (1, 2, 3) to __VA_ARGS__
                            Analyze (1, 2, 3) for further expansion
                            Apply operator '#' to (1, 2, 3) -> "(1, 2, 3)"
                        Replace #__VA_ARGS__
                Replace the result of str((1,2,3)) -> "(1, 2, 3)"

        Replace the original call "foo(bar, (1, 2, 3))" for the result
            -> foo foo bar 1 foo bar 1 "(1, 2, 3)"

This last example should be a good representative of the complexities involved in a macro expansion; hopefully now you know more than you ever wanted to know about macros.

Thursday 26 September 2013

Setting up a Linux gateway/router, a guide for non network admins

Setting up a Linux GW or router is not as hard as it may seem, as long as you are reading a friendly enough guide. Yes, there are a lot of guides for this, but since I needed to document how I did it, I might as well write a post about it here. My addition to the usual "setting up a linux gw guide": I'll do it using Virtualbox first, so I can test my setup before actually deploying it.

I'm going to write about how can you setup a regular Linux distro to be your border router/gateway for your LAN, but for easy of use I'll base my examples on Ubuntu.

As expected, if we are going to replace a device, say, a router, we need to replace it with something that can provide the same functionality. In this case, we have chosen a Linux server, so we need to figure out which services are provided by the router and then emulate them someway:

  • DHCP to manage leases
  • DNS to translate domains to IPs
  • NAT, to multiplex a single connection
  • Service forwarding, to expose internal services to an external network
Luckily Linux supports all of these:
  • ISC for DHCP
  • bind9 for DNS
  • iptables for NAT
  • iptables again, for service forwarding
We'll be setting up each of these services in the next posts, for now:

Preliminary work, the hardware setup

Before you setup any services, you are going to need two things: first two network cards, one for the outgoing connection and another one for the (switched) LAN, and a way of telling your server that you want all traffic from network 1 forwarded to network 2. You may want to install more than two cards, in case you need to route several LANs. We'll see that later.

You will also need an OS. I have chosen Ubuntu because it's very simple to install, and has all the software we need available in the repositories, but you can use any other distribution if it suits your needs.

Also, throughout this guide I will assume a setup like this:

  • WAN access through eth0, DHCP address
  • LAN routing in eth1, network 192.168.10.1/24

If you don't have all this hardware...

Not everyone may have two spare desktops with three NICs ready for testing. Even if you do, you may be too lazy to setup the physical part of your network. If this is your case, you can also setup a virtual machine to emulate your setup, and Virtualbox is great for the task:
  1. Begin by creating what will be your router VM.
  2. Enable the first network adapter. This one should be able to see your physycal router (i.e. connect to a WAN).
  3. Enable a second network adapter. Use the 'Internal network' option in the 'Attached to' field. This will be your LAN interface.
  4. Create a second VM. This one will be your client.
  5. Enable a single network adapter, attached to an internal network as well. The name for this network should match that of the other VM.
You are all set now, with this virtual setup you can begin setting up your router. We'll see how next time.

Tuesday 24 September 2013

C preprocessor VI: Recursive macro expansion rules

What happens if you define a recursive macro? This might seem like a silly question, but by asking it we can gain some insight on the inner working of the preprocessor.

Let's start with a simple example:

#define foo bar 1
#define bar foo 2
foo

Luckily the preprocessor is smart enough not to trip up on this simple piece of code. When expanding foo on line three it will do something like this:

#define foo bar
#define bar foo
foo
// Applies foo -> bar 1
bar 1
// Applies bar -> foo 2
foo 2 1
// Scans foo again... but doesn't expand it

The second time the preprocessor scans foo it won't expand it: it "knows" foo was already expanded, so it won't do it again. But how does it know that foo was already expanded? Let's try something a bit more complicated:

#define foo bar a baz b
#define bar foo 1
#define baz bar 2
foo

And then let's see how foo is expanded, step by step:

#define foo foo a bar b baz c
#define bar foo 1
#define baz bar 2
foo

First the rule "foo -> foo a bar b baz c" will be applied and the results rescanned: let's call this scope 1. We'll end up with:

foo a bar b baz c

Now the results of this expansion will be scanned, in a new scope. Let's call it scope 2. The first token the preporcessor will see is "foo", which was already expanded on scope 1: it will be ignored and it will continue to the next expandable token, "bar", and it will expand it like this:

foo a foo 1 b bar 2 c

On the scope that baz's expansion creates (scope 4), the parent's scope expansion rules are "inherited", so for scope 4 "foo" was already expanded but "bar" was not, because bar's expansion happened on scope 3 and scope 3 is not scope's 4 parent. Not following me? Try following this diagram:

foo -> foo a bar b baz c
    foo -> already expanded, ignore
    a   -> not a macro, ignore
    bar -> expand to "foo 1"
        foo -> expanded at parent scope, ignore
        1   -> not a macro, ignore
    b   -> not a macro, ignore
    baz -> expand to "bar 2"
        bar -> expand to "foo 1"
            foo -> already expanded at parent scope, ignore
            1   -> not a macro, ignore
        2   -> not a macro, ignore
    c   -> not a macro, ignore

Hopefully the preprocessor expansion rules should be a bit more clear now: each expansion creates a scope, each scope inherits from parent's scopes whether a rule was applied or not and if it was then said rule is ignored in the current scope.

Of course these rules get more complicated when dealing with token pasting and stringifying operators, because each phase (stringifying, token pasting, rescanning and expansion) will happen in a specific order. Things get even more complicated when you realize (by reading the standard) that said order is not the same when you deal with argument replacement.

Then again, it's probably a good idea if your macros don't rely on the recursive expansion rules of the preprocessor.

Thursday 19 September 2013

Vim tip: Jump to a tag definition

I admit it, hitting control enter to jump to a definition is a very useful feature of fancy GUIs. Fortunately, it's not exclusive to fancy GUIs, apparently it's been available using ctags too for a little while (if you consider the last 20 or 30 years to be little, that is).

Just add this magic spell to your .vimrc; next time you open a file in a project with a tags file generated just press ctrl-enter with your cursor over the definition you wish to find:

map <C-CR> :tab split<CR>:exec("tag ".expand("<cword>"))<CR>

Tuesday 17 September 2013

C preprocessor V: Conditionals

While walking around the c preprocessor we came to know the stringify operator, the crazy token pasting operator and a __VA_ARGS__ macro. All very weird, but at least the #if's work in a sane way... or do they? They do, but there's some room for unexpected behavior if you don't know some implementation details. Take this code for example:

#if 0
#  if 0
#  else
#  elif true
#  endif
#endif

Clearly the inner if is wrong because the else clause comes before the elseif, however you might think it doesn't matter because it's surrounded by an #if 0. Surprise: it does matter, that's not valid preprocessor input. Even if the outer #if is not "taken", whatever preprocessing directives are inside it should still be valid (though anything that's not a preprocessing directive will indeed be ignored).

Even though at first it might seem weird for things inside an #if 0 to be important, it makes sense if you think that should an internal #if not respect the proper structure then the preprocessor wouldn't know when to end the first #if 0. Then again, if you find any real-world utility to this bit of preprocessor implementation trivia, you are doing something horribly wrong!

Thursday 12 September 2013

Vim tip: open file under cursor

If you have a bunch of #include's you don't need to manually type their path when you need to edit one of them; just place your cursor on top of the written path and remember these helpful commands:
gf	 open in the same window ("goto file")
f	 open in a new window (Ctrl-w f)
gf	 open in a new tab (Ctrl-w gf)

Tuesday 10 September 2013

C preprocessor IV: VA Args

And things just got even more awesome in our preprocessor series: if just passing a known number of parameters is not cool enough for you you can use a variable number of arguments on a macro definition too. This is very useful to implement printf style debug macros which get replaced by no tokens on a release build. Or to make debugging a bit more complicated, your choice.

#define DEBUG(fmt, ...) printf(fmt, __VA_ARGS__);

Combining this with stringify will provide you hours of fun. Combining this with token pasting... well, that's just evil.

Thursday 5 September 2013

Stopping commits on git

Who hasn't commited debug code by mistake? It's only normal to forget to remove an #include we added only to test some stuff. Luckily it's easy to tell git that we don't want to commit any changes with a certain string.

On any (git) repo you'll find a .git/hooks folder; add this script in .git/hooks/pre-commit (and don't forget to chmod +x it):

#!/bin/sh

if [ 0 != `git diff | grep "STOPCOMMIT" | wc -l` ]; then
    echo "Error: STOPCOMMIT found, remove it before commiting";
    git diff
    exit 1
fi

Now git will check your commits and stop them if they contain the STOPCOMMIT string. Now you can add all the debug changes you want, as long as you add a //STOPCOMMIT after them you'll never end up commiting them by mistake.

Monday 2 September 2013

C preprocessor III: Token pasting

A stringify operator is good but the token pasting operator goes off the awesomeness chart (if you're working on an ioccc entry, that is). Actually, what token pasting does is conceptually simple: it will paste together two tokens to form a new one. So, for example, PASTE(foo, bar) would result in the "foobar" token. Looks simple enough, doesn't it? The token pasting operator is invoked via '##'. For example:

#define PASTE(x, y) x ## y
#define FOOBAR 42
int main() { return PASTE(FOO, BAR); }

The previous code would just return 42. So what's the usefulness of a paste operator? Other than obfuscating stuff, you can use it to create classes with similar interfaces but different method names (I'm not saying it's a good idea, I'm saying you can). For example:

#define MAKE_GET_SET(x, T) \
               void set_ ## x (T o) { this->x = o; } \
               T get_ ## x () { return this->x; }
class Foo {
  MAKE_GET_SET(foo, int);

The token pasting operator doesn't have the limitation of being applicable only to a macro parameter, so code like "12 ## 34" is a perfectly valid operation which results in "1234". It does have a catch: if the resulting token is not valid the behavior is undefined. This means that, for example, pasting "12" and "foo" together produces "12foo", which is not a valid token. Being the operation undefined means that a compiler might reject this operation (I'm pretty sure gcc does) or that it might do a completely different thing (it could choose to ignore the token pasting operator and it would still be standard compliant).

Nasal demons FTW!

Thursday 29 August 2013

Vrapper: a real text editor for Eclipse

It's been a while since I had to use Eclipse, and last time I did I was really disappointed at the lack of a real text editor embedded in it. There were a few Vim plugins for Eclipse back then, but all of them where paid. Luckly now there seems to be a decent open alternative: http://vrapper.sourceforge.net/documentation/

Vrapper provides pretty decent text editing capabilities for Eclipse, I'd almost say it makes it usable. And, as a beneficial side effect, now your coworkers wont' be able to touch your Eclipse anymore!

Tuesday 27 August 2013

C preprocessor II: stringify operator

We all more or less know the list of operators that C++ provides for "normal code" but not everyone is aware that the preprocessor also has special operators we can use. Small difference: an operator like '+' will usually operate on numbers, but the preprocessor operates only on a single concept: source code tokens. What kind of operators could a preprocessor have, then? Two, actually. Let's start with the simpler one: Stringify The '#' operator is the simplest operator of the preprocessor: it converts the next token to string. Something like this, for example:
#define f(x) to_str(x) == #x
f(123)
Would print
to_str(123) == "123"
A restriction applies to the stringify operator: it can only be applied to a macro param, not just any token. So this, for example, is an illegal macro:
#define f(x) #123 == #x

There's another operator, which is a bit more "esoteric". We'll talk about token pasting next time.

Thursday 22 August 2013

Crazy git error

Have you ever run into this error message on git before?

fatal: example.com/repo.git/info/refs not found: did you run git update-server-info on the server?

It can be very baffling, because it may happen even if you change absolutely nothing in your git's configuration. I've read most people attribute this to a typo, and that seems to be the most common case, but I found yet another thing that might trigger this error: if you have set a proxy server, for example for wget, using an environment variable like http_proxy, https_proxy or ftp_proxy then git might be tripping up on your proxy and producing this error message.

Tuesday 20 August 2013

C preprocessor: Just a simple replacer?

Lately, out of curiosity, I spent some time to better understand how the C preprocessor works. I admit it, I thought it was a very dumb copy-paste based replace mechanism, only capable of doing the simpler keyword matching and replacement. Boy, was I wrong. Turns out the preprocessor is actually an organically grown pseudo language (as opposed to a properly designed language feature) inside C, which later got standardized through an incredibly complex set of rules and definitions. Rules for recursion, expansion, pattern matching and crazy operators like # and ## are some of the things that I never before knew existed in the preprocessor.

During my time toying with the preprocessor I learned a few things about recursion, the different operators supported by it and some crazy things about the order of conditional evaluation. I'll summarize some of the things I learned in the next few posts: you might want to check 16.3 in the C++ standard, since the next few articles will be only explanations about different paragraphs on this section.
Disclaimer: if you find any real-world utility to these bits of preprocessor trivia, you are probably doing something horribly wrong or horribly evil!

Thursday 15 August 2013

Avoid compile warnings from 3rd party libs with gcc

So, your code is ferpect. It compiles cleanly with all warning options maxed out. You have already added -ansi, -pedantic, -Wall, -Wc++0x-compat, -Wextra and it all works. Even -Weffc++ emits no warning. And then, a wild third party library appears; your beautiful compile log is now littered with "initialization out of order" and "should declare a virtual destructor" warnings. What to do?

When including a third party library (like, for example, boost) you will almost never have the option to fix any of the diagnostics that your compiler helpfully provides you. If there's nothing you can do about them, there's no point in getting the warnings either. Disabling -Weffc++ is also not a good idea. If you already took the effort of cleaning your code to such a high standard, you shouldn't now relax it.

There's a third option: When compiling don't include those libs as "-I /path/to/lib", do it as "-isystem /path/to/lib". Gcc will now know those warnings are not your fault and it will stop nagging you.

Tuesday 13 August 2013

Git tip: auto update your ctags

On any your .git/hooks folder; add this script in .git/hooks/post-merge (and don't forget to chmod +x it):
ctags -R -f .ctags .

Now every time you do a git pull your ctags file will automagically update. You might also want to copy or ln -s this script for the post-commit hook, if you want to run a ctags update on each git commit. Be aware that this will make your commits slower, if generating your tags file takes a long time.

Extra tip: "-f .ctags" will make ctags write into a hidden file, .ctags, which you can then add to .gitignore. Now ctags magically works in Vim and you won't even need to see your tags file (just don't forget to "set tags=./.ctags;/" on vim).

Thursday 1 August 2013

Pictag: finally a simple geotagging tool for Linux

TL;DR: Link to a mostly working hacked version of Pictag, on my Github repo.

Since Google decided not to support Picasa for Linux anymore (yes, a long time ago) I've been looking for a decent photo management alternative. Lately I've settled with Digikam, it does everything Picasa used to do (and much better, I may add) except for providing a way to geotag your pictures on a map.

Most geotagging solutions involve having an already created waypoints map from a GPS device, which then gets processed and magically added to the images' exif data. That didn't cut it for me, I don't have, nor want, a GPS I can take on holidays, plus I really only want to drag and drop pictures on a map. That's where pictag comes in.

At the moment pictag seems to be a bit abandoned, as there are no more packages for Ubuntu 13.04. Luckily with some hacking it's possible to get it running.

First, since there's no package for Pictag you'll need to take care of the dependencies yourself. On a more or less vanilla 13.04 install, this should do the trick:

sudo apt-get install python-setuptools \
                     python-distutils-extra \
                     geoclue-ubuntu-geoip \
                     liblaunchpad-integration-common \
                     libchamplain-0.12-0 \
                     libchamplain-0.12-dev \
                     libchamplain-gtk-0.12-0 \
                     libchamplain-gtk-0.12-dev \
                     python-pyexiv2 \
                     libclutter-gtk-1.0-0 \
                     libclutter-gtk-1.0-dev

After you've taken care of that you can download the latest version from Launchpad (while writing this article that should be 12.07.17) and run ./bin/pictag, only to watch it fail miserably.

Pictag seems to be using GSettings, a very annoying Gnome settings manager which won't work unless you actually install whatever program you're trying to run. Luckily we can just hack it out of Pictag simply by commenting out all references to self.settings in PictagWindow.py and Window.py. Either that or get my hacked version of Pictag, on my Github repo.

With some luck, my hacked version of Pictag should run pretty much OK on Ubuntu 13.04 or newer. There seems to be a few issues with libchamplain (the mapping library) on earlier versions of Ubuntu that may cause the map to display only broken images. If you can't load any maps you'll have to get a newer Ubuntu. Or fork my repo and get hacking :)

Tuesday 30 July 2013

Force a program to output to stdout

Silly but handy CLI trick on Linux: Some programs don't have an option to output to stdout. Gcc comes to mind. In that case the symlink '/dev/stdout' will come in handy: /dev/stdout will be symlinked to stdout for each process.

With this trick you could, for example, run "gcc -S foo.cpp -o /dev/stdout", to get the assembly listing for foo.cpp.

You probably shouldn't use this trick on anything other than CLI scripting stuff (keep in mind /dev/stdout might be closed or not accessible for some processes).

Thursday 25 July 2013

C++ exceptions under the hood appendix III: RTTI and exceptions orthogonality

Exception handling on C++ requires a lot of reflexion. I don't mean the programmer should be reflecting on exception handling (though that's probably not a bad idea), I mean that a piece of C++ code should be able to understand things about itself. This looks a lot like run-time type information, RTTI. Are they the same? If they are, does exception handling work without RTTI?

We might be able to get a clue about the difference between RTTI and exception handling by using -fno-rtti on gcc when compiling our ABI project. Let's use the throw.cpp file:

g++ -fno-rtti -S throw.cpp -o throw.nortti.s
g++ -S throw.cpp -o throw.s
diff throw.s throw.nortti.s

If you try that yourself you should see there's no difference between the RTTI and the No-RTTI version. Can we conclude then that gcc's exception handling is done with a mechanism different to RTTI? Not yet, let's see what happens if we try to use RTTI ourselves:

void raise() {
    Exception ex;
    typeid(ex);
    throw Exception();
}

If you try and compile that, gcc will complain: you can't use typeid with -fno-rtti specified. Which makes sense. Let's see what typeid does with a simple test:

#include <typeinfo>

class Bar {};
const std::type_info& foo()
{
        Bar bar;
            return typeid(bar);
}

If we compile this with "g++ -O0 -S", you will see foo compiled into something like this:

_Z3foov:
.LFB19:
    # Prologue stuff...

    subl    $16, %esp
    # Bar bar

    movl    $_ZTI3Bar, %eax
    # typeid(bar)

    leave
    # Epilogue stuff...

_ZTS3Bar:
    # Definition for _ZTS3Bar...

_ZTI3Bar:
    .long   _ZTVN10__cxxabiv117__class_type_infoE+8
    .long   _ZTS3Bar
    .ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
    .section    .note.GNU-stack,"",@progbits

Does that look familiar? If it doesn't, then try changing the sample code to this one:

class Bar {};
void foo() { throw Bar(); }

Compile it like "g++ -O0 -fno-rtti -S test.cpp" and see the resulting file. You should see something like this now:

_Z3foov:
    # Prologue stuff...

    # Initialize exception
    subl    $24, %esp
    movl    $1, (%esp)
    call    __cxa_allocate_exception
    movl    $0, 8(%esp)

    # Specify Bar as exception thrown
    movl    $_ZTI3Bar, 4(%esp)
    movl    %eax, (%esp)

    # Handle exception
    call    __cxa_throw

    # Epilogue stuff...

_ZTS3Bar:
    # Definition for _ZTS3Bar...

_ZTI3Bar:
    .long   _ZTVN10__cxxabiv117__class_type_infoE+8
    .long   _ZTS3Bar
    .ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
    .section    .note.GNU-stack,"",@progbits

That should indeed look familiar: the class being thrown is exactly the same as the class that was used for typeid!

We can now conclude what's going on: the implementation for exception throwing type information, which needs reflexion and relies on RTTI info for it, is exactly the same as the underlying implementation for typeid and other RTTI friends. Specifying -fno-rtti on g++ only disables the "frontend" functions for RTTI: that means you won't be able to use typeid, and no RTTI classes will be generated... unless an exception is thrown, in which case the needed RTTI classes will be generated regardless of -fno-rtti being present (you still won't be able to access the RTTI information of this class via typeid, though).

Tuesday 23 July 2013

A random slideshow in Ubuntu

The other day I wanted to use my tv for a slideshow of my travel pictures. Something simple, just select a folder and have a program like Shotwell create a slideshow with a random order on my tv. Of course, Ubuntu and double screen equals fail. For some reaason all the programs I tried either were incapable of using the tv as the slideshow screen (even after cloning screens... now that's a wtf) or where not able to recursively use all the pictures in a folder.

feh to the rescue. It's not pretty, but feh seems to be exactly what I was looking for. It's a CLI application for Linux and after some RTFM I came up with this script:

feh ~/Pictures \
     --scale-down \
     --geometry 1920x760 \
     --slideshow-delay 9 \
     --recursive \
     --randomize \
     --auto-zoom \
     --draw-filename \
     --image-bg black

You can probably figure out by yourself what each option means. If not, just man feh.

Thursday 18 July 2013

Mocking in C++: the virtual problem

Mocking objects is crucial for a good test suite. If you don't have a way to mock heavy objects you'll end up with slow and unreliable tests that depend on database status to work. On the other hand, C++ mocking tends to be a bit harder than it is on dynamic languages. A frequent problem people find when mocking are virtual methods.

What's the problem with virtual methods? C++ has the policy of "not paying for what you don't use". That means, not using virtual methods is "cheaper" than using them. Classes with no virtual methods don't require a virtual dispatch nor a vtable, which speeds things up. So, for a lot of critical objects people will try to avoid virtual methods.

How is this a problem for mocking? A mock is usually a class which inherits from the real class, as a way to get the proper interface and to be compatible with code that uses the real implementation. If a mock inherits from the real thing you'll need to define all of its methods as virtual, even if you don't need to, just so you can actually implement a mock.

A possible solution

The problem is clear: we need some methods to behave as virtual, without defining them as virtual.

A solution to this problem, the one I personally choose, is using a TEST_VIRTUAL macro in the definition of each mockeable method for a class; in release builds I just compile with -DTEST_VIRTUAL="", and for testing builds I compile with -DTEST_VIRTUAL="virtual". This method is very simple and works fine but has the (very severe) downside of creating a different release code than the code you test; this might be acceptable, but it's a risk nonetheless.

Other possible solutions I've seen in the past are:

  • Making everything virtual, even if not strictly necessary. Quite an ugly solution, in my opinion, it can affect performance and the code is stating that a method can be overridden, even if this don't make sense.
  • Using some kind of CRTP for static dispatching: probably one of the cleanest solutions, but I think it adds too much overhead to the definition of each class.
  • Don't make the mock inherit from the real implementation, make the user code deduce the correct type (eg by using templates). It's also a clean solution, but you loose a lot of type information (which might or might not be important) and it might also severely impact the build time

To conclude, I don't think there's a perfect solution to the virtual problem. Just choose what looks better and accept we live in an imperfect world.

Tuesday 16 July 2013

Counting lines per second with bash

The other day I wanted to quickly monitor the status of a production nginx after applying some iptables rules and changing some VPN stuff. It's easy to know if you completely screwed up the server: the number of requests per second will drop to zero, all requests will have an httpstatus different from 200, or some other dramatic and easy to measure side effect.

What happens if you broke something in a slightly more subtle way? Say, you screwed up something in ipsec (now, I wonder how that can happen...) and now networking is slow. Or iptables now enforces some kind of throttling in a way you didn't expect. To detect this type of errors I wrote a quick bash script to output how many lines per second are added to a file. This way I was able to monitor if the throughput of my nginx install didn't decrease after my config changes, without installing a full fledged solution like zabbix.

I didn't find anything like this readily available, so I'm posting it here in case someone else finds it useful.

#!/bin/bash

# Time between checks
T=5

# argv[1] will be the file to check
LOG_FILE=$1

while true; do
    tmp=`mktemp`
    # tail a file into a temp. -n0 means don't output anything at the start so
    # we can sleep $T seconds and we don't need to worry about previous entries
    tail -n0 -f $LOG_FILE > $tmp 2>/dev/null & sleep $T;
    kill $! > /dev/null 2>&1;
    echo "Requests in $LOG_FILE in the last $T seconds: `cat $tmp | wc -l`";
    rm $tmp;
done

Thursday 11 July 2013

Starting an EMR job with Boto

I've noticed there are not many articles about boto and Amazon web services. Although boto's documentation is quite good, it lacks some practical examples. Most specifically, I found quite a fair amount of RTFM was needed to get an elastic map reduce job started on Amazon using Boto (and I did it from Google app engine, just to go full cloud!). So here it goes, a very basic EMR job launcher using boto:

zone_name = 'eu-west-1'
access_id = ...
private_key = ...

# Connect to EMR
conn = EmrConnection(access_id, private_key,
                    region=RegionInfo(name=zone_name,
                    endpoint= zone_name + '.elasticmapreduce.amazonaws.com'))

# Create a step for the EC2 instance to install Hive
args = [u's3://'+zone_name+'.elasticmapreduce/libs/hive/hive-script',
            u'--base-path', u's3://'+zone_name+'.elasticmapreduce/libs/hive/',
            u'--install-hive', u'--hive-versions', u'0.7.1']
start_jar = 's3://'+zone_name+ \
            '.elasticmapreduce/libs/script-runner/script-runner.jar'
setup_step = JarStep('Hive setup', start_jar, step_args=args)

# Create a jobflow using the connection to EMR and specifying the
# Hive setup step
jobid = conn.run_jobflow(
                    "Hive job", log_bucket.get_bucket_url(),
                    steps=[setup_step],
                    keep_alive=keep_alive, action_on_failure='CANCEL_AND_WAIT',
                    master_instance_type='m1.medium',
                    slave_instance_type='m1.medium',
                    num_instances=2,
                    hadoop_version="0.20")

# Set the termination protection, so the job id won't be killed after the
# script is finished (that way we can reuse the instance for something else
# Don't forget to shut it down when you're done!
conn.set_termination_protection(jobid, True)

s3_url = 'Link to a Hive SQL file in S3'
args = ['s3://'+zone_name+'.elasticmapreduce/libs/hive/hive-script',
        '--base-path', 's3://'+zone_name+'.elasticmapreduce/libs/hive/',
        '--hive-versions', '0.7.1',
        '--run-hive-script', '--args',
        '-f', s3_url]

start_jar = 's3://'+zone_name+'.elasticmapreduce/libs/script-runner/script-runner.jar'
step = JarStep('Run SQL', start_jar, step_args=args)
conn.add_jobflow_steps(jobid, [step])

Tuesday 9 July 2013

A coverage report for C++ unit tests

A lot of tools and metrics which are pretty much given for some dynamic languages are quite esoteric in C++ land. Unit testing is one of these tools, and so code coverage metrics is even more obscure in C++. Turns out it's not impossible. I have uploaded an example C++ project with unit tests and code coverage report generation. Shouldn't be to hard to adapt this code to your own project.

Let's analyze some of the core concepts of this example.

Unit testing

A coverage report only makes sense if you have a suite of unit/integration tests. gtest and gmock have worked the best for me but I guess anything that can run a suit of tests will be good to get a coverage report.

Getting some coverage

gcov is a simple utility you can find on Linunx to generate coverage reports. gcc has support for it, you just need to compile with "-fprofile-arcs -ftest-coverage --coverage" and link with "--coverage -lgcov". If you see the line 10 on the makefile for the example project, you'll see I define a new build type, special for coverage report.

Once the project is built with support for gcov, running the tests will generate a bunch of stats for lcov to pick up. The makefile includes a target that takes care of all these steps, compiling the program with gcov support, running the tests and then collecting the results into a nice html report.

Getting it running

Unfortunatelly, generating a coverage report has a lot of dependencies in C++. For the example on my github repository you'll have to install lcov, cppcheck, gtest, gmock and vera++ (a code style checker for C++ which is now discontinued... you should probably search for a replacement). Once you have it running, though, you can easily integrate this with your jenkins setup.

Thursday 4 July 2013

My own gdb cheatsheet, just because

Gdb is the de facto tool for debugging applications on GNU/Linux. The first time you see it, it would appear to be a very simple application with very limited capabilities. Truth is, gdb is a very complex tool for a very difficult job, and becoming an proficient user can be a daunting task. To top it off, gdb graphical interfaces don't help at all when using it, so you are better off learning how to use it in console mode.

There are a ton of guides to learn the basics of gdb, so I'll just leave here a very quick list on the very basics needed to start understanding it:

Running stuff

  • Start your debugging session with "gdb $path_to_app"
  • If you have a core dump you need to analyze, start it as "gdb $path_to_app $path_to_core"
  • Don't forget to 'ulimit -c unlimited' if you want to get core files
  • Don't forget to compile with debug symbols ("-g3")
  • Are you using gcc? Then instead of -g3 use -ggdb

Breaking stuff

  • Set breakpoints by typing "break"
  • Break on functions by typing "break 'Namespace::Class::InnerClass::function(overload_t)'"
  • When breaking on function's names, use tab's autocompletion. It's your best friend (don't forget the quotes in the function's name, otherwise the double colon symbol will break the autocompletion)
  • You can also "break filename.cpp:line_number"
  • Start the show by typing "run"

Viewing the source

  • "list" will show the source code for your current location
  • "list foo" will show the source code for function foo
  • "list *0x080483c7" will list the source code for whatever there is at address 0x080483c7
  • Replace list for disassemble for extra fun
  • "disassemble /r ..." will additionally print an hex dump
  • "disassemble /m ..." will also interleave the original source

While running

  • step will continue execution until next line
  • stepi will continue execution until next assembly instruction
  • next will continue execution until next line, skipping function calls (ie won't step into another function)
  • continue will run until the next breakpoint

Inspecting stuff

  • 'print x' will print an expression. You can print pretty much any valid c/c++ expression.
  • "print *0x080483b4" will print whatever there is at 0x080483b4
  • "info locals" will print local vars
  • "info registers" will print cool stuff
  • "backtrace", bt for his friends, will print the current calling stack.

This cheatsheet is far from being "advanced stuff" but it should be enough to get you started. The rest is practice.

Tuesday 2 July 2013

A tardis in gdb? Reverse a program's execution!

Have you ever been running a long debug session only to find you missed the spot by overstepping? I sure have and that can be one of the strongest motivators to invent a time machine. And it seems I'm not the only one who thinks so, given that gdb can now travel back in time. That's right, you can save a snapshot of a running program and then reverse the polarity to go back in time, just before you missed your breakpoint!

It's very simple to use too, you don't need six people to use this feature. Just type "checkpoint" in gdb to let it know you want to record the execution's state, then "restore N" to go back in time. I've recorded a sample debugging session:

(gdb) list 
1	int main()
2	{
3	    int a = 1;
4	    int b =2 ;
5	    a = b;
6	    b = 42;
7	    return 0;
8	}

(gdb) run
Breakpoint 1, main () at test.cpp:3
3	    int a = 1;
(gdb) n
4	    int b =2 ;
(gdb) p a
$1 = 1

Next, create a checkpoint:

(gdb) checkpoint 
checkpoint: fork returned pid 29192.

Interesting: a checkpoint is actually implemented as a fork. Moving on:

(gdb) n
5	    a = b;
(gdb) n
6	    b = 42;
(gdb) p a
$2 = 2

Ohnoes! We overstepped. Let's go back:

(gdb) restart 1
Switching to process 29192
#0  main () at test.cpp:4
4	    int b =2 ;
(gdb) p a
$3 = 1

And we're back in time.

How does it work

Reversing to a previous execution state is not an easy task. Gdb implements this feature by forking out a new process, a process we can later switch to. This means that reverting to a previous state might break things. The way forking is implemented in Linux, things like open files shouldn't be much of a problem. Sockets should still be connected but, of course, whatever you already sent won't be "unsent".

Gdb internals docs have some useful information on the limitation of this feature.

Thursday 27 June 2013

Useless code: a template device to calculate e

Recently I needed to flex a bit my template metaprogrammingfooness, so I decided to go back and review and old article I wrote about it (C++11 made some parts of those articles obsolete, but I'm surprised of how well it's aged). To practice a bit I decided to tackle a problem I'm sure no one ever had before: defining a mathematical const on compile time. This is what I ended up with, do you have a better version? Shouldn't be to hard.

template <int N, int D> struct Frak {
	static const long Num = N;
	static const long Den = D;
};

template <class X, int N> struct MultEscalar {
	typedef Frak< N*X::Num, N*X::Den > result;
};

template <class X1, class Y1> struct IgualBase {
	typedef typename MultEscalar< X1, Y1::Den >::result X;
	typedef typename MultEscalar< Y1, X1::Den >::result Y;
};

template <int X, int Y>	struct MCD {
	static const long result = MCD<Y, X % Y>::result;
};
template <int X> struct MCD<X, 0> {
	static const long result = X;
};

template <class F> struct Simpl {
	static const long mcd = MCD<F::Num, F::Den>::result;
	typedef Frak< F::Num / mcd, F::Den / mcd > result;
};

template <class X, class Y> struct Suma {
	typedef IgualBase<X, Y> B;
	static const long Num = B::X::Num + B::Y::Num;
	static const long Den = B::Y::Den; // == B::X::Den
	typedef typename Simpl< Frak<Num, Den> >::result result;
};

template <int N> struct Fact {
	static const long result = N * Fact<N-1>::result;
};
template <> struct Fact<0> {
	static const long result = 1;
};

template <int N> struct E {
	// e = S(1/n!) = 1/0! + 1/1! + 1/2! + ...
	static const long Den = Fact<N>::result;
	typedef Frak< 1, Den > term;
	typedef typename E<N-1>::result next_term;
	typedef typename Suma< term, next_term >::result result;
};
template <> struct E<0> {
	typedef Frak<1, 1> result;
};

#include <iostream>
int main() {
	typedef E<8>::result X;
	std::cout << "e = " << (1.0 * X::Num / X::Den) << "\n";
	std::cout << "e = " << X::Num <<"/"<< X::Den << "\n";
	return 0;
}

Tuesday 25 June 2013

Watchpoints in gdb: wake me up when foo changes

I've noticed a lot of people claim gdb is not a good debugger because it doesn't support feature X. X is many times the ability to monitor changes to a memory location (ie when the value of a variable changes). Most times, though, people believe gdb doesn't implement X only because not enough time was spent reading its manual.

In gdb it's very easy to monitor variable changes using watchpoints. Here's a very simple example session:

(gdb) list 
1	int main()
2	{
3	    int a = 1;
4	    int b;
5	    a = b;
6	    b = 42;
7	    return 0;
8	}

Of course we need to be in the proper scope to set a watchpoint:

(gdb) run
Breakpoint 1, main () at test.cpp:3

Let's try to catch when b changes value:

(gdb) watch b
Hardware watchpoint 2: b

Interesting: a hardware watchpoint was set. What might that be?

(gdb) continue
Hardware watchpoint 2: b
  Old value = 0
  New value = 42
main () at test.cpp:7

Nice! gdb alerted us of the value change by breaking program execution. This can come in handy to fix race conditions.

Hardware and software watchpoints

Gdb will use hardware watchpoints if the underlying platform provides them; that means your architecture should provide some kind of hook for gdb to be alerted when a memory write at a certain address occurs. Hardware watchpoints are quite easy to use, relatively speaking, but not all platforms support them. In that case gdb will use software watchpoints, which are quite expensive and slow. Did you ever try to run a program by pressing "step" continuously? Well, a software watchpoint is similar, gdb will have to execute a program step by step and check if the value has changed in between steps.

As usual, gdb's manual has a lot more info.

PS: Once you find your bug with the aid of a watchpoint, please go and read some books about encapsulation!

Thursday 20 June 2013

Detecting and ignoring third party memory problems with Valgrind

Lot's of people seem to give up on Valgrind after they see the dreaded "More than ### errors detected, go and fix your program". If the bulk of these errors are caused by crappy code in third party libraries there's very little to be done to fix them, other than creating a ticket for the library maintainer (and if the bulk of these errors are caused by your own code... well, don't write a watchdog please, do fix your program!). And that's assuming the reported error is not even a false positive, since Valgrind can report problems for crazy optimizations -O3 might have or for weird operator arithmetic.

If these spurious memory errors are there for too long most people will start ignoring Valgrind's output. Luckily, ignoring errors we can't fix is a possibility too, using Valgrind's ignore files.

  • Check if someone else has already found this issue. Many times libraries do have an "official" ignore file for the lib
  • If you find no ignore file, make really really sure the problem is not on your code. Preferably write a minimal unit test that triggers the warning on Valgrind. Make sure you're not misusing the library.
  • Add whatever warnings you found which were not on your application to a new ignore file
  • Share your ignore file with the world! Other people will either find it useful or tell you that what you thought was a bug on a lib is actually a problem on your code. That happens more often than not.

Thursday 13 June 2013

C++ exceptions under the hood appendix II: metaclasses and RTTI on C++

A long time ago, when we where just starting to write our mini ABI to handle exceptions without libstdc++'s help, we had to add an empty class to appease the linker:

namespace __cxxabiv1 {
    struct __class_type_info {
        virtual void foo() {}
    } ti;
}

I mentioned this class is used to check whether a catch can handle a subtype of the exception thrown, but what does that exactly mean? Let's change a bit our throwing functions to see what happens when we start dealing with inheritance. You may want to check the source code for these examples.

struct Derived_Exception : public Exception {};

void raise() {
    throw Derived_Exception();
}

void catchit() {
    try {
        raise();
    } catch(Exception&) {
        printf("Caught an Exception!\n");
    } catch(Derived_Exception&) {
        printf("Caught a Derived_Exception!\n");
    }

    printf("catchit handled the exception\n");
}

What should happen in this example is crystal clear: it should print "Caught an Exception", because that catch block should be able to handle both types, Exception and Derived_Exception. Not only that, if we compile throw.cpp we'll get a warning to let us know that the second catch is dead code:

throw.cpp: In function void catchit():
throw.cpp:15:7: warning: exception of type Derived_Exception will be caught [enabled by default]
throw.cpp:13:7: warning:    by earlier handler for Exception [enabled by default]

Luckily a warning won't stop compilation; we can continue and try to link the resulting .o; we'll find a linker error:

throw.o:(.rodata._ZTI17Derived_Exception[typeinfo for Derived_Exception]+0x0): undefined reference to `vtable for __cxxabiv1::__si_class_type_info'

And again we start seeing __type_info errors. If we create a fake __si_class_type_info to workaround this problem we we'll finally see our ABI breaks down when dealing with inheritance, in a rather funny way: the compiler will warn us about dead code and then we see that same code being executed by our ABI!

g++ -c -o throw.o -O0 -ggdb throw.cpp
throw.cpp: In function void catchit():
throw.cpp:15:7: warning: exception of type Derived_Exception will be caught [enabled by default]
throw.cpp:13:7: warning:    by earlier handler for Exception [enabled by default]
gcc main.o throw.o mycppabi.o -O0 -ggdb -o app
./app
begin FTW
Caught a Derived_Exception!
end FTW
catchit handled the exception

Clearly there's something wrong with our ABI, and it's very easy to trace this problem back to the definition of "can_handle", the part of the code that checks whether an exception can by caught by a catch block:

bool can_handle(const std::type_info *thrown_exception,
                const std::type_info *catch_type)
{
    // If the catch has no type specifier we're dealing with a catch(...)
    // and we can handle this exception regardless of what it is
    if (not catch_type) return true;

    // Naive type comparisson: only check if the type name is the same
    // This won't work with any kind of inheritance
    if (thrown_exception->name() == catch_type->name())
        return true;

    // If types don't match just don't handle the exception
    return false;
}

Our ABI gets the std::type_info for the exception being thrown and for the type which can be handled, and then compares if the names for these types is the same. This is fine as long as no inheritance is involved, but in the example above we already found a case where an exception should be handled even though a name is not shared.

The same problem will arise when trying to catch a pointer to an exception: the names won't match. Even more interesting, if you try and link throw.cpp but change the catch to receive a pointer instead, you'll get a new linker error. If you fix it you should end up with something like this:

namespace __cxxabiv1 {
    struct __class_type_info    { virtual void foo() {} } ti;
    struct __si_class_type_info { virtual void foo() {} } si;
    struct __pointer_type_info  { virtual void foo() {} } ptr;
}

A very interesting pattern is starting to emerge: there is a different *_type_info for each possible catch type used. In fact the compiler will generate a different structure for each throw style. For example, for these throws:

throw new Exception;
throw Exception;

the compiler would generate something like:

__cxa_throw(_Struct_Type_Info__Ptr__Exception);
__cxa_throw(_Struct_Type_Info__Class__Exception);

In fact, even for this simple example, the inheritance web (not tree, web) is quite complex (note that I'm kind of inventing the mangling here, it's not what gcc uses):

All these classes are generated by the compiler to specify precisely which type is being thrown, and how. For example, if an exception of type "Ptr__Type_Info__Derived_Exception" is thrown then a catch can handle it if:

      The catch type equals the thrown type exactly (this is the only check our ABI does).
      If the catch type is a pointer (ie inherits from cxxabi::ponter_type_info), and said pointer can be casted to the exception type.
      If the thrown type is a derived type, then we need to check if the catch type is a parent type

And this list is still missing lots of possibilities, but for the full list is better to check a real C++ ABI. LLVM has very clear and easy to understand ABI, you can check these details in the file "private_typeinfo.cpp". If you check LLVM's implementation of run time type information, you'll soon realize why we didn't implement this on our ABI: the amount of rules to determine whether two types are the same or not is incredibly complex.

Tuesday 11 June 2013

C++ exceptions under the hood appendix I: the true cost of an exception

Remember a long way back, when the series on exception handling was just started, that I mentioned these articles would only apply for gcc/x86? There is a reason for that since not all compilers implement exception handling the same way. In particular, there are two major ways of doing it:

  • With a lookup table and some metadata, like the Itanium ABI specifies; this is what we talked about.
  • Sj/Lj (ARM): Registering exception handling information upon entering or exiting a method.

The way gcc (and many other compilers) implement this ABI on x86 is by using metadata (the .gcc_except_table and the CFI). Although it's rather difficult to parse, and it might take a long time to parse this on runtime when an exception is thrown, it has a great upside: if no exceptions are thrown then there's no setup cost to be paid. This is called "Zero-cost exception handling" because a normal execution, where no exceptions are thrown, no penalty is payed. The performance is exactly the same we would have as if we had specified nothrow. That's right, leaving code locality & caching issues aside, using exceptions or not has no performance penalty unless an exception is actually thrown. This is a great advantage and it goes in line with C++ philosophy of having no-cost for non used features.

When using the noexcept specification while declaring a method (or an empty throw specifier, pre C++11) in the setup used for these articles the compiler would omit the creation of the .gcc_except_table. This will make the code more compact and it will improve the cache usage, but it's very unlikely that will have a noticeable impact on the performance of the application.

If we talk about ARM, Sj/Lj seems to be the default option (I'm sure there's a good reason for that but I don't have enough experience with ARM to know it). This exception handling method is based on registering exception handling information upon entering or exiting a method which either uses exceptions or requires a cleanup if an exception is thrown. This will result in a quicker exception handling, but the setup cost is payed whether an exception is thrown or not.

If you're interested on reading more about sjlj and zero cost exception handling LLVM has great documentation.

Thursday 6 June 2013

Bash scripting and getops

Did you ever write a bash script and thought it looked too clean? Yeah, me neither. Anyway, now you can make it look even worse by using getopt. As an upside, you'll be able to read command line options from a user without having to resort to nasty hacks, like hardcoding the switch position into the argv.

getopt should be installed by default in most Linux distros, and you can even run it as a command line program. It's quite easy to use on a bashcript. For example, something like:


while getopts "bar" opt; do
    case "$opt" in
        b) echo "Option b is set"
           ;;
        a) echo "Option a is set"
           ;;
        r) echo "Option r is set"
           ;;
    esac
done

It won't look pretty but it does get the job done. According to "man getopt" it supports things like short & long options and defaults; if you need something more complex, you should probably be using a proper language instead of a bash script.

Tuesday 4 June 2013

C++ exceptions under the hood 21: a summary and some final thoughts

After writing twenty some articles about C++ low level exception handling, it's time for a recap and some final thoughts. What did we learn, how is an exception thrown and how is it caught?

Leaving aside the ugly details of reading the .gcc_except_table, which were probably the biggest part of these articles, we could summarize the whole process like this:

  1. The C++ compiler actually does rather little to handle an exception, most of the magic actually happens in libstdc++.
  2. There are a few things the compiler does, though. Namely:
    • It creates the CFI information to unwind the stack.
    • It creates something called .gcc_except_table with information about landing pads (try/catch blocks). Kind of like reflexion info.
    • When we write a throw statement, the compiler will translate it into a pair of calls into libstdc++ functions that allocate the exception and then start the stack unwinding process by calling libstdc.
  3. When an exception is thrown at runtime __cxa_throw will be called, which will delegate the stack unwinding to libstdc.
  4. As the unwinder goes through the stack it will call a special function provided by libstdc++ (called personality routine) that checks for each function in the stack which exceptions can be caught.
  5. If no matching catch is found for the exception, std::terminate is called.
  6. If a matching catch is found, the unwinder now starts again on the top of the stack.
  7. As the unwinder goes through the stack a second time it will ask the personality routine to perform a cleanup for this method.
  8. The personality routine will check the .gcc_except_table for the current method. If there are any cleanup actions to be run, it will "jump" into the current stack frame and run the cleanup code. This will run the destructor for each object allocated at the current scope.
  9. Once the unwinder reaches the frame in the stack that can handle the exception it will jump into the proper catch statement.
  10. Upon finishing the execution of the catch statement, a cleanup function will be called to release the memory held for the exception.

Having learned how exceptions work we are now in a position to better answer why it's hard to write exception safe code.

Exceptions, while conceptually clean, are pretty much "spooky action at a distance". Throwing and catching an exception involves a certain degree of reflexion (in the sense that a program must analyze itself) which is not common for C++ applications.

Even if we talk about higher level languages, throwing an exception means we cannot rely on our understanding of how a normal program execution flow should work anymore: we are used to a pretty much linear execution flow with some conditional operators branching or calling other functions. With an exception, this no longer holds true: an entity which is not the code of our application is in control of the execution, and it goes around the program executing certain blocks of code here and there without following any of the normal rules. The instruction pointer gets changed by each landing pad, the stack is unwinded in ways we can't control and, ultimately, a lot of magic happens under the hood.

To summarize it even more: exceptions are hard simply because they break the natural flow of a program as we understand it. This does not mean they are intrinsically bad as properly used exceptions can surely lead to cleaner code, but they should always be used with care.

Thursday 30 May 2013

Vim tip: remember undos

Git makes this feature rather obsolete, but it's still a nifty trick: Vim can remember your undos even if you close it. To enable this feature just "set undofile" and now Vim will unforgivingly remember your mistakes. For ever.

Tuesday 28 May 2013

C++ exceptions under the hood 20: running destructors while unwinding

The mini ABI version 11 we have written last time was able to handle pretty much all the basics to handle an exception: we have an (almost working) ABI capable of throwing and catching exceptions, but it is still unable to properly run destructors. That's quite important if we want to write exception safe code. With what we know about .gcc_except_table running destructors is a piece of cake, we only need to see a bit of assembly:

# Call site table
.LLSDACSB2:
    # Call site 1
	.uleb128 ip_range_start
	.uleb128 ip_range_len
	.uleb128 landing_pad_ip
	.uleb128 (action_offset+1) => 0x3
    
    # Rest of call site table

# Action table start
.LLSDACSE2:
    # Action 1
	.byte	0
	.byte	0

    # Action 2
	.byte	0x1
	.byte	0x7d

	.align 4
	.long	_ZTI14Fake_Exception
.LLSDATT2:
# Types table start

On a regular landing pad, when an action has a type index greater than 0 it means we're seeing an index to a type tables, and we can use that to know which types the catch can handle; for a type index with a value of 0 it means we are instead seeing a cleanup block and we should run it anyway. Although the landing pad can't handle the exception it will still be able to perform the cleanup that's supposed to happen while unwinding. Of course the landing pad will call _Unwind_Resume when the cleanup is done and that will continue the regular stack unwinding process.

I've uploaded this latest and last version to my github repo, but there are some bad news: remember how we cheated by saying that a uleb128 == char? As soon as we start adding blocks to run destructors the .gcc_except_table starts to get quite big (where "big" means we have offsets over 127 bytes long) and that assumption no longer holds.

For the next version of this ABI we would have to rewrite our LSDA reading functions to read proper uleb128 code. Not a major change, but at this point we wouldn't gain much, we have already achieved our goal: a working minimal ABI capable of handling exceptions without the help of libcxxabi.

There are parts we haven't covered, like handling non-native exceptions, catching derived types or interoperability between compilers and linkers. Maybe some other time, in this rather long series of articles we already learned quite a bit about low level exception handling in C++.

Thursday 23 May 2013

C++ exceptions under the hood 19: getting the right catch in a landing pad

19th entry about C++ exception handling: we have written a personality function that can so far, by reading the LSDA, choose the right landing pad on the right stack frame to handle a thrown exception, but it was having some difficulties finding the right catch inside a landing pad. To finally get a decently working personality function we'll need to check all the types an exception can handle by going through all the actions table in the .gcc_except_table.

Remember the action table? Let's check it again but this time for a try with multiple catch blocks.

# Call site table
.LLSDACSB2:
    # Call site 1
	.uleb128 ip_range_start
	.uleb128 ip_range_len
	.uleb128 landing_pad_ip
	.uleb128 (action_offset+1) => 0x3
    
    # Rest of call site table

# Action table start
.LLSDACSE2:
    # Action 1
	.byte	0x2
	.byte	0

    # Action 2
	.byte	0x1
	.byte	0x7d

	.align 4
	.long	_ZTI9Exception
	.long	_ZTI14Fake_Exception
.LLSDATT2:
# Types table start

If we intend to read the exceptions supported by the landing pad 1 in the example above (that LSDA is for the catchit function, by the way) we need to do something like this:

  • Get the action offset from the call site table, 2: remember you'll actually read the offset plus 1, so 0 means no action.
  • Go to action offset 2, get type index 1. The types table is indexed in reverse order (ie we have a pointer to its end and we need to access each element by using -1 * index).
  • Go to types_table[-1]; you'll get a pointer to the type_info for Fake_Exception
  • Fake_Exception is not the current exception being thrown; get the next action offset for our current action (0x7d)
  • Reading 0x7d in uleb128 will actually yield -3; from the position where we read the offset move back 3 bytes to find the next action
  • Read type index 2
  • Get the type_info for Exception this time; it matches the current exception being thrown, so we can install the landing pad!

It sounds complicated because there's, again, a lot of indirection for each step but you can check the full sourcecode for this project in my github repo.

In the link above you will also see a bonus: a change to the personality function to correctly detect and use catch(...) blocks. That's an easy change once the personality functions knows how to read the types table: a type with a null pointer (ie a position in the table that instead of a valid pointer to an std::type_info holds null) represents a catch all block. This has an interesting side effect: a catch(T) will be able to handle only native (ie coming from C++) exceptions, whereas a catch(...) would catch also exceptions not thrown from within C++.

We finally know how exceptions are thrown, how the stack is unwinded, how a personality function selects the correct stack frame to handle an exception and how the right catch inside a landing pad is selected, but we still have on more problem to solve: running destructors. We'll change our personality function to support RAII objects next time.

Tuesday 21 May 2013

Vim tip: some cool moves

We all know the usual Vim moves, hjkl (though I admit I still use the arrows) et al, but there are some more obscure and very useful Vim moves, for the hipster in you. I only recently learned about L and H, for the beginning and the end of the screen.

I don't know how I lived so long without those. It's an incredibly useful shortcut.

Thursday 16 May 2013

C++ exceptions under the hood 18: getting the right stack frame

Our latest personality function knows whether it can handle an exception or not (assuming there is only one catch statement per try block and assuming no inheritance is used) but to make this knowledge useful, we have first to check if the exception we can handle matches the exception being thrown. Let's try to do this.

Of course, we need first to know the exception type. To do this we need to save the exception type when __cxa_throw is called (this is the chance the ABI gives us to set all our custom data):

Note: You can download the full sourcecode for this project in my github repo.

void __cxa_throw(void* thrown_exception,
                 std::type_info *tinfo,
                 void (*dest)(void*))
{
    __cxa_exception *header = ((__cxa_exception *) thrown_exception - 1);

    // We need to save the type info in the exception header _Unwind_ will
    // receive, otherwise we won't be able to know it when unwinding
    header->exceptionType = tinfo;

    _Unwind_RaiseException(&header->unwindHeader);
}

And now we can read the exception type in our personality function and easily check if the exception types match (the exception names are C++ strings, so doing a == is enough to check this:

// Get the type of the exception we can handle
const void* catch_type_info = lsda.types_table_start[ -1 * type_index ];
const std::type_info *catch_ti = (const std::type_info *) catch_type_info;

// Get the type of the original exception being thrown
__cxa_exception* exception_header = (__cxa_exception*)(unwind_exception+1) - 1;
std::type_info *org_ex_type = exception_header->exceptionType;

printf("%s thrown, catch handles %s\n",
            org_ex_type->name(),
            catch_ti->name());

// Check if the exception being thrown is of the same type
// than the exception we can handle
if (org_ex_type->name() != catch_ti->name())
    continue;

Check here for the full source with the new changes.

Of course there would be a problem if we add that (can you see it?). If the exception is thrown in two phases and we said in the first one we would handle it, then we can't say on the second one we don't want it anymore. I don't know if _Unwind_ handles this case according to any documentation but this is most likely calling upon undefined behavior, so just saying we'll handle everything is no longer enough.

Since we gave our personality function the ability to know if the landing pad can handle the exception being thrown we have been lying to _Unwind_ about which exceptions we can handle; even though we said we handle all of them on our ABI 9, the truth is that we didn't know whether we would be able to handle it. That's easy to change, we can do something like this:

_Unwind_Reason_Code __gxx_personality_v0 (...)
{
    printf("Personality function, searching for handler\n");

    // ...

    foreach (call site entry in lsda)
    {
        if (call site entry.not_good()) continue;

        // We found a landing pad for this exception; resume execution

        // If we are on search phase, tell _Unwind_ we can handle this one
        if (actions & _UA_SEARCH_PHASE) return _URC_HANDLER_FOUND;

        // If we are not on search phase then we are on _UA_CLEANUP_PHASE
        /* set everything so the landing pad can run */

        return _URC_INSTALL_CONTEXT;
    }

    return _URC_CONTINUE_UNWIND;
}

As usual, check the full sourcecode for this project in my github repo.

So, what would we get if we run the personality function with this change? Fail, that's what we'd get! Remember our throwing functions? This one should catch our exception:

void catchit() {
    try {
        try_but_dont_catch();
    } catch(Fake_Exception&) {
        printf("Caught a Fake_Exception!\n");
    } catch(Exception&) {
        printf("Caught an Exception!\n");
    }

    printf("catchit handled the exception\n");
}

Unfortunately, our personality function only checks for the first type the landing pad can handle. If we delete the Fake_Exception catch block and try it again, though, we'd get a different story: finally, success! Our personality function can now select the correct catch in the correct frame, provided there's no try block with multiple catches.

Next time we'll be further improving this.