Infinite monkey - Nico Brailovsky's blog: 2012

Thursday, 27 December 2012

Printing broken in Ubuntu? Try lpr

Lately I've seen quite a few Ubuntu machines with broken printing (more broken than usual, that is). Sometimes printers are detected, sometimes not. When it doesn't work you usually get a warning saying "Getting printer information failed" even though the printer is connected and you can even ping it. Searching around a little bit I read a couple of threads and bug reports that make me think that it's actually a Gnome 3 migration problem, but I'm not sure. It doesn't matter much anyway: screw UIs, just use lpr < foo.pdf from a CLI and keep on killing forests!

On a cheerful closing note: how funny is it that when the regular printing applet in Gnome is broken we have to use a utility by "Apple Inc"?

Tuesday, 25 December 2012

Setting up a Linux gateway/router

Yes, there are a lot of guides for this, but since I need to document how I did it for my office a lot of time ago, I might as well write a post about it here.

As expected, if we are going to replace a device, say, a router, we need to replace it with something that can provide the same functionality. In this case, we have chosen a Linux server, so we need to figure out which services are provided by the router and then emulate them someway:

DHCP to manage leases
DNS to translate domains to IPs
NAT, to multiplex a single connection
Service forwarding, to expose internal services to an external network

Luckily Linux supports all of these:

ISC for DHCP

bind9 for DNS

iptables for NAT

iptables again, for service forwarding

Let's set up each of these services.

Preliminary work

Before you setup any services, you are going to need two things: first two network cards, one for the outgoing connection and another one for the (switched) LAN, and a way of telling your server that you want all traffic from network 1 forwarded to network 2. You may want to install more than two cards, in case you need to route several LANs. We'll see that later.

You will also need an OS. I have chosen Ubuntu because it's very simple to install, and has all the software we need available in the repositories, but you can use any other distribution if it suits your needs.

Also, throughout this guide I will assume a setup like this:

WAN access through eth0, DHCP address
LAN routing in eth1, network 192.168.10.1/24

Setting up NATting and forwarding

Services like DNS and DHCP are nice-to-have, but having real connectivity is way more important. Let's set up the NAT and connection forwarding features of the new router, then we can test if our setup is working properly by pinging an IP of one LAN from the other.

We'll do this by setting up NAT with iptables. We'll also have to configure the OS to forward connections from one network card to the other:

echo 1 > /proc/sys/net/ipv4/ip_forward
iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE
# Add a line like this for each eth* LAN
iptables --append FORWARD --in-interface eth1 -j ACCEPT

We will also need to setup the IP for eth0, since there won't be a DHCP server (we ARE the server!). Open /etc/network/interfaces and add something like this:

# Configure the WAN port to get IP via DHCP
auto eth0
iface eth0 inet dhcp
# Configure the LAN port
auto eth1
iface eth1 inet static
	address 192.168.10.1	# (Or whatever IP you want)
	netmask	255.255.255.0	# Netmasks explanations not included

Once that's checked, restart networking services like this:

sudo /./etc/init.d/networking restart

Everything ready, now just plug your PC to the new router and test it. Remember to manually set your IP in the same network range as your router, since there's no DHCP at the moment. This may be useful to debug a problem.

In your client PC, set your IP address:

ifconfig eth0 192.168.10.10

Test if you IP is set:

ping 192.168.10.10

If you get a reply, your new IP is OK, if not there's a problem with your client. Second step, see if you can reach the router:

ping 192.168.10.1

Note that you may need to renew everything (i.e. restart networking and manually assign your IP) after you connect the cable.

Again, if you get a reply then you have connectivity with the router. So far we haven't tested the iptables rules nor the forwarding, so any issue at this point should be of IP configuration. If everything went well, it's time to test the NAT rules and the forwarding.

ping 192.168.1.1

That should give you an error. Of course, since there's no DHCP there's no route set. Let's manually set a route in the client:

sudo route add default gateway 192.168.10.1

Then again:

ping 192.168.0.1

Magic! It works! If it doesn't, you have a problem either in the NAT configuration or the IP Forwarding of the router. You can check this with wireshark, if the pings reach the server but they never get a reply back then it's the NAT, i.e. it can forward the IP packages on eth1 to eth0 but the router has no NAT, and it doesn't know how to route the answer back. If the pings never even reach eth0, then you have an ip forwarding problem.

We can try something more complex now, like pinging a domain instead of an IP. Something like this:

ping google.com

You should get a message saying the host is unknown. Can you guess why? Right, there's no DNS. Let's install one, but first we have to make these changes permanent.

Persisting the forwarding rules

In order to have the forwarding rules persisting after a reboot, we need first to change /etc/sysctl.conf to allow IP forwarding. It's just a mater of uncommenting this line:

net.ipv4.ip_forward = 1

We will also have a lot of iptables rules we need to setup during boot time. I have created a script at /home/router/set_forwarding.sh, which I also linked into /etc/init.d/rc.local so it's run whenever the system boots.

Setting up DNS

DNS will be necessary to resolve domains to IPs. bind9 is the default option for Debian based servers (are there others? no idea).

sudo apt-get install bind9

This will get your DNS server up and running, but you will still need to add this server manually to your client (again, because there's no DHCP running):

sudo echo "nameserver 192.168.10.1" > /etc/resolv.conf

And now:

ping google.com

Magic again, it (may) work. If it doesn't, you may need to open /etc/bind/named.conf and setup your router (192.168.0.1) as a forwarder, then restart the bind server.

Setting up a custom TLD in your DNS

Now we have a DNS running, so we can set up local zones for the LAN. For example, if you want your router to have a nice user friendly name, instead of just an IP.

Let's start by adding a local zone to /etc/bind/named.conf.local, for a domain we'll call "boc":

zone "boc" {
        type master;
        file "/home/router/named/boc.db";
};

Now we need to add a reverse zone:

zone "10.168.192.in-addr.arpa" {
	type master;
        file "/home/router/named/rev.10.168.192.in-addr.arpa";
};

We still need to create both files (boc.db and rev.10.168.192.in-addr.arpa), but will do that later. Lastly, a place to log all the DNS queries (if you want):

logging {
    channel query.log {
        file "/home/router/named/dns.log";
        severity debug 3;
    };

    category queries { query.log; };
};

For the log entry I have chosen /home/router/named as the log directory, just because for this project I'm keeping everything together (config and logs) so it's easy for people not used to administer a Linux box, but of course this means that apparmor must be configured to allow reads and writes for bind in this directory. We'll get to that in a second, first let's create the needed zone files for our new TLD.

Remember our two zone files? I put them on /home/router/named, but usually they are on /etc/bind. Again, I did this so I can have all the config files together. These are my two files:

For boc.db:

boc.      IN      SOA     ns1.boc. admin.boc. (
                                                        2006081401
                                                        28800
                                                        3600
                                                        604800
                                                        38400
 )

boc.      IN      NS              ns1.boc.

wiki             IN      A       192.168.0.66
ns1              IN      A       192.168.0.1
router           IN      A       192.168.0.1

For rev.10.168.192.in-addr.arpa

@ IN SOA ns1.boc. admin.example.com. (
                        2006081401;
                        28800; 
                        604800;
                        604800;
                        86400
)

                     IN    NS     ns1.boc.
1                    IN    PTR    boc

Most of these lines are black magic, and since an explanation of both DNS and Bind is out of scope let's just say you can add new DNS entries by adding lines like this:

NICE_NAME           IN      A       REAL_IP

This will make bind translate NICE_NAME.boc to REAL_IP. Of course, this will depend on the TLD you defined. Now restart bind to get a fuckton of errors. It will complain about not being able to load a master file in /home/router/named. Remember that apparmor thing I mentioned?

Setting up apparmor to allow new directories

Apparmor is a service that runs in the background, checking what other binaries can and can't do. For example, it will allow bind9 to open a listening socket on port 53 (DNS), but it will deny an attempt to open a listening socket on port 64. This is a security meassure to limit the damage a compromised bind9 binary running as root might do. And since we are going to use a non standard configuration, we need to tell apparmor that it's OK.

After installing bind9 we should get a new file in /etc/apparmor.d/usr.sbin.named. Add the following lines at the bottom:

  /home/router/named/** rw,
  /home/router/named/ rw,

And restart apparmor service:

/./etc/init.d/apparmor restart

Since we were modifying apparmor to allow a non-standard bind installation, now restart bind too. This time it will start without any errors, and you should be able to tail -f /home/router/named/dns.log to see the DNS queries on real time. If it doesn't, check that /home/router/named is writable to the bind user (I did a chgrp -R bind named).

Setting up DHCP

We have DNS and NAT so far, but the client configuration so far has been absolutely manual. We can't have many clients with this sort of setup, so let's automate the client config with a DHCP server. Begin by installing isc-dhcp-server.

Edit /etc/dhcp/dhcpd.conf, set the domain-name and the domain-name-servers, like this:

option domain-name "boc";
option domain-name-servers ns1.boc, ns2.boc;

default-lease-time 86400;
max-lease-time 172800;

authoritative;

I'm not sure if the first line is needed. The other two will set the DNS servers for your clients. In my case, ns1 and ns2 resolve to 192.168.10.1 and 192.168.0.1 respectively. Also, increasing your lease time is recommended, I used one day for default leases. I set this DHCP server as the authoritative server. If this is your router, that's probably what you want.

Now we need to define the network topology:

# This is the WAN network, and we won't provide a service here
subnet 192.168.0.0 netmask 255.255.255.0 {
}

# Define the service we provide for the LAN
subnet 192.168.10.1 netmask 255.255.255.0 {
	range 192.168.10.100 192.168.10.200;
	option routers 192.168.10.1;
}

Now we need to restart ISC:

sudo /./etc/init.d/isc-dhcp-server restart

And now we need to check if everything worked in the client. It's easy this time, we just ask for an IP:

sudo dhclient
ifconfig

If everything went fine, we should now have an IP in the 100-200 range, as well as the DNS server in /etc/resolv.conf. We have now setup a very basic router and should be able to server several clients for basic browsing capabilities.

Moving DHCP config files

Since I want to keep everything together for easy administration, I will move the configuration files for DHCP to /home/router/dhcp. Changing the dhcpd.conf file itself is easy, just move the subnets declarations and add this line:

include "/home/router/dhcp/subnets.conf";
include "/home/router/dhcp/static_hosts.conf";

Like we did with bind, we need to configure apparmor. vim /etc/apparmor.d/usr.sbin.dhcpd and add this two lines:

/home/router/dhcp/ rw,
/home/router/dhcp/** rw,

Restart apparmor service, then restart dhcpd. Everything should work just fine.

Setting up port forwardings

In any LAN you'll probably want to expose some services to the outer world, be it for a bittorrent connection or because you have internal servers you need to access from outside your internal LAN. To do this, you'll have to tell your router to forward some external port to an internal one, like this:

iptables -t nat -A PREROUTING -i eth0 -p tcp 
	--dport PORT -j DNAT --to INTERNAL_IP:INTERNAL_PORT
# This rule may not be needed, depending on other chain confings
iptables -A INPUT -i eth0 -p tcp -m state --state NEW 
	--dport PORT -j DNAT --to INTERNAL_IP:INTERNAL_PORT

This is enough to expose a private server to the world, but it will not be very useful when your dynamic IP changes. You need to set INTERNAL_IP to be a static IP.

Setting up static IPs

Remember the static_hosts file we created before? We can use that to define a static IP. Add the following to set a static IP host:

host HostName {
	hardware ethernet 00:00:00:00:00:00
	fixed-address 192.168.10.50;
}

After that, just restart the DHCP service and renew your client's IP. Done, it's static now!

Wait a minute: how do you find the MAC for your host? I'm to lazy to copy and type it, so I did the following:

# cd /home/router/dhcp
# ln -s /var/lib/dhcp leases

Then you can check the hardware address in the leases/dhcpd.leases file. I created a symlink to keep this directory at hand, since it gives you a status of the current leases.

Warm restart for the router and easy administration

We have quite a few configuration files now, with different settings for iptables, the DHCP and the DNS. If we are aiming for an easy to administer setup, we should add a restart script like this:

#!/bin/bash

./set_forwarding.sh
/./etc/init.d/bind9 restart
/./etc/init.d/isc-dhcp/server restart

Now anyone who changes a config file can run this script to have their new rule applied.

Thursday, 20 December 2012

Non standard UML: decomposing inheritance in sequence diagrams

TL;DR: In sequence diagrams, separating an object into (some of) its parent classes improves readibility, at the cost of a not so accurate structure description.

A sequence diagram is supposed to display interactions between objects, making enphasis on its sequentiality. The important part here is "between objects"; as far as I know, the standard states that you should display interaction between entities, or the interaction of an entity with itself through a public interface.

Usually this works just fine, since a sequence diagram provides a way of understanding a program through the interaction of its entities, however I found that sometimes you need to express the interaction of an object with his own public interface with a little bit more of detail, namely when there's inheritance involved.

When there's an extension relationship between two objects, and a dependency (i.e. a call) in methods of this two classes, the code for each method usually "lives" far appart, most usually in two different files. Portraying these two methods as a reflexive call is sintactically accurate for a sequence diagram, but I found it very poor semantically. Representing this dependency as a call between two different objects might not be a sintactically correct representation, as you are treating a single entity as two different objects, but I have found it results in much clearer and cleaner diagrams.

Of course spliting an entity into multiple objects has the disadvantage of inducing the reader to believe these are indeed different objects, but since the purpose of a sequence diagram is not to display a static structure I believe this is an acceptable tradeoff, one that can be diminished by just using a note and a reference to a class diagram, where the accurate structure of the classes can be displayed.

Tuesday, 18 December 2012

Using masqd for custom subdomains

Some time ago I had to work on a project with many, and dynamic, subdomains. Things like sub1.domain, sub2.domain, subN.domain. It turns out /etc/hosts doesn't support wildcards. That sucks, but its understandable, that's not why the hosts file is there; that's the reason we have DNS, isn't it?

Of course installing a DNS server just to resolve a couple of subdomains is much too work for lazy people like me, so it's time to install dnsmasq instead. dnsmasq is a DNS and DHCP proxy, and has a ton of configuration options I don't even know about. If you really want to learn how to use masqd you should read its manpage, if you just want to solve your DNS issues just add "address=/.DOMAIN/IP" to your /etc/dnsmasq.conf (notice the . before the domain. That's the wildcard).

Thursday, 13 December 2012

CLI music FTW!

Does your music collection suck? Did you have to delete all your mp3 because you were facing a major lawsuit by the MPAA? Make your own console-music! Just open a terminal and type this for hours of endless fun:

cat /dev/urandom | aplay

Aditional tip: if you ever get tired of the random static, you can have fun playing your boot images like this:

cat /boot/initrd.img-3.0.0-12-generic | aplay

I wonder if that has copyrights....

Tuesday, 11 December 2012

Checking connectivity for port forwardings

This is a little bit outside of what I normally post, but when I find such a terribly useful site I need to share it (so I won't forget next time I need it).

Have you ever had to set up a forwarding to expose a LAN service to the outside world? It kind of sucks not knowing if you set up everything correctly, and it's not easy to test, since you can only test if the service is working inside your LAN. Normally you would need to either bounce on a proxy outside, or ask a friend to nmap you. Alternatively, you can use canyouseeme.org, a website that will probe a specific port on the IP you specify, and tell you if it times out or if it's open.

A great time saver.

Thursday, 6 December 2012

g++ has a (mean) personaility

Did you ever get a message like "undefined reference to `__gxx_personality_v0'"? It means you are trying to link c++ code with a c linker, just change gcc with g++. But what is gxx_personality?

Basically, gxx_personality is a global pointer used for stack unwinding. You could make your code compile (assuming you don't have other problems with vtables and such) by defining it as a NULL ptr, and everything should work until an exception is thrown.

Wednesday, 5 December 2012

Tailing Google app engine's logs on realtime

Knowing in real time when an error occurs on your GAE app is not as easy as it sounds. You can create a custom error handler and then mail you the exception log, but you have to develop that yourself (and if you do it wrong you may end up slowing down the whole app... yes, I know it first hand). You can also keep the GAE log page open and set your browser to refresh it every X seconds, but that's quite cumbersome.

So, seeing there are a bunch of acceptable but not quite nice solutions out there, I decided to add yet another "solution" that works, but looks ugly. This script will probably break in a few months, or whenever any part of the auth protocol Google uses changes (since I think very little of it is supposed to be public) but until then you can use it to tail GAE's logs on real time.

Note it will work polling a logs webservice @ Google (just like appcfg does); if you set the frequency too high you may see your daily cost increasing but if you set it too low then the script will only get the last error between intervals (so you might miss errors in between). If too many errors go undetected, though, your problem is likely a too-high error rate and not a low update frequency.

#!/usr/bin/python

# Configure your account here
GOOGLE_AUTH = ('foobar@gmail.com', 'gmailpass')
# The app version should be in your GAE control panel (Main> Version)
APP_VERSION = "Your_App_version"
# Your app ID (You can see the ID in the "Application" list on every GAE
# control panel page)
APP_ID = "your app id"
# Frequency of updates; if it's too often, you might get a noticeable increase
# in your cost per day, if it's too sparse you might loose an error in between
# updates (though this probably means your error rate is too high)
UPDATE_FREQ_SECS = 60



import time
from datetime import datetime
import urllib2, urllib
from httplib2 import Http

class Google_Authenticator(object):

    # We use this to keep urllib2 from following redirs
    class _RedirectHandler(urllib2.HTTPRedirectHandler):
        def handler(self, req, fp, code, msg, headers):
            infourl = urllib.addinfourl(fp, headers, req.get_full_url())
            infourl.status = code
            infourl.code = code
            return infourl

        http_error_301 = handler
        http_error_302 = handler
    opener = urllib2.build_opener(_RedirectHandler)
    urllib2.install_opener(opener)


    def __init__(self, email, passwd):
        self.auth = self.__class__._get_auth_token(email, passwd)
        self.cookie = self.__class__._convert_auth_cookie(self.auth)


    @classmethod
    def _get_auth_token(cls, email, passwd):
        GOOG_LOGIN_URL = "https://www.google.com/accounts/ClientLogin"
        data = {
                'Email': email,
                'Passwd': passwd,
                'source': 'Google-appcfg-1.7.2',
                'accountType': 'HOSTED_OR_GOOGLE',
                'service': 'ah',
            }

        try :
            post_data = urllib.urlencode(data)   
            res = urllib2.urlopen(GOOG_LOGIN_URL, post_data)
        except Exception:
            return None

        for ln in res.readlines():
            if ln.startswith('Auth='):
                pos = len('Auth=')
                return ln[pos:].strip()

        return None


    @classmethod
    def _convert_auth_cookie(cls, auth):
        GOOG_COOKIE_URL = "https://appengine.google.com/_ah/login?"\
                          "continue={0}&auth={1}"
        redir = 'http%3A%2F%2Flocalhost%2F' # Anywhere (that's listening)...
        url = GOOG_COOKIE_URL.format(redir, auth)

        try:
            res = urllib2.urlopen(url)
            cookie = res.headers['set-cookie']
        except Exception:
            return None

        pos = cookie.find(' ') - 1
        return cookie[:pos]



class GAE_Logs_Fetcher(object):

    GAE_LOGS_URL = "https://appengine.google.com/api/request_logs?" \
                   "include_vhost=False&version={0}&limit={1}&"\
                   "severity={2}&app_id={3}"

    DEBUG=0
    ERROR=3
    CRITICAL=4

    def __init__(self, auth_cookie, app_id, app_version, limit, severity):
        self.url = self.__class__.GAE_LOGS_URL.\
                        format(app_version, limit, severity, app_id)
        self.opener = urllib2.build_opener()
        self.opener.addheaders = [('cookie', auth_cookie),
                                  ('X-appcfg-api-version', 1)]


    def fetch(self):
        res = self.opener.open(self.url)
        msg = ""
        for ln in res.readlines():
            if not ln.startswith('# next_offset='):
                msg += ln

        return msg


    def watch(self, freq, callback):
        try:
            last_msg = self.fetch()
            while True:
                time.sleep(freq)
                msg = self.fetch()
                if msg != last_msg:
                    last_msg = msg
                    callback(msg)

        except KeyboardInterrupt:
            pass



def main():
    def printmsg(msg):
        print msg

    auth = Google_Authenticator(*GOOGLE_AUTH)
    logs_fetcher = GAE_Logs_Fetcher(auth.cookie, app_version=APP_VERSION,
                                    limit=1, severity=GAE_Logs_Fetcher.ERROR, app_id=APP_ID)

    print "Auth OK, starting watch on {0} error log".format(APP_ID)
    logs_fetcher.watch(UPDATE_FREQ_SECS, printmsg)



if __name__ == '__main__':
    main()

Tuesday, 4 December 2012

Boot Linux in single user mode

Sooner or later, you'll need a safe boot mode for Linux. Maybe you forgot your password, maybe you need to recover some files of a really really broken system (shame on you for not using a different partition for /home). Luckily in any Linux server you can probably interrupt Lilo or Grub and add to the kernel line the following parameter init="/bin/sh". This will give you a root command shell from which you can do other fun recovery tasks.

Admin tip: Just don't forget to fill up on coffee before starting.

Thursday, 29 November 2012

Cool C++0X features XIII: auto and ranged for, cleaner loops FTW

Long time without updating this series. Last time we saw how the ugly

for (FooContainer::const_iterator i = foobar.begin(); i != foobar.end(); ++i)

could be transformed into the much cleaner

for (auto i = foobar.begin(); i != foobar.end(); ++i)

Yet we are not done, we can clean that a lot more using for range statements.

Ranged for is basically syntactic sugar (no flamewar intended) for shorter for statements. It's nothing new and it's been part of many languages for many years already, so there will be no lores about the greatness of C++ innovations (flamewar intended), but it still is a nice improvement to have, considering how tedious can be to write nested loops. This certainly looks much cleaner:

for (auto x : foobar)

This last for-statement, even though it looks good enough to print and hang in a wall, raises a lot of questions. What's the type of x? What if I want to change its value? Let's try to answer that.

The type of the iterator will be the same as the type of the vector, so in this case x would be an int:

std::vector foobar;
for (auto x : foobar) {
	std::cout << (x+2);
}

And now, what happens if you want to alter the contents of a list and not only display them? That's easy too, just declare x as an auto reference:

std::vector foobar;
for (auto& x : foobar) {
	std::cout << (x+2);
}

This looks really nice but it won't really do anything, for two different reasons:

Ranged fors won't work until g++ 4.5.6 is released
The list is empty!

There are many ways to initialize that list, but we'll see how C++0X let's you do it in a new way the next time.

Tuesday, 27 November 2012

Hugging borked on Ubuntu?

Hugging is pretty awesome to create panoramic photos. I use it regularly, for example for this awesome Prague old town square panoramic I took some time ago:

[BROKENIMAGE]

At some point I realized that, after an upgrade, Hugging just stopped finding connection points. Apparently there's a licences issue somewhere. Just "sudo apt-get install autopano-sift" and continue hugging.

Thursday, 22 November 2012

Vim tip: goto column

You can quickly jump to a specified column in Vim by simply typing the column number followed by a pipe, like this:

80| # go to the 80th column

Tuesday, 20 November 2012

False sharing in multi threaded applications, part II

In the last entry we learned how an apparently parallel algorithm can turn into a sequential nightmare via false sharing; what you may think to be two independent variables may actually be spatially close, thus sharing a cache line which gets invalidated by each and every write across cores. But is this a real world issue? If so, how can we fix it?

We'll work backwards: let's see first how can this be fixed, and then we'll check if this is actually a real world issue.

Remember our code sample, from last time:

void lots_of_sums(unsigned idx, int v[])
{
	const unsigned itrs = 2000*1000*1000;

	for (int i=0; i < itrs; ++i)
		v[idx].num = i;
}

An easy way to avoid false sharing would be to just assign i to a temporary variable, then assign the real result to v[i]; this way, you would be writing only once, the intermediate results will be in TSS, thus avoiding the contention in the loop.

The second strategy to solve this problem would be to use padding. Knowing that your cache line is made of 64 bytes will let you write something like this:

struct Padded {
	int num;
	char pad[60];
};

Of course, this has another problem: what about the offset? We need not only padding but also spacing, for the alignment.

struct Padded {
	char space[60];
	int num;
	char pad[60];
};

Alternatively, you could use the align keyword of C++0x, but since it's not implemented on g++ I have never tested it before, so I have no idea how it's supposed to work. For more information on this you can check Herb Sutter's article, Eliminate False Sharing.

Monday, 19 November 2012

False sharing in multi threaded applications

Concurrent programing is hard. Very hard. But every once in a while you can actually find a textbook problem, an application for which a parallel algorithm is not only simple but also intuitive, so clear that it would be difficult to actually do it any other way. You put on your high speed helmet and fasten your seatbelt, ready for an FTL (faster than lineal) jump.

Running make && make test only show lower numbers. You actually decreased the performance of your application, by making it multi-threaded! How can this be possible?

While writing concurrent algorithms there are many places where you can fail, miserably. Contention is only one of them, but this is a very subtle way of failure; it will render your concurrent algorithms into a very complicated sequential algorithm, run in many cores. How can a simple concurrent algorithm turn into such a mess?

In many cases, false sharing is the culprit of the lost performance. How exactly? Let's write a simple example for this:

void lots_of_sums(unsigned idx, int v[])
{
	const unsigned itrs = LOTS;

	for (int i=0; i < itrs; ++i)
		v[idx].num = i;
}

Imagine this function, one running in each core and v defined as a simple array. We can expect this to have a linear scalability, right? Turns out it's not that simple. Why? Let's see how v is laid out in memory:

+------------+--+--+--+------+-----------------+
|Random stuff|V0|V1|V2|...|VN|More random stuff|
+------------+--+--+--+------+-----------------+

So, v will (most likely) be made of adjacent chunks in memory. We can also assume that sizeof(v[i]) will be 4 or 8 bytes, or something small like that. Why is that important?

Remember that nowadays a CPU core is amazingly fast, so much that they make main memory look like dial up. To avoid waiting for the main memory to complete a request, several levels of cache exists. These caches know nothing about your application (well, mostly) so they can't know that v0 and v1 are actually different variables. Notice how this would apply if instead of a vector they were two random variables which happen to be allocated spatially close.

If the CPU cache, L1 for the friends, can't know that v0 and v1 are separate, it will probably try to cache a whole chunk of memory. Say, maybe 64 or 128 bytes. Why should you care?

Let's go back to our code. When you see this

...
	v[idx].num = i;
...

you just see independent accesses to different variables. What will the cache see? Let's go back to our memory layout, but this time let's make up an offset too:

0xF00    0xF10  0xF14...                   0xF40
+------------+--+--+--+------+-----------------+
|Random stuff|V0|V1|V2|...|VN|More random stuff|
+------------+--+--+--+------+-----------------+

So, if we assume a 64 byte cache line, then how might the CPU see a read-write process for v[0]? Something like this, probably:

 +---------+-------------------+----------------------------+---------------------------+
 |  C Code |       CPU         |             L1             |        Main Memory        |
 |---------|-------------------|----------------------------|---------------------------|
 |         |                   |                            |                           |
 |Read v[0]|                   |                            |                           |
 | +------+|                   |                            |                           |
 |       +-> Hey, get me 0xF10 |                            |                           |
 |         | +----------------+|                            |                           |
 |         |                  +> Huh... I don't have 0xF10, |                           |
 |         |                   | let me get that for you... |                           |
 |         |                   |  +-----------------------+ |                           |
 |         |                   |                          +-> 0xF00... yup, here you go |
 |         |                   |                            |                           |
 |         |                   |  I have it now!            |                           |
 +---------+-------------------+----------------------------+---------------------------+

Note an important bit there: the CPU requested 0xF10 but L1 cached the whole segment, from 0xF00 to 0xF40. Why is that? It's called spatial locality or locality of reference and since it's not the main goal of this article I'll just say that L1 will cache the whole segment just in case you want it.

So far so good. What happens when you add more CPUs and write OPs? Remember that write OPs are cached as well.

Looks pretty similar, but note an interesting thing: CPU1 will read v[0], which translates to 0xF10, whereas CPU2 will read v[1], which translates to 0xF14. Even though these are different addresses, they both correspond to 0xF00 segment, thus L1 will have actually read the same chunk of memory!

The problem with this starts when a thread needs to write. Like this:

Wow, a simple read from L1 cache now fails because it's been marked as dirty by an L1 from another CPU. How crazy is that? (*)

Can you see now why our simple and elegant parallel algorithm has turned into a messy lineal... steaming pile of code? Each time a core tries to write to his own variable, it's inadvertently invalidating the variables other cores need. This is called false sharing, and we'll see a fix next time.

(*) Disclaimer: this diagram, and the ones before, are simplifications of what actually happens. In reality there are several levels of cache, and some of those may be shared between cores, so most likely the L1 won't need to go to main memory. Also, even in the case where there is no shared cache between cores, there are cache coherency protocols which will most likely avoid a read from main memory.

Tuesday, 13 November 2012

Easy bind dns log analyzer

Any real-programmer (tm) should be able to memorize all the static IPs in his network and should feel more comfortable using the IP to access the LAN resources, instead of those user friendly URL things. Of course, not everyone is a real-programmer (tm), and these people usually drive crazy the poor guys who do remember all the static IPs in the office. For them, the DNS was invented.

Some time ago I set up a bind9 on a spare server I had. Easy job, a side project of a boring late night. Of course, there was no point in just setting up my own TLD; I now had the power to have fun with DNS poisoning, and I could also create nice reports of the queries to the DNS server.

Having fun with a DNS might be the topic for another post, for this one I'll just focus on how to get query statistics from a bind server.

After grepping the web a little bit, I found a rather disconcerting lack of bind log analyzers. I just wanted a list of the most popular queries, as well as the ability to group the log by it's IP (and may be even to get the computer's SMB name). Couldn't have asked for a better chance to flex a little bit my Ruby-foo, and here is the hackish result for whoever may want to do something similar: [source lang="ruby"]#!/usr/bin/ruby1.8 class Hits def initialize(n,v,k=nil) @n=n @v=v @k=k end def n() @n; end def v() @v; end def k() @k; end def <(o) @n < o.n; end end if ARGV.length == 0 then puts "Usage: dns.rb DNS_LOG [--domains] [--ip [--no-samba]]" puts " --domains will list all queried domains" puts " --ip will list every query gruoped by IP" exit end @domains = @ip = @nosamba = false ARGV.each { |arg| case arg when '--domains' then @domains = true when '--ip' then @ip = true when '--no-samba' then @nosamba = true end } fp = File.open ARGV[0] queries_by_ip = {} queries_by_domain = {} fp.each_line { |l| v = l.split(' ') if not (v.nil? or v.length < 4) then ip = v[1].split('#')[0] query = v[3] if queries_by_domain[query].nil? then queries_by_domain[query] = 0 end queries_by_domain[query] += 1 if queries_by_ip[ip].nil? then queries_by_ip[ip] = [] end queries_by_ip[ip].push query end } if @domains then hits = [] queries_by_domain.each { |k,v| hits.push Hits.new(v, k) } hits.sort!.reverse!.each { |h| puts h.v + " has " + h.n.to_s + " hits" } end if @ip then lst = [] queries_by_ip.each { |ip,queries| lst.push Hits.new(queries.length, ip, queries) } lst.sort!.reverse!.each { |h| puts "Report for " + h.v + ", Samba addr:" if not @nosamba then Kernel.system "nmblookup -A " + h.v end puts "Requested " + h.n.to_s + " URLs:" h.k.uniq.each { |url| puts "t" + url } puts "." puts "." } end

Tuesday, 6 November 2012

Redirecting connections from one port to another with iptables

People say iptables are incredibly useful. Mind you, I have gotten mostly headaches out of it, but it is easy to see they are a powerful tool. The other day (for reasons not relevant to this post) I needed to run a service on a port, yet have it accept connections in a different port. Say, the service should be listening on port 1234, but the server should accept any connection on port 4321 and "redirect" each package to the correct port. Turns out iptables is the tool for the job.

For the impatient ones, this is the command I used:

iptables -t nat -A PREROUTING -s 192.168.10.0/24 \
         -p tcp --dport 4321 -j DNAT --to :1234

Let's analyze this part by part. First I tried something like this:

# Caution: This is wrong and won't work!
iptables -A INPUT -s 192.168.1.0/24 --dport 4321 --to :1234

Seems clear enough:

-A INPUT will add this rule to the INPUT chain (i.e. the list of rules that are run for every incoming packet.
-s 192.168.1.0/24 will filter the packages, so this rule will only be applied to the packets incoming from the 192.168.1.0/24 network.
--dport 4321 will filter the packages again, so we can apply this rule not only to those packets incoming from the LAN but also to those packet to port 4321.
--to :1234 is the rule to rewrite the destination port.

Too bad this won't work. To begin with, the --dport option will fail, since it makes no sense to talk about ports without specifying a protocol (i.e. ports are a TCP layer concept, not an IP layer conecpt, so iptables doesn't know what to do with a --dport option!). Thus, let's rewrite our command like this:

# Caution: This is wrong and won't work!
iptables -A INPUT -s 192.168.1.0/24 -p tcp --dport 4321 --to :1234

Now iptables will complain about the --to option. This is because iptables can't rewrite a package (i.e. use the --to option) unless you specify to jump to the DNAT table (?). Now our command should look like this:

# Caution: This is wrong and won't work!
iptables -A INPUT -s 192.168.1.0/24 -p tcp --dport 4321 -j DNAT --to :1234

iptables will still complain: you can't use DNAT unless you use a nat table, so we have to add -t nat. If we do that, iptables will complain once more about not being able to use DNAT in the INPUT chain, it is only available in the PREROUTING/OUTPUT chains. This is related to the order in which iptables processes its rules.

With all this in mind, we can now write a command that will run:

iptables -t nat -A PREROUTING -s 192.168.10.0/24
         -p tcp --dport 4321 -j DNAT --to :1234

Thursday, 1 November 2012

stlfilt: read ugly tmpl errors

There's nothing better for Monday mornings than the smell of hundreds of template errors after a make clean all. When using template metaprogramming, a tiny misplaced coma can generate enough error code that, if printed, would crush you under tones of paper. And don't even try to read them, it'll make your head explode.

Luckily STLFilt can be quite a relief when dealing with this kind of errors. Granted, it won't make a steaming pile of poo seem to be a nice poem, but if you have something like the dog in the picture, to use a metaphor, at least it would put a blanket on its face.

Tuesday, 30 October 2012

Fastgrep, a cache for grep

Sooner or later, you'll find that you need to know where to find a certain piece of text that ctags does not index, and grep is just not fast enough. Say, you're trying to match that log line you see every one in a while to the specific printf("I'm here!\n") that produced it.

Working on any reasonable sized project, searching for free-form text means you'll need some kind of indexing; grep will work, but you'll end up having to wait a couple of minutes between searches. Funny thing, we can probably speed up grep quite easily. Long story short, you can find a grep cache here: Fastgrep.

So, how does it work? If we reason a bit about how grep will spend time we can probably assume the following:

Re-positioning the disk head to find the next file to grep
Reading file contents
Opening & closing files
Actually grepping

I didn't actually check how closely this "benchmark" resembles reality, but it seems reasonable to assume that most of the time grep spends searching for a string in a big project, is actually wasted in I/O, and more cores won't help.

After a quick Google search I didn't come up with any already available grep cache, so I rolled up a quick version myself which you can find here: Fastgrep. The idea behind it is very simple, if most of the time is wasted accessing files, then just cat every file in the project together and grep that one instead.

Since the grepcache is actually a merged copy of all the files in the project, it can quickly get out of sync with the rest of the code. To somewhat improve this the index file is only used to get the list of files where a string might be found; these files are then grepped for the real results. This only helps a little bit and eventually everything gets out of sync, but I found that rebuilding the cache in a post-merge git hook (or a post-commit svn hook) is more than enough to make fastgrep more than usable.

Thursday, 25 October 2012

Fixing keyboard layouts in Ubuntu. Scarier than it seems.

At the moment of writing this post there is an open bug in Ubuntu, still active in 11.04, that makes your keyboard layout revert to whatever GDM wants. Apparently this is caused by GDM failing to synch with the preferences of the session, so if you change your layout (even if you delete the previous one) the change will be reverted next time you login. There seems to be no fix coming soon, so this magic incantation might work if you have this problem:

sudo dpkg-reconfigure keyboard-configuration

This will ask you a lot of questions about your keyboard, good luck guessing. It kind of reminds me the Windows 95 install process, in which erring the keyboard layout meant it was probably easier to just format and reinstall everything all over again. With some luck, next time you reboot your Ubuntu will actually remember your keyboard preference. If not, just take this as an opportunity to learn a foreign language.

Having keyboard problems? You may also be interested in learning how to activate tildes and accents for a USA keyboard layout in Ubuntu.

Tuesday, 23 October 2012

Getting a stacktrace on C/C++: Mapping function pointers to function names on runtime

Last time we talked about mapping function addresses to names (albeit mangled) in object files; we can also get this information during runtime:

Glibc to the aid

Let's go one step back: how to get the current stacktrace. Turns out glibc already has a nice feature to get the current stacktrace. Going back to our original program, with some minor changes:

struct Caller { Caller *addr; };
void bt_by_hand() {
    // Get the stack base ptr from a register
    void *sp;
    asm("movl %%ebp,%0" : "=r"(sp));

    // Loop through every caller
    cout << "Hand made stack walker" << endl;
    Caller *caller = (Caller*)sp;
    while (caller) {
        cout << (((void**)caller)[1]) << endl;
        caller = caller->addr;
    }
}

#include <execinfo.h>
void bt_glibc() {
    void* buffer[10];
    int frames = backtrace(buffer, sizeof buffer);

    cout << "glibc stack walker" << endl;
    for (int i=0; i < frames; ++i) cout << buffer[i] << endl;
}

void bar(int, float) {
    bt_by_hand();
    bt_glibc();
}

Clearly using glibc's version is much cleaner, but will they yield the same results? Turns out not:

Hand made stack walker
0x80487b0
0x80487d5
0xb7659113
glibc stack walker
0x804873b
0x80487b5
0x80487d5
0xb7659113
0x8048611

Similar, but not quite.

The first address in the glibc's stack walker correspond to the bt_glibc, and more importantly since the real glibc backtrace is a function call itself it's easy to get that function into the stack. We don't even consider that case on our hand made stack walker. First difference explained.
The second address should correspond to bar, but one is 0x80487b0 and the other is 0x80487b5. Again, it makes sense: since the void* is actually the return address for EIP if we check the dissasembly we'll find that the 5 bytes difference correspond to the next instruction to be executed.
0x80487d5 is the return address for main, which is the same for both.
The rest of the stack is easy: we stop at main, glibc keeps walking the stack inside glibc itself. Not so important for us, anyway.

Name translations

We have a bunch of pointers. How can we get real function names now? Well, the easiest way is to use glibc's backtrace_symbols_fd, like this:

    int frames = backtrace(buffer, sizeof buffer);
    backtrace_symbols_fd(buffer, frames, 1);

On my machine, when running "g++ -rdynamic foo.cpp &&./a.out | c++filt", I get something like this:

./a.out(bt_glibc()+0x19)[0x8048976]
./a.out(bar(int, float)+0x10)[0x8048a0a]
./a.out(main+0x1e)[0x8048a2a]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb759f113]
./a.out[0x8048851]

Note that without -rdynamic the function name symbols won't be available. Anyway, what we get is much more interesting than raw pointers. And exactly what we were looking for. It's also very boring, unless we learn what's going on inside backtrace_symbols_fd. If we go and check what backtrace_symbols_fd is doing (sysdeps/generic/elf/backtracesyms.c in glibc) we'll see that all the heavy work is done by libdl. A quick check with 'man dladdr' will show that we are on the right path. Let's add this to our program:

#include <dlfcn.h>
int get_sym_name(void *addr) {
    Dl_info info;
    int res = dladdr(addr, &info);
    cout << info.dli_fname << ": " << info.dli_sname << endl;
}

Behold, we now have a nice backtrace in C++, not so different than the bt you'd get in any dynamic language:

./a.out: _Z3barif
0x8048af9
./a.out: main
0x8048b19
/lib/i386-linux-gnu/libc.so.6: __libc_start_main
0xb7612113

Turtles all the way down

Getting the function name using libdl feels a bit like cheating, after we manually walked the call stack. They are not in the same level of abstraction. Can we check what lurks inside libdl's dladdr? It's absolutely possible. In theory. Now we are dealing not only with a specific architecture (x86) we are also dealing with a binary format (more specifically, elf). To understand what goes on inside libdl's we need to know about the runtime linking process and elf internals. Feel free to peek at glibc/dlfcn/dlinfo.c, but beware that's a daunting task, way beyond the original scope of this article.

Epilogue / Disclaimer

The whole series on getting a stacktrace on C++ is merely "educational", as in "never-ever do this on your program". As stated on the first part of the series it's not portable, and it's also extremely frail. If you want something production ready use glibc's backtrace features. And if you want something portable, try libunwind. It works great, but where would the fun be if we skipped the whole learning process and went straight to this library?

Thursday, 18 October 2012

Getting a stacktrace on C/C++: Mapping function pointers to function names in obj files

Last time we saw how to get a stacktrace in C++, yet we only had access to the list of function pointers and not to the function names. Still, pointers are not that useful. Can we get function names instead? Yes, we can but it's not easy. One option would be to read the elf specification. Boring. Let's tinker around with our test program, may be we can find something interesting:

    Caller *caller = (Caller*)sp;
    while (caller) {
        cout << caller->addr << endl;
        void** foo = (void**)caller;
        cout << "\t" << foo[0] << "|" << foo[1] << endl;
        caller = caller->addr;
    }

On my machine I got something like this:

0xbfa25fd8|0x8048acb
0xbfa25ff8|0x8048aeb
0|0x40143113

The first address is the memory address of the stack to which we should return after executing this function. The second address looks interesting. It looks like the addresses we see when dissasemblying an object. Let's try running 'objdump -Sd ./a.out':

08048ac0 <_Z3barif>:
 8048ac0: 55                      push   %ebp
 8048ac1: 89 e5                   mov    %esp,%ebp
 8048ac3: 83 ec 08                sub    $0x8,%esp
 8048ac6: e8 e1 fe ff ff          call   80489ac <_Z3foov>
 8048acb: c9                      leave
 8048acc: c3                      ret

08048acd :
 8048acd:   55                      push   %ebp
 8048ace:   89 e5                   mov    %esp,%ebp
 8048ad0:   83 e4 f0                and    $0xfffffff0,%esp
 8048ad3:   83 ec 10                sub    $0x10,%esp
 8048ad6:   b8 00 00 00 40          mov    $0x40000000,%eax
 8048adb:   89 44 24 04             mov    %eax,0x4(%esp)
 8048adf:   c7 04 24 02 00 00 00    movl   $0x2,(%esp)
 8048ae6:   e8 d5 ff ff ff          call   8048ac0 <_Z3barif>
 8048aeb:   b8 00 00 00 00          mov    $0x0,%eax
 8048af0:   c9                      leave
 8048af1:   c3                      ret

And, indeed, 0x8048acb and 0x8048aeb are both there: they are the EIP after the ret call! Note that you may use c++filt if the mangled names are too confusing. Anyway, we can indeed get the function names, albeit in a rather contrived way. Is there a better way to get a backtrace and its symbols' names? Turns out there is and we'll see how to get a function's name during runtime, in the next article.

Tuesday, 16 October 2012

Getting a stacktrace on C/C++: Some calling internals

High level languages tend to have a lot of features for introspection and metaprogramming. One of those useful features is the possibility to get a stacktrace of the current function. At first C++ would seem to lack that ability but once you think about it, high level languages internal workings are in the very basics not that different from lower level languages: they tend to be a virtual representation of the physical hardware. A function call, in the end, will most likely be implemented the same on both C++ and Ruby. So, although it may not be as straight forward as it is with a dynamic language, we should be able to get a stacktrace just fine. Also, there's a big clue for us: gdb can get stacktraces just fine, so why wouldn't we?

Let's start by trying to figure out how we can get a stacktrace by ourselves, with no help of any other tools. Sounds like a daunting task? It isn't really. Let's write a simple program to figure out how gcc performs function calls:

int foo() {
    return 42;
}

void bar() {
    foo();
}

int main() {
    bar();
    return 0;
}

If we compile this with gcc -S we'll get a .s file with the disassembly. Of course this depends a lot on the compiler, architecture, OS, etc, etc. For the moment we'll just assume x86 gcc Linux with no optimizations. A lot of the code from the disassembly will be part of the compiler's preamble and postamble. Cleaning things up a bit we should see something like this:

_Z3barv:
.LFB1:
	pushl	%ebp
	movl	%esp, %ebp
	call	_Z3foov
	popl	%ebp
	ret

Doesn't look so hard: it just pushes the current stack base pointer to the stack, sets a new stack pointer and calls the function (you might want to play around with c++filt if name mangling confuses you). Once it returns it just reads back the original stack base pointer and continues. Knowing that return addresses are in the stack makes it easy for us to retrieve this information, we just need a way to get the current stack pointer. Some assembler in C++ will be needed:

    void *sp;
    asm("movl %%esp,%0"; : "=r"(sp));
    std::cout << sp << std::endl;

That should print the current function's start address. But from our disassembly we can also see that the current function's return address, ie its caller, would be stored in the first word of the current function's stack. Likewise, our caller's return address will be on its first stack word. Let's check if this holds up in the code:

    void *sp;
    asm("movl %%esp,%0"; : "=r"(sp));
    void *caller_addr = *(void**)sp;
    void *caller_addr2 = *(void**)caller_addr;
    void *caller_addr3 = *(void**)caller_addr2;

    cout << sp << endl;
    cout << caller_addr << endl;
    cout << caller_addr2 << endl;
    cout << caller_addr3 << endl;

Looks ugly, but remember we are fighting the type system here: we need to tell the compiler that the void* we're trying to access is actually a void**. We'll clean that up later, for the moment if we run that on our sample we should see all the stack addresses that for our stack trace, ending with a null pointer for the main function. Pretty neat, huh? So far we only have function addresses, but we'll get some pretty names later. Let's make it a bit more generic before moving on.

    struct Caller {
            Caller *addr;
    };

    // Get the stack base ptr from a register
    void *sp;
    asm("movl %%ebp,%0" : "=r"(sp));

    // Loop through every caller
    Caller *caller = (Caller*)sp;
    while (caller) {
        cout << caller->addr << endl;
        caller = caller->addr;
    }

Of course this is very naive, as it will only work on a 32 bit platform, and it will break as soon as we change calling conventions, but it's still useful to draw some conclusions:

Getting a stacktrace in C++ is indeed possible
Now we know why inlined functions and optimizations make debugging more difficult (hint: how can you get a stack frame for a function that doesn't really exist?)

Next time we'll see how we can get a function name from it's pointer.

Thursday, 11 October 2012

Faking a server and testing networks with netcat

Not long ago I wrote about having to use iptables to redirect packets from one port to another. Testing this with a real server may be complicated, or at least inconvenient. Luckily we have netcat to help us.

If you use "nc -l 1234", netcat will create a listening socket on the port 1234. You can check if it's working by doing a "telnet IP 1234", nc should echo whatever you type on the client in the server. In the example from my article explaining an iptables rule, you would do an nc -l 1234, apply the iptables rule and the issue a "netcat IP 4321". If everything went according to plan you should be seeing the echo on your nc server.

Tuesday, 9 October 2012

gdb pretty printers

If you have ever used gdb then you know printing an stl object is useless. You'll be flooded with stuff you don't care about, and if you want to find, say, the contents of an std::vector you'll have to dive through tons of junk. It turns there's an easier way, it's called pretty printers and I have no idea why they are not included by default with gdb.

You'll need to download the pretty printers at svn co svn://gcc.gnu.org/svn/gcc/trunk/libstdc++-v3/python and then create a ~/.gdbinit like this one:

python
import sys
sys.path.insert(0, '~/gdb_prettyprinters/python')
from libstdcxx.v6.printers import register_libstdcxx_printers
register_libstdcxx_printers (None)
end

Have fun!

syntax highlight