Infinite monkey - Nico Brailovsky's blog

Thursday, 21 April 2011

echo "Hola mundo" > /dev/full

I'd write something witty but there's not a lot to talk about /dev/full. Anyway, it is a cool tip, so I'll share it:

Everyone knows /dev/null, and most will know /dev/zero. But /dev/full was unknown to me until some time ago. This device will respond to any write request with ENOSPC, No space left on device. Handy if you want to test if your program catches "disk full" - just let it write there

From Myon's Blog

Tuesday, 19 April 2011

Cool C++0X features II, Variadic templates: What's wrong with varargs

Last time we explained what variadic templates are. We'll see what they can do now. We mentioned that solving the problem of having a type-safe varargs is one of the best ways of applying variadic templates, but what's varargs?

Varargs functions (from C world, not even from C++!) are functions which have a variable number of arguments, just like printf. These are usually very dangerous functions, since they are not typesafe. Let's see how they are implemented with an example:

#include <stdarg.h>
#include <iostream>

// My god, it's full of bugs
void va_println(int args_left, ...) {
   va_list arg_lst;
   va_start(arg_lst, args_left);

   while(args_left--) {
      const char *p = va_arg(arg_lst, const char*);
      std::cout << p;
   }

   va_end(arg_lst);
}

int main() {
   va_println(3, "Hola ", "mundo", "n");
   return 0;
}

This implementation of a function with variable arguments is, more or less, the best C can give us, yet it riddled with bugs and hidden problems. Let's go one by one:

Arg num will get out of sync: You need to specify the list of args as well as how many you have. That WILL get out of sync. Trust me, it's just a mater of time. And when it does, you'll have a coredump.
Type-unsafe: You just tell varargs "Hey, get me an int". And it will give you an int, no warranties included. If it was supposed to be a short instead, though luck, you end up with a coredump.
No, really, coredump: Where are so many coredumps coming from, you may ask. Easy, varargs it's just a way of handling the stack. Calling va_arg just moves the stack pointer by the sizeof the datatype you requested. That means no compile-time checks are included.
No pod types: Remember POD types? Try running this code:

#include <stdarg.h>

struct X { virtual ~X(){} };

void va_println(int args_left, ...) {
   va_list arg_lst;
   va_start(arg_lst, args_left);

   while(args_left--) {
      X *p = va_arg(arg_lst, X*);
   }

   va_end(arg_lst);
}

int main() {
   X x, y, z;
   va_println(3, x, y, z);
   return 0;
}

And how do we fix it?

The fix is easy. Too easy. You just need C++0X. We will discuss why this is better next time, but just as a sneak peak:

void println() {}
template <typename H, typename... T> void println(H p, T... t) {
   std::cout << p;
   println(t...);
}

int main() {
   println("Hola", " mundo ", 42, 'n');
   return 0;
}

Remember to compile using -std=c++0x in gcc. (Thanks Hugo Arregui for correcting the POD example)

Monday, 18 April 2011

Cool C++0X features I: Intro

C++0X brings some very cool changes, and I wanted to start a series of posts regarding some of these changes, with a small explanation of each new feature (that I currently understand, at least), an example of its usage and why I think it's a cool thing. Notice these two may be mutually exclusive, some of these may just be cool but I wouldn't recommend using them on a day to day basis. An example of a very cool feature which I wouldn't normally use in a project is the one I want to write about today: variadic templates.

What's not to love about variadic templates? Its name implies (correctly) that it uses templates, and it also has a "variadic" thingy, which you can use to look smart since no one really knows what it means.

Templates themselves can quickly get complicated if used by unexperienced padawans in the art of martial C++, yet their hypnotic beauty draws every programmer to use them just like flies are drawn to fire. When used correctly they can produce very elegant code; if not for the template programmer, at least for the end user. Yet in all their power, templates in C++ have been lacking a fundamental aspect: a variable number of arguments.

There are ways to work around this limitation, like using a list of types paired with a template-paramlist-object. Sounds familiar? (I know it doesn't, don't worry). You could also generate N constructors, one overload for each parameter count. The drawback, exponential compile time (say, TR1). These are all hacks, which are in place only because there wasn't a safe way of passing a list of types associated with a list of arguments. This is over now with variadic templates in C++0X.

So, what kind of problem would variadic templates solve? Let's name a few:

A typesafe varargs function (a function with a variable number of arguments)
Easily create a template object which acts as a tuple
An easier implementation of a reduce (inject) function

This entry is getting quite long so we'll start seeing these examples on the next post.

Thursday, 7 April 2011

Hex dump in C++

If you need to work with low level stuff (say communications protocols, compression algorithms, stuff like that) you'll be needing an hex dump function sooner or later. Alex, from Alex on Linux, has a great hex dump function for Python and C.

I added an =NULL for caption, I don't use it.

void hex_dump(char *data, int size, char *caption=NULL)
{
	int i; // index in data...
	int j; // index in line...
	char temp[8];
	char buffer[128];
	char *ascii;

	memset(buffer, 0, 128);

	printf("---------> %s <--------- (%d bytes from %p)n", caption, size, data);

	// Printing the ruler...
	printf("        +0          +4          +8          +c            0   4   8   c   n");

	// Hex portion of the line is 8 (the padding) + 3 * 16 = 52 chars long
	// We add another four bytes padding and place the ASCII version...
	ascii = buffer + 58;
	memset(buffer, ' ', 58 + 16);
	buffer[58 + 16] = 'n';
	buffer[58 + 17] = '';
	buffer[0] = '+';
	buffer[1] = '0';
	buffer[2] = '0';
	buffer[3] = '0';
	buffer[4] = '0';
	for (i = 0, j = 0; i < size; i++, j++)
	{
		if (j == 16)
		{
			printf("%s", buffer);
			memset(buffer, ' ', 58 + 16);

			sprintf(temp, "+%04x", i);
			memcpy(buffer, temp, 5);

			j = 0;
		}

		sprintf(temp, "%02x", 0xff & data[i]);
		memcpy(buffer + 8 + (j * 3), temp, 2);
		if ((data[i] > 31) && (data[i] < 127))
			ascii[j] = data[i];
		else
			ascii[j] = '.';
	}

	if (j != 0)
		printf("%s", buffer);
}

Monday, 4 April 2011

Newsflash: C++ object commits sepuku

Check this out. Is it valid C++?

class X {
  void dispose() {
    delete this;
  }
};

Strange pattern, isn't it?. What happens if you try to dispose a heap object?

void f() {
   X x;
   x.dispose();
}

Indeed, nasal demons FTW, you're trying to free an invalid pointer. Yet if we change that a little bit...

void f() {
   (new X)->dispose();
}

Zomg now it works. It's weird, but it works. Why would anybody on earth do something like this? Can you guess when would this be useful?

Some times you launch a background job, and you don't really care when it's done. You may use a callback to be notified when the job is done, but if you don't really care then having an object which deletes itself is an option. You'll have to be very careful about it, though, because this is legal C++ too:

class X {
  void dispose() {
    delete this;
    std::cout << "Hello worldn";
  }
};

Though "Hello world" will be printed, it will be running in a dead object. Which is fine, as far as the compiler cares, but if you do try to reference the this pointer, you'll be in a lot of trouble.

Bonus reading For a much more interesting note than mine, go and check When does an object become available for garbage collection? in The Old New Thing.

Thursday, 31 March 2011

CRTP for static dispatching

So, virtual dispatching is just too much overhead for you? I bet you do need every femtosecond from your CPU. Even if you don't, who doesn't like weird C++ constructs? Take CRTP, for example, a Curiously recurring template pattern:

template <class Derived> struct CRTP {
    const char* greeting() const {
        const Derived* self = static_cast<const Derived*>(this);
        return self->greeting();
    }
};

struct Hello : public CRTP<Hello> {
    const char* greeting() const { return "Hello world"; }
};

struct Bye : public CRTP<Bye> {
    const char* greeting() const { return "Bye world"; }
};

#include <iostream>
template <class T> void print(const CRTP<T> &x) {
    std::cout << x.greeting() << "n";
}

int main() {
    print(Hello());
    print(Bye());
    return 0;
}

Using this weird looking (ain't them all?) template device you can have static dispatching with most of the flexibility of dynamic dispatching. As a bonus, you'll drive all your cow-orkers insane! Bonus non useful information: In C++ 0X you could use variadic templates and have a proxy object with static dispatching. How cool is that?

Tuesday, 29 March 2011

Time your time

"time" is a useful command line utility to measure how long it takes for your super optimized algorithm to run, but it's useful as a timer too: just write "time read" and press enter when you get tired of waiting. Instant timer on your console!

Thursday, 24 March 2011

Moving away from DB IPC

Last time I wrote about why DB IPC is bad. Now I intend to write about the way to start moving away from it, towards a better architecture.

As I mentioned, this pattern is deeply rooted across all the enterprise platform, so removing it is not an easy task, and it can only be done in small steps. Small steps means a compromise solution, you won't be going from IPC DB to a restful application in a week, so having an ugly-but-not-so-much-as-ipc-db solution is the way to go.

The first step to move from DB IPC to a services oriented architecture is moving from data driven applications to event driven applications. That means, instead of polling the database for changes, receive a notification that the data has changed and act upon the event.

A way to implement notifications without polling is having the DB notify you of any changes occurred. A way of doing this is using something like otl_subscriber, a wrapper to Oracle's notifications features. Postgres has its own notification schema, MySQL AFAIK doesn't.

Once you have managed to separate the responsibility of processing the event and the data of the event itself, it's easy to go one step beyond and implement a messaging platform, like CORBA or something like AMQP.

Conclusion: the architecture may not be nice with DB notifications either, but you have taken the first step towards decoupling two different components. From this schema to a real queue there's only one step, and once there you can finally begin to have a db-schema for each application.

Tuesday, 22 March 2011

DB IPC: Communicating processes the wrong way

A common pattern in enterprisy applications is DB IPC, probably one of the worst kind of coupling you can have. If you tell me you never saw it I won't believe you, but in any case: DB IPC is an architecture antipattern, in which you have several semi-independent components which must share some kind of information, and do so through a database. The producer writes into a table, the consumer polls the table for changes.

For an otherwise perfectly designed application, DB IPC may seem like a bad thing but not the worst kind of architecture possible. Clearly a god object may look uglier than an IPC DB. Yet this kind of architecture leads to a tight coupling between the component's inner data structures, making any change in them very unlikely, if not impossible.

Inhouse applications tend to rely a lot on this pattern, albeit unknowingly, for historical reasons: components which are now different applications were once part of a single process, in which no IPC was needed. After these components start growing, instead of a careful and planned change IPC DB gets implemented. It is the path of less resistance, after all.

Steering away from this pattern is very difficult, as it requires a lot of changes to every single application on the enterprise platform, and the introduction of new technologies like CORBA or web services. Seeing this is maintenance job and not productive (i.e. money making) development, it tends to get delayed.

Not everything is lost. An intermediate solution, not as ugly as IPC DB but not so nice as CORBA, is implementing a queue using the DB itself. We'll see a way of doing just that next time.

Thursday, 17 March 2011

Truth be told

I bet 90% of enterprisey architecture diagrams are more or less like this one.

From Geek and Poke

Tuesday, 22 February 2011

I thought we had deprecated regedit

Guys, I thought we had already agreed on this a long time ago. Windows registry sucks. It's a pain in the ass.

Why TF is regedit still used in Gnome? I'd switch to KDE, if only I wasn't so lazy.

Thursday, 17 February 2011

Using BoUML as a case tool

I don't really fancy CASE tools a lot, they are mostly fads, but I must admit it, using BoUML to work on a design the other day was a nice surprise. Not only the generated code didn't seem to be written by an trained monkey [1], it almost seemed to be usable with some tweaking. It even generated nice javadocs!

For this article I'll asume you have some experience with BoUML. If you don't, apt-get it (it's available for Linux and Windows) then come back later, I'll wait. BoUML's manual is quite good and includes a lot of screenshots, but if you're already experienced using it you may find this short checklist quicker to create a new project and use the code generation tools:

Languages > C++ [...]; this will tell BoUML what language should generate. I haven't tried other than C++ but I've heard they work fine.
Draw some nice UML to create a couple of classes (i.e. create a class view and a class diagram, then add some classes). Don't forget to add some relationships, so the generated code can be something more than "class Foobar{};"
Create a deployment view. Something like "Foobar_deploy"
Edit the class view, it should open a class view dialog. Select your new deployment view
Right click on each class and click "Create source artifact". This'll create a new artifact under your deployment view
Almost done. In project > Edit > Edit Generation Settings go to "Directory" tab and select a root directory
You're now ready to right click on your class view and select "Generate > C++". Congratulations.

Generating namespaces in C++ wasn't easy at first, and the manual may not be so clear about this one:

Create a package. Move everything there (your deployment view, class view, etc).
Edit the package. Under C++ complete each path and namespace name (?). For example, Foo, Foo Foo would generate your sources and headers under ./Foo, with namespace Foo.
Repeat for as many namespaces as you want. You can have nested namespaces but you'll have to specify the full path and namespace name, i.e. Foo/Bar, Foo/Bar_CPP, Foo::Bar

If you're going to use BoUML as a case tool, you'll want to name associations, use the multiplicity, create setters and getters and all the stuff you probably never did when documenting in UML to reach a minimum user-documentation wheight. It seems these CASE tools haven't developed telepathy yet. Too bad.

Source: $ man BoUML

[1] Regardless of the fact that some may say that about my own code.

Tuesday, 15 February 2011

Vim Tip: Vigor

Oh man. Just do a search on google images for Vim + Vigor. There are so many WTF images to choose from, I just can't decide. Apparently since Vim is the name of sexual enhancement drug or something like that, combining Vim and Vigor is a formula for fun.

Well, Vim and Vigor have a different meaning in Linux. Just do an apt-get install vigor, then run it. You'll have lots of fun with Vim's evil cousing, I promise.

Thursday, 10 February 2011

BoUML: A usable UML editor for Linux

I bet you've heard it before, some people actually document their projects. Crazy, I know. UML is the tool of choice this days, and a lot of bad applications exist to make your life misserable with half implementations of a half standard language, and random crashes sprinkled here and there to keep you alert and saving at all times (I'm looking at you, Umbrello).

In Linux finding a decent UML application has been quite a difficult task, but after working with BoUML for quite some time I can say this tool, although not without its quircks, certainly meets my acceptance criteria [1].

BoUML, though a little bit unintuitive at first, is quite easy to use. Unlike most UML applications, this one workes closer to a CASE tool, so everything will have to be organized in packages. As a quickstart, create a class view and a class diagram inside that one, but you should really check the official manual. It'll save you a lot of grief, trust me.

On the downside, BoUML is not quite so good for free-style UML, so making a collaboration diagram with network symbols is impossible. I can live with that, Dia is a nice complement for free-style diagrams (it does work... though the result is uglier than an the dog at the end of this post).

[1] WTF after so many years of really bad applications my "acceptance criteria" has fallen so low that "stable" almost equals "good"...

Tuesday, 28 December 2010

Oh sh....

Tuesday, 23 November 2010

A callgraph for C programs

I was playing with Egypt the other day. It's a very cool application. It'll generate a callgraph for your C programs, no more no less. Actually, doxygen (kind of) does that, but what I really liked about this application is its simplicity. From Egypt's page

Egypt neither analyzes source code nor lays out graphs. Instead, it leaves the source code analysis to GCC and the graph layout to Graphviz, both of which are better at their respective jobs than egypt itself could ever hope to be. Egypt is simply a very small Perl script that glues these existing tools together.

Code reuse at its best. Give it a try.

Thursday, 18 November 2010

Brillant corporate inteligence

In the I bet somebody got a really nice bonus category, I found this one while trying to compile MySQL++ on Solaris: http://lists.mysql.com/plusplus/7811.

Don't worry though. Now it'll be renamed to #define ORACLE.

PS: Kind of related is the story about Oracle breaking everything after a %s/Sun/Oracle in the JVM [1], though in that case I'm more inclined to blame sloppy programmers.

[1] http://it.slashdot.org/story/10/07/28/2121259/Oracles-Java-Company-Change-Breaks-Eclipse

Tuesday, 12 October 2010

Easier inbox count with Gmail Favicon script

There's a very cool script to add to your browser but first:

Do you try to keep your inbox count in 0 (but usually fail miserably)?
Do you like (need!) to be notified when a new mail arrives?
Do you like Opera?

If you meet this conditions then you are very sick and need professional help. In the meantime, go and check Gmail Favicon Alerts 3, a cool script which changes gmails favicon to show your current email count. It works on Opera but it makes it crash. Most likely the script is not the one to blame here...

Thursday, 7 October 2010

Template Metaprogramming XVI: Appendix

I forgot but one thing about these posts: the titles. No one got the hint but they are all chapter names from one of my favorites series, Nowhere Man.

Monday, 4 October 2010

mv, mf, Ubuntu's "did you mean?"

Some time ago I found about Ubuntu's "did you mean" thingy on console. It's very cool. Of course, I found it after makeing a typo on the console, I wrote mf instead of mv. Ubuntu suggested I was trying to use "mv", but I could install "mf", if I wanted to. I was a little bored, so I researched a little bit what mf is.

Turns out mf is metafont, a programming language to define vector based fonts (!), kind of like postscript. It's written by Donald Knuth, and there are three bits of trivia which make this post somewhat meaningful:

The program's version number asymptoticaly aproaches e, right now it's on version 2.718281
rtfm! The section on comments says: Warning: Type design can be hazardous to your other interests. Once you get hooked, you will develop intense feelings about letterforms; the medium will intrude on the messages that you read. And you will perpetually be thinking of improvements to the fonts that you see everywhere, especially those of your own design.''
And lastly, on the bugs section: On January 4, 1986 the ``final'' bug in Metafont was discovered and removed. If an error still lurks in the code, Donald E. Knuth promises to pay a finder's fee which doubles every year to the first person who finds it. Happy hunting.