Things I learned while writing Synema

As I pointed out in previous blog entries, I have been running short on time in the last few months. The reason is that I put quite a lot of time into a project that was assigned to me as part of my studies. The goal of the project was to create an application allowing graphical monitoring of various system and network security tools, such as SELinux, PIGA (currently under development in my school, ENSI of Bourges), Osiris, Snort, etc.

SYstem and NEtwork security Monitoring Application

This application, named Synema, was meant to be developed as a first-time project for first year students in computer science and security, so there were no big expectations about it when it was started. My coworker and I have been charged of writing a plugin framework that would allow other students to compute text reports into images, with the help of libploticus, and to display those images in a provided area, using the Cairo library. As the other students were beginners, the API had to be simple and contain as few functions as possible, and it had to be available in C.

These constraints, and the short deadline (three months to write the API and an application that would use the plugins to display data about real machines) limited the range of possibilities, and I am not extremely satisfied with the result (the first reason being that only one group of students managed to write a plugin whose code I consider clean enough for inclusion). The project is still going on, and it may be pursued in a different form (focusing on certain security tools and doing deeper data visualisation, and data correlation) in the next months, but this is not what I wanted to blog about ; and even if I wanted to, I could not since I’m not very brilliant at foreseeing the future. Instead, I am just going to speak a bit about the development of Synema, with the hope that it can be helpful to anyone (especially, hints things that I could not find anywhere else on the Internet – but it is vast and I’ve not visited all of it yet!).

Let’s begin with screenshots so that you know what I’m speaking about:

From text to images

As you can see on the screenshots, Synema is a frame-based application, where each frame serves to display timestamped data coming from a specific security tool. Each plugin written by students will take care of one security tool, and will be split in two parts:

a daemon that will continuously parse logs coming from real machines, and create light reports out of these
then, a library containing a function that parses these reports and generates an SVG or PNG image out of it with libploticus

Well, this is what was intended. Now, if you think about the fact that there are several frames to be displayed at the same time, it means several plugins can call libploticus concurrently. But, considering that libploticus is not thread-safe, we had to replace it by horrible system() calls to the ploticus binary. You think this is ugly? Well, me too, but I had no time to try to make libploticus thread-safe. Anyone willing to do it will get a free beer from me (once it’s done of course :p).

Each library is a set of predefined functions, among which:

one that uses librsvg to display the generated image, which will be called automagically by catching the GTK+ expose event of the drawing area widget in the frame
and the function that generates the image out of logs, called in a separate thread, at regular intervals (either fixed or calculated from the average time when reports’ content have changed).
other functions that allow managing user events (eg. when the user changes the machine whose data is displayed, etc).

But the core of Synema is rather simple. Generate reports, generate images, display whatever you want in a tiny frame. All you need do now is find out what data can be of interest for your users – and be able to write one or two hundred lines of proper C code (which was more difficult than expected for some of my classmates, but my first attempts at writing code were not very successful, so I won’t blame them).

Dynamically loading code

We had to decide on a way to handle plugins. In a first time, a rough and very inelegant solution was to ask every student to prefix their functions with an unique id, so that all source files could be linked to the executable. Indeed, I did not know, at this time, that you can dynamically link code to a running executable. My friend Martin Peres (the master of this octopus den) told me about dlopen and dlsym, and indeed these two functions fit me perfectly, since they respectively open a shared library object and searches for symbols inside these libraries.

With dlopen, it began dead simple to allow building Synema without any plugin-specific code, and to build plugins separately. This allows packaging plugins separately from the main application, and reducing the weight and amount of dependencies of the Synema binary. Also, if a plugin’s code does not build (and this happens more often than you believe when it comes to letting first year students code), Synema is not affected, and a faulty plugin can easily be removed from the system without preventing our application to run.

I won’t detail about the API of dlopen, because this is very well documented on your system. All you have to know is that not all POSIX systems use fcntl.h to include these functions. Solaris systems use another, but I forgot about it (yeah, Synema doesn’t even build under Solaris, shameful). Of course, Windows-specific code will also be required for those of you who consider portable applications should also run on proprietary crap. I advise you to look at PPassKeeper’s source code if you’re interested into this.

GTK+ is not thread-safe

One thing I lost way too much time with during Synema’s development is struggling with GTK+ and threads. I wanted a visual indicator that a frame’s displayed data (usually an image) was being recomputed, so I decided to use a GtkSpinner (available from GTK+ 2.20, but you can find it in gedit’s source code).

I had to control when to start and stop the spinner from another thread than the one running the GLib main loop. Trying to use the gtk_spinner directly for that lead to random crashes, so it clearly wasn’t the thing to do. A quick search on Google taught me this was due to the lack of thread safety in the GLib. I looked for ways to circumvent this problem, and several blogs and how-to’s recommended to simply make the GTK calls inside a function called with gtk_idle_add so that they would be called in the main loop. Well, this was not enough, and crashes were still occurring.

I asked Google for some help again, and I was then recommended to use gdk_threads_add_idle instead, since this function was the one that was meant to allow adding idle functions in a thread-safe way. Guess what… I still had the same random crashes going on after some time, and I knew for sure it was caused by my calls to gtk_spinner_start and gtk_spinner_stop… This solution should theoretically work, according to the almighty Internet, but they didn’t, so I had to take things to the next level. In the end, I use gdk_threads_add_idle to call a function that emits signals in a custom GObject, whose handlers call the GTK functions I need. This is overkilling, but at least it works.

If you consider writing multi-threaded applications and need a decent GUI framework, I can only recommend you to use Qt (and it’s easy to learn)! I could not because of the necessity to write C code for the other students to understand, but this kind of details makes me bitterly regret using GTK+.

Update

Jannis Pohlmann, an XFCE developer, provided me with a more simple solution, using gdk_threads_enter and gdk_threads_leave. Read his comment below if you want to know more about it. Thanks Jannis!

DBus’s generated introspection files

The log player is an external daemon, running a DBus server, that just reads a given set of files and writes them back, little by little, in the place where Synema plugins’ daemons expect to find real logs. This allows rereading of archived logs, in order to analyse what’s happening on a honeypot, for instance (the whole point of Synema is actually to allow having a glance at the state of a honeypot). The student who was in charge of the player provided me with an XML file that could be used with dbus-binding-tool in order to generate the DBus interface used by his log playing daemon. After having generated this interface, all I had to do was writing functions calling the good DBus methods, and connecting an appropriate handler to the DBus signals sent by the server.

Alas, when I tried to connect to the signal described in our XML file, it was just not working… My handler was never called and it took me several hours to find the culprit. In order to connect to a DBus signal, you first have to add it to your DBus proxy using dbus_g_proxy_add_signal, and then you need to connect a handler to it via dbus_g_proxy_connect_signal. The server will use the very same signal name as in the XML file, for instance, ‘foo_bar’. But, and don’t ask me why, because I have no clue, the client instead has to call the signal ‘FooBar’, because this is the name of the signal being emitted client-side… It took me a while to figure this out, and I could find no help on the Internet, so I’m writing it down here with the hope that it can be helpful to anyone.

Update

This problem I encountered came from a misunderstanding between a coworker and I. He used the annotation property in the methods of our XML file in order to change their name to a ‘foo_bar’ like name, but I didn’t do so for the signals which I added to the file. While a wrong name is obviously easy to spot for a method (because it won’t build), a wrong signal name on the client side is unspottable, though.

CMake and code generation

I will finish with a code snippet from my CMakeLists.txt, the CMake equivalent of autotool’s autogen.sh, configure.ac and Makefile.am. It took me a while to choose between CMake, which I didn’t know at all but of which I had an experienced user under hand, and Autotools, which I knew better but whose syntax and file cluttering I was getting fed up of. I finally decided to give CMake a go, and it was truly horrible to find a decent starting point. As for Autotools, the most simple method I found was to steal another CMakeLists.txt and adapt it to my needs, relying on a list of available CMake commands to find out how to do new things.

I am still unsatisfied with CMake, because having it actually do custom commands is just way too difficult (actually, it’s not hard to tell it to add custom commands, but hard to tell it to run the damn commands!), and because I still don’t know how to check for installed packages under my favourite Unix systems (and I couldn’t care less about Windows and MacOS, so please don’t tell me it’s hard to find a portable way to do this).

So, for several reasons I may need to generate code prior to compiling Synema (including generation of the previously mentioned DBus interface files, and the custom GObject generated with the awesome Gob 2 utility). Here is what I’m doing for the DBus file generation:

ADD_CUSTOM_COMMAND(
    OUTPUT "${CMAKE_CURRENT_SOURCE_DIR}/player/dbus-methods.h"
    OUTPUT "${CMAKE_CURRENT_SOURCE_DIR}/src/log-player-dbus-methods.h"
    COMMAND dbus-binding-tool --mode=glib-server --prefix=object ../data/log-player-demo.xml > "${CMAKE_CURRENT_SOURCE_DIR}/player/dbus-methods.h"
    COMMAND dbus-binding-tool --mode=glib-client ../data/log-player-demo.xml > "${CMAKE_CURRENT_SOURCE_DIR}/src/log-player-dbus-methods.h"
    DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/data/log-player-demo.xml"
)
SET_SOURCE_FILES_PROPERTIES("${CMAKE_CURRENT_SOURCE_DIR}/src/log-player-dbus.c"
    PROPERTIES OBJECT_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/src/log-player-dbus-methods.h")
LIST(APPEND ls_headers "${CMAKE_CURRENT_SOURCE_DIR}/src/log-player-dbus-methods.h")

SET_SOURCE_FILES_PROPERTIES("${CMAKE_CURRENT_SOURCE_DIR}/player/server.c"
    PROPERTIES OBJECT_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/player/dbus-methods.h")
LIST(APPEND demo_player_headers "${CMAKE_CURRENT_SOURCE_DIR}/player/dbus-methods.h")

With this command, I can guarantee the following properties:

if the XML file changes, DBus bindings will be generated again. This is because the custom command depends on the XML file
if the output file log-player-dbus-methods.h changes, then log-player-dbus.c will require rebuilding
and especially, if log-player-dbus-methods.h does not exist, building the application will generate it because log-player-dbus.c depends on it

This solution is not ideal because if forces you to list all the files that will depend on your generated files, but it is enough in most cases. There are not many examples of this online, so hopefully this one will be useful to someone else.

Voilà

That’s already the end of this blog post. I hope people will now hate me a little less for having been so unavailable lately. I’ve just been wasting some time writing C code and using APIs I hope to never have to make use of ever again. I hope to soon have the occasion to write about my debuts with C++ and Qt, because it’s very likely Synema (or whatever succeeds to it) will be written in that language, so that I don’t run out of surprises!