Backing up bazaar repos

Marcos Dione

2010-03-19 19:10

In one of my previous post I mentioned that my blog was brokenito. Actually my blog is just a bunch of markdown files that I compile in a blog with ikiwiki, and I store it in a Bazaar repo.

A month ago I bought a new hardisk that I installed in my notebook, replacing the one that used to have the sources of my blog. I reinstalled Debian Sid from scratch¹ and just grabbed everything from the backups (yes, I have weekly backups. How many can say that? :)

The problem was this: the backups included the Bazaar working copies/branches, except for the (normally empty) directory .bzr/repository/upload/. This directory is populated with a temporary file each time you do a commit, but bzr doesn't try to create it if it doesn't exist. I think they assume it's a given because it's created when you do a bzr init.

So here are two bugs: first my backup system (just a bash script that runs rsync) should store empty directories (fixed) and I think bzr should create the dir if it's not there. I will try to make a patch and submmit a wish in their bugtracker.

This oneliner should fix all my restored repos:

find . -name .bzr | while read repo; do mkdir -vp $repo/repository/upload/; done

Somehow I feel the new instalation faster than the previous one, specially when installing new software or updates. I think this might be related to the high fragmentation that the old system might have. I should explore this. ↩

Is Fedora a rolling distro

Marcos Dione

2010-03-19 19:10

Today, looking at the output of my mirror script for my LUG's ftp repos, I found, like almost every two or three days, a very long list of updated packages in Fedora's updates repos. This wasn't new, and actually I already knew that normally their updates' repos are bigger that the distros' ones. Here you can see what I mean:

$ du -sm fedora/12/Fedora/i386/os/Packages/ fedora/updates/12/i386/
2853    fedora/12/Fedora/i386/os/Packages/
7764    fedora/updates/12/i386/

More that two times the size! What's going on here? To answer that one has to get the package names from one repo and the other and compare. The easiest way I got was to use the filenames, but in contrast to .deb based distros, which separate package name, version and architecture by _, .rpm based distros (such as, of course, Fedora), they use -, which can also occur in the middle of the package name. So I hacked a oneliner that approximates what I wanted:

ls -1 fedora/12/Fedora/i386/os/Packages/*.rpm | python -c 'import sys; \
    lines= sys.stdin.readlines (); \
    lines.sort();\
    packs_ver= [ "-".join ( x.split (".")[0].split ("-")[:-1] ) \
        for x in lines ]; \
    print "\n".join (packs_ver)'

Putting both outputs in spearate files and using diff -y --left-column one can get some rough numbers:

# how many are in each side?
$ wc -l fc12*
  2399 fc12.list
  5299 fc12-up.list

# these packages have not been updated
$ diff -y --left-column fc12* | grep '<' | wc -l
485

# these lines are different, so adds one package on each side
$ diff -y --left-column fc12* | grep '|' | wc -l
887

# there are new packages
$ diff -y --left-column fc12* | grep '>' | wc -l
3385

# these are common, so they have been updated
$ diff -y --left-column fc12* | grep '(' | wc -l
1027

So the fisrt clue that I'm right: Fedora adds packages (3385+887) to the original distro (2399) via the updates; that is, it grows with time. One notable remark is that this happens with every 'release', as if they start from scratch (or a small base) for each release. Also, half of the original packages (1027) have been updated, while the rest (485+887) have not. So in total FC12 started with 2.4k packages last November, and in only 4 months 1k of its packages has been updtated at least once (but I know that the common case has had more than just only one) and 4.2k additions, adding up to roughly 6.6k packages so far. Also, and this is also an important fact, there are between 20 and 170 updates per day (at least that's the numbers that came up looking at this week's logs).

I know that Fedora is mostly a development distro (in the sense that is in constant development, not that it is aimed for developers, at least not for mere mortals-among-developers like me), so it only makes sense that there are so many updates, but why bothering making a release?

The answer lies, I think, in the notable remark I mention up there. Fedora starts almost from scratch everytime; that's why they can introduce so many and so big system changes from one release to the next. SELinux and PulseAudio are the two I can remember without checking the interwebs. Probably because it has so fresh and sometimes so radical changes is that core developers use it. Maybe one could call it the ultimate (b)leading edge distro, shy of LFS and probably closely followed by Gentoo.

So my conclusion is right now Fedora is not for me anymore. Maybe some 8 years ago, when I had time to fiddle with the system, but now I prefer to trust a more stable (but not completely, mind you) Debian Sid. With only 6k+ packages in their official repos one could also say that it is small compared with the ~28k packages in Debian Sid, but their packaging policies differ a lot to make a one to one comparisson¹ (nonetheless, I think that there is more soft in Debian that in the official Fedora). What does puzzle me is why a noticeable part of the sysadmins I know prefer to run it in their servers. Maybe I should ask them :)

Debian, for instance, breaks library packages in three: the library, the development files (headers, static versions) and optionally a debug version, with symbol info for useful stack traces. There are more examples. ↩

Compiling native apps for Android explained

Marcos Dione

2010-02-24 23:12

In my daily work (the one that pays the bills) I've been burdened with the responsability to port two projects to the Android platform: hop, a programming language for the web 2.0/application server writen in Scheme. For that I would need to also port bigloo, a Scheme compiler. The goal is to be able to run hop apps in the phone itself.

I really didn't know much about the platform then, only that probably it would be a fun platform to have in your phone, being GNU/Linux based, open source and all. There was also the promise to play with these phones; being somewhat a technophile, that was definetely a plus. With a smile in my face, I cracked may hands, sat down and started... reading.

The first thing one learns when reading about developing for the platform is that the language is Java. There's a complete SDK for developing them. If you want to port some C/C++ library, you can do so with the NDK, and that's pretty much it. Now, from my point of view, this is very restricting: you cannot choose your own programming language. Even if I like the language a bit (but not its std library or the way that the language is driven), I prefer Python or whatever suits better the problem. But then, I understand Android/Google/OHA needs to be control freaks with the apps running on the phone, because there are limited resources and you cannot leave an application open forever. So, they devised their component lifecycle, which reminds me of OSGi, made it in Java, made all the tools and that was it.

But in my case I needed a native port, and that's for two reasons. First, even if bigloo can produce Java bytecode, and even when I've been told that Dalvik, their JVM implementation, can run Java bytecode¹, the result is not fast enough, being a both bigloo and the JVM in turn problem. Certainly bigloo's Java output might not be the best, but also the JVMs we tried were too heavy². The second and most important reason is that if we did a Java port, we would be tied to the component lifecycle I mentioned above. We need a hop daemon running in order to be able to run its apps. If this deamon goes off by the platform's request³, there's no way to awake it again when a (web) client tries to make a request⁴, so the request would fail and would make hop useless. Hence, a native port⁵.

I have to say in everybody's defense that this is my first contact with toolchains and cross compiling.

Given that native porting is not supported, I went off searching. The first posts talked about static linking. This guy even pointed to a first wrapper for the precompiled toolchain and gave some details on some CFLAGS to set. But static linking was not an option for us. hop uses bigloo's runtime libs as dynlibs⁶.

Then I hit Rob Savoye's efforts to port gnash. There I started to fully understand the Android platform: «while the C compiler is ok, the C++ compiler is crippled. No iostream support in libstdc++, no locales, etc... Then to make it more fun they use their own libc called Bionic, which is trimmed down for embedded devices». He has more details on C++ compilation because he needed it (we don't... yet).

So I turned to what I knew, if it only was its name: emdebian. I was till a litle hazy about what cross compiling meant, so I started to make questions. Thing is, emdebian is coupled with Debian (it only makes sense), so the first step would have been creating a Bionic package. Of course, this was way beyond my requirements and abilities. In that channel I was also pointed to a post by Harald Welte, which in turn links to a talk called "Android Mythbusters", where we learn also that the linking is not standard [s.6], some hacks in the device support [s.7], hardcoded hotplug [s.8], no tslib support [s.12], hardcoded limits [s.13], and a lot more of calamities, including bad community support [s.18], and its peak, the CyanogenMod-gate.

All that analisys is nice, but I was still stuck with no porting system. My boss suggested scratchbox, but just like emdebian it only has support for (e)glibc and uClibc. It is possible to add support for other platforms, but their script didn't work and my feeble attempts led me no closer to a solution.

Then with a strike of luck I found Joel Reymont's Android Notes, who not only explains a little more the platform, but also has a example Makefile for compiling executables for Android. And then, just a few minutes later, Takuya Murakami's wrapper, which, together with the prebuilt toolchain in Android's source code, made most of it. I finally got a simple Hello World! dynamic executable running in the emulator.

From here all was mostly downhill. The last stone in the path was a linking problem. bigloo uses Boehm's gc, a renowed garbage collector used by several proyects, including mono and ocalm. gc needs a symbol called __stack_base__ defined, which, from what I could find looking, used to be defined in crt0.o, one of the parts that the compiler normally attaches to executables to make them so. The problem is that this file is not in the SDK or Android's source code/prebuilt toolchain. To be fair, that file is not to be found in a fairly new distro. What I found was a reverse engineered crt0.S ⁷, but it doesn't define the symbol I needed, and in any case I already got running executables in the emulator. In the other hand, naoya_t seems to have hit the same problem, but he writes in japanese and I couldn't get hold of anyone to translate it for me⁸.

Long story short, I made a hack to make the linking happy. bigloo finished compiling, I pushed it to the emulator and... it worked!

Some notes about this port: with a little bit of luck it will be officially announced next week, even if it works with a hack. Ivan Maidanski assures me that it should not be nessesary, so probably there's an error on our side when configuring gc; but then again, I spent more that 2 weeks with that problem alone and I followed all the suggestions they gave us⁹, so I'm pretty sure is something in the middle. Also, mono has been ported to Android, and even if I read references to the NDK, I've been assured that it's a native port. If they could port it without hacks, so should we¹⁰.

Lastly, some notes about the crippled GNU/Linux running behing the Android platform: besides all the low level stuff mentioned by others cited above, there are no cp or tee; ls does not support the -R option, mkdir does not support the -p option, and even when one of these tools don't recognize an option, instead of barking, it happily accepts it as another file to treat. so, mkdir -p bongs/bonga creates a -p dir and barfs on bongs/bonga. What it is there, yes, are dd and strace! Not very surprisingly, there's no /etc/passwd, but then the system somehow knows several others users besides root and shell, so I guess they're hardcoded somewhere. It's really impressive the effort they put to cripple this thing. Most probably they're using a tool like BusyBox¹¹, and if so, they are only saving some directory entries in the case of cp or tee and wasting time in the other cases.

All in all, I'm dissapointed by the platform. I understand that what we're trying to do is not supported, and that alone tells me that I don't want an Android device until this has changed, if it ever does. Probably the fact that Android is (now) open source and that the Cyanogengate led to the formation of «the Open Android Alliance (not to be confused with the Open Handset Alliance) an organization whose stated goal is to distribute "a Flavor of Android that is fully customizable and does not rely on Google or other copyrights» gives the platform some hope.

Meanwile I will prefer Maemo (who at least looks more open), specially now that Qt will be the default toolkit. I'll also keep an eye on Bada, Samsung's new platform. There are not many details about this last one, except that they have their own toolkit in C++, and from the API's documentation, it looks like it's a very thorough one.

but then the WikiPedia page says you have to convert the .class file in a .dex file using a tool called dx, and that «It uses its own bytecode, not Java bytecode». ↩
but then Dalvik is supposed to be fast enough for the platform. ↩
«If an activity is paused [not having the focus] or stopped [run at some time but obscured completely], the system can drop it from memory either by asking it to finish (calling its finish() method), or simply killing its process». ↩
because instead of communicating via Android's component system, the communication is via a simple TCP socket. ↩
but probably we'll have to make at least a Java wrapper to start it. I'm even pondering making a "hop apps" app which would be a WebKit widget and a hop server runnning in the background. ↩
or in any case I would need to compile bigloo first dynamically to get the libs to compile hop staticly. ↩
that blog has some more details about porting to Android. ↩
a chineese friend could read some, but his translation didn't make more sense than google's. babelfish just timed out on that page. ↩
the ones that made sense, anyways. ↩
when I was looking for anyone involved in mono's irc channel, I was told to look in the code. A simple “noone involved is here, try reaching them by mail” would have sufficed. Way to go, mono community! ↩
and if they're not, then I don't know what's going on in their heads. ↩

Going to FOSDEM 2010

Marcos Dione

2010-02-24 22:43

Hell yeah I'm going! It's the first free soft event I'll be attending since I moved to France and I'm quite excited about it. I really don't know what to spect, except drinking belgium and dutch beer (which I miss from last summer, which I spent in Netherland 20m away from a bar... how many times I came back drunk during that time I don't even fathom to try to count). Here I barely can get my hands on some Leffe and that's it.

So, I can't wait for those hangovers, yay!

Went to FOSDEM 2010

Marcos Dione

2010-02-24 22:43

Having an extraordinaire hangover is, by far, not the best way to start attending a conference. But with FOSDEM there doesn't seems to be any alternative. The night before the conference a bar is kidnapped from the night circuit so FOSDEM-goers can get together to have a drink or two... or three, four, I-lost-the-count. You have to buy 'tokens' from the organizers (EUR 3, standard price) and then exchange them at the bar. I bought 4, which probably it was a little too much for me (I weight around 60Kg and my food input during the day was not even on the 'enough' level), but that was not the reason I was so hungover the next day. The reason was that clever people were leaving early and left their extra tokens behind, and that I didn't control myself. That's probably because the first one I had was a Delirium Tremens, which is above 8% instead of the normal 4/5%. Whoever to blame, no matter how hard I tried, I couldn't get off bed before 15h the next day.

So the first thing I did in FOSDEM was to miss the talk about beernet¹, something that sounds like a transactional and replicated DHT. I also missed the KDE group photo, which would have been a good way to introduce myself to the (rest of the²) KDE guys.

The first talk I saw was Will Stephenson's talk about OBS which at first I wasn't really interested in seeing, but end up being really interesting. The OpenSuSE guys have a huge farm of machines for building packages. They can build packages for most of the mainstream distributions (I remember OpenSuse itself, Fedora, Debian, Ubuntu, Mandriva and more), for all the versions that run KDE4, for several archs. All that it's needed is the source code and the packaging instructions. And all this is available to us, developers, just a registration away. The other half of the talk was in charge of Luboš Luňák, who promised to also release a tool to help to generate the .spec and the debian/{control,rules,etc} files. I await spectantly for them!

That night I hung around with some Debian guys, some of them I knew from DebConf8. We found a nice italian restaurant, where we ate pizza and pasta almost alone in the fisrt floor. The waiter was all the time telling jokes and when we were asking for the bill someone asked for another beer, which led the rest for asking for dessert and coffee. Fun time, which continued in the Monk bar until 24h. One thing about bars in Belgium: they don't have the smoke ban, so by the end of the night I was reeking of smoke even when I don't smoke at all.

The second day gave me the surprise that the KDE track had already finished (it was only the previous afternoon), so I wandered around a little. I went to a talk called "apt-get for Android", which given my current work got all my attention. Unluckly it was only just an Android app store (a free one, both in the beer and freedom sense) and not an effort to port some Debian stuff to the platform (it wouldn't make any sense anyways). I also saw the Ofono talk, but I didn't even got it's motive of existence.

Sebastian Trueg gave a talk about the Nepomuk stack. It's impressively huge and it's incredibly useful, but it still needs more integration into apps. I got the apportunity to talk with Trueg after the talk and I promissed to look into a Konqueror plugin who's sitting in playground. I'm also concerned about the UI for tagging/adding tags. Let's see if I can keep my promises.

again, given the size and behaviour of my hangover and the project's name, even if I could crawl off the bed in time to see it, probably it would be not advisable anyways :) ↩
being a user and a more-times-off-than-on developer makes me a KDE guy, but I'm still not confident enough. ↩

satyr 0.3.2

Marcos Dione

2010-02-24 22:18

satyr-0.3.2 "I should install my own food" is out. The Changelog is not very impressive:

We can save the list of queued Songs on exit and load it at startup.
The queue position is not presented when editing the title of a Song that is queued.
Setting the track number works now (was a bug).
Fixed a bug in setup.py.

but the last one is somewhat important (Thanks Chipaca). Also, 2 months ago, I made the satyr-0.3.1 "freudian slip" release, when user 'dglent' from http://kde-apps.org/ found a packaging bug. It was not only a bugfixing revision, it also included new features:

nowPlaying(), exported via dbus.
'now playing' plugin for irssi.
Changes in tags is copied to selected cells (in the same column). This allows 'Massive tag writing'.
Fixed "NoneType has no attribute 'row'" bug, but this is just a workaround.
Forgot to install complex.ui.

Now go get it!

satyr 0.2

Marcos Dione

2010-02-24 22:16

I figured several things after the last/first release. One of those is that one can't try to pull a beta of your first releases. Betas are for well stablished pieces of code which are supposed to be rock solid; initial releases not. Another thing I figured out (or actually remembered) is that old saying: release early, release often.

So instead of a 0.1 'official' release, where all the bugs are nailed down in their coffins and everything is as peachy and rock solid as a peachy huge rock (like the Mount Fitz Roy¹, for instance), and only 13 days later than the initial release, we get another messy release: satyr-0.2, codenamed "I love when two branches come together", is out.

This time we got that pharaonic refactoring I mentioned in the last release, which means that skins are very independient from the rest of the app, which is good for skins developers and the core developers, even if those sets are equal and only contain me.

From the user point of view, the complex skin is nicer to see (column widths and headers, OMG!) and it also allows tag editing. Yes, because we have tag editing! Right now the only way to fire the edition is to start typing, which will erase previous data, but don't worry, I plan to nail that soon. At least it's usefull for filling up new tags. I also fixed the bug which prevented this skin to highlight which is being played. Lastly but not leastly, the complex skin has a seek bar, and the code got tons of cleanups.

So, that's it. It's out, go grab it!

Right now I would consider satyr just a small peeble in a highway, only noticeable if some huge truck picks it up with its wheels and throws it to your windshield. But I plan to reach at least to be a sizable rock such as that one found near one of the Vikings in Mars. ↩

Oh yeah

Marcos Dione

2010-02-24 21:54

Experience freedom!

Yes, a little bit late, but my blog was brokenito. Unluckly I missed the release party at FOSDEM and it seems there were none in France. I hope the guys back in my country will have fun tonight. Anybody else, just enjoy this incredible relase!

Edit/Update: so brokenito that it took me a while to gather the intentions to fix it :| At the end it was a small thing about wich I'll write about later.

Our man in Toulon

Marcos Dione

2010-01-27 23:55

A couple of days ago Marcelo Fernández wrote a simple image viewer in PyGTK. It's less than 200 lines long¹, and I thought that it would be nice to compare how the same app would be written in PyKDE4. But then I though that it would not be fair, as KDE is a whole desktop environment and GTK is 'only' a widget library, so I did it in PyQt4 instead.

To make this even more fair, I hadn't had a good look at the code itself, I only run it to see what it looks like: a window with only the shown image in it, both scrollbars, no menu or statusbar, and no external file, so I assume he builds the ui 'by hand'. He mentions these features:

Pan the image with the mouse.
F1 to F5 handle the zoom from 'fit to window', 25%, 50%, 75% and 100%.
Zooming with the mouse wheel doesn't work.

Here's my take:

#! /usr/bin/python
# -*- coding: utf-8 -*-

# OurManInToulon - Example image viewer in PyQt4
# Marcos Dione <mdione@grulic.org.ar> - http://grulicueva.homelinux.net/~mdione/glob/

# TODO:
#     * add licence! (GPLv2 or later)

from PyQt4.QtGui import QApplication, QMainWindow, QGraphicsView, QGraphicsScene
from PyQt4.QtGui import QPixmap, QGraphicsPixmapItem, QAction, QKeySequence
import sys

class OMITGraphicsView (QGraphicsView):
    def __init__ (self, pixmap, scene, parent, *args):
        QGraphicsView.__init__ (self, scene)
        self.zoomLevel= 1.0
        self.win= parent
        self.img= pixmap
        self.setupActions ()

    def setupActions (self):
        # a factory to the right!
        zoomfit= QAction (self)
        zoomfit.setShortcuts ([QKeySequence.fromString ('F1')])
        zoomfit.triggered.connect (self.zoomfit)
        self.addAction (zoomfit)

        zoom25= QAction (self)
        zoom25.setShortcuts ([QKeySequence.fromString ('F2')])
        zoom25.triggered.connect (self.zoom25)
        self.addAction (zoom25)

        zoom50= QAction (self)
        zoom50.setShortcuts ([QKeySequence.fromString ('F3')])
        zoom50.triggered.connect (self.zoom50)
        self.addAction (zoom50)

        zoom75= QAction (self)
        zoom75.setShortcuts ([QKeySequence.fromString ('F4')])
        zoom75.triggered.connect (self.zoom75)
        self.addAction (zoom75)

        zoom100= QAction (self)
        zoom100.setShortcuts ([QKeySequence.fromString ('F5')])
        zoom100.triggered.connect (self.zoom100)
        self.addAction (zoom100)

    def zoomfit (self, *ignore):
        winSize= self.size ()
        imgSize= self.img.size ()
        print winSize, imgSize
        hZoom= 1.0*winSize.width  ()/imgSize.width  ()
        vZoom= 1.0*winSize.height ()/imgSize.height ()
        zoomLevel= min (hZoom, vZoom)
        print zoomLevel
        self.zoomTo (zoomLevel)

    def zoom25 (self, *ignore):
        self.zoomTo (0.25)

    def zoom50 (self, *ignore):
        self.zoomTo (0.5)

    def zoom75 (self, *ignore):
        self.zoomTo (0.75)

    def zoom100 (self, *ignore):
        self.zoomTo (1.0)

    def zoomTo (self, zoomLevel):
        scale= zoomLevel/self.zoomLevel
        print "scaling", scale
        self.scale (scale, scale)
        self.zoomLevel= zoomLevel

if __name__=='__main__':
    # this code is enough for loading an image and show it!
    app= QApplication (sys.argv)
    win= QMainWindow ()

    pixmap= QPixmap (sys.argv[1])
    qgpi= QGraphicsPixmapItem (pixmap)
    scene= QGraphicsScene ()
    scene.addItem (qgpi)

    view= OMITGraphicsView (pixmap, scene, win)
    view.setDragMode (QGraphicsView.ScrollHandDrag)
    view.show()

    app.exec_ ()
    # up to here!

# end

Things to note:

The code for loading, showing the image and pan support is only 13 lines of Python code, including 3 imports. The resulting app is also able to handle vector graphics, but of course I didn't exploit that, I just added a QPixmap/QGraphicsPixmapItem pair.
Zooming is implemented via QGraphicsView.scale(), which is accumulative (scaling twice to 0.5 actually scales to 0.25 of the original size), so I have to keep the zoom level all the time. There should be a zoom() interface!
The code for calculating the scale level is not very good: scaling between 75% and 50% or 25% produces scales of 0.666 and 0.333, which I think at the end of the day will accumulate a lot of error.
For the same reason, zoomToFit() has to do some magic. I also got hit by the integer division of Python (I was getting zoom factors of 0) so I had to add 1.0* to the claculations. It's good that this is fixed in Python2.6/3.0.
The size reported by the QMainWindow was any vegetable (it said 640x480 when it actually was 960x600), so I used the QGraphicsView instead. WTF?
For some strange reason zoomToFit() scales the image a little bigger than it should, so a scrollbar appears in the direction of the constraining dimension.
Less that 100 lines! Even if setupActions() could surely be improved.
In Marcelo's favor I should mention that he writes docstrings for most of his methods both in english and spanish (yes, of course I read his code after I finished mine). I barely put a couple of comments, but doing the same should add 10 more lines, tops. Also, I don't want to convert this into a who-has-it-smaller contest (the code, I mean :).
It took me approx 3 hours, with no previous knowledge of how to do it and no internet connection, so no asking around. I just used the «Qt Reference Documentation», going to the «Gropued Classes» page and to the «Graphics View Classes» from there.
It doesn't zoom with the mouse wheel either.
The default colors of ikiwiki's format plugin are at most sucky, but better than nothing.

Unluckly he didn't declared which license it has, so I'm not sure if I really can do this. I GPL'ed mine. ↩

IPSec, tcpdump and firewalls

Marcos Dione

2010-01-27 23:55

La red de la empresa es un quilombo. Tenemos la red de la oficina por un lado, con los desktops en una red privada y los servers en una red pública (parte de la red pública del edificio donde está la oficina). Tenemos dos CoLos, uno en Rotterdam (Verizon; no me pidan comentarios) y otro en Amsterdam (xs4all; parece que le da housing a la mitad de .nl). En el primero tenemos tres rangos públicos y al menos dos privados, y en el otro tenemos sólo un rango público y unos dos o tres privados, incluyendo un cross-over entre los dos firewalls para hacer fail-over. En resumen, 5 redes públicas y vayasabersecuántas privadas desperdigadas.

Para complicar un poco más las cosas, montamos (bueno, yo no tuve nada que ver, pero ya trabajaba acá) una VPN entre los dos CoLos, teniendo ambas puntas en los firewalls (sin contar de que una punta tenemos dos; uno duerme hasta que el otro se cae).

El último ingrediente de esta sopa: graficamos el tráfico con MRTG (si, ya sé de Munin, ya lo sugerí, pero son como 80 servers en total) sacando los datos por SNMP (que, como dicen en todos lados, de simple tiene sólo el nombre). Pero para el caso es simple: un cliente corre en el server graficador, en este caso en Rotterdam, y en el nodo a graficar un servidorcito SNMP; el tráfico es por UDP.

'Bueno, ¿y dónde está la complicación?', se preguntarán. Les cuento.

Esta gente (notar cómo cuando se mandan una 'cagada' me despego :-P) tiene unos firewalls implementados a mano en bash. No es la típica ristra de reglas iptables, sino algo un poco más elaborado, con funcioncitas, pero aún así son 1400 líneas de bash y otras 630 de configuración (en bash también, obvio; no hay manera más fácil de hacer configuración de scripts bash que en bash mismo). Ya les comenté de shorewall también, pero todo a su tiempo.

Entonces me tocó la tarea de poner a andar el SMNP contra los firewalls de Amsterdam. Parece una boludez, sobre todo porque el MRTG tiene scripts con los que crear la configuración automáticamente leyendo lo que hay disponible por el mismo protocolo (lo diseñadores tuvieron la decencia de ponerle 'descubribilidad'). Por simple que parezca, no andaba nipatrásnipadelante. Le dimos quichicientas vueltas y náa.

Lo más raro fue cuando empezamos a usar tcpdump en las interfaces externas de los firewalls. Filtrando por el puerto 161 (snmp), en el firewall de Amsterdam veíamos el tráfico entrante pero no el saliente. Eso nos llevó a pensar que el firewall estaba jodiendo, pero entonces vimos que el tráfico que generaba el gráfico para el otro firewall, el durmiente, hacía lo mismo: veíamos el request pero no la respuesta. Pero el gráfico andaba. Y para complicar las cosas, con IPSec andando no deberíamos ver el tráfico, pues tcpdump no sabe desencriptar esos paquetes, sino apenas reconocerlos.

Entonces sospechamos.

Probando el tcpdump en el firewall de la red de Rotterdam vimos un comportamiento simétrico: no veíamos el request pero si la respuesta. Y sospechamos mas fuerte.

Mirando con mas fuerza me dí con que los scripts de firewall tenían un par de bugs: la función que seteaba las excepciones para el firewall mismo no era llamada nunca y además llamaba a iptables con parámetros erróneos.

La conclusión a la que llegamos, basada únicamente en lo que pudimos ver, es que el túnel anda, y que IPSec hace cosas raras con los paquetes; creemos que cuando los termina de procesar un paquete lo inyecta en la interfaz por la que entró, y eso es lo que hace que veamos el tráfico para un lado. Algún día que esté muy aburrido me voy a poner a ver si esta teoría es cierta.