glob no es un blog. No en el sentido corriente de la palabra. Es un registro de mis proyectos y otras interacciones con el software libre.

glob is not a blog. Not in the common meaning of the word. It's a record of my projects and other interactions with libre software.

All texts CC BY Marcos Dione

This is the second time I spent hours looking for this, so this time I'm writing it down.

My 10 year old Dell Inspiron 1420N, which is now my home server where I keep several useful online tools, has two problems: The keyboard and the LCD do not work. Well, the LCD works erratically, most of the times a couple of seconds after boot. The first problem can be fixed by attaching a USB keyboard, and the second by attaching an external screen.

Except that the machine does not enable the VGA output by default; but no problem, you just press Fn+F8 and voilà, external screen works. Except that external keyboards do not have the Fn key; but no problem, you can emulate it with Scroll Lock by just telling the BIOS to do so.

But you can't do it if you can't see anything on the screen. To do it blindly, you have to either know you BOIS by heart or find any reference online. I don't know that BIOS by heart, mainly because it's been a loooong while since I had to use it for anything, but also because I barely touch that machine anymore. And online references, well, there are none for models so old.

One of the possible solutions it occured to me that could help was to try to run a BIOS image, which you can still download from Dell's site (!), under qemu, but this tool cannot run arbitrary BIOSes. A pity, but understandable.

So without further ado, a schematic of the BIOS contents and how to fix this blindly:

- System
| System Info         <-- the cursor starts here
| Processor Info
| Memory Info
| Device Info
| Battery Info
| Battery Health
| Date/Time
| Boot Sequence
+ Onboard Devices
+ Video
+ Security
+ Performance
+ Power Management
+ Maintenance
- POST Behaviour      <-- 14 * <Down> + <Enter> and the following menu opens
| Adapter Warnings
| Fn Key Emulation    <-- 2 * <Down> + <Enter> and the setup screen opens
| Fast Boot
| Virtualization
| Keypad (embedded)
| Numlock LED
| USB Emulation
+ Wireless

The setup screen is quite simple, it has two options, Off and Scroll Lock, and you move with <Left> and <Right>. I'm not sure if it's needed, but pressing <Enter> to choose your option does not hurt. Then you press <Esc>, which gives you the Exit screen. This screen has three options: Remain in Setup (which is selected), Save/Exit and Discard/Exit. Guess which one you want :^) Just press <Right>, <Enter> and you're done! The machine reboots and now you can use <Scroll Lock>+<F8> in your external keyboard to activate the external screen.


Posted dom 15 ene 2017 21:20:34 CET Tags:

Last night I realized the first point. Checking today I found the latter. Early, often, go!

  • ayrton-0.9 has debug on. It will leave lots of files laying around your file system.
  • Modify the release script to do not allow this never ever more.
  • make install was not running the tests.

Get it on github or pypi!

python ayrton

Posted mi� 07 dic 2016 14:10:40 CET Tags:

Another release, but this time not (only) a bugfix one. After playing with bool semantics I converted the file tests from a _X format, which, let's face it, was not pretty, into the more usual -X format. This alone merits a change in the minor version number. Also, _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND.

In other news, I had to drop support for Pyhton-3.3, because otherwise I would have to complexify the import system a lot.

But in the end, yes, this also is a bugfix release. Lost of fd leaks where plugged, so I suggest you to upgrade if you can. Just remember the s/_X/-X/ change. I found all the leaks thanks to unitest's warnings, even if sometimes they were a little misleading:

testRemoteCommandStdout (tests.test_remote.RealRemoteTests) ... ayrton/parser/pyparser/ <span class="createlink">ResourceWarning</span>: unclosed <socket.socket fd=5, family=AddressFamily.AF_UNIX, type=SocketKind.SOCK_STREAM, proto=0, raddr=/tmp/ssh-XZxnYoIQxZX9/agent.7248>
  self.stack[-1] = (dfa, next_state, node)

The file and line cited in the warning have nothing to do with the warning itself (it was not the one who raised it) or the leaked fd, so it took me a while to find were those leaks were coming from. I hope I have some time to find why this is so. The most frustrating thing was that unitest closes the leaking fd, which is nice, but in one of the test cases it was closing it seemingly before the test finished, and the test failed because the socket was closed:

ERROR: testLocalVarToRemoteToLocal (tests.test_remote.RealRemoteTests)
Traceback (most recent call last):
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/", line 225, in wrapper
    test (self)
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/", line 235, in testLocalVarToRemoteToLocal
    self.runner.run_file ('ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay')
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 304, in run_file
    return self.run_script (script, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 323, in run_script
    return self.run_tree (tree, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 336, in run_tree
    return self.run_code (code, file_name, argv)
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 421, in run_code
    raise error
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 402, in run_code
    exec (code, self.globals, self.locals)
File "ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay", line 6, in <module>
    with remote ('', _test=True):
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 362, in __enter__
    i, o, e= self.prepare_connections (backchannel_port, command)
File "/home/mdione/src/projects/ayrton_clean/ayrton/", line 270, in prepare_connections
    self.client.connect (self.hostname, *self.args, **self.kwargs)
File "/usr/lib/python3/dist-packages/paramiko/", line 338, in connect
File "/usr/lib/python3/dist-packages/paramiko/", line 493, in start_client
    raise e
File "/usr/lib/python3/dist-packages/paramiko/", line 1757, in run
    self.kex_engine.parse_next(ptype, m)
File "/usr/lib/python3/dist-packages/paramiko/", line 75, in parse_next
    return self._parse_kexdh_reply(m)
File "/usr/lib/python3/dist-packages/paramiko/", line 112, in _parse_kexdh_reply
File "/usr/lib/python3/dist-packages/paramiko/", line 2079, in _activate_outbound
File "/usr/lib/python3/dist-packages/paramiko/", line 1566, in _send_message
File "/usr/lib/python3/dist-packages/paramiko/", line 364, in send_message
File "/usr/lib/python3/dist-packages/paramiko/", line 314, in write_all
    raise EOFError()

This probably has something to do with the fact that the test (a functional test, really) is using threads and real sockets. Again, I'll try to investigate this.

All in all, the release is an interesting one. I'll keep adding small features and releasing, let's see how it goes. Meanwhile, here's the changelog:

  • The 'No Government' release.
  • Test functions are no longer called _X but -X, which is more scripting friendly.
  • Some if those tests had to be fixed.
  • Dropped support for py3.3 because the importer does not work there.
  • tox support, but not yet part of the stable test suite.
  • Lots and lots of more tests.
  • Lots of improvements in the remote() tests; in particular, make sure they don't hang waiting for someone who's not gonna come.
  • Ignore ssh remote() tests if there's not password/phrase-less connection.
  • Fixed several fd leaks.
  • _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND. Mostly used internally.

Get it on github or pypi!

python ayrton

Posted mar 06 dic 2016 19:46:11 CET Tags:

I'll keep this short. During the weekend I found a bug in ayrton. I fixed it in develop, and decided to make a release with it, because it was kind of a showstopper. It was the first time I decided to use ayrton for a oneliner. It was this one:

ayrton -c "rm(v=True, locate('.xvpics', _out=Capture))"

See, ayrton's native support for filenames with spaces makes it a perfect replacement for find and xargs and tools like that. That command simply finds all the files or directories called like .xvpics using locate and removes them. There is a little bit of magic where locate's output becomes rm's arguments, but probably not magic enough: _out=Capture has to be specified. We'll probably fix that in the near future.

So, enjoy the new release. It just fixes a couple of bugs, one of them directly related to this oneliner. Here's the changelog:

  • The 'Release From The Bus' release.
  • Bugfix release.
  • Argv should not be created with an empty list.
  • Missing dependencies.
  • Several typos.
  • Fix for _h().
  • Handle paramiko exceptions.
  • Calling ayrton -c <script> was failing because the file name properly was not properly (f|b)aked.
  • ayrton --version didn't work!

Get it on github or pypi!

Meanwhile, a little about its future. I have been working on ayrton on and off. Right now I'm gathering energy to modify pypy's Python parser so it supports py3.6's formatted string literals. With this I can later update ayrton's parser, which is based on pypy's. A part of it has been done, but then I run out of gas. I think FSLs are perfect for ayrton in its aim to replace shell script languages. In other news, there's a nasty remote() bug that I can't pin down. These two things might mean that there won't be a significant release for a while.

python ayrton

Posted lun 21 nov 2016 22:16:53 CET Tags:

I was trying to modify ayrton so we could really have sh[1]-style file tests. In sh, they're defined as unary operators in the -X form[2], where X is a letter. For instance, -f foo returns true (0 in sh-peak) if foo is some kind of file. In ayrton I defined them as functions you could use, but the names sucked a little. -f was called _f() and so on. Part of the reason is, I think, that both python-sh and ayrton already do some -/_ manipulations in executable names, and part because I thought that -True didn't make any sense.

A couple of days ago I came with the idea that I could symply call the function f() and (ab)use the fact that - is a unary operator. The only detail was to make sure that - didn't change the truthiness of bools. In fact, it doesn't, but this surprised me a little, although it shouldn't have:

In [1]: -True
Out[1]: -1

In [2]: -False
Out[2]: 0

In [3]: if -True: print ('yes!')

In [4]: if -False: print ('yes!')

You see, the bool type was introduced in Python-2.3 all the way back in 2003. Before that, the concept of true was represented by any 'true' object, and most of the time as the integer 1; false was mostly 0. In Python-2.2.1, True and False were added to the builtins, but only as other names for 1 and 0. According the that page and the PEP, bool is a subtype of int so you could still do arithmetic operations like True+1 (!!!), but I'm pretty sure deep down below the just wanted to be retro compatible.

I have to be honest, I don't like that, or the fact that applying - to bools convert them to ints, so I decided to subclass bool and implement __neg__() in such a way that it returns the original value. And that's when I got the real surprise:

In [5]: class FalseBool (bool):
   ...:     pass
TypeError: type 'bool' is not an acceptable base type

Probably you didn't know (I didn't), but Python has such a thing as a 'final class' flag. It can only be used while defining classes in a C extension. It's a strange flag, because most of the classes have to declare it just to be subclassable; it's not even part of the default flags. Even more surprising, is that there are a lot of classes that are not subclassable: around 124 in Python-3.6, and only 84 that are subclassable.

So there you go. You learn something new every day. If you're curious, here's the final implementation of FalseBool:

class FalseBool:
    def __init__ (self, value):
        if not isinstance (value, bool):
            raise ValueError

        self.value= value

    def __bool__ (self):
        return self.value

    def __neg__ (self):
        return self.value

This will go in ayrton's next release, which I hope will be soon. I'm also working in implementing all of the different styles of expansion found in bash. I even seem to have found some bugs in it.

python ayrton

[1] I'm talking about the shell, not to confuse with python-sh.

[2] Well, there are a couple of infix binary operands in the form -XY.

Posted vie 21 oct 2016 18:17:46 CEST Tags:

Today I had to setup 3 Firefox profiles, because I started a new job, and I realized I never documented which extensions I use or why, so I had to work a little from memory. Hence, this post, which I plan to keep up-to-date as much as possible.

A little bit of rationale first. I'm very privacy-conscious, but at the same time very pragmatic. I use several profiles to add an extra level of data isolation. That also allows me to have different sets of extensions, because some are some intrusive that they break some non-important sites' functionality.

Finally, the list, in no particular order:

  • FlashGot, by Giorgio Maone: Better downloads handling.

  • Go-Mobile, by 'Geek in Training': A lot of sites are actually more useful (read, with less crap on them) in their Mobile versions. This plugins lets you switch from one to the other.

  • HTTPS everywhere, by EFF: Don't navigate in the clear anymore.

  • No Script, also by Giorgio Maone: A broad spectrum antibiotic. Not loading JS makes pages less CPU intensive, plus sites cannot track you if you don't make requests, plus also blocks videos.

  • Privacy Badger, also by EFF: In their own words, “protects privacy by blocking spying ads and invisible trackers”.

  • Tab Auto Reload, by 'Schuzak': I use this to reload sites that constantly log you out, but only under certain circumstances.

  • Tab mix plus, by 'onemen': Once upon a time ffox didn't have session management/recovery. Now it does, but not very good; I still think TMP's ones are better. Also, duplicate tab.

  • Toggle animated GIFs, by Simon Lindholm: Stop annoying animations. Just make sure to tick 'Pause GIFs by default'.

  • uBlock Origin, by Raymond Hill: an (ad) blocker, goodbye-adiós 15s ad videos in youtube.[1]

So that's it. Unluckily there's nothing against browser fingerprinting yet (and my browser ranks as quite unique), and I don't know how much can be/has been implemented by [Mozilla]. If you have other suggestions about plugins, please do in the comments below. As I said, I'll try to keep this post up to date.


[1] I used to use ABP, but it seems it became a protection scam.

Posted vie 14 oct 2016 19:30:38 CEST Tags:

I just uploaded my first semi-automated change. This change was generated with my hack for generating centerlines for riverbank polygons. This time I expanded it to include a JOSM plugin which will take all the closed polygons from the selection and run the algorithm on them, creating new lines. It still needs some polishing, like making sure they're riverbanks and copying useful tags to the new line, and probably running a simplifying algo at some point. Also, even simple looking polygons might generate complex lines (in plural, and some of these lines could be spurious), so some manual editing might be required afterwards, specially connecting the new line to existing centerlines. Still, I think it's useful.

Like I mentioned last time, its setup is quite complex: The JOSM plugin calls a Python script that needs the Python module installed. That module, for lack of proper bindings for SFCGAL, depends on PostgreSQL+PostGIS (but we all have one, right? :-[ ), and connection strings are currently hardcoded. All that to say: it's quite hacky, not even alpha quality from the installation point of view.

Lastly, as imagico mentioned in the first post about this hack, the algorithms are not fast, and I already made my computer start thrashing the disk swapping like Hell because pg hit a big polygon and started using lots of RAM to calculate its centerline. At least this time I can see how complex the polygons are before handing them to the code. As an initial benchmark, the original data for that changeset (I later simplified it with JOSM's tool) took 0.063927s in pg+gis and 0.004737s in the Python code. More test will come later.

Okey, one last thing: Java is hard for a Pythonista. At some point it took me 2h40 to write 60 lines of code, ~2m40 per line!

openstreetmap gis python

Posted lun 29 ago 2016 20:06:39 CEST Tags:

A month ago I revived my old-laptop-as-server I have at home. I don't do much in it, just serve my photos, a map, provide a ssh trampoline for me and some friends and not much more. This time I decided to tackle one of the most annoying problems I had with it: That closing the lid led to the system to suspend.

Now, the setup in that computer has evolved through some years, so a lot of cruft was left on it. For instance, at some point I solved the problem by installing a desktop and telling it not to suspend the machine, mostly because that's how I configure my current laptop. That, of course, was a cannon-for-killing-flies solution, but it worked, so I could focus in other things. Also, a lot of power-related packages were installed, assuming the were really needed for supporting everything I might ever wanted to do about power. This is the story on how I removed them all, why, and how I solved the lid problem... twice.

First thing to go were the desktop packages, mostly because the screen in that laptop has been dead for more than a year now, and because its new space in the house is a small shelf in my wooden desktop. Then I reviewed the power-related packages one by one and decided whether I needed it or not. This is more or less what I found:

  • acpi-fakekey: This package has a tool for injecting fake ACPI keystrokes in the input system. Not really needed.
  • acpi-support: It has a lot of scripts that can be run when some ACPI events occur. For instance, lid closing, battery/AC status, but also things like responding to power and even 'multimedia' keys. Nice, but not needed in my case; the lid is going to be closed all the time anyways.
  • laptop-mode-tools: Tools for saving power in your laptop. Not needed either, the server is going to be running all the time on AC (its battery also died some time ago).
  • upower: D-Bus interface for power events. No desktop or anything else to listen to them. Gone.
  • pm-utils: Nice CLI scripts for suspending/hibernating your system. I always have them around in my laptop because sometimes the desktops don't work properly. No use in my server, but it's cruft left from when I used it as my laptop. Adieu.

Even then, closing the lid led to the system suspending. Who else could be there? Well, there is one project who's being everywhere: systemd. I'm not saying this is bad, but it is everywhere. Thing is, its login subsystem also handles ACPI events. In the /etc/systemd/logind.conf file you can read the following lines:


so I uncommented the 4th line and changed it so:


Here you can also configure how the inhibition of actions work:


Please check the config file's doc if you plan to modify it.

Not entirely unrelated, my main laptop also started suspending when I closed the lid. I have it configured, through the desktop environment, to only turn off the screen, because what use is the screen if it's facing the keyboard and touchpad :) Somehow, these settings only recently started to be in effect, but a quick search didn't gave any results on when things changed. Remembering what I did with the server, I just changed that config file to:


That is, “let me configure this through the desktop, please”, and now I have my old behavior back :)

PS: I should start reading more about systemd. A good starting point seems to be all the links in its home page.

sysadmin systemd acpi

Posted dom 28 ago 2016 17:34:56 CEST Tags:

Dear conference speakers:

Please test write your slides at 1024x768 resolution, on a projector.

Dear conference organizers:

Please remind your speakers to do so.

Dear conference attendants:

If you are at the back of the room and you can't see the text/code in the slides the speaker(s) is showing, please shout “I cant see shit!” in the appropriate language, and try to embarrass the speaker as much as possible.

Thanks in advance. Yours truly,

-- Marcos.

PS: I've been watching videos of talks in some conferences and I swear to $DEITY at in least 40% of the ones I was interested in, I couldn't read the code on the video. Sometimes the fonts are too small, sometimes the colors are not contrasting enough. Please, at least test your slides on a projector...

PSS: I know the resolution I'm suggesting is low. Be happy I'm not asking for 640x480 :-P

PSSS: Ok, attendants, don't embarrass/harass the speakers :)


Posted jue 25 ago 2016 14:19:38 CEST Tags:

Like I said in my last post[1], I'm looking at last PyCon's videos. Here are my selected videos, in the order I saw them:

Ned Batchelder - Machete-mode debugging: Hacking your way out of a tight spot. In fact, I saw this twice.

Larry Hastings - Removing Python's GIL: The Gilectomy

The Report Of Twisted’s Death or: Why Twisted and Tornado Are Relevant In The Asyncio Age

How I built a power debugger out of the standard library and things I found on the internet

Davey Shafik - HTTP/2 and Asynchronous APIs

Sumana Harihareswara - HTTP Can Do That?! Points for informative and funny.

Matthias Kramm - Python Typology Types are comming, so get used to them.

Scott Sanderson, Joe Jevnik - Playing with Python Bytecode Nice, very nice trick. I'm talking about the way the presentation is given.

Brian Warner - Magic Wormhole - Simple Secure File Transfer

And of course, the lighning talks. I always like these, because you can get exposed to any kind of things, some not even remotely connected to Python, but which can get your brain rolling down nice little bunny holes, or at least get a smile from you. So here:

LT#1. Please watch it at least between 20-25m.




And of course, check the other ones, don't stop at my own interests.

[1] Yes, I started writing this a month ago.


Posted vie 19 ago 2016 17:35:45 CEST Tags:

Long time for this release. A couple of hard bugs (which fix was just moving a line down a little), a big-ish new feature, and moving in a new city. Here's the ChangeLog:

  • You can import ayrton modules and packages!
  • Depends on Python3.5 now.
  • argv is not quite a list: for some operations (len(), iter(), pop()), argv[0] is left alone.
  • option() raises KeyError or ValueError if the option or its 'argument' is wrong.
  • makedirs() and stat() are available as functions.
  • -p|--pdb launches pdb when there is an unhandled exception.
  • Fix for line in foo(...): ... by automatically adding the _bg=True option.
  • Better Command() detection.
  • A lot of internal fixes.

Get it on github or pypi!

python ayrton

Posted mi� 17 ago 2016 13:17:22 CEST Tags:

My latest Europe import was quite eventful. First, I run out of space several times during the import itself, at indexing time. The good thing is that, if you manage to reclaim some space, and reading a little of source code[1], you can replay the missing queries by hand and stop cursing. To be fair, osm2pgsql currently uses a lot of space in slim+flat-nodes mode: three tables, planet_osm_node, planet_osm_way and planet_osm_relation; and one file, the flat nodes one. Those are not deleted until the whole process has finished, but they're actually not needed after the processing phase. I started working on fixing that.

But that was not the most difficult part. The most difficult part was that I forgot, somehow, to add a column to the file. Elevation, my own style, renders different icons for different types of castles (and forts too), just like the Historic Place map of the Hiking and Bridle map[2]. So today I sat down and tried to figure out how to reparse the OSM extract I used for the import to add this info.

The first step is to add the column to the tables. But first, which tables should be impacted? Well, the line I should have added to the import style is this:

node,way   castle_type  text         polygon

That says that this applies to nodes and ways. If the element is a way, polygon will try to convert it to a polygon and put it in the planet_osm_polygon table; if it's a node, it ends in the planet_osm_point table. So we just add the column to those tables:

ALTER TABLE planet_osm_point   ADD COLUMN castle_type text;
ALTER TABLE planet_osm_polygon ADD COLUMN castle_type text;

Now how to process the extract? Enter pyosmium. It's a Python binding for the osmium library with a stream-like type of processing à la expat for processing XML. The interface is quite simple: one subclasses osmium.SimpleHandler, defines the element type handlers (node(), way() and/or relation()) and that's it! Here's the full code of the simple Python script I did:

#! /usr/bin/python3

import osmium
import psycopg2

conn= psycopg2.connect ('dbname=gis')
cur= conn.cursor ()

class CastleTypes (osmium.SimpleHandler):

    def process (self, thing, table):
        if 'castle_type' in thing.tags:
                name= thing.tags['name']
            # osmium/boost do not raise a KeyError here!
            # SystemError: <Boost.Python.function object at 0x1329cd0> returned a result with an error set
            except (KeyError, SystemError):
                name= ''
            print (table,, name)

            cur.execute ('''UPDATE '''+table+
                         ''' SET castle_type = %s
                            WHERE osm_id = %s''',

    def node (self, n):
        self.process (n, 'planet_osm_point')

    def way (self, w):
        self.process (w, 'planet_osm_polygon')

    relation= way  # handle them the same way (*honk*)

ct= CastleTypes ()
ct.apply_file ('europe-latest.osm.pbf')

The only strange part of the API is that it doesn't seem to raise a KeyError when the tag does not exist, but a SystemError. I'll try to figure this out later. Also interesting is the big amount of unnamed elements with this tag that exist in the DB.

[1] I would love for GitHub to recognize something like and be directed to that method, because #Lxxx gets old pretty quick.

[2] I just noticed how much more complete those maps are. more ideas to use :)

openstreetmap gis python

Posted mar 09 ago 2016 17:45:56 CEST Tags:

Remember this?

For a few months now I've been trying to have a random slideshow of images. I used to do this either with kscreensaver, which for completely different reasons I can't use now, or xscreensavers' glslideshow, which, even when I compiled it by hand, I can't find the way to give it the root dir of the images. So, based on OMIT, I developed my own.

The differences with OMIT are minimal. It has to scan the whole tree for finding the appropriate files (its definition of "appropriate" could be improved, it's true); it goes into full screen mode with black background; and it (more) properly handles EXIF rotation[1]. All that in 176 LOCs, including proper licensing (GPLv3), and developed in one day and refined the next one.

One interesting thing I found out is that pyexiv2 is deprecated, with is last release in 2011 (!!!). What this new app uses is its recommended replacement, gexiv2.

So, there you are. Like OMIT, it's in PyQt4, but this time in Python3 (that's why I used gexiv2 instead). Its TODO includes porting it to PyQt5 and a few other things. You can grab it here. I plan to do a proper release soon, but for the moment just drop it in your PATH and be happy with it.


omit omim pykde python

Posted vie 05 ago 2016 14:51:35 CEST Tags:

In this last two days I've been expanding osm-centerlines. Now it not only supports ways more complex than a simple rectangle, but also ones that lead to 'branches' (unfortunately, most probably because the mapper either imported bad data or mapped it himself). Still, I tested it in very complex polygons and the result is not pretty. There is still lots of room for improvements.

Unluckily, it's not as stand alone as it could be. The problem is that, so far, the algos force you to provide now only the polygon you want to process, but also its skeleton and medial. The code extends the medial using info extracted from the skeleton in such a way that the resulting medial ends on a segment of the polygon, hopefully the one(s) that cross from one riverbank to another at down and upstream. Calculating the skeleton could be performed by CGAL, but the current Python binding doesn't include that function yet. As for the medial, SFCGAL (a C++ wrapper for CGAL) exports a function that calculates an approximative medial, but there seem to be no Python bindings for them yet.

So, a partial solution would be to use PostGIS-2.2's ST_StraightSkeleton() and ST_ApproximateMedialAxis(), so I added a function called skeleton_medial_from_postgis(). The parameters are a psycopg2 connection to a PostgreSQL+PostGIS database and the way you want to calculate, as a shapely.geometry, and it returns the skeleton and the medial ready to be fed into extend_medials(). The result of that should be ready for mapping.

So there's that. I'll be trying to improve it in the next days, and start looking into converting it into a JOSM plugin.

openstreetmap gis python

Posted vie 29 jul 2016 00:39:57 CEST Tags:

For a long time now I've been thinking on a problem: OSM data sometimes contains riverbanks that have no centerline. This means that someone mapped (part of) the coasts of a river (or stream!), but didn't care about adding a line that would mark its centerline.

But this should be computationally solvable, right? Well, it's not that easy. See, for given any riverbank polygon in OSM's database, you have 4 types of segments: those representing the right and left riverbanks (two types) and the flow-in and flow-out segments, which link the banks upstream and downstream. With a little bit of luck there will be only one flow-in and one flow-out segment, but there are no guarantees here.

One method could try and identify these segments, then draw a line starting in the middle of the flow-in segment, calculating the middle by traversing both banks at the same time, and finally connect to the middle for the flow-out segment. Identifying the segments by itself is hard, but it is also possible that the result is not optimal, leading to a jagged line. I didn't try anything on those lines, but I could try some examples by hand...

Enter topology, the section of maths that deals with this kind of problems. The skeleton of a polygon is a group of lines that are equidistant to the borders of the polygon. One of the properties this set of lines provides is direction, which can be exploited to find the banks and try to apply the previous algorithm. But a skeleton has a lot of 'branches' that might confuse the algo. Going a little further, there's the medial axis, which in most cases can be considered a simplified skeleton, without most of the skeleton branches.

Enter free software :) CGAL is a library that can compute a lot of topological properties. PostGIS is clever enough to leverage those algorithms and present, among others, the functions ST_StraightSkeleton() and ST_ApproximateMedialAxis(). With these two and the original polygon I plan to derive the centerline. But first an image that will help explaining it:

The green 'rectangle' is the original riverbank polygon. The thin black line is the skeleton for it; the medium red line is the medial. Notice how the medial and the center of the skeleton coincide. Then we have the 4 branches forming a V shape with its vertex at each end of the medial and its other two ends coincide with the ends of the flow in and flow out segments!

So the algorithm is simple: start with the medial; from its ends, find the branches in the skeleton that form that V; using the other two ends of those Vs, calculate the point right between them, and extend the medial to those points. This only calculates a centerline. The next step would be to give it a direction. For that I will need to see if there are any nearby lines that could be part of the river (that's what the centerline is for, to possibly extend existing rivers/centerlines), and use its direction to give it to the new centerline.

For the moment the algorithm only solves this simple case. A slightly more complex case is not that trivial, as skeletons and medials are returned as a MultiLineString with a line for each segment, so I will have to rebuild them into LineStrings before processing.

I put all the code online, of course :) Besides a preloaded PostgreSQL+PostGIS database with OSM data, you'll need python3-sqlalchemy, geoalchemy, python3-fiona and python3-shapely. The first two allows me to fetch the data from the db. Ah! by the way, you will need a couple of views:

CREATE VIEW planet_osm_riverbank_skel   AS SELECT osm_id, way, ST_StraightSkeleton (way)      AS skel   FROM planet_osm_polygon WHERE waterway = 'riverbank';
CREATE VIEW planet_osm_riverbank_medial AS SELECT osm_id, way, ST_ApproximateMedialAxis (way) AS medial FROM planet_osm_polygon WHERE waterway = 'riverbank';

Shapely allows me to manipulate the polygonal data, and fiona is used to save the results to a shapefile. This is the first time I ever use all of them (except SQLAlchemy), and it's nice that it's so easy to do all this in Python.

openstreetmap gis python

Posted mar 26 jul 2016 18:55:14 CEST Tags:

A few weeks ago an interesting PR for osm-carto landed in the project's GitHub page. It adds rendering for several natural relief features, adding ridges, valleys, aretes, dales, coulouirs and others to cliffs, peaks and mountain passes, which were already being rendered. I decided to try it in Elevation (offline for the moment).

I sync'ed the style first with the latest release, applied the patch and... not much. My current database is quite old (re-importing takes ages and I don't have space for updates), so I don't have much features like that in the region I'm interested in. In fact, I went checking and the closest mountain range around here was not in the database, so I added it.

By the way, the range is mostly concurrent with a part of an administrative boundary, but SomeoneElse and SK53 suggested to make a new line. Even when other features are nearby (there's a path close to the crest and it's also more or less the limit between a forest and a bare rock section), which already makes the region a little bit crowded with lines, it makes sense: boundaries, paths, forest borders and ridges change at different time scales, so having them as separate lines makes an update to any of those independent of the rest.

Now I wanted to export this feature and import it in my rendering database, so I can actually see the new part of the style. This is not an straightforward process, only because when I imported my data I used osm2pgsql --drop, which removes the much needed intermediate tables for when one wants to update with osm2pgsql --append. Here's a roundabout way to go.

First you download the full feature (thanks RichardF!). In this case:

This not only exports the line (which is a sequence of references to nodes) with its tags, but the nodes too (which are the ones storing the coords). The next step is to convert it to something more malleable, for instance, GeoJSON. For that I used ogr2ogr like this:

ogr2ogr -f GeoJSON 430573542.GeoJSON 430573542.xml lines

The last parameter is needed because, quoting Even Rouault (a.k.a. José GDAL): «you will always get "points", "lines", "multilinestrings", "multipolygons" and "other_relations" layers when reading a osm file, even if some are empty», and the GeoJSON driver refuses to create layers for you:

ERROR 1: Layer lines not found, and <span class="createlink">CreateLayer</span> not supported by driver.

But guess what, that not the easiest way :) At least we learned something. In fact postgis already has a tool called shp2pgsql that imports ESRIShapeFiles, and ogr2ogr produces by default this kind of file. It creates a .shp file for each layer as discussed before, but again, we're only interested in the line one. So:

ogr2ogr 430573542 430573542.xml lines
shp2pgsql -a -s 900913 -S 430573542/lines.shp > 430573542.sql

We can't use this SQL file directly, as it has a couple of problems. First, you can't tell shp2pgsql the names of the table where you want to insert the data or the geometry column. Second, it only recognizes some attributes (see below), and the rest it tries to add them as hstore tags. So we have to manually edit the file to go from:

INSERT INTO "lines" ("osm_id","name","highway","waterway","aerialway","barrier","man_made","z_order","other_tags",geom)
    VALUES ('430573542','Montagne Sainte-Victoire',NULL,NULL,NULL,NULL,NULL,'0','"natural"=>"ridge"','010500002031BF0D[...]');


INSERT INTO "planet_osm_line" ("osm_id","name","z_order","natural",way)
    VALUES ('430573542','Montagne Sainte-Victoire','0','ridge','010500002031BF0D[...]');

See? s/lines/planet_osm_line/, s/other_tags/"natural"/ (with double quotes, because natural is a keyword in SQL, as in natural join), s/geom/way/ and s/'"natural"=>"ridge"'/'ridge'/ (in single quotes, so it's a string; double quotes are for columns). And I also removed the superfluous values and the ANALIZE line, as I don't care that much. Easy peasy.

A comment on the options for shp2pgsql. -s 900913 declares the SRID of the database. I got that when I tried without and:

ERROR:  Geometry SRID (0) does not match column SRID (900913)

-S is needed because shp2pgsql by default generated MultiLineStrings, but that table in particular has a LineString way column. This is how I figure it out:

ERROR:  Geometry type (MultiLineString) does not match column type (LineString)

Incredibly, after this data massacre, it loads in the db:

$ psql gis < 430573542.sql


openstreetmap gdal postgis

Posted mar 12 jul 2016 17:37:22 CEST Tags:

Today I stumbled upon PyCon 2016's youtube channel and started watching some of the talks. The first one I really finished watching was Ned Batchelder's "Machete debugging", a very interesting talk about 4 strange bugs and the 4 strange techniques they used to find where those bugs were produced. It's a wonderful talk, full of ideas that, if you're a mere mortal developer like me, will probably blow your mind.

One of the techniques they use for one of the bugs is to actually write a trace function. A trace function in cpython context is a function that is called in several different points of execution of Python code. For more information see sys.settrace()'s documentation.

In my case I used tracing for something that I always liked about bash: that you can ask it to print every line that's being executed (even in functions and subprocesses!). I wanted something similar for ayrton, so I sat down to figure out how this would work.

The key to all this is the function I mention up there. The API seems simple enough at first sight, but it's a little more complicated. You give this function what is called the global trace function. This function will be called with three parameters: a frame, an event and a event-dependent arg. The event I'm interested in is line, which is called for each new line of code that is executed. The complication comes because what this global trace function should return is a local trace function. This function will be called with the same parameters as the global trace function. I would really like an explanation why this is so.

The job for this function, in ayrton's case, is simple: inspect the frame, extract the filename and line number and print that. At first this seems to mean that I should read the files by myself, but luckily there's another interesting standard module: linecache to the rescue. The only 'real complication' of ayrton's use is that it would not work if the script to run was passed with the -c|--script option, but (un)luckily the execution engine already has to read the hold the script in lines, so using that as the cache instead of linecache was easy.

Finally, if you're interested in the actual code, go take a look. Just take in account that ayrton has 3 levels of tracing: à la bash (script lines prepended by +), with line numbers, and tracing any Python line execution, including any modules you might use and their dependencies. And don't forget that it also has 3 levels of debug logging into files. See ayrton --help!

ayrton python

Posted jue 23 jun 2016 20:32:12 CEST Tags:

ayrton has always been able to use any Python module, package or extension as long as it is in a directory in sys.path, but trying to solve a bigger bug, I realized that there was no way to use ayrton modules or packages. Having only laterally heard about the new importlib module and the new mechanism, I sat down and read more about it.

The best source (or at least the easiest to find) is possibly what Python's reference says about the import system, but I have to be honest: it was not an easy read. Next week I'll sit down and see if I can improve it a little. So, for those out there who, like me, might be having some troubles understanding the mechanism, here's how I understand the system works (ignoring deprecated APIs and corner cases or even relative imports; I haven't used or tried those yet):

def import_single(full_path, parent=None, module=None):
    # try this cache first
    if full_path in sys.modules:
        return sys.modules[full_path]

    # if not, try all the finders
    for finder in sys.meta_path:
        if parent is not None:
            spec = finder.find_spec(full_path, parent.__path__, target)
            spec = finder.find_spec(full_path, None, target)

        # if the finder 'finds' ('knows how to handle') the full_path
        # it will return a loader
        if spec is not None:
            loader = spec.loader

            if module is None and hasattr(loader, 'create_module'):
                module = loader.create_module(spec)

            if module is None:
                module = ModuleType(  # let's assume this creates an empty module object
                module.__spec__ = spec

            # add it to the cache before loading so it can referenced from it
            sys.modules[] = module
                # if the module was passed as parameter,
                # this repopulates the module's namespace
                # by executing the module's (possibly new) code
                # clean up
                del sys.modules[]

            return module

    raise ImportError

def import (full_path, target=None):
    parent= None

    # this code iterates over ['foo', '', '']
    elems = full_path.split('.')
    for partial_path in [ '.'.join (elems[:i]) for i in range (len (elems)+1) ][1:]
        parent = import_single(partial_path, parent, target)

    # the module is loaded in parent
    return parent

A more complete version of the if spec is not None branch can be found in the Loading section of the reference. Notice that the algorithm uses all the finders in sys.meta_path. So which are the default finders?

In [9]: sys.meta_path

Of those finders, the latter one is the one that traverses sys.path, and also has a hook mechanism. I didn't use those, so for the moment I didn't untangle how they work.

Finally, this is how I implemented importing ayrton modules and packages:

from import MetaPathFinder, Loader
from importlib.machinery import ModuleSpec
import sys
import os
import os.path

from ayrton.file_test import _a, _d
from ayrton import Ayrton
import ayrton.utils

class AyrtonLoader (Loader):

    def exec_module (klass, module):
        # «the loader should execute the module’s code
        # in the module’s global name space (module.__dict__).»
        load_path= module.__spec__.origin
        loader= Ayrton (g=module.__dict__)
        loader.run_file (load_path)

        # set the __path__
        # TODO: read PEP 420
        init_file_name= '__init__.ay'
        if load_path.endswith (init_file_name):
            # also remove the '/'
            module.__path__= [ load_path[:-len (init_file_name)-1] ]

loader= AyrtonLoader ()

class AyrtonFinder (MetaPathFinder):

    def find_spec (klass, full_name, paths=None, target=None):
        # TODO: read PEP 420 :)
        last_mile= full_name.split ('.')[-1]

        if paths is not None:
            python_path= paths  # search only in the paths provided by the machinery
            python_path= sys.path

        for path in python_path:
            full_path= os.path.join (path, last_mile)
            init_full_path= os.path.join (full_path, '__init__.ay')
            module_full_path= full_path+'.ay'

            if _d (full_path) and _a (init_full_path):
                return ModuleSpec (full_name, loader, origin=init_full_path)

                if _a (module_full_path):
                    return ModuleSpec (full_name, loader, origin=module_full_path)

        return None

finder= AyrtonFinder ()

# I must insert it at the beginning so it goes before FileFinder
sys.meta_path.insert (0, finder)

Notice all the references to PEP 420. I'm pretty sure I must be breaking something, but for the moment this works.

ayrton python

Posted mi� 15 jun 2016 16:46:41 CEST Tags:

Remember this? Ok, maybe you never read that. The gist of the post is that I used strace -r -T to produce some logs that we «amassed[sic] [...] with a python script for generating[sic] a CSV file [...] and we got a very interesting graph». Mein Gott, sometimes the English I write is terrible... Here's that graph again:

This post is to announce that that Python script is now public. You can find it here. It's not as fancy as those flame graphs you see everywhere else, but it's a good first impression, specially if you have to wait until the SysAdmin installs perf or any other tool like that (ok, let's be fair, l/strace is not a standard tool, but probably your grumpy SysAdmin will be more willing to install those than something more intrusive; I know it happened to me, at least). It's written in Python3; I'll probably backport it to Python2 soon, so those stuck with it can still profit from it.

To produce a similar graph, use the --histogram option, then follow the suggestions spewed to stderr. I hope this helps you solve a problem like it did to me!

profiling python

Posted jue 09 jun 2016 15:27:24 CEST Tags:

A short story. For years I've not only accumulated thousands of pictures[1] (around 50, to be more precise) but also schemas on how to sort those photos. After a long internal debate, I settled for the following one (for the moment, that is):

  • Pictures are imported from the camera's SD via a script, which:
    • Renames them from DSCXXXXX.JPG to a date based file name, using the picture's exif data.
    • (Hard) links them to a year/month based dir structure.
  • Later, pictures are sorted by hand into thematic dirs for filtering.
  • The year/month tree is handled with digikam for tagging.
  • Pictures are filtered into their final destination sorted by category, year and event (for instance, trip/year/to_this_place).
  • Pictures are also (hard) linked into a tag based dir structure, using nested tags.

So, the whole workflow is:

SD card --> incoming/01-tmp -> incoming/02-new/theme -> incoming/03-cur --> final destination
        `-> year/month                                                  `-> tags/theme/parent/child

The reason for using hard links is the following: I rsync everything to my home server, both as backup and for feeding the gallery(s) there. Because pictures are moved from one location to another until they reach their final destination, rsync retransmits the picture in its new location and then deletes the old one (I'm using --delete-after, to make sure the backup does not loose any picture if the transfer is stopped). This leads to useless transfers, as the picture is in the remote.

I played with the idea of using git or even git-annex for working around this, but in the end I decided to (ab)use rsync's hard link support. Now moving any picture in the workflow or renaming a category, theme or directory just means creating new hardlinks to the links in the year/month tree and removing the old ones later, an almost immediate operation. This also helps saving space and time when implementing the tag based tree.

digikam is good enough to uniquely identify each picture, even when two hard links point to the same file. This still means the picture appears several times; the metadata (most importantly, tags) are shared, but each new link adds load to the digikam's database.

I bit the bullet and sat down to do a last move and be done. I moved the year/month tree to ByDate/, completely isolating it from the rest of the collection. Then pointed digikam to only read that, and here's how I did it:

  • Closed digikam, of course.
  • Backed up everything, including both the database, which was in the collection's root, and the .local/share/config/digikamrc file.
  • Modified the latter so it points to the new database location:
[Database Settings]
Database Name=/home/mdione/Pictures/ByDate/
Database Name Thumbnails=/home/mdione/Pictures/ByDate/
  • Moved the database.
  • Changed the collection's root dir in the database:
mdione@diablo:~/Pictures/ByDate$ sqlite3 digikam4.db
-- Notice the specificPath is relative to the volume. In my case, the volume is /home
sqlite> select * from AlbumRoots;
sqlite> update AlbumRoots set specificPath = '/mdione/Pictures/ByDate' where id = 1;
sqlite> select * from AlbumRoots;
  • Fired digikam and let it rescan the collection, which recognized the only link to the image and kept the tags.

This worked superbly!

[1] When I talk about pictures, I'm also including videos.

Posted lun 06 jun 2016 16:59:11 CEST

Your boss climbed the corporate ladder, wrong by wrong.