glob no es un blog. No en el sentido corriente de la palabra. Es un registro de mis proyectos y otras interacciones con el software libre.
glob is not a blog. Not in the common meaning of the word. It's a record of my projects and other interactions with libre software.
Last night I realized the first point. Checking today I found the latter. Early, often, go!
ayrton-0.9has debug on. It will leave lots of files laying around your file system.
- Modify the release script to do not allow this never ever more.
make installwas not running the tests.
Another release, but this time not (only) a bugfix one. After
I converted the file tests from a
_X format, which, let's face it, was not pretty,
into the more usual
-X format. This alone merits a change in the minor version
_err also accept a tuple
(path, flags), so
you can specify things like
In other news, I had to drop support for Pyhton-3.3, because otherwise I would have to complexify the import system a lot.
But in the end, yes, this also is a bugfix release. Lost of fd leaks where
plugged, so I suggest you to upgrade if you can. Just remember the
change. I found all the leaks thanks to
unitest's warnings, even if sometimes
they were a little misleading:
testRemoteCommandStdout (tests.test_remote.RealRemoteTests) ... ayrton/parser/pyparser/parser.py:175: <span class="createlink">ResourceWarning</span>: unclosed <socket.socket fd=5, family=AddressFamily.AF_UNIX, type=SocketKind.SOCK_STREAM, proto=0, raddr=/tmp/ssh-XZxnYoIQxZX9/agent.7248> self.stack[-1] = (dfa, next_state, node)
The file and line cited in the warning have nothing to do with the warning
itself (it was not the one who raised it) or the leaked fd, so it took me a while
to find were those leaks were coming from. I hope I have some time to find why
this is so. The most frustrating thing was that
unitest closes the leaking fd,
which is nice, but in one of the test cases it was closing it seemingly before the
test finished, and the test failed because the socket was closed:
====================================================================== ERROR: testLocalVarToRemoteToLocal (tests.test_remote.RealRemoteTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 225, in wrapper test (self) File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 235, in testLocalVarToRemoteToLocal self.runner.run_file ('ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay') File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 304, in run_file return self.run_script (script, file_name, argv, params) File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 323, in run_script return self.run_tree (tree, file_name, argv, params) File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 336, in run_tree return self.run_code (code, file_name, argv) File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 421, in run_code raise error File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 402, in run_code exec (code, self.globals, self.locals) File "ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay", line 6, in <module> with remote ('127.0.0.1', _test=True): File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 362, in __enter__ i, o, e= self.prepare_connections (backchannel_port, command) File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 270, in prepare_connections self.client.connect (self.hostname, *self.args, **self.kwargs) File "/usr/lib/python3/dist-packages/paramiko/client.py", line 338, in connect t.start_client() File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 493, in start_client raise e File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1757, in run self.kex_engine.parse_next(ptype, m) File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 75, in parse_next return self._parse_kexdh_reply(m) File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 112, in _parse_kexdh_reply self.transport._activate_outbound() File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 2079, in _activate_outbound self._send_message(m) File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1566, in _send_message self.packetizer.send_message(data) File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 364, in send_message self.write_all(out) File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 314, in write_all raise EOFError() EOFError
This probably has something to do with the fact that the test (a functional test, really) is using threads and real sockets. Again, I'll try to investigate this.
All in all, the release is an interesting one. I'll keep adding small features and releasing, let's see how it goes. Meanwhile, here's the changelog:
- The 'No Government' release.
- Test functions are no longer called
-X, which is more scripting friendly.
- Some if those tests had to be fixed.
- Dropped support for
py3.3because the importer does not work there.
toxsupport, but not yet part of the stable test suite.
- Lots and lots of more tests.
- Lots of improvements in the
remote()tests; in particular, make sure they don't hang waiting for someone who's not gonna come.
- Ignore ssh
remote()tests if there's not password/phrase-less connection.
- Fixed several fd leaks.
_erralso accept a tuple
(path, flags), so you can specify things like
os.O_APPEND. Mostly used internally.
I'll keep this short. During the weekend I found a bug in
ayrton. I fixed it
develop, and decided to make a release with it, because it was kind of a
showstopper. It was the first time I decided to use
ayrton for a oneliner.
It was this one:
ayrton -c "rm(v=True, locate('.xvpics', _out=Capture))"
ayrton's native support for filenames with spaces makes it a perfect
xargs and tools like that. That command simply finds
all the files or directories called like
locate and removes them. There is a
little bit of magic where
locate's output becomes
rm's arguments, but probably
not magic enough:
_out=Capture has to be specified. We'll probably fix that
in the near future.
So, enjoy the new release. It just fixes a couple of bugs, one of them directly related to this oneliner. Here's the changelog:
- The 'Release From The Bus' release.
- Bugfix release.
Argvshould not be created with an empty list.
- Missing dependencies.
- Several typos.
- Fix for
ayrton -c <script>was failing because the file name properly was not properly (f|b)aked.
ayrton --versiondidn't work!
Meanwhile, a little about its future. I have been working on
ayrton on and off.
Right now I'm gathering energy to modify
pypy's Python parser so it supports
py3.6's formatted string literals. With this I can later update
which is based on
pypy's. A part of it has been done, but then I run out of gas.
I think FSLs are perfect for
ayrton in its aim to replace shell script languages.
In other news, there's a nasty
remote() bug that I can't pin down. These two
things might mean that there won't be a significant release for a while.
I was trying to modify
ayrton so we could really have
sh-style file tests.
sh, they're defined as unary operators in the
-X form, where
X is a
letter. For instance,
-f foo returns true (
0 in sh-peak) if
foo is some
kind of file. In
ayrton I defined them as functions you could use, but the
names sucked a little.
-f was called
_f() and so on. Part of the reason is,
I think, that both
ayrton already do some
in executable names, and part because I thought that
-True didn't make any
A couple of days ago I came with the idea that I could symply call the function
f() and (ab)use the fact that
- is a unary operator. The only detail was to
make sure that
- didn't change the truthiness of
bools. In fact, it doesn't,
but this surprised me a little, although it shouldn't have:
In : -True Out: -1 In : -False Out: 0 In : if -True: print ('yes!') yes! In : if -False: print ('yes!')
You see, the
bool type was
introduced in Python-2.3
all the way back in 2003. Before that, the concept of true was represented by
any 'true' object, and most of the time as the integer
1; false was mostly
False were added to the builtins, but only as
other names for
0. According the that page and
bool is a subtype of
so you could still do arithmetic operations like
True+1 (!!!), but I'm pretty
sure deep down below the just wanted to be retro compatible.
I have to be honest, I don't like that, or the fact that applying
convert them to
ints, so I decided to subclass
bool and implement
in such a way that it returns the original value. And that's when I got the real
In : class FalseBool (bool): ...: pass ...: TypeError: type 'bool' is not an acceptable base type
Probably you didn't know (I didn't), but Python has such a thing as a 'final class' flag. It can only be used while defining classes in a C extension. It's a strange flag, because most of the classes have to declare it just to be subclassable; it's not even part of the default flags. Even more surprising, is that there are a lot of classes that are not subclassable: around 124 in Python-3.6, and only 84 that are subclassable.
So there you go. You learn something new every day. If you're curious, here's
the final implementation of
class : def __init__ (self, value): if not isinstance (value, bool): raise ValueError self.value= value def __bool__ (self): return self.value def __neg__ (self): return self.value
This will go in
ayrton's next release, which I hope will be soon. I'm also
working in implementing all of the different styles of expansion found in
I even seem to have found some bugs in it.
 I'm talking about the shell, not to confuse with
 Well, there are a couple of infix binary operands in the form
Today I had to setup 3 Firefox profiles, because I started a new job, and I realized I never documented which extensions I use or why, so I had to work a little from memory. Hence, this post, which I plan to keep up-to-date as much as possible.
A little bit of rationale first. I'm very privacy-conscious, but at the same time very pragmatic. I use several profiles to add an extra level of data isolation. That also allows me to have different sets of extensions, because some are some intrusive that they break some non-important sites' functionality.
Finally, the list, in no particular order:
FlashGot, by Giorgio Maone: Better downloads handling.
Go-Mobile, by 'Geek in Training': A lot of sites are actually more useful (read, with less crap on them) in their Mobile versions. This plugins lets you switch from one to the other.
HTTPS everywhere, by EFF: Don't navigate in the clear anymore.
No Script, also by Giorgio Maone: A broad spectrum antibiotic. Not loading JS makes pages less CPU intensive, plus sites cannot track you if you don't make requests, plus also blocks videos.
Privacy Badger, also by EFF: In their own words, “protects privacy by blocking spying ads and invisible trackers”.
Tab Auto Reload, by 'Schuzak': I use this to reload sites that constantly log you out, but only under certain circumstances.
Tab mix plus, by 'onemen': Once upon a time ffox didn't have session management/recovery. Now it does, but not very good; I still think TMP's ones are better. Also, duplicate tab.
Toggle animated GIFs, by Simon Lindholm: Stop annoying animations. Just make sure to tick 'Pause GIFs by default'.
uBlock Origin, by Raymond Hill: an (ad) blocker, goodbye-adiós 15s ad videos in youtube.
So that's it. Unluckily there's nothing against browser fingerprinting yet (and my browser ranks as quite unique), and I don't know how much can be/has been implemented by [Mozilla]. If you have other suggestions about plugins, please do in the comments below. As I said, I'll try to keep this post up to date.
 I used to use ABP, but it seems it became a protection scam.
I just uploaded my first semi-automated change. This change was generated with my hack for generating centerlines for riverbank polygons. This time I expanded it to include a JOSM plugin which will take all the closed polygons from the selection and run the algorithm on them, creating new lines. It still needs some polishing, like making sure they're riverbanks and copying useful tags to the new line, and probably running a simplifying algo at some point. Also, even simple looking polygons might generate complex lines (in plural, and some of these lines could be spurious), so some manual editing might be required afterwards, specially connecting the new line to existing centerlines. Still, I think it's useful.
Like I mentioned last time, its setup is quite complex: The JOSM plugin calls a Python script that needs the Python module installed. That module, for lack of proper bindings for SFCGAL, depends on PostgreSQL+PostGIS (but we all have one, right? :-[ ), and connection strings are currently hardcoded. All that to say: it's quite hacky, not even alpha quality from the installation point of view.
Lastly, as imagico mentioned in the first post about this hack,
the algorithms are not fast, and I already made my computer start thrashing the
disk swapping like Hell because
pg hit a big polygon and started using lots of
RAM to calculate its centerline. At least this time I can see how complex the
polygons are before handing them to the code. As an initial benchmark, the original data
for that changeset (I later simplified it with JOSM's tool) took 0.063927s in
pg+gis and 0.004737s in the Python code. More test will come later.
Okey, one last thing: Java is hard for a Pythonista. At some point it took me 2h40 to write 60 lines of code, ~2m40 per line!
A month ago I revived my old-laptop-as-server I have at home. I don't do much in
it, just serve my photos, a map, provide a
ssh trampoline for me and some
friends and not much more. This time I decided to tackle one of the most
annoying problems I had with it: That closing the lid led to the system to
Now, the setup in that computer has evolved through some years, so a lot of cruft was left on it. For instance, at some point I solved the problem by installing a desktop and telling it not to suspend the machine, mostly because that's how I configure my current laptop. That, of course, was a cannon-for-killing-flies solution, but it worked, so I could focus in other things. Also, a lot of power-related packages were installed, assuming the were really needed for supporting everything I might ever wanted to do about power. This is the story on how I removed them all, why, and how I solved the lid problem... twice.
First thing to go were the desktop packages, mostly because the screen in that laptop has been dead for more than a year now, and because its new space in the house is a small shelf in my wooden desktop. Then I reviewed the power-related packages one by one and decided whether I needed it or not. This is more or less what I found:
acpi-fakekey: This package has a tool for injecting fake ACPI keystrokes in the input system. Not really needed.
acpi-support: It has a lot of scripts that can be run when some ACPI events occur. For instance, lid closing, battery/AC status, but also things like responding to power and even 'multimedia' keys. Nice, but not needed in my case; the lid is going to be closed all the time anyways.
laptop-mode-tools: Tools for saving power in your laptop. Not needed either, the server is going to be running all the time on AC (its battery also died some time ago).
upower: D-Bus interface for power events. No desktop or anything else to listen to them. Gone.
pm-utils: Nice CLI scripts for suspending/hibernating your system. I always have them around in my laptop because sometimes the desktops don't work properly. No use in my server, but it's cruft left from when I used it as my laptop. Adieu.
Even then, closing the lid led to the system suspending. Who else could be there?
Well, there is one project who's being everywhere:
systemd. I'm not saying
this is bad, but it is everywhere. Thing is, its login subsystem also handles
ACPI events. In the
/etc/systemd/logind.conf file you can read the following
#HandlePowerKey=poweroff #HandleSuspendKey=suspend #HandleHibernateKey=hibernate #HandleLidSwitch=suspend #HandleLidSwitchDocked=ignore
so I uncommented the 4th line and changed it so:
Here you can also configure how the inhibition of actions work:
#PowerKeyIgnoreInhibited=no #SuspendKeyIgnoreInhibited=no #HibernateKeyIgnoreInhibited=no #LidSwitchIgnoreInhibited=yes
Please check the config file's doc if you plan to modify it.
Not entirely unrelated, my main laptop also started suspending when I closed the lid. I have it configured, through the desktop environment, to only turn off the screen, because what use is the screen if it's facing the keyboard and touchpad :) Somehow, these settings only recently started to be in effect, but a quick search didn't gave any results on when things changed. Remembering what I did with the server, I just changed that config file to:
HandlePowerKey=ignore HandleSuspendKey=ignore HandleHibernateKey=ignore HandleLidSwitch=ignore HandleLidSwitchDocked=ignore
That is, “let me configure this through the desktop, please”, and now I have my old behavior back :)
PS: I should start reading more about
systemd. A good starting point seems to
be all the links in its home page.
Dear conference speakers:
test write your slides at 1024x768 resolution, on a projector.
Dear conference organizers:
Please remind your speakers to do so.
Dear conference attendants:
If you are at the back of the room and you can't see the text/code in the slides the speaker(s) is showing, please shout “I cant see shit!” in the appropriate language, and try to embarrass the speaker as much as possible.
Thanks in advance. Yours truly,
PS: I've been watching videos of talks in some conferences and I swear to $DEITY at in least 40% of the ones I was interested in, I couldn't read the code on the video. Sometimes the fonts are too small, sometimes the colors are not contrasting enough. Please, at least test your slides on a projector...
PSS: I know the resolution I'm suggesting is low. Be happy I'm not asking for 640x480 :-P
PSSS: Ok, attendants, don't embarrass/harass the speakers :)
Like I said in my last post, I'm looking at last . Here are my selected videos, in the order I saw them: 's videos
Ned Batchelder - Machete-mode debugging: Hacking your way out of a tight spot. In fact, I saw this twice.
Sumana Harihareswara - HTTP Can Do That?! Points for informative and funny.
Matthias Kramm - Python Typology Types are comming, so get used to them.
Scott Sanderson, Joe Jevnik - Playing with Python Bytecode Nice, very nice trick. I'm talking about the way the presentation is given.
And of course, the lighning talks. I always like these, because you can get exposed to any kind of things, some not even remotely connected to Python, but which can get your brain rolling down nice little bunny holes, or at least get a smile from you. So here:
LT#1. Please watch it at least between 20-25m.
And of course, check the other ones, don't stop at my own interests.
 Yes, I started writing this a month ago.
Long time for this release. A couple of hard bugs (which fix was just moving a line down a little), a big-ish new feature, and moving in a new city. Here's the:
- You can import ayrton modules and packages!
- Depends on Python3.5 now.
argvis not quite a list: for some operations (
argvis left alone.
option()raises or if the option or its 'argument' is wrong.
stat()are available as functions.
pdbwhen there is an unhandled exception.
for line in foo(...): ...by automatically adding the
- A lot of internal fixes.
My latest Europe import was quite eventful. First, I run out of space several
times during the import itself, at indexing time. The good thing is that, if you
manage to reclaim some space, and reading a little of
you can replay the missing queries by hand and stop cursing. To be fair,
osm2pgsql currently uses a lot of space in slim+flat-nodes mode: three tables,
planet_osm_relation; and one file, the
flat nodes one. Those are not deleted until the whole process has finished, but
they're actually not needed after the processing phase. I started working on
But that was not the most difficult part. The most difficult part was that I
forgot, somehow, to add a column to the
Elevation, my own style,
renders different icons for different types of castles (and forts too), just like
the Historic Place map
of the Hiking and Bridle map. So today
I sat down and tried to figure out how to reparse the OSM extract I used for the
import to add this info.
The first step is to add the column to the tables. But first, which tables should be impacted? Well, the line I should have added to the import style is this:
node,way castle_type text polygon
That says that this applies to nodes and ways. If the element is a way,
will try to convert it to a polygon and put it in the
if it's a node, it ends in the
planet_osm_point table. So we just add the
column to those tables:
ALTER TABLE planet_osm_point ADD COLUMN castle_type text; ALTER TABLE planet_osm_polygon ADD COLUMN castle_type text;
Now how to process the extract? Enter
pyosmium. It's a Python binding
osmium library with a stream-like type of processing à la expat for
processing XML. The interface is quite simple: one subclasses
osmium.SimpleHandler, defines the element type handlers (
relation()) and that's it! Here's the full code of the simple Python
script I did:
#! /usr/bin/python3 import osmium import psycopg2 conn= psycopg2.connect ('dbname=gis') cur= conn.cursor () class CastleTypes (osmium.SimpleHandler): def process (self, thing, table): if 'castle_type' in thing.tags: try: name= thing.tags['name'] # osmium/boost do not raise a here!# : <Boost.Python.function object at 0x1329cd0> returned a result with an error setexcept (KeyError, SystemError): name= '' print (table, thing.id, name) cur.execute ('''UPDATE '''+table+ ''' SET castle_type = %s WHERE osm_id = %s''', (thing.tags['castle_type'], thing.id)) def node (self, n): self.process (n, 'planet_osm_point') def way (self, w): self.process (w, 'planet_osm_polygon') relation= way # handle them the same way (*honk*) ct= CastleTypes () ct.apply_file ('europe-latest.osm.pbf')
The only strange part of the API is that it doesn't seem to raise a
when the tag does not exist, but a
SystemError. I'll try to figure this out
later. Also interesting is the big amount of unnamed elements with this tag that
exist in the DB.
 I would love forto recognize something like https://github.com/openstreetmap/osm2pgsql/blob/master/table.cpp#table_t::stop and be directed to that method, because #Lxxx gets old pretty quick.
 I just noticed how much more complete those maps are. more ideas to use :)
For a few months now I've been trying to have a random slideshow of images. I
used to do this either with
kscreensaver, which for completely different
reasons I can't use now, or
glslideshow, which, even when I
compiled it by hand, I can't find the way to give it the root dir of the images.
So, based on OMIT, I developed my own.
The differences with OMIT are minimal. It has to scan the whole tree for finding the appropriate files (its definition of "appropriate" could be improved, it's true); it goes into full screen mode with black background; and it (more) properly handles EXIF rotation. All that in 176 LOCs, including proper licensing (GPLv3), and developed in one day and refined the next one.
So, there you are. Like OMIT, it's in
PyQt4, but this time in Python3 (that's
why I used
includes porting it to
PyQt5 and a few other things. You can grab it
here. I plan to do a proper
release soon, but for the moment just drop it in your
PATH and be happy with
In this last two days I've been expanding
osm-centerlines. Now it not only
supports ways more complex than a simple rectangle, but also ones that lead to
'branches' (unfortunately, most probably because the mapper either imported
bad data or mapped it himself). Still, I tested it in very complex polygons
and the result is not pretty. There is still lots of room for improvements.
Unluckily, it's not as stand alone as it could be.
The problem is that, so far, the algos force you to provide now only the polygon
you want to process, but also its
medial. The code extends the
medial using info extracted from the skeleton in such a way that the resulting
medial ends on a segment of the polygon, hopefully the one(s) that cross
from one riverbank to another at down and upstream. Calculating the skeleton
could be performed by
CGAL, but the current
Python binding doesn't include
that function yet. As for the medial, SFCGAL (a C++ wrapper for CGAL)
exports a function that calculates an approximative medial,
but there seem to be no Python bindings for them yet.
So, a partial solution would be to use PostGIS-2.2's
ST_ApproximateMedialAxis(), so I added a function called
skeleton_medial_from_postgis(). The parameters are a
psycopg2 connection to a
PostgreSQL+PostGIS database and the way you want to calculate, as a
and it returns the skeleton and the medial ready to be fed into
The result of that should be ready for mapping.
So there's that. I'll be trying to improve it in the next days, and start looking into converting it into a JOSM plugin.
For a long time now I've been thinking on a problem: OSM data sometimes contains riverbanks that have no centerline. This means that someone mapped (part of) the coasts of a river (or stream!), but didn't care about adding a line that would mark its centerline.
But this should be computationally solvable, right? Well, it's not that easy. See, for given any riverbank polygon in OSM's database, you have 4 types of segments: those representing the right and left riverbanks (two types) and the flow-in and flow-out segments, which link the banks upstream and downstream. With a little bit of luck there will be only one flow-in and one flow-out segment, but there are no guarantees here.
One method could try and identify these segments, then draw a line starting in the middle of the flow-in segment, calculating the middle by traversing both banks at the same time, and finally connect to the middle for the flow-out segment. Identifying the segments by itself is hard, but it is also possible that the result is not optimal, leading to a jagged line. I didn't try anything on those lines, but I could try some examples by hand...
Enter topology, the section of maths that deals with this kind of problems. The skeleton of a polygon is a group of lines that are equidistant to the borders of the polygon. One of the properties this set of lines provides is direction, which can be exploited to find the banks and try to apply the previous algorithm. But a skeleton has a lot of 'branches' that might confuse the algo. Going a little further, there's the medial axis, which in most cases can be considered a simplified skeleton, without most of the skeleton branches.
Enter free software :) CGAL
is a library that can compute a lot of topological properties. PostGIS is clever
enough to leverage those algorithms and present, among others, the functions
ST_ApproximateMedialAxis(). With these two and the
original polygon I plan to derive the centerline. But first an image that will
help explaining it:
The green 'rectangle' is the original riverbank polygon. The thin black line is the skeleton for it; the medium red line is the medial. Notice how the medial and the center of the skeleton coincide. Then we have the 4 branches forming a V shape with its vertex at each end of the medial and its other two ends coincide with the ends of the flow in and flow out segments!
So the algorithm is simple: start with the medial; from its ends, find the branches in the skeleton that form that V; using the other two ends of those Vs, calculate the point right between them, and extend the medial to those points. This only calculates a centerline. The next step would be to give it a direction. For that I will need to see if there are any nearby lines that could be part of the river (that's what the centerline is for, to possibly extend existing rivers/centerlines), and use its direction to give it to the new centerline.
For the moment the algorithm only solves this simple case. A slightly more
complex case is not that trivial, as skeletons and medials are returned as a
MultiLineString with a line for each segment, so I will have to rebuild them
LineStrings before processing.
I put all the code
online, of course :)
Besides a preloaded PostgreSQL+PostGIS database with OSM data, you'll need
first two allows me to fetch the data from the db. Ah! by the way, you will need
a couple of views:
CREATE VIEW planet_osm_riverbank_skel AS SELECT osm_id, way, ST_StraightSkeleton (way) AS skel FROM planet_osm_polygon WHERE waterway = 'riverbank'; CREATE VIEW planet_osm_riverbank_medial AS SELECT osm_id, way, ST_ApproximateMedialAxis (way) AS medial FROM planet_osm_polygon WHERE waterway = 'riverbank';
Shapely allows me to manipulate the polygonal data, and fiona is used to save the results to a shapefile. This is the first time I ever use all of them (except SQLAlchemy), and it's nice that it's so easy to do all this in Python.
A few weeks ago an interesting
landed in the project's page. It adds rendering for several natural relief
features, adding ridges, valleys, aretes, dales, coulouirs and others to cliffs,
peaks and mountain passes, which were already being rendered. I decided to try
it in Elevation (offline for
I sync'ed the style first with the latest release, applied the patch and... not much. My current database is quite old (re-importing takes ages and I don't have space for updates), so I don't have much features like that in the region I'm interested in. In fact, I went checking and the closest mountain range around here was not in the database, so I added it.
By the way, the range is mostly concurrent with a part of an administrative
SK53 suggested to make a new line.
Even when other features are nearby (there's a path close to the crest and it's
also more or less the limit between a forest and a bare rock section), which already
makes the region a little bit crowded with lines, it makes sense: boundaries,
paths, forest borders and ridges change at different time scales, so having them
as separate lines makes an update to any of those independent of the rest.
Now I wanted to export this feature
and import it in my rendering database, so I can actually see the new part of the
style. This is not an straightforward process, only because when I imported my data I used
osm2pgsql --drop, which removes the much needed intermediate tables for when
one wants to update with
osm2pgsql --append. Here's a roundabout way to go.
First you download the full feature (thanks
RichardF!). In this case:
This not only exports the line (which is a sequence of references to nodes) with
its tags, but the nodes too (which are the ones storing the coords). The next
step is to convert it to something more malleable, for instance, GeoJSON. For
that I used
ogr2ogr like this:
ogr2ogr -f GeoJSON 430573542.GeoJSON 430573542.xml lines
The last parameter is needed because, quoting Even Rouault (a.k.a. José GDAL): «you will always get "points", "lines", "multilinestrings", "multipolygons" and "other_relations" layers when reading a osm file, even if some are empty», and the GeoJSON driver refuses to create layers for you:
ERROR 1: Layer lines not found, and <span class="createlink">CreateLayer</span> not supported by driver.
But guess what, that not the easiest way :) At least we learned something. In
postgis already has a tool called
shp2pgsql that imports ESRIShapeFiles,
ogr2ogr produces by default this kind of file. It creates a
for each layer as discussed before, but again, we're only interested in the line
ogr2ogr 430573542 430573542.xml lines shp2pgsql -a -s 900913 -S 430573542/lines.shp > 430573542.sql
We can't use this SQL file directly, as it has a couple of problems. First, you
shp2pgsql the names of the table where you want to insert the data
or the geometry column. Second, it only recognizes some attributes (see below),
and the rest it tries to add them as hstore tags. So we have to manually edit
the file to go from:
INSERT INTO "lines" ("osm_id","name","highway","waterway","aerialway","barrier","man_made","z_order","other_tags",geom) VALUES ('430573542','Montagne Sainte-Victoire',NULL,NULL,NULL,NULL,NULL,'0','"natural"=>"ridge"','010500002031BF0D[...]');
INSERT INTO "planet_osm_line" ("osm_id","name","z_order","natural",way) VALUES ('430573542','Montagne Sainte-Victoire','0','ridge','010500002031BF0D[...]');
s/other_tags/"natural"/ (with double quotes,
natural is a keyword in SQL, as in
s/'"natural"=>"ridge"'/'ridge'/ (in single quotes, so it's a string; double
quotes are for columns). And I also removed the superfluous values and the
ANALIZE line, as I don't care that much. Easy peasy.
A comment on the options for
-s 900913 declares the SRID of the
database. I got that when I tried without and:
ERROR: Geometry SRID (0) does not match column SRID (900913)
-S is needed because
shp2pgsql by default generated , but
that table in particular has a
way column. This is how I figure it
ERROR: Geometry type (MultiLineString) does not match column type (LineString)
Incredibly, after this data massacre, it loads in the db:
$ psql gis < 430573542.sql SET SET BEGIN INSERT 0 1 COMMIT
Today I stumbled upon PyCon 2016's youtube channel and started watching some of the talks. The first one I really finished watching was Ned Batchelder's "Machete debugging", a very interesting talk about 4 strange bugs and the 4 strange techniques they used to find where those bugs were produced. It's a wonderful talk, full of ideas that, if you're a mere mortal developer like me, will probably blow your mind.
One of the techniques they use for one of the bugs is to actually write a trace
function. A trace function in
cpython context is a function that is called
in several different points of execution of Python code. For more information
In my case I used tracing for something that I always liked about
bash: that you
can ask it to print every line that's being executed (even in functions and subprocesses!).
I wanted something similar for
ayrton, so I sat down to figure out how this would
The key to all this is the function I mention up there. The API seems simple enough
at first sight, but it's a little more complicated. You give this function what is
called the global trace function. This function will be called with three parameters:
a frame, an event and a event-dependent arg. The event I'm interested in is
line, which is called for each new line of code that is executed. The complication
comes because what this global trace function should return is a
local trace function. This function will be called with the same parameters as
the global trace function. I would really like an explanation why this is so.
The job for this function, in
ayrton's case, is simple:
inspect the frame, extract the filename and line number and print that. At first this
seems to mean that I should read the files by myself, but luckily there's another
interesting standard module:
linecache to the rescue.
The only 'real complication' of
ayrton's use is that it would not work if the
script to run was passed with the
-c|--script option, but (un)luckily the
execution engine already has to read the hold the script in lines, so using that
as the cache instead of
linecache was easy.
Finally, if you're interested in the actual code,
go take a look.
Just take in account that
ayrton has 3 levels of tracing: à la
lines prepended by
+), with line numbers, and tracing any Python line execution,
including any modules you might use and their dependencies. And don't forget that
it also has 3 levels of debug logging into files. See
ayrton has always been able to use any Python module, package or extension as
long as it is in a directory in
sys.path, but trying to solve a bigger bug, I
realized that there was no way to use
ayrton modules or packages. Having only
laterally heard about the new
importlib module and the new mechanism, I sat down
and read more about it.
The best source (or at least the easiest to find) is possibly what Python's reference says about the import system, but I have to be honest: it was not an easy read. Next week I'll sit down and see if I can improve it a little. So, for those out there who, like me, might be having some troubles understanding the mechanism, here's how I understand the system works (ignoring deprecated APIs and corner cases or even relative imports; I haven't used or tried those yet):
def import_single(full_path, parent=None, module=None): # try this cache first if full_path in sys.modules: return sys.modules[full_path] # if not, try all the finders for finder in sys.meta_path: if parent is not None: spec = finder.find_spec(full_path, parent.__path__, target) else: spec = finder.find_spec(full_path, None, target) # if the finder 'finds' ('knows how to handle') the full_path # it will return a loader if spec is not None: loader = spec.loader if module is None and hasattr(loader, 'create_module'): module = loader.create_module(spec) if module is None: module = ModuleType(spec.name) # let's assume this creates an empty module object module.__spec__ = spec # add it to the cache before loading so it can referenced from it sys.modules[spec.name] = module try: # if the module was passed as parameter, # this repopulates the module's namespace # by executing the module's (possibly new) code loader.exec_module(module) except: # clean up del sys.modules[spec.name] raise return module raise ImportError def import (full_path, target=None): parent= None # this code iterates over ['foo', 'foo.bar', 'foo.bar.baz'] elems = full_path.split('.') for partial_path in [ '.'.join (elems[:i]) for i in range (len (elems)+1) ][1:] parent = import_single(partial_path, parent, target) # the module is loaded in parent return parent
A more complete version of the
if spec is not None branch can be found in
the Loading section
of the reference. Notice that the algorithm uses all the finders in
So which are the default finders?
In : sys.meta_path Out: [_frozen_importlib.BuiltinImporter, _frozen_importlib.FrozenImporter, _frozen_importlib_external.PathFinder]
Of those finders, the latter one is the one that traverses
sys.path, and also has
a hook mechanism. I didn't use those, so for the moment I didn't untangle how they
Finally, this is how I implemented importing
ayrton modules and packages:
from importlib.abc import , Loader from importlib.machinery import import sys import os import os.path from ayrton.file_test import _a, _d from ayrton import Ayrton import ayrton.utils class AyrtonLoader (Loader): @classmethod def exec_module (klass, module): # «the loader should execute the module’s code # in the module’s global name space (module.__dict__).» load_path= module.__spec__.origin loader= Ayrton (g=module.__dict__) loader.run_file (load_path) # set the __path__ # TODO: read PEP 420 init_file_name= '__init__.ay' if load_path.endswith (init_file_name): # also remove the '/' module.__path__= [ load_path[:-len (init_file_name)-1] ] loader= AyrtonLoader () class AyrtonFinder (MetaPathFinder): @classmethod def find_spec (klass, full_name, paths=None, target=None): # TODO: read PEP 420 :) last_mile= full_name.split ('.')[-1] if paths is not None: python_path= paths # search only in the paths provided by the machinery else: python_path= sys.path for path in python_path: full_path= os.path.join (path, last_mile) init_full_path= os.path.join (full_path, '__init__.ay') module_full_path= full_path+'.ay' if _d (full_path) and _a (init_full_path): return ModuleSpec (full_name, loader, origin=init_full_path) else: if _a (module_full_path): return ModuleSpec (full_name, loader, origin=module_full_path) return None finder= AyrtonFinder () # I must insert it at the beginning so it goes before sys.meta_path.insert (0, finder)
Notice all the references to PEP 420. I'm pretty sure I must be breaking something, but for the moment this works.
Remember this? Ok,
maybe you never read that. The gist of the post is that I used
strace -r -T to
produce some logs that we «amassed[sic] [...] with a python script for generating[sic]
a CSV file [...] and we got a very interesting graph». Mein Gott, sometimes the
English I write is terrible... Here's that graph again:
This post is to announce that that Python script is now public. You can find it
here. It's not as fancy as those flame
graphs you see everywhere else, but it's a good first impression, specially if you have
to wait until the installs
perf or any other tool like that (ok, let's be
l/strace is not a standard tool, but probably your grumpy will
be more willing to install those than something more intrusive; I know it happened to
me, at least). It's written in Python3; I'll probably backport it to Python2 soon,
so those stuck with it can still profit from it.
To produce a similar graph, use the
--histogram option, then follow the
suggestions spewed to
stderr. I hope this helps you solve a problem like it did to
A short story. For years I've not only accumulated thousands of pictures (around 50, to be more precise) but also schemas on how to sort those photos. After a long internal debate, I settled for the following one (for the moment, that is):
- Pictures are imported from the camera's SD via a script, which:
- Renames them from
DSCXXXXX.JPGto a date based file name, using the picture's exif data.
- (Hard) links them to a year/month based dir structure.
- Renames them from
- Later, pictures are sorted by hand into thematic dirs for filtering.
- The year/month tree is handled with
- Pictures are filtered into their final destination sorted by category, year and
event (for instance,
- Pictures are also (hard) linked into a tag based dir structure, using nested tags.
So, the whole workflow is:
SD card --> incoming/01-tmp -> incoming/02-new/theme -> incoming/03-cur --> final destination `-> year/month `-> tags/theme/parent/child
The reason for using hard links is the following: I
rsync everything to my home
server, both as backup and for feeding the gallery(s) there. Because pictures are
moved from one location to another until they reach their final destination,
retransmits the picture in its new location and then deletes the old one (I'm using
--delete-after, to make sure the backup does not loose any picture if the transfer
is stopped). This leads to useless transfers, as the picture is in the remote.
I played with the idea of using
git or even
git-annex for working around this,
but in the end I decided to (ab)use
rsync's hard link support. Now moving any picture
in the workflow or renaming a category, theme or directory just means creating new hardlinks to the links in the year/month tree
and removing the old ones later, an almost immediate operation. This also helps saving
space and time when implementing the tag based tree.
digikam is good enough to uniquely identify each picture, even when two hard links
point to the same file. This still means the picture appears several times; the
metadata (most importantly, tags) are shared, but each new link
adds load to the
I bit the bullet and sat down to do a last move and be done. I moved the year/month
ByDate/, completely isolating it from the rest of the collection. Then pointed
digikam to only read that, and here's how I did it:
digikam, of course.
- Backed up everything, including both the database, which was in the collection's root, and the
- Modified the latter so it points to the new database location:
[Database Settings] Database Name=/home/mdione/Pictures/ByDate/ Database Name Thumbnails=/home/mdione/Pictures/ByDate/
- Moved the database.
- Changed the collection's root dir in the database:
mdione@diablo:~/Pictures/ByDate$ sqlite3 digikam4.db -- Notice the specificPath is relative to the volume. In my case, the volume is /home sqlite> select * from ; 1|Pictures|0|1|volumeid:?uuid=f5cadc44-6324-466c-8a99-4ede7457677e|/mdione/Pictures sqlite> update set specificPath = '/mdione/Pictures/ByDate' where id = 1; sqlite> select * from ; 1|Pictures|0|1|volumeid:?uuid=f5cadc44-6324-466c-8a99-4ede7457677e|/mdione/Pictures/ByDate
digikamand let it rescan the collection, which recognized the only link to the image and kept the tags.
This worked superbly!
 When I talk about pictures, I'm also including videos.
I've been improving a little Elevation's reproducibility. One of the steps of setting it up is to download an extract to both import in the database and fetch the DEM files that will be part of the background. The particular extract that I'm using, Europe, is more than 17GiB in size, which means that it takes a looong time to download. Thus, I would like to have the ability to continue the download if it has been interrupted.
The original script
that was trying to do that is using
curl. This version is not trying to continue
the download, which can easily be achieved by adding the
--continue - option. The
version that has it never hit the repo because of the following:
The problem arises when the file we want to download is rolled every day. This means
that the contents of the file changes from one day to the other, and we can't just
continue from we left if that's the case, we must start all over.
One could think that
curl has an option that looks like it handles that,
--time-cond, which is what the script is trying to use. This option makes
If-Modified-Since HTTP header,
which allows the server to respond with a 304 (Not modified) if the file is not newer
that the provided date. The date the
curl provides is the one from the file
referenced by that option, and I was giving the same file as the one where the output
goes. I was using these options wrong, it was doing it the other way around: continue
if the file changed or doing nothing if not.
So I sat down to try and tackle the problem. I know one can use the
to check (at least) two things: the resource's date and size (bah, at least in the case of
static files like this). So the original idea was to get the URL's date and size;
if the date is newer than the local file, I should restart the download from scratch;
if not and the size was bigger than the local file, then continue; otherwise, assume
the file is finished downloading and stop there.
The last twist of the problem is that the only useful dates from the file were either
mtime, but both change on every write on the file. This means that if I
leave the script downloading the file, and in the meanwhile the file is rotated,
and the download is interrupted and I try again later, the file's
newer that the URL, even when is for a file that is older then the URL. So I
had to add a parallel timestamp file that is created only when starting a download
and never updated (until the next full download; the file is actually
and it is its
mtime the one used for comparing with the URL's.
Long story short,
--continue options combined are not
for this, a
HEAD helps a little bit, but rotation-while-downloading can further
complicate things. One last feature one could ask to such a script would be to keep
the old file while downloading a new one and rotate at the end, but I will leave it
for when/if I really need it.
The new script
is written in
ayrton because it's easier to
handle execution output and dates in it than in
bash. This also pushed me to make
minor improvements to it, so expect a release soon.
 In fact the other options are not do anything (but then we're left with an incomplete, useless file) or to try and find the file; in the case of geofabrik, they keep the last week of daily rotation, the first day of each previous month back to the beginning of the year; then the first day of each year back to 2014. Good luck with that.
"Life, loathe it or ignore it, you can't like it." -- Marvin, "Hitchhiker's Guide to the Galaxy"