glob no es un blog. No en el sentido corriente de la palabra. Es un registro de mis proyectos y otras interacciones con el software libre.

glob is not a blog. Not in the common meaning of the word. It's a record of my projects and other interactions with libre software.

All texts CC BY Marcos Dione


I started watching PyCon's videos. One of the first ones I saw is Amber Brown's "How we do identity wrong". I think she[1] is right in raising not only the notion of not assuming things related to names, addresses and ID numbers, but also that you shouldn't be collecting information that you don't need; at some point, it becomes a liability.

In the same vein about assuming, I have more examples. One of them is deciding what language you show your site depending on what country the client connects form. I'm not a millennial (more like a transmillennial, if you push me to it), but I tend to go places. Every time I go to a new place, I get sites in new languages, but maps in US!

Today I wanted to book a hotel room. The hotel's site asked me where do I live, so I chose France. Fact is, for them country and language is the same thing (I wonder what would happen if I answer Schweiz/Suisse/Svizzera/Svizra), so I can't say that I live in France but prefer English, so I chose United Kingdom instead. Of course, this also meant that I got prices in GBP, not EUR, so I had to correct that one too. At least I could.

Later they asked me country of residence and nationality; when I chose italian, the country was set to Italia, even when I chose France first!

I leave you all with an anecdote. As I said, I lake to go places, most of the times with friends. Imagine the puzzled expression of the police officer that stopped us to find a car licensed in France, driven by an italian, with an argentinian, a spanish and a chilean passangers, crossing from Austria to Slovakia, listening to US music. I only forgot to put the GPS in japanese or something.

So, don't assume; if you assume, let the user change settings to their preferences, and don't ask for data you don't actually need. And please use the user's Accept-Language header; they have it for a reason.


[1] I think that's the pronoun she[1] said she[1] preferred. I'm sorry if I got that wrong.


python misc

Posted dom 17 jun 2018 18:06:01 CEST Tags:

Another aspect I've been looking into with respect to optimizing the rendering speed is data sources that are not in the target projection. Not being in the target projection forces Mapnik to reproject them on the fly, and this for each (meta)tile, whereas it would make more sense to have them already reprojected.

In my case I have 3 datasets, big ones, in EPSG:4258. The three datasets are elevation, hill and slope shade based on EEA's DEM files. The source tiles amount to almost 29GiB, and the elevation layer, being RGB, takes more than that. So I set off to try to reproject the things.

My first, more obvious approach was to reproject every 5x5°, 18000x18000px tile, then derive the data I need, but I started to get gaps between tiles. Notice that the original data is cleanly cut (5° x 3600"/° x 1px/" == 18000px), without any overlapping.

The next approach was to merge them all in a .vrt file, then reproject chunks of it with gdalwarp. What I wanted as output was the same 5x5° tiles, reprojected, but with an extra pixel, so they overlap. This last requirement was the problematic one. See, the final projection makes any square in the original projection a tall rectangle, stretching more and more towards the poles. The closest I could get was to use the -ts option, but that meant that I didn't get any control about how many extra pixels I got in the vertical/latitude direction. My OCD started thrashing :) In fact what happened was that I was not sure how GDAL would handle the possible partial pixel, whether rounding down (meaning excluding it), up (finishing it), or simply leaving the pixel with partial data and impacting the final rendering.

Even Rouault pointed to me that gdalwarp can do something fantastic: it can generate a .vrt file too with all the parameters needed for the reprojection, so reading from there was automatically reprojecting the original data. The resulting dataset is 288,000x325,220px (the original is 288,000x180,000px), so I'm definitely going to cut it down in small tiles. After consulting with a eight-ball, I decided to discard the idea of tiles with boundaries based on coordinates, which might not even make sense anymore, but settle for pixel based sizes, still with an extra pixel. The chosen size is 2**14+1 a.k.a. 16385. For this gdal_translate is perfect.

The final algorithm is like this:

gdalwarp -t_srs "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 \
    +x_0=0.0 +y_0=0.0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs +over" \
    -r lanczos -tr 30.92208077590933 -30.92208077590933 \
    -of VRT EU-DEM.vrt EU-DEM-corrected.vrt

The values for the -tr option is the pixel size in meters, which is the unit declared in the SRS. Notice that as Mercator stretches towards the poles, this is the size at the origin of the projection; in this case, at 0° lat, 0° lon.

Then the reprojection (by reading from the reprojecting dataset) and cut, in a couple of loops:

tile_size=$((2**14)); \
for i in $(seq 0 17); do
    for j in $(seq 0 5); do
        for k in $(seq 0 3); do
            l=$((4*$j+$k));
            gdal_translate -co BIGTIFF=YES -co TILED=YES -co COMPRESS=LZMA \
                -co LZMA_PRESET=9 \
                -srcwin $(($tile_size*$i)) $(($tile_size*$l)) \
                    $(($tile_size+1)) $(($tile_size+1)) \
                -of GTiff EU-DEM-corrected.vrt \
                $(printf "%03dx%03d-corrected.tif" $i $l) &
        done;
        wait;
    done;
done

There's an extra loop to be able to launch 4 workers at the same time, because I have 4 cores. This doesn't occupy the 4 cores a 100% of the time (cores that already finished stay idle until the other finished), but it was getting akward to express in a Makefile, and this is run only once.

Before deriving the rest of the data there's an extra step: removing those generated tiles that actually have no data. I do a similar thing with empty sea tiles in the rendering process. Notice also that the original data is not tile complete for the covered region (79 tiles instead of the 160 they should be).


openstreetmap gis gdal

Posted mar 23 ene 2018 14:08:42 CET Tags:

At work I'm writing an API using Django/DRF. Suddenly I had to write an application (just a few pages for calling a few endpoints), so I (ab)used DRF's Serializers to build them. One of the problems I faced while doing this was that DRF's ChoiceField accepts only a sequence with the values for the dropdown, unlike Django's, who also accepts callables. This means that once you gave it a set of values, it never ever changes, at least until you restart the application.

Unless, of course, you cheat. Or hack. Aren't those synonyms?

class UpdatedSequence:
    def __init__(self, update_func):
        self.update_func = update_func
        self.restart = True

        self.data = None
        self.index = 0


    def __iter__(self):
        # we're our own iterator
        return self


    def __next__(self):
        # if we're iterating from the beginning, call the function
        # and cache the result
        if self.restart:
            self.data = self.update_func()
            self.index = 0

        try:
            datum = self.data[self.index]
        except IndexError:
            # we reached the limit, start all over
            self.restart = True
            raise StopIteration
        else:
            self.index += 1
            self.restart = False

        return datum

This simple class tracks when you start iterating over it and calls the function you pass to obtain the data. Then it iterates over the result. When you reach the end, it marks it to start all over, so the next time you iterate over it, it will call the function again. The function you pass can be the all() method of a QuerySet or anything else that goes fetch data and returns an iterable.

In my case in particular, I also added a TimedCache so I don't read twice the db to fill two dropdown with the same info in the same form:

class TimedCache:
    '''A function wrapper that caches the result for a while.'''
    def __init__(self, f, timeout):
        self.f = f
        self.timeout = timeout
        self.last_executed = None
        self.cache = None
        self.__name__ = f.__name__ + ' (Cached %ds)' % timeout


    def __call__(self):
        now = time.monotonic()

        if self.cache is None or (now - self.last_executed) > self.timeout:
            self.cache = self.f()
            self.last_executed = now

        return self.cache

I hope this helps someone.


python django drf

Posted vie 19 ene 2018 13:18:03 CET Tags:

I never wanted this. Whatsapp and its parent company are some of the things I most hate about what tech has become, showing the utter lack of ethics in a industry that has too much impact on the rest of the planet. Just go and read all the reports about groups abusing these platforms (and them allowing them to) to change politics all over the world, or all the shitty security a lot of IoT stuff has, and how they're used to attack services on the Internet.

But reality is more complex than that. In my home country, Facebook and Whatsapp specifically are very popular, to the point that, due to our lack of net-neutrality laws, phone companies offer cheap contracts where those two application's data usage do not count as such, becoming completely free. This means people almost stopped using the phone or SMSs, but instead send text, pictures and voice messages via these platforms. That includes my whole family, which also almost stopped using email.

So for the moment I left my principles aside and installed the app. My last attempt at this had failed because you can't install it on a tablet; the device has to have phone capabilities. Even more, when you try to register it, it forces you to use a cellular phone number; Signal at least has the decency to let you register it to a land line too (if it can't send you an SMS, it gives you the option of being called and the registering code is spelled to you). Luckily I had a spare number from a throwaway line I bought in my last trip to homeland, so I used that number instead of the real one. I know it's a useless step, it's equivalent to giving the finger to someone's back.

Once installed, I tried to send a message to my wife. The app denies you to do so if you don't give it in exchange access to your contacts. Again luckily for me, this phone was mostly empty, but I still took steps to avoid giving it all my contacts. The few I had were already sync'ed to my owncloud instance back home.

First, I exported all my contacts locally and deleted them all. I reimported them after I got the app running. Then I created a new, empty owncloud account, so when Whatsapp asked me which 'account' to use to get/sync the contacts, I gave it that one. This way, when you add contacs, they go to this 'honeypot' and it doesn't have access to your real Contacts. If you don't have a owncloud or similar service you control, you can simply create a bogus Google account and use that instead. The only downside is that you will get dupe'd contacts, but once you sent them a message, you can safely delete the contact and even completely disable sync'ing the account. You can also revoque the permission to access Contacts, but that means you're back to square one, except for the conversations you already have started.

I'm sorry can't give you the exact steps I did, I was on the bus, and with all the failing attempts I lost track. Of course, removing all the contacts means that you only see phone numbers and their photos, but after a while you can recognize them by that. Right now I only have my wife and my family's group, and I hope I can keep it like that for a long, long time.

One last thing: Whatsapp asks you for your contacts, but you can't nicely ask them back: the phone numbers of new contacts are very difficult to extract. You either export them to the Contacts Account if you still have around (I didn't) or you copy them by hand (which I did). Last but not least, I still have the nagging sensation that Whatsapp would have been able to read the contacts; I really whish that Android would gives us more fine grained firewall capabilities. Also, remember that Whatsapp has no option to store media in an SD card, only the phone's internal storage (WTF, people, seriously!), and it's a pain in the ass to clean up the stuff you don't want. So for the moment I haven't gave it access to Photos, Media and Files.


misc rant

Posted lun 20 nov 2017 19:41:11 CET Tags:

I have a love and hate relantionship with regular expressions (regexps). On one side they're a very powerful tool for text processing, but on the other side of the coin, the most well known implementation is a language whose syntax is so dense, it's hard to read beyond the most basic phrases. This clashes with my intention of trying to make programs as readable as possible[1]. It's true that you can add comments and make your regexps span several lines so you can digest them more slowly, but to me it feels like eating dried up soup by the teaspoon directly from the package without adding hot water.

So I started reading regexps aloud and writing down how I describe them in natural language. This way, [a-z]+ becomes one or more of any of the letters between lowercase a and lowercase z, but of course this is way too verbose.

Then I picked up these descriptions and tried to come up with a series of names (in the Pyhton sense) that could be combined to build the same regexps. Even 'literate' programs are not really plain English, but a more condensed version, while still readable. Otherwise you end up with Perl, and not many think that's a good idea. So, that regexp becomes one_or_more(any_of('a-z')). As you can see, some regexp language can still be recognizable, but it's the lesser part.

So, dinant was born. It's a single source file module that implements that language and some other variants (any_of(['a-z'], times=[1, ]), etc). It also implements some prebuilt regexps for common constructs, like integer, a datetime() function that accepts strptime() patterns or more complex things like IPv4 or IP_port. Conforming I start using it in (more) real world examples (or issues are filed!), the language will slowly grow.

Almost accidentally, its constructive form brought along a nice feature: you can debug() your expression so you can find out the first sub expression that fails matching:

# this is a real world example!
In [1]: import dinant as d
In [2]: line = '''36569.12ms (cpu 35251.71ms)\n'''
# can you spot the error?
In [3]: render_time_re = ( d.bol + d.capture(d.float, name='wall_time') + 'ms ' +
...:                       '(cpu' + d.capture(d.float, name='cpu_time') + 'ms)' + d.eol )

In [4]: print(render_time_re.match(line))
None

In [5]: print(render_time_re.debug(line))
# ok, this is too verbose (I hope next version will be more human readable)
# but it's clear it's the second capture
Out[5]: '^(?P<wall_time>(?:(?:\\-)?(?:(?:\\d)+)?\\.(?:\\d)+|(?:\\-)?(?:\\d)+\\.|(?:\\-)?(?:\\d)+))ms\\ \\(cpu(?P<cpu_time>(?:(?:\\-)?(?:(?:\\d)+)?\\.(?:\\d)+|(?:\\-)?(?:\\d)+\\.|(?:\\-)?(?:\\d)+))'
# the error is that the text '(cpu' needs a space at the end

Of course, the project is quite simple, so there is no regexp optimizer, which means that the resulting regexpes are less readable than the ones you would had written by hand. The idea is that, besides debugging, you will never have to see them again.

Two features are in the backburner, and both are related. One is to make debugging easier by simply returning a representation of the original expression instead of the internal regexp used. That means, in the previous example, something like:

bol + capture(float, name='wall_time') + 'ms ' + '(cpu' + capture(float, name='cpu_time')

The second is that you can tell which types the different captured groups must convert to. This way, capture(float) would not return the string representing the float, but the actual float. The same for datetime() and others.

As the time of writing the project only lives on GitHub, but it will also be available in PyPI Any Time Soon®. Go grab it!


python ayrton


[1] for someone that knows how to read English, that is.

Posted mié 18 oct 2017 19:42:37 CEST Tags:

TL;DR: How lazy can you be? This post should take you 5 minutes to read... :-P

So npm is out of Debian testing. This means that the hell that is node code handling is now even harder. node's installation instructions is a bloody video from which you can't copy and paste the commands (how useful), and as far as I can tell, it's the official way to install npm.

If you already have a good version of node provided by your trusted distribution, you most probably will cringe on the idea of installing a third party package like this, and probably you don't think containers are the solution, or you just want to install something locally so you can play with it.

If you look closer to the bottom of that page you'll find the "advances user's" guide to install it yourself, but it's only a pattern URL to the distribution .tar.gz, with no further instructions. With a little bit of luck, the instructions will be included. The pattern has a placeholder for the version you want (putatively, the latest), but I can't find, for the life of me, references to which is the latest version.

In the GitHub project page you will find the terrible, unluckily classic curl https://site.com/unknown_script.sh | sh command that downloads this script. The script is in POSIX shell dialect, and has strange constructions:

node=`which node 2>&1`
ret=$?
if [ $ret -eq 0 ] && [ -x "$node" ]; then
  (exit 0)

To me, that exit 0 in a subshell is the equivalent of a NOOP, so I wonder why they decided to write the condition like that.

After checking the availability of a couple of tools (node, tar, make, but not curl), it uses the latter to download JSON from the registry, finding there the actual version (currently 4.5.0, if you're interested). It downloads the package, untars it, and executes:

"$node" cli.js rm npm -gf
"$node" cli.js install -gf

The first removes any old installation. More on that in a minute. The second, obviously, installs the new version. But the -gf options (I hate short options in scripts) are to be guessed, as no help is provided about them. Let's go with --global and --force, which means it will install somewhere in your system and overwriting anything it finds. With the previous command it should have deleted all the files (same options), so you're really nuking whatever was there before.

Nowhere in the instructions so far says anything about root, but obviously this needs to be run as such. There's also this detail:

As of version 0.3, it is recommended to run npm as root. This allows npm to
change the user identifier to the nobody user prior to running any package
build or test commands.

So there's no way to make a local installation of npm... is there? Well, not user wide, only system wide (already explained) and project wide. Here's how to do the latter:

$ wget https://registry.npmjs.org/npm/-/npm-4.5.0.tgz
$ tar xvf npm-4.5.0.tgz  # it's unpacked in a directory called 'package'
$ /usr/bin/node package/cli.js install npm
$ rm -rf package  # clean up after you!
$ ./node_modules/.bin/npm install carto

The third command uses the tarball's CLI interface to install the same version 'the right way'. To be honest, I had already used the old npm version that used to come with Debian to do exactly the same thing. Of course, this works as long as newer version of npm can still be installed with such an old version of the same. Who knows when that's gonna break/be deprecated.

All in all, it's sad to see such an useful tool be dropped like that. I just hope someone can pick up the pieces.


debian nodejs npm

Posted jue 04 may 2017 21:58:58 CEST Tags:

Since I started playing with rendering maps locally I've been modifying and extending the original generate_tiles.py script from mapnik-stilesheets. I added option parsing and lots of features; here's the current usage:

$ ./generate_tiles.py --help
usage: generate_tiles.py [-h] [-b BBOX] [-B BBOX_NAME] [-n MIN_ZOOM]
                        [-x MAX_ZOOM] [--tiles Z,X,Y [Z,X,Y ...]]
                        [-i MAPFILE] [-f FORMAT] [-o TILE_DIR]
                        [-m METATILE_SIZE] [-t THREADS]
                        [-p {threads,fork,single}] [-X] [-N DAYS]
                        [-E {skip,link,render}] [-d] [--dry-run]

optional arguments:
-h, --help            show this help message and exit
-b BBOX, --bbox BBOX
-B BBOX_NAME, --bbox-name BBOX_NAME
-n MIN_ZOOM, --min-zoom MIN_ZOOM
-x MAX_ZOOM, --max-zoom MAX_ZOOM
--tiles Z,X,Y [Z,X,Y ...]
-i MAPFILE, --input-file MAPFILE
-f FORMAT, --format FORMAT
-o TILE_DIR, --output-dir TILE_DIR
-m METATILE_SIZE, --metatile-size METATILE_SIZE
-t THREADS, --threads THREADS
-p {threads,fork,single}, --parallel-method {threads,fork,single}
-X, --skip-existing
-N DAYS, --skip-newer DAYS
-E {skip,link,render}, --empty {skip,link,render}
-d, --debug
--dry-run

BBoxes are stored in a file called bboxes.ini, so I can say -B Europe instead of remembering the coords. The idea of --format is that I should be supporting slippy maps .png file structure or mbtiles, but the latter support is a little lagging behind because I don't have a use for them yet. You can choose to whether use threads (broken because mapnik cannot handle the situation; I can't find a reference to the problem now), child processes (probably the only one working correctly) or a single main process (so no parallelism). It handles resuming a stopped render by not rendering if the tile exists or it's too new. It also can skip writing empty seas tiles.

I use it to rerender my style everytime I make a modification (or just update to the latest openstreetmap-carto, of which is a fork). I usually bulk render a great part of Europe up to ZL 11 or 14, and them some regions down to ZL 18 or 19 as needed for trips or other interests.

For Europe, it can take a long while, so I've been thinking on ways to optimize the rendering. Besides tuning the database, I first found that rendering big metatiles (8x8, for instance) gave a big boost in rendering time. The next idea is to reuse disk cache. When you render a (meta)tile in ZL N, the same data used for rendering it is going to be used for the 4 sub(meta)tiles of ZL N+1 (except when you remove features, which is rare but exists; city labels come to mind). I don't think something could be done at mapnik level, but one can think of the tiles as a tree: a node in ZL N has 4 subtiles in level N+1 and the leafs are the last ZL rendered. The original algorithm did a breath first traveling of this tree, but if you do a depth first algorithm, it could reuse the kernel's page/disk cache for the data collected by mapnik from the database or other files. Also, I can check whether the subtiles are render worthy: if they're only sea, I don't need to render it or its subtiles; I can cut down whole tile trees. The only point at which this could no longer be true is when we start rendering more things on sea, which currently ammounts to ferry routes at ZL 7.

I finished implementing all these ideas, but I don't have any numbers to prove it works. Definitely not rendering sea tiles should be a great improvement, but I don't really know whether the caching idea works. At least it was fun to implement.

So the rendering batch will be cut in 3: ZLs 0-6 in one run, then 7 and 8 with less threads (these ZLs of my style use so much memory the machine starts thrashing), then 9-18/19 with full threads.


elevation openstreetmap

Posted vie 10 mar 2017 20:44:23 CET Tags:

I'm writing a python module that allows me to 'drive' a site using Qt. This means that I can navigate the site, fill forms, submit them and read the resulting pages and scrape them, Selenium style. The reasons I'm using Qt are that it has enough support for the site I'm driving (it's the web frontend of the SIP telephony solution we're using, which has an incomplete API and I have to automatize several aspects not covered by it); there are python bindings; and because I can do it headless: instead of using browser instances, I simply instanciate one QWebPage[1] per thread and that's it.

The first thing I learned today is that JS objects representing the DOM elements have two sets of value holders: attributes and properties. The properties is what in Python we call attributes: the object's elements which are accesible with the '.' operator and hold instance values. The attributes are in fact the HTML element's attributes that gave the properties' initial values. That is, given the following HTML element:

<input type="text" name="foo° id="bar" value="quux">

the initial JS object's attributes and properties will have those values. If you change the value with your browser, the value property of that element will be changed, but not the attribute. When you submit the form, the value properties of all the form elements are used, so if you "only' change the value attribute, that won't be used. So forget attributes. Also, the DOM is the representation of the actual state of the page, but this state is never reflected in the HTML source that you can ask your browser to show, but you see those changes reflected in the browser's debugger. It's like they really wanted[3] to keep initial values apart from current state[2].

On the Qt side, QWebElement is only the DOM element representation, not the JS object[4], so you can't access the properties via its API, but by executing JS[5]:

e = DOMRoot.findFisrt('[name="foo"]')
e.evaluateJavaScript("this.value = 'abracadabra'")

Tonight I finished fixing the most annoying bug I had with this site. To add a user I have to fill a form that is split in 7 'tabs' (which means 7 <div>s with fields where only one is shown at a time). One of the fields on the second tab has a complex JS interaction and I was cracking my skull trying to make it work. Because the JS is reacting to key presses, setting the value property was not triggering it. Next I tried firing a KeyboardEvent in JS, but I didn't succeed. Maybe it was me, maybe the fact that the engine behind QWebPage is the original Webkit and for some reason its JS support is lacking there, who knows.

But the good guys from #qtwebkit gave me a third option: just send plain QKeyEvents to the input element. Luckily we can do that, the web engine is completely built in Qt and supports its event system and more. I only had to give focus to the widget.

Again, I tried with JS and failed[7], so I went back cheating with Qt behind curtains. QWebElemnt.geometry() returns the QRect of the QWidget that implements the input element; I just took the .center() of it, and generated a pair of mouse button press/release events in that point. One further detail is that the .geometry() won't be right unless I force the second tab to be shown, forcing the field to be drawn. Still, for some reason getting a reference to the input field on page load (when I'm trying to figure out which fields are available, which in the long run does not make sense, as fields could easily be created or destroyed on demand with JS) does not return an object that will be updated after the widget is repositioned, so asking its geometry returns ((0, -1), (-1, 0)), which amounts to an invalid geometry. The solution is to just get the reference to the input field after forcing the div/tab to be shown.

Finally, I create a pair of key press/release events for each character of the string I wanted as value, and seasoned everything with a lot of QMainLoop.processEvents(). Another advantage of using the Qt stuff is that while I was testing I could plug a QWebView, sprinkle some time.sleep() of various lengths, and see how it behaved. Now I can simply remove that to be back to headlessness.

I'm not sure I'll publish the code; as you can see, it's quite hacky and it will require a lot of cleanup to be able to publish it without a brown paper bag in my head.


[1] Yes, I'm using qt5.5 because that's what I will have available in the production server.

[2] Although as I said, you can change the attributes and so you lose the original values.

[3] I guess the answer is in in the spec.

[4] I think i got it: QWebElement is the C++ class that is used in WebKit to represent the HTML tree, the real DOM, while somewhere deeper in there are the classes representing the JS objects which you just can't reach[6].

[5] This clearly shows that there is a connection between the DOM object and the JS one, you just can't access it via the API.

[6] This is the original footnote: Or something like that. Look, I'm an engineer and I usually want to know how things work, but since my first exposure to HTML, CSS and JS, back in the time when support was flaky and fragmented on purpose, I always wanted to stay as far away from them as possible. Things got much better, but as you can see the details are still somewhat obscure. I guess, I hope the answer is in the spec.

[7] With this I mean that I executed something and it didn't trigger the events it should, and there's no practical way to figure out why.


python pyqt

Posted mar 24 ene 2017 20:25:26 CET Tags:

This is the second time I spent hours looking for this, so this time I'm writing it down.

My 10 year old Dell Inspiron 1420N, which is now my home server where I keep several useful online tools, has two problems: The keyboard and the LCD do not work. Well, the LCD works erratically, most of the times a couple of seconds after boot. The first problem can be fixed by attaching a USB keyboard, and the second by attaching an external screen.

Except that the machine does not enable the VGA output by default; but no problem, you just press Fn+F8 and voilà, external screen works. Except that external keyboards do not have the Fn key; but no problem, you can emulate it with Scroll Lock by just telling the BIOS to do so.

But you can't do it if you can't see anything on the screen. To do it blindly, you have to either know you BOIS by heart or find any reference online. I don't know that BIOS by heart, mainly because it's been a loooong while since I had to use it for anything, but also because I barely touch that machine anymore. And online references, well, there are none for models so old.

One of the possible solutions it occured to me that could help was to try to run a BIOS image, which you can still download from Dell's site (!), under qemu, but this tool cannot run arbitrary BIOSes. A pity, but understandable.

So without further ado, a schematic of the BIOS contents and how to fix this blindly:

- System
| System Info         <-- the cursor starts here
| Processor Info
| Memory Info
| Device Info
| Battery Info
| Battery Health
| Date/Time
| Boot Sequence
+ Onboard Devices
+ Video
+ Security
+ Performance
+ Power Management
+ Maintenance
- POST Behaviour      <-- 14 * <Down> + <Enter> and the following menu opens
| Adapter Warnings
| Fn Key Emulation    <-- 2 * <Down> + <Enter> and the setup screen opens
| Fast Boot
| Virtualization
| Keypad (embedded)
| Numlock LED
| USB Emulation
+ Wireless

The setup screen is quite simple, it has two options, Off and Scroll Lock, and you move with <Left> and <Right>. I'm not sure if it's needed, but pressing <Enter> to choose your option does not hurt. Then you press <Esc>, which gives you the Exit screen. This screen has three options: Remain in Setup (which is selected), Save/Exit and Discard/Exit. Guess which one you want :^) Just press <Right>, <Enter> and you're done! The machine reboots and now you can use <Scroll Lock>+<F8> in your external keyboard to activate the external screen.


misc

Posted dom 15 ene 2017 21:20:34 CET Tags:

Last night I realized the first point. Checking today I found the latter. Early, often, go!

  • ayrton-0.9 has debug on. It will leave lots of files laying around your file system.
  • Modify the release script to do not allow this never ever more.
  • make install was not running the tests.

Get it on github or pypi!


python ayrton

Posted mié 07 dic 2016 14:10:40 CET Tags:

Another release, but this time not (only) a bugfix one. After playing with bool semantics I converted the file tests from a _X format, which, let's face it, was not pretty, into the more usual -X format. This alone merits a change in the minor version number. Also, _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND.

In other news, I had to drop support for Pyhton-3.3, because otherwise I would have to complexify the import system a lot.

But in the end, yes, this also is a bugfix release. Lost of fd leaks where plugged, so I suggest you to upgrade if you can. Just remember the s/_X/-X/ change. I found all the leaks thanks to unitest's warnings, even if sometimes they were a little misleading:

testRemoteCommandStdout (tests.test_remote.RealRemoteTests) ... ayrton/parser/pyparser/parser.py:175: <span class="createlink">ResourceWarning</span>: unclosed <socket.socket fd=5, family=AddressFamily.AF_UNIX, type=SocketKind.SOCK_STREAM, proto=0, raddr=/tmp/ssh-XZxnYoIQxZX9/agent.7248>
  self.stack[-1] = (dfa, next_state, node)

The file and line cited in the warning have nothing to do with the warning itself (it was not the one who raised it) or the leaked fd, so it took me a while to find were those leaks were coming from. I hope I have some time to find why this is so. The most frustrating thing was that unitest closes the leaking fd, which is nice, but in one of the test cases it was closing it seemingly before the test finished, and the test failed because the socket was closed:

======================================================================
ERROR: testLocalVarToRemoteToLocal (tests.test_remote.RealRemoteTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 225, in wrapper
    test (self)
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 235, in testLocalVarToRemoteToLocal
    self.runner.run_file ('ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay')
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 304, in run_file
    return self.run_script (script, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 323, in run_script
    return self.run_tree (tree, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 336, in run_tree
    return self.run_code (code, file_name, argv)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 421, in run_code
    raise error
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 402, in run_code
    exec (code, self.globals, self.locals)
File "ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay", line 6, in <module>
    with remote ('127.0.0.1', _test=True):
File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 362, in __enter__
    i, o, e= self.prepare_connections (backchannel_port, command)
File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 270, in prepare_connections
    self.client.connect (self.hostname, *self.args, **self.kwargs)
File "/usr/lib/python3/dist-packages/paramiko/client.py", line 338, in connect
    t.start_client()
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 493, in start_client
    raise e
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1757, in run
    self.kex_engine.parse_next(ptype, m)
File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 75, in parse_next
    return self._parse_kexdh_reply(m)
File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 112, in _parse_kexdh_reply
    self.transport._activate_outbound()
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 2079, in _activate_outbound
    self._send_message(m)
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1566, in _send_message
    self.packetizer.send_message(data)
File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 364, in send_message
    self.write_all(out)
File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 314, in write_all
    raise EOFError()
EOFError

This probably has something to do with the fact that the test (a functional test, really) is using threads and real sockets. Again, I'll try to investigate this.

All in all, the release is an interesting one. I'll keep adding small features and releasing, let's see how it goes. Meanwhile, here's the changelog:

  • The 'No Government' release.
  • Test functions are no longer called _X but -X, which is more scripting friendly.
  • Some if those tests had to be fixed.
  • Dropped support for py3.3 because the importer does not work there.
  • tox support, but not yet part of the stable test suite.
  • Lots and lots of more tests.
  • Lots of improvements in the remote() tests; in particular, make sure they don't hang waiting for someone who's not gonna come.
  • Ignore ssh remote() tests if there's not password/phrase-less connection.
  • Fixed several fd leaks.
  • _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND. Mostly used internally.

Get it on github or pypi!


python ayrton

Posted mar 06 dic 2016 19:46:11 CET Tags:

I'll keep this short. During the weekend I found a bug in ayrton. I fixed it in develop, and decided to make a release with it, because it was kind of a showstopper. It was the first time I decided to use ayrton for a oneliner. It was this one:

ayrton -c "rm(v=True, locate('.xvpics', _out=Capture))"

See, ayrton's native support for filenames with spaces makes it a perfect replacement for find and xargs and tools like that. That command simply finds all the files or directories called like .xvpics using locate and removes them. There is a little bit of magic where locate's output becomes rm's arguments, but probably not magic enough: _out=Capture has to be specified. We'll probably fix that in the near future.

So, enjoy the new release. It just fixes a couple of bugs, one of them directly related to this oneliner. Here's the changelog:

  • The 'Release From The Bus' release.
  • Bugfix release.
  • Argv should not be created with an empty list.
  • Missing dependencies.
  • Several typos.
  • Fix for _h().
  • Handle paramiko exceptions.
  • Calling ayrton -c <script> was failing because the file name properly was not properly (f|b)aked.
  • ayrton --version didn't work!

Get it on github or pypi!

Meanwhile, a little about its future. I have been working on ayrton on and off. Right now I'm gathering energy to modify pypy's Python parser so it supports py3.6's formatted string literals. With this I can later update ayrton's parser, which is based on pypy's. A part of it has been done, but then I run out of gas. I think FSLs are perfect for ayrton in its aim to replace shell script languages. In other news, there's a nasty remote() bug that I can't pin down. These two things might mean that there won't be a significant release for a while.


python ayrton

Posted lun 21 nov 2016 22:16:53 CET Tags:

I was trying to modify ayrton so we could really have sh[1]-style file tests. In sh, they're defined as unary operators in the -X form[2], where X is a letter. For instance, -f foo returns true (0 in sh-peak) if foo is some kind of file. In ayrton I defined them as functions you could use, but the names sucked a little. -f was called _f() and so on. Part of the reason is, I think, that both python-sh and ayrton already do some -/_ manipulations in executable names, and part because I thought that -True didn't make any sense.

A couple of days ago I came with the idea that I could symply call the function f() and (ab)use the fact that - is a unary operator. The only detail was to make sure that - didn't change the truthiness of bools. In fact, it doesn't, but this surprised me a little, although it shouldn't have:

In [1]: -True
Out[1]: -1

In [2]: -False
Out[2]: 0

In [3]: if -True: print ('yes!')
yes!

In [4]: if -False: print ('yes!')

You see, the bool type was introduced in Python-2.3 all the way back in 2003. Before that, the concept of true was represented by any 'true' object, and most of the time as the integer 1; false was mostly 0. In Python-2.2.1, True and False were added to the builtins, but only as other names for 1 and 0. According the that page and the PEP, bool is a subtype of int so you could still do arithmetic operations like True+1 (!!!), but I'm pretty sure deep down below the just wanted to be retro compatible.

I have to be honest, I don't like that, or the fact that applying - to bools convert them to ints, so I decided to subclass bool and implement __neg__() in such a way that it returns the original value. And that's when I got the real surprise:

In [5]: class FalseBool (bool):
   ...:     pass
   ...:
TypeError: type 'bool' is not an acceptable base type

Probably you didn't know (I didn't), but Python has such a thing as a 'final class' flag. It can only be used while defining classes in a C extension. It's a strange flag, because most of the classes have to declare it just to be subclassable; it's not even part of the default flags. Even more surprising, is that there are a lot of classes that are not subclassable: around 124 in Python-3.6, and only 84 that are subclassable.

So there you go. You learn something new every day. If you're curious, here's the final implementation of FalseBool:

class FalseBool:
    def __init__ (self, value):
        if not isinstance (value, bool):
            raise ValueError

        self.value= value

    def __bool__ (self):
        return self.value

    def __neg__ (self):
        return self.value

This will go in ayrton's next release, which I hope will be soon. I'm also working in implementing all of the different styles of expansion found in bash. I even seem to have found some bugs in it.


python ayrton


[1] I'm talking about the shell, not to confuse with python-sh.

[2] Well, there are a couple of infix binary operands in the form -XY.

Posted vie 21 oct 2016 18:17:46 CEST Tags:

Today I had to setup 3 Firefox profiles, because I started a new job, and I realized I never documented which extensions I use or why, so I had to work a little from memory. Hence, this post, which I plan to keep up-to-date as much as possible.

A little bit of rationale first. I'm very privacy-conscious, but at the same time very pragmatic. I use several profiles to add an extra level of data isolation. That also allows me to have different sets of extensions, because some are some intrusive that they break some non-important sites' functionality.

Finally, the list, in no particular order:

  • FlashGot, by Giorgio Maone: Better downloads handling.

  • Go-Mobile, by 'Geek in Training': A lot of sites are actually more useful (read, with less crap on them) in their Mobile versions. This plugins lets you switch from one to the other.

  • HTTPS everywhere, by EFF: Don't navigate in the clear anymore.

  • No Script, also by Giorgio Maone: A broad spectrum antibiotic. Not loading JS makes pages less CPU intensive, plus sites cannot track you if you don't make requests, plus also blocks videos.

  • Privacy Badger, also by EFF: In their own words, “protects privacy by blocking spying ads and invisible trackers”.

  • Tab Auto Reload, by 'Schuzak': I use this to reload sites that constantly log you out, but only under certain circumstances.

  • Tab mix plus, by 'onemen': Once upon a time ffox didn't have session management/recovery. Now it does, but not very good; I still think TMP's ones are better. Also, duplicate tab.

  • Toggle animated GIFs, by Simon Lindholm: Stop annoying animations. Just make sure to tick 'Pause GIFs by default'.

  • uBlock Origin, by Raymond Hill: an (ad) blocker, goodbye-adiós 15s ad videos in youtube.[1]

So that's it. Unluckily there's nothing against browser fingerprinting yet (and my browser ranks as quite unique), and I don't know how much can be/has been implemented by [Mozilla]. If you have other suggestions about plugins, please do in the comments below. As I said, I'll try to keep this post up to date.


misc


[1] I used to use ABP, but it seems it became a protection scam.

Posted vie 14 oct 2016 19:30:38 CEST Tags:

I just uploaded my first semi-automated change. This change was generated with my hack for generating centerlines for riverbank polygons. This time I expanded it to include a JOSM plugin which will take all the closed polygons from the selection and run the algorithm on them, creating new lines. It still needs some polishing, like making sure they're riverbanks and copying useful tags to the new line, and probably running a simplifying algo at some point. Also, even simple looking polygons might generate complex lines (in plural, and some of these lines could be spurious), so some manual editing might be required afterwards, specially connecting the new line to existing centerlines. Still, I think it's useful.

Like I mentioned last time, its setup is quite complex: The JOSM plugin calls a Python script that needs the Python module installed. That module, for lack of proper bindings for SFCGAL, depends on PostgreSQL+PostGIS (but we all have one, right? :-[ ), and connection strings are currently hardcoded. All that to say: it's quite hacky, not even alpha quality from the installation point of view.

Lastly, as imagico mentioned in the first post about this hack, the algorithms are not fast, and I already made my computer start thrashing the disk swapping like Hell because pg hit a big polygon and started using lots of RAM to calculate its centerline. At least this time I can see how complex the polygons are before handing them to the code. As an initial benchmark, the original data for that changeset (I later simplified it with JOSM's tool) took 0.063927s in pg+gis and 0.004737s in the Python code. More test will come later.

Okey, one last thing: Java is hard for a Pythonista. At some point it took me 2h40 to write 60 lines of code, ~2m40 per line!


openstreetmap gis python

Posted lun 29 ago 2016 20:06:39 CEST Tags:

A month ago I revived my old-laptop-as-server I have at home. I don't do much in it, just serve my photos, a map, provide a ssh trampoline for me and some friends and not much more. This time I decided to tackle one of the most annoying problems I had with it: That closing the lid led to the system to suspend.

Now, the setup in that computer has evolved through some years, so a lot of cruft was left on it. For instance, at some point I solved the problem by installing a desktop and telling it not to suspend the machine, mostly because that's how I configure my current laptop. That, of course, was a cannon-for-killing-flies solution, but it worked, so I could focus in other things. Also, a lot of power-related packages were installed, assuming the were really needed for supporting everything I might ever wanted to do about power. This is the story on how I removed them all, why, and how I solved the lid problem... twice.

First thing to go were the desktop packages, mostly because the screen in that laptop has been dead for more than a year now, and because its new space in the house is a small shelf in my wooden desktop. Then I reviewed the power-related packages one by one and decided whether I needed it or not. This is more or less what I found:

  • acpi-fakekey: This package has a tool for injecting fake ACPI keystrokes in the input system. Not really needed.
  • acpi-support: It has a lot of scripts that can be run when some ACPI events occur. For instance, lid closing, battery/AC status, but also things like responding to power and even 'multimedia' keys. Nice, but not needed in my case; the lid is going to be closed all the time anyways.
  • laptop-mode-tools: Tools for saving power in your laptop. Not needed either, the server is going to be running all the time on AC (its battery also died some time ago).
  • upower: D-Bus interface for power events. No desktop or anything else to listen to them. Gone.
  • pm-utils: Nice CLI scripts for suspending/hibernating your system. I always have them around in my laptop because sometimes the desktops don't work properly. No use in my server, but it's cruft left from when I used it as my laptop. Adieu.

Even then, closing the lid led to the system suspending. Who else could be there? Well, there is one project who's being everywhere: systemd. I'm not saying this is bad, but it is everywhere. Thing is, its login subsystem also handles ACPI events. In the /etc/systemd/logind.conf file you can read the following lines:

#HandlePowerKey=poweroff
#HandleSuspendKey=suspend
#HandleHibernateKey=hibernate
#HandleLidSwitch=suspend
#HandleLidSwitchDocked=ignore

so I uncommented the 4th line and changed it so:

HandleLidSwitch=ignore

Here you can also configure how the inhibition of actions work:

#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes

Please check the config file's doc if you plan to modify it.

Not entirely unrelated, my main laptop also started suspending when I closed the lid. I have it configured, through the desktop environment, to only turn off the screen, because what use is the screen if it's facing the keyboard and touchpad :) Somehow, these settings only recently started to be in effect, but a quick search didn't gave any results on when things changed. Remembering what I did with the server, I just changed that config file to:

HandlePowerKey=ignore
HandleSuspendKey=ignore
HandleHibernateKey=ignore
HandleLidSwitch=ignore
HandleLidSwitchDocked=ignore

That is, “let me configure this through the desktop, please”, and now I have my old behavior back :)


PS: I should start reading more about systemd. A good starting point seems to be all the links in its home page.


sysadmin systemd acpi

Posted dom 28 ago 2016 17:34:56 CEST Tags:

Dear conference speakers:

Please test write your slides at 1024x768 resolution, on a projector.

Dear conference organizers:

Please remind your speakers to do so.

Dear conference attendants:

If you are at the back of the room and you can't see the text/code in the slides the speaker(s) is showing, please shout “I cant see shit!” in the appropriate language, and try to embarrass the speaker as much as possible.

Thanks in advance. Yours truly,

-- Marcos.

PS: I've been watching videos of talks in some conferences and I swear to $DEITY at in least 40% of the ones I was interested in, I couldn't read the code on the video. Sometimes the fonts are too small, sometimes the colors are not contrasting enough. Please, at least test your slides on a projector...

PSS: I know the resolution I'm suggesting is low. Be happy I'm not asking for 640x480 :-P

PSSS: Ok, attendants, don't embarrass/harass the speakers :)


rants

Posted jue 25 ago 2016 14:19:38 CEST Tags:

Like I said in my last post[1], I'm looking at last PyCon's videos. Here are my selected videos, in the order I saw them:

Ned Batchelder - Machete-mode debugging: Hacking your way out of a tight spot. In fact, I saw this twice.

Larry Hastings - Removing Python's GIL: The Gilectomy

The Report Of Twisted’s Death or: Why Twisted and Tornado Are Relevant In The Asyncio Age

How I built a power debugger out of the standard library and things I found on the internet

Davey Shafik - HTTP/2 and Asynchronous APIs

Sumana Harihareswara - HTTP Can Do That?! Points for informative and funny.

Matthias Kramm - Python Typology Types are comming, so get used to them.

Scott Sanderson, Joe Jevnik - Playing with Python Bytecode Nice, very nice trick. I'm talking about the way the presentation is given.

Brian Warner - Magic Wormhole - Simple Secure File Transfer

And of course, the lighning talks. I always like these, because you can get exposed to any kind of things, some not even remotely connected to Python, but which can get your brain rolling down nice little bunny holes, or at least get a smile from you. So here:

LT#1. Please watch it at least between 20-25m.

LT#2

LT#3

LT#4

And of course, check the other ones, don't stop at my own interests.


[1] Yes, I started writing this a month ago.


python

Posted vie 19 ago 2016 17:35:45 CEST Tags:

Long time for this release. A couple of hard bugs (which fix was just moving a line down a little), a big-ish new feature, and moving in a new city. Here's the ChangeLog:

  • You can import ayrton modules and packages!
  • Depends on Python3.5 now.
  • argv is not quite a list: for some operations (len(), iter(), pop()), argv[0] is left alone.
  • option() raises KeyError or ValueError if the option or its 'argument' is wrong.
  • makedirs() and stat() are available as functions.
  • -p|--pdb launches pdb when there is an unhandled exception.
  • Fix for line in foo(...): ... by automatically adding the _bg=True option.
  • Better Command() detection.
  • A lot of internal fixes.

Get it on github or pypi!


python ayrton

Posted mié 17 ago 2016 13:17:22 CEST Tags:

My latest Europe import was quite eventful. First, I run out of space several times during the import itself, at indexing time. The good thing is that, if you manage to reclaim some space, and reading a little of source code[1], you can replay the missing queries by hand and stop cursing. To be fair, osm2pgsql currently uses a lot of space in slim+flat-nodes mode: three tables, planet_osm_node, planet_osm_way and planet_osm_relation; and one file, the flat nodes one. Those are not deleted until the whole process has finished, but they're actually not needed after the processing phase. I started working on fixing that.

But that was not the most difficult part. The most difficult part was that I forgot, somehow, to add a column to the import.style file. Elevation, my own style, renders different icons for different types of castles (and forts too), just like the Historic Place map of the Hiking and Bridle map[2]. So today I sat down and tried to figure out how to reparse the OSM extract I used for the import to add this info.

The first step is to add the column to the tables. But first, which tables should be impacted? Well, the line I should have added to the import style is this:

node,way   castle_type  text         polygon

That says that this applies to nodes and ways. If the element is a way, polygon will try to convert it to a polygon and put it in the planet_osm_polygon table; if it's a node, it ends in the planet_osm_point table. So we just add the column to those tables:

ALTER TABLE planet_osm_point   ADD COLUMN castle_type text;
ALTER TABLE planet_osm_polygon ADD COLUMN castle_type text;

Now how to process the extract? Enter pyosmium. It's a Python binding for the osmium library with a stream-like type of processing à la expat for processing XML. The interface is quite simple: one subclasses osmium.SimpleHandler, defines the element type handlers (node(), way() and/or relation()) and that's it! Here's the full code of the simple Python script I did:

#! /usr/bin/python3

import osmium
import psycopg2

conn= psycopg2.connect ('dbname=gis')
cur= conn.cursor ()

class CastleTypes (osmium.SimpleHandler):

    def process (self, thing, table):
        if 'castle_type' in thing.tags:
            try:
                name= thing.tags['name']
            # osmium/boost do not raise a KeyError here!
            # SystemError: <Boost.Python.function object at 0x1329cd0> returned a result with an error set
            except (KeyError, SystemError):
                name= ''
            print (table, thing.id, name)

            cur.execute ('''UPDATE '''+table+
                         ''' SET castle_type = %s
                            WHERE osm_id = %s''',
                         (thing.tags['castle_type'], thing.id))

    def node (self, n):
        self.process (n, 'planet_osm_point')

    def way (self, w):
        self.process (w, 'planet_osm_polygon')

    relation= way  # handle them the same way (*honk*)

ct= CastleTypes ()
ct.apply_file ('europe-latest.osm.pbf')

The only strange part of the API is that it doesn't seem to raise a KeyError when the tag does not exist, but a SystemError. I'll try to figure this out later. Also interesting is the big amount of unnamed elements with this tag that exist in the DB.


[1] I would love for GitHub to recognize something like https://github.com/openstreetmap/osm2pgsql/blob/master/table.cpp#table_t::stop and be directed to that method, because #Lxxx gets old pretty quick.

[2] I just noticed how much more complete those maps are. more ideas to use :)


openstreetmap gis python

Posted mar 09 ago 2016 17:45:56 CEST Tags:

You are capable of planning your future.