Sandboxing WhatsApp

I never wanted this. Whatsapp and its parent company are some of the things I most hate about what tech has become, showing the utter lack of ethics in a industry that has too much impact on the rest of the planet. Just go and read all the reports about groups abusing these platforms (and them allowing them to) to change politics all over the world, or all the shitty security a lot of IoT stuff has, and how they're used to attack services on the Internet.

But reality is more complex than that. In my home country, Facebook and Whatsapp specifically are very popular, to the point that, due to our lack of net-neutrality laws, phone companies offer cheap contracts where those two application's data usage do not count as such, becoming completely free. This means people almost stopped using the phone or SMSs, but instead send text, pictures and voice messages via these platforms. That includes my whole family, which also almost stopped using email.

So for the moment I left my principles aside and installed the app. My last attempt at this had failed because you can't install it on a tablet; the device has to have phone capabilities. Even more, when you try to register it, it forces you to use a cellular phone number; Signal at least has the decency to let you register it to a land line too (if it can't send you an SMS, it gives you the option of being called and the registering code is spelled to you). Luckily I had a spare number from a throwaway line I bought in my last trip to homeland, so I used that number instead of the real one. I know it's a useless step, it's equivalent to giving the finger to someone's back.

Once installed, I tried to send a message to my wife. The app denies you to do so if you don't give it in exchange access to your contacts. Again luckily for me, this phone was mostly empty, but I still took steps to avoid giving it all my contacts. The few I had were already sync'ed to my owncloud instance back home.

First, I exported all my contacts locally and deleted them all. I reimported them after I got the app running. Then I created a new, empty owncloud account, so when Whatsapp asked me which 'account' to use to get/sync the contacts, I gave it that one. This way, when you add contacs, they go to this 'honeypot' and it doesn't have access to your real Contacts. If you don't have a owncloud or similar service you control, you can simply create a bogus Google account and use that instead. The only downside is that you will get dupe'd contacts, but once you sent them a message, you can safely delete the contact and even completely disable sync'ing the account. You can also revoque the permission to access Contacts, but that means you're back to square one, except for the conversations you already have started.

I'm sorry can't give you the exact steps I did, I was on the bus, and with all the failing attempts I lost track. Of course, removing all the contacts means that you only see phone numbers and their photos, but after a while you can recognize them by that. Right now I only have my wife and my family's group, and I hope I can keep it like that for a long, long time.

One last thing: Whatsapp asks you for your contacts, but you can't nicely ask them back: the phone numbers of new contacts are very difficult to extract. You either export them to the Contacts Account if you still have around (I didn't) or you copy them by hand (which I did). Last but not least, I still have the nagging sensation that Whatsapp would have been able to read the contacts; I really whish that Android would gives us more fine grained firewall capabilities. Also, remember that Whatsapp has no option to store media in an SD card, only the phone's internal storage (WTF, people, seriously!), and it's a pain in the ass to clean up the stuff you don't want. So for the moment I haven't gave it access to Photos, Media and Files.

dinant 0.5

I have a love and hate relantionship with regular expressions (regexps). On one side they're a very powerful tool for text processing, but on the other side of the coin, the most well known implementation is a language whose syntax is so dense, it's hard to read beyond the most basic phrases. This clashes with my intention of trying to make programs as readable as possible1. It's true that you can add comments and make your regexps span several lines so you can digest them more slowly, but to me it feels like eating dried up soup by the teaspoon directly from the package without adding hot water.

So I started reading regexps aloud and writing down how I describe them in natural language. This way, [a-z]+ becomes one or more of any of the letters between lowercase a and lowercase z, but of course this is way too verbose.

Then I picked up these descriptions and tried to come up with a series of names (in the Pyhton sense) that could be combined to build the same regexps. Even 'literate' programs are not really plain English, but a more condensed version, while still readable. Otherwise you end up with Perl, and not many think that's a good idea. So, that regexp becomes one_or_more(any_of('a-z')). As you can see, some regexp language can still be recognizable, but it's the lesser part.

So, dinant was born. It's a single source file module that implements that language and some other variants (any_of(['a-z'], times=[1, ]), etc). It also implements some prebuilt regexps for common constructs, like integer, a datetime() function that accepts strptime() patterns or more complex things like IPv4 or IP_port. Conforming I start using it in (more) real world examples (or issues are filed!), the language will slowly grow.

Almost accidentally, its constructive form brought along a nice feature: you can debug() your expression so you can find out the first sub expression that fails matching:

# this is a real world example!
In [^1]: import dinant as d
In [^2]: line = '''36569.12ms (cpu 35251.71ms)\n'''
# can you spot the error?
In [^3]: render_time_re = ( d.bol + d.capture(d.float, name='wall_time') + 'ms ' +
...:                       '(cpu' + d.capture(d.float, name='cpu_time') + 'ms)' + d.eol )

In [^4]: print(render_time_re.match(line))
None

In [^5]: print(render_time_re.debug(line))
# ok, this is too verbose (I hope next version will be more human readable)
# but it's clear it's the second capture
Out[^5]: '^(?P<wall_time>(?:(?:\\-)?(?:(?:\\d)+)?\\.(?:\\d)+|(?:\\-)?(?:\\d)+\\.|(?:\\-)?(?:\\d)+))ms\\ \\(cpu(?P<cpu_time>(?:(?:\\-)?(?:(?:\\d)+)?\\.(?:\\d)+|(?:\\-)?(?:\\d)+\\.|(?:\\-)?(?:\\d)+))'
# the error is that the text '(cpu' needs a space at the end

Of course, the project is quite simple, so there is no regexp optimizer, which means that the resulting regexpes are less readable than the ones you would had written by hand. The idea is that, besides debugging, you will never have to see them again.

Two features are in the backburner, and both are related. One is to make debugging easier by simply returning a representation of the original expression instead of the internal regexp used. That means, in the previous example, something like:

bol + capture(float, name='wall_time') + 'ms ' + '(cpu' + capture(float, name='cpu_time')

The second is that you can tell which types the different captured groups must convert to. This way, capture(float) would not return the string representing the float, but the actual float. The same for datetime() and others.

As the time of writing the project only lives on GitHub, but it will also be available in PyPI Any Time Soon®. Go grab it!


  1. for someone that knows how to read English, that is. 

Installing npm on Debian testing

TL;DR: How lazy can you be? This post should take you 5 minutes to read... :-P

So npm is out of Debian testing. This means that the hell that is node code handling is now even harder. node's installation instructions is a bloody video from which you can't even copy and paste the commands (how useful), and as far as I can tell, it's the official way to install npm.

If you already have a good version of node provided by your trusted distribution, you most probably will cringe on the idea of installing a third party package like this, and probably you don't think containers are the solution, or you just want to install something locally so you can play with it.

If you look closer to the bottom of that page you'll find the "advances user's" guide to install it yourself, but it's only a pattern URL to the distribution .tar.gz, with no further instructions. With a little bit of luck, the instructions will be included. The pattern has a placeholder for the version you want (putatively, the latest), but I can't find, for the life of me, references to which is the latest version.

In the GitHub project page you will find the terrible, unluckily classic curl https://site.com/unknown_script.sh | sh command that downloads this script. The script is in POSIX shell dialect, and has strange constructions:

node=`which node 2>&1`
ret=$?
if [ $ret -eq 0 ] && [ -x "$node" ]; then
  (exit 0)

To me, that exit 0 in a subshell is the equivalent of a NOOP, so I wonder why they decided to write the condition like that.

After checking the availability of a couple of tools (node, tar, make, but not curl), it uses the latter to download JSON from the registry, finding there the actual version (currently 4.5.0, if you're interested). It downloads the package, untars it, and executes:

"$node" cli.js rm npm -gf
"$node" cli.js install -gf

The first removes any old installation. More on that in a minute. The second, obviously, installs the new version. But the -gf options (I hate short options in scripts) are to be guessed, as no help is provided about them. Let's go with --global and --force, which means it will install somewhere in your system and overwriting anything it finds. With the previous command it should have deleted all the files (same options), so you're really nuking whatever was there before.

Nowhere in the instructions so far says anything about root, but obviously this needs to be run as such. There's also this detail:

As of version 0.3, it is recommended to run npm as root. This allows npm to
change the user identifier to the nobody user prior to running any package
build or test commands.

So there's no way to make a local installation of npm... is there? Well, not user wide, only system wide (already explained) and project wide. Here's how to do the latter:

$ wget https://registry.npmjs.org/npm/-/npm-4.5.0.tgz
$ tar xvf npm-4.5.0.tgz  # it's unpacked in a directory called 'package'
$ /usr/bin/node package/cli.js install npm
$ rm -rf package  # clean up after you!
$ ./node_modules/.bin/npm install carto

The third command uses the tarball's CLI interface to install the same version 'the right way'. To be honest, I had already used the old npm version that used to come with Debian to do exactly the same thing. Of course, this works as long as newer version of npm can still be installed with such an old version of the same. Who knows when that's gonna break/be deprecated.

All in all, it's sad to see such an useful tool be dropped like that. I just hope someone can pick up the pieces.

Optimizing the render stack

Since I started playing with rendering maps locally I've been modifying and extending the original generate_tiles.py script from mapnik-stilesheets. I added option parsing and lots of features; here's the current usage:

$ ./generate_tiles.py --help
usage: generate_tiles.py [-h] [-b BBOX] [-B BBOX_NAME] [-n MIN_ZOOM]
                        [-x MAX_ZOOM] [--tiles Z,X,Y [Z,X,Y ...]]
                        [-i MAPFILE] [-f FORMAT] [-o TILE_DIR]
                        [-m METATILE_SIZE] [-t THREADS]
                        [-p {threads,fork,single}] [-X] [-N DAYS]
                        [-E {skip,link,render}] [-d] [--dry-run]

optional arguments:
-h, --help            show this help message and exit
-b BBOX, --bbox BBOX
-B BBOX_NAME, --bbox-name BBOX_NAME
-n MIN_ZOOM, --min-zoom MIN_ZOOM
-x MAX_ZOOM, --max-zoom MAX_ZOOM
--tiles Z,X,Y [Z,X,Y ...]
-i MAPFILE, --input-file MAPFILE
-f FORMAT, --format FORMAT
-o TILE_DIR, --output-dir TILE_DIR
-m METATILE_SIZE, --metatile-size METATILE_SIZE
-t THREADS, --threads THREADS
-p {threads,fork,single}, --parallel-method {threads,fork,single}
-X, --skip-existing
-N DAYS, --skip-newer DAYS
-E {skip,link,render}, --empty {skip,link,render}
-d, --debug
--dry-run

BBoxes are stored in a file called bboxes.ini, so I can say -B Europe instead of remembering the coords. The idea of --format is that I should be supporting slippy maps .png file structure or mbtiles, but the latter support is a little lagging behind because I don't have a use for them yet. You can choose to whether use threads (broken because mapnik cannot handle the situation; I can't find a reference to the problem now), child processes (probably the only one working correctly) or a single main process (so no parallelism). It handles resuming a stopped render by not rendering if the tile exists or it's too new. It also can skip writing empty seas tiles.

I use it to rerender my style everytime I make a modification (or just update to the latest openstreetmap-carto, of which is a fork). I usually bulk render a great part of Europe up to ZL 11 or 14, and them some regions down to ZL 18 or 19 as needed for trips or other interests.

For Europe, it can take a long while, so I've been thinking on ways to optimize the rendering. Besides tuning the database, I first found that rendering big metatiles (8x8, for instance) gave a big boost in rendering time. The next idea is to reuse disk cache. When you render a (meta)tile in ZL N, the same data used for rendering it is going to be used for the 4 sub(meta)tiles of ZL N+1 (except when you remove features, which is rare but exists; city labels come to mind). I don't think something could be done at mapnik level, but one can think of the tiles as a tree: a node in ZL N has 4 subtiles in level N+1 and the leafs are the last ZL rendered. The original algorithm did a breath first traveling of this tree, but if you do a depth first algorithm, it could reuse the kernel's page/disk cache for the data collected by mapnik from the database or other files. Also, I can check whether the subtiles are render worthy: if they're only sea, I don't need to render it or its subtiles; I can cut down whole tile trees. The only point at which this could no longer be true is when we start rendering more things on sea, which currently ammounts to ferry routes at ZL 7.

I finished implementing all these ideas, but I don't have any numbers to prove it works. Definitely not rendering sea tiles should be a great improvement, but I don't really know whether the caching idea works. At least it was fun to implement.

So the rendering batch will be cut in 3: ZLs 0-6 in one run, then 7 and 8 with less threads (these ZLs of my style use so much memory the machine starts thrashing), then 9-18/19 with full threads.

Implementing Selenium with Python and Qt

I'm writing a python module that allows me to 'drive' a site using Qt. This means that I can navigate the site, fill forms, submit them and read the resulting pages and scrape them, Selenium style. The reasons I'm using Qt are that it has enough support for the site I'm driving (it's the web frontend of the SIP telephony solution we're using, which has an incomplete API and I have to automatize several aspects not covered by it); there are python bindings; and because I can do it headless: instead of using browser instances, I simply instanciate one QWebPage1 per thread and that's it.

The first thing I learned today is that JS objects representing the DOM elements have two sets of value holders: attributes and properties. The properties is what in Python we call attributes: the object's elements which are accesible with the '.' operator and hold instance values. The attributes are in fact the HTML element's attributes that gave the properties' initial values. That is, given the following HTML element:

<input type="text" name="foo° id="bar" value="quux">

the initial JS object's attributes and properties will have those values. If you change the value with your browser, the value property of that element will be changed, but not the value attribute. When you submit the form, the value properties of all the form elements are used, so if you "only' change the value attribute, that won't be used. So, forget attributes. Also, the DOM is the representation of the actual state of the page, but this state is never reflected in the HTML source that you can ask your browser to show, but you see those changes reflected in the browser's debugger. It's like they really wanted3 to keep initial values apart from current state2.

On the Qt side, QWebElement is only the DOM element representation, not the JS object4, so you can't access the properties via its API, but by executing JS5:

e = DOMRoot.findFisrt('[name="foo"]')
e.evaluateJavaScript("this.value = 'abracadabra'")

Tonight I finished fixing the most annoying bug I had with this site. To add a user I have to fill a form that is split in 7 'tabs' (which means 7 <div>s with fields where only one is shown at a time). One of the fields on the second tab has a complex JS interaction and I was cracking my skull trying to make it work. Because the JS is reacting to key presses, setting the value property was not triggering it. Next I tried firing a KeyboardEvent in JS, but I didn't succeed. Maybe it was me, maybe the fact that the engine behind QWebPage is the original Webkit and for some reason its JS support is lacking there, who knows.

But the good guys from #qtwebkit gave me a third option: just send plain QKeyEvents to the input element. Luckily we can do that, the web engine is completely built in Qt and supports its event system and more. I only had to give focus to the widget.

Again, I tried with JS and failed7, so I went back cheating with Qt behind curtains. QWebElemnt.geometry() returns the QRect of the QWidget that implements the input element; I just took the .center() of it, and generated a pair of mouse button press/release events in that point. One further detail is that the .geometry() won't be right unless I force the second tab to be shown, forcing the field to be drawn. Still, for some reason getting a reference to the input field on page load (when I'm trying to figure out which fields are available, which in the long run does not make sense, as fields could easily be created or destroyed on demand with JS) does not return an object that will be updated after the widget is repositioned, so asking its geometry returns ((0, -1), (-1, 0)), which amounts to an invalid geometry. The solution is to just get the reference to the input field after forcing the div/tab to be shown.

Finally, I create a pair of key press/release events for each character of the string I wanted as value, and seasoned everything with a lot of QMainLoop.processEvents(). Another advantage of using the Qt stuff is that while I was testing I could plug a QWebView, sprinkle some time.sleep() of various lengths, and see how it behaved. Now I can simply remove that to be back to headlessness.

I'm not sure I'll publish the code; as you can see, it's quite hacky and it will require a lot of cleanup to be able to publish it without a brown paper bag in my head.


  1. Yes, I'm using qt5.5 because that's what I will have available in the production server. 

  2. Although as I said, you can change the attributes and so you lose the original values. 

  3. I guess the answer is in in the spec. 

  4. I think i got it: QWebElement is the C++ class that is used in WebKit to represent the HTML tree, the real DOM, while somewhere deeper in there are the classes representing the JS objects which you just can't reach6:. 

  5. This clearly shows that there is a connection between the DOM object and the JS one, you just can't access it via the API. 

  6. This is the original footnote: Or something like that. Look, I'm an engineer and I usually want to know how things work, but since my first exposure to HTML, CSS and JS, back in the time when support was flaky and fragmented on purpose, I always wanted to stay as far away from them as possible. Things got much better, but as you can see the details are still somewhat obscure. I guess, I hope the answer is in the spec. 

  7. With this I mean that I executed something and it didn't trigger the events it should, and there's no practical way to figure out why. 

Activate external screen with external keyboard in Dell Inspiron 1420

This is the second time I spent hours looking for this, so this time I'm writing it down.

My 10 year old Dell Inspiron 1420N, which is now my home server where I keep several useful online tools, has two problems: The keyboard and the LCD do not work. Well, the LCD works erratically, most of the times a couple of seconds after boot. The first problem can be fixed by attaching a USB keyboard, and the second by attaching an external screen.

Except that the machine does not enable the VGA output by default; but no problem, you just press Fn+F8 and voilà, external screen works. Except that external keyboards do not have the Fn key; but no problem, you can emulate it with Scroll Lock by just telling the BIOS to do so.

But you can't do it if you can't see anything on the screen. To do it blindly, you have to either know you BOIS by heart or find any reference online. I don't know that BIOS by heart, mainly because it's been a loooong while since I had to use it for anything, but also because I barely touch that machine anymore. And online references, well, there are none for models so old.

One of the possible solutions it occured to me that could help was to try to run a BIOS image, which you can still download from Dell's site (!), under qemu, but this tool cannot run arbitrary BIOSes. A pity, but understandable.

So without further ado, a schematic of the BIOS contents and how to fix this blindly:

- System
| System Info         <-- the cursor starts here
| Processor Info
| Memory Info
| Device Info
| Battery Info
| Battery Health
| Date/Time
| Boot Sequence
+ Onboard Devices
+ Video
+ Security
+ Performance
+ Power Management
+ Maintenance
- POST Behaviour      <-- 14 * <Down> + <Enter> and the following menu opens
| Adapter Warnings
| Fn Key Emulation    <-- 2 * <Down> + <Enter> and the setup screen opens
| Fast Boot
| Virtualization
| Keypad (embedded)
| Numlock LED
| USB Emulation
+ Wireless

The setup screen is quite simple, it has two options, Off and Scroll Lock, and you move with <Left> and <Right>. I'm not sure if it's needed, but pressing <Enter> to choose your option does not hurt. Then you press <Esc>, which gives you the Exit screen. This screen has three options: Remain in Setup (which is selected), Save/Exit and Discard/Exit. Guess which one you want :^) Just press <Right>, <Enter> and you're done! The machine reboots and now you can use <Scroll Lock>+<F8> in your external keyboard to activate the external screen.

ayrton 0.9.1

Last night I realized the first point. Checking today I found the latter. Early, often, go!

  • ayrton-0.9 has debug on. It will leave lots of files laying around your file system.
  • Modify the release script to do not allow this never ever more.
  • make install was not running the tests.

Get it on github or pypi!

ayrton 0.9

Another release, but this time not (only) a bugfix one. After playing with bool semantics I converted the file tests from a _X format, which, let's face it, was not pretty, into the more usual -X format. This alone merits a change in the minor version number. Also, _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND.

In other news, I had to drop support for Pyhton-3.3, because otherwise I would have to complexify the import system a lot.

But in the end, yes, this also is a bugfix release. Lost of fd leaks where plugged, so I suggest you to upgrade if you can. Just remember the s/_X/-X/ change. I found all the leaks thanks to unitest's warnings, even if sometimes they were a little misleading:

testRemoteCommandStdout (tests.test_remote.RealRemoteTests) ... ayrton/parser/pyparser/parser.py:175: ResourceWarning: unclosed <socket.socket fd=5, family=AddressFamily.AF_UNIX, type=SocketKind.SOCK_STREAM, proto=0, raddr=/tmp/ssh-XZxnYoIQxZX9/agent.7248>
  self.stack[-1] = (dfa, next_state, node)

The file and line cited in the warning have nothing to do with the warning itself (it was not the one who raised it) or the leaked fd, so it took me a while to find were those leaks were coming from. I hope I have some time to find why this is so. The most frustrating thing was that unitest closes the leaking fd, which is nice, but in one of the test cases it was closing it seemingly before the test finished, and the test failed because the socket was closed:

======================================================================
ERROR: testLocalVarToRemoteToLocal (tests.test_remote.RealRemoteTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 225, in wrapper
    test (self)
File "/home/mdione/src/projects/ayrton_clean/ayrton/tests/test_remote.py", line 235, in testLocalVarToRemoteToLocal
    self.runner.run_file ('ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay')
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 304, in run_file
    return self.run_script (script, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 323, in run_script
    return self.run_tree (tree, file_name, argv, params)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 336, in run_tree
    return self.run_code (code, file_name, argv)
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 421, in run_code
    raise error
File "/home/mdione/src/projects/ayrton_clean/ayrton/__init__.py", line 402, in run_code
    exec (code, self.globals, self.locals)
File "ayrton/tests/scripts/testLocalVarToRealRemoteToLocal.ay", line 6, in <module>
    with remote ('127.0.0.1', _test=True):
File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 362, in __enter__
    i, o, e= self.prepare_connections (backchannel_port, command)
File "/home/mdione/src/projects/ayrton_clean/ayrton/remote.py", line 270, in prepare_connections
    self.client.connect (self.hostname, *self.args, **self.kwargs)
File "/usr/lib/python3/dist-packages/paramiko/client.py", line 338, in connect
    t.start_client()
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 493, in start_client
    raise e
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1757, in run
    self.kex_engine.parse_next(ptype, m)
File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 75, in parse_next
    return self._parse_kexdh_reply(m)
File "/usr/lib/python3/dist-packages/paramiko/kex_group1.py", line 112, in _parse_kexdh_reply
    self.transport._activate_outbound()
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 2079, in _activate_outbound
    self._send_message(m)
File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1566, in _send_message
    self.packetizer.send_message(data)
File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 364, in send_message
    self.write_all(out)
File "/usr/lib/python3/dist-packages/paramiko/packet.py", line 314, in write_all
    raise EOFError()
EOFError

This probably has something to do with the fact that the test (a functional test, really) is using threads and real sockets. Again, I'll try to investigate this.

All in all, the release is an interesting one. I'll keep adding small features and releasing, let's see how it goes. Meanwhile, here's the changelog:

  • The 'No Government' release.
  • Test functions are no longer called _X but -X, which is more scripting friendly.
  • Some if those tests had to be fixed.
  • Dropped support for py3.3 because the importer does not work there.
  • tox support, but not yet part of the stable test suite.
  • Lots and lots of more tests.
  • Lots of improvements in the remote() tests; in particular, make sure they don't hang waiting for someone who's not gonna come.
  • Ignore ssh remote() tests if there's not password/phrase-less connection.
  • Fixed several fd leaks.
  • _in, _out and _err also accept a tuple (path, flags), so you can specify things like os.O_APPEND. Mostly used internally.

Get it on github or pypi!

ayrton 0.8.1.0

I'll keep this short. During the weekend I found a bug in ayrton. I fixed it in develop, and decided to make a release with it, because it was kind of a showstopper. It was the first time I decided to use ayrton for a oneliner. It was this one:

ayrton -c "rm(f=True, v=True, locate('.xvpics', _out=Capture))"

See, ayrton's native support for filenames with spaces makes it a perfect replacement for find and xargs and tools like that. That command simply finds all the files or directories called like .xvpics using locate and removes them. There is a little bit of magic where locate's output becomes rm's arguments, but probably not magic enough: _out=Capture has to be specified. We'll probably fix that in the near future.

So, enjoy the new release. It just fixes a couple of bugs, one of them directly related to this oneliner. Here's the changelog:

  • The 'Release From The Bus' release.
  • Bugfix release.
  • Argv should not be created with an empty list.
  • Missing dependencies.
  • Several typos.
  • Fix for _h().
  • Handle paramiko exceptions.
  • Calling ayrton -c <script> was failing because the file name was not properly (f|b)aked.
  • ayrton --version didn't work!

Get it on github or pypi!

Meanwhile, a little about its future. I have been working on ayrton on and off. Right now I'm gathering energy to modify pypy's Python parser so it supports py3.6's formatted string literals. With this I can later update ayrton's parser, which is based on pypy's. A part of it has been done, but then I run out of gas. I think FSLs are perfect for ayrton in its aim to replace shell script languages. In other news, there's a nasty remote() bug that I can't pin down. These two things might mean that there won't be a significant release for a while.

The truth about bool in Python

I was trying to modify ayrton so we could really have sh1-style file tests. In sh, they're defined as unary operators in the -X form2, where X is a letter. For instance, -f foo returns true (0 in sh-peak) if foo is some kind of file. In ayrton I defined them as functions you could use, but the names sucked a little. -f was called _f() and so on. Part of the reason is, I think, that both python-sh and ayrton already do some -/_ manipulations in executable names, and part because I thought that -True didn't make any sense.

A couple of days ago I came with the idea that I could symply call the function f() and (ab)use the fact that - is a unary operator. The only detail was to make sure that - didn't change the truthiness of bools. In fact, it doesn't, but this surprised me a little, although it shouldn't have:

In [^1]: -True
Out[^1]: -1

In [^2]: -False
Out[^2]: 0

In [^3]: if -True: print ('yes!')
yes!

In [^4]: if -False: print ('yes!')

You see, the bool type was introduced in Python-2.3 all the way back in 2003. Before that, the concept of true was represented by any 'true' object, and most of the time as the integer 1; false was mostly 0. In Python-2.2.1, True and False were added to the builtins, but only as other names for 1 and 0. According the that page and the PEP, bool is a subtype of int so you could still do arithmetic operations like True+1 (!!!), but I'm pretty sure deep down below the just wanted to be retro compatible.

I have to be honest, I don't like that, or the fact that applying - to bools convert them to ints, so I decided to subclass bool and implement __neg__() in such a way that it returns the original value. And that's when I got the real surprise:

In [^5]: class FalseBool (bool):
   ...:     pass
   ...:
TypeError: type 'bool' is not an acceptable base type

Probably you didn't know (I didn't), but Python has such a thing as a 'final class' flag. It can only be used while defining classes in a C extension. It's a strange flag, because most of the classes have to declare it just to be subclassable; it's not even part of the default flags. Even more surprising, is that there are a lot of classes that are not subclassable: around 124 in Python-3.6, and only 84 that are subclassable.

So there you go. You learn something new every day. If you're curious, here's the final implementation of FalseBool:

class FalseBool:
    def __init__ (self, value):
        if not isinstance (value, bool):
            raise ValueError

        self.value= value

    def __bool__ (self):
        return self.value

    def __neg__ (self):
        return self.value

This will go in ayrton's next release, which I hope will be soon. I'm also working in implementing all of the different styles of expansion found in bash. I even seem to have found some bugs in it.


  1. I'm talking about the shell, not to confuse with python-sh

  2. Well, there are a couple of infix binary operands in the form -XY