Shelling Python

Marcos Dione

2013-07-21 01:08

Since a long time I've been toying with the idea of having a better programing language for (shell) scripts than bash. Put in another way, a better shell language. Note that I'm not looking for a better shell per se, i just want a language that has better data manipulation than the rudimentary set of tools that normal shells give. I might as well be overlooking more powerful shells than bash like zsh, but so far I have seen them more pluggable than anything else. Alas, if that language could be Python, the better.

Enter sh. It's a nice Python module that allows you to call programs as if the were mere functions defined in it. Behind the curtains sh does some magic passes to make it so. It is fairly documented, commented (in the code) and maintained (via GitHub issues).

So I started using sh for replacing some shell scripts I had for their Python equivalents. So far the experience has been more or less satisfactory, with some papercuts. I thinks it's easier to explain with a simpl-ish example:

Exhibit A. Exhibit B. Try to view the side to side, I aligned them as much as I could. The Python version then diverged to using another set of data.

The most notable thing to notice is that the data manipulation in Python is so better done. This stems from the fact that bash has no float handling, much less concepts like floor or ceiling, So instead of a couple of ifs in the inner loop, I have to define three arrays, fill them according to some cases that handle the 'signs' of two different 'floats' (they're strings, really). Also setting the variables west, south, east and north is not only simpler, but it also has more error checking. We also save a loop: Python's version has two nested loops, bash's has three.

Now, if you squint a little, you'll see where Python starts to drag. One of the first things to do is to import a lot of modules. It's impressive how many, seven from the standard library, sh itself and one of my own (file_test). Then we try to figure the extent of this PBF file by piping the output of one command into another. In bash this is just a matter of, you know, using a pipe. sh provides us functions and we can even nest them, making the ouput of the inner command go to the outer one. I can live with that, but for someone coming from shell scripting might (just might) find it confusing.

There's something I think will definetely confuse a shell scripter (a SysAdmin?): the fact that by default, sh makes the commands think that their stdout is a TTY, while that is not the case. In my case that meant that osmpbf-outline¹ spat out colors for formatting the output, which meant that I had to explicitly say that the stdout should be a plain file (_tty_out=False). Also, at the beginning, the error handling of sh took me by surprise. That's why at first I said that an error code of 1 is ok (_ok_code=1), while later I did proper error handling with a try: ... except: block.

Notice that I also tried to use Python modules and functions were it made as much or more (or the only) sense than using an external command, like when I use os.chdir()² or os.unlink() instead of rm.

Also I find lacking in sh is more functions to do shell expansion. It only handles globbing, and only because Python's glob.glob() returns None if the pattern does not match any file, while bash leaves the patter as it is.

So my conclusion is that sh is already a very goo step forward towards what I would like to have in a shell scripting language, but I see space for some improvements. That's why I started hacking another module, called ayrton to try to tackle all this. Notice that at the beginning I tried to hack it in such a way that instead of having to say sh.ls or from sh import ls, you would simply use ls and me in the backstage would do all the juggling necessary to be equivalent to those³. That is not possible without breaking Python itself, but now that I'm starting to convert my scripts, I see a place for it. I will also try to incorporate whatever I hack back into sh. I'll glob about its details soon.

In the meantime, before I start to really document it, you can take a look of the current version of the most advanced script in ayrton so far⁴.

Notice how in the Python script this is written as osmpbf_outline. This is due to the fact that function names in Python cannot have -s in them (it's the 'minus' operator), so sh makes a trick where if you put a _ in the function name and such a command does not exist, it will try replacing them all with -s ans try again. Hacky, if you want, but works for me. ↩
There's no cd command, that's a bash builtin command, and it wouldn't make any senses anyways, as the change would only affect the subcommand and not the Python process. ↩
This includes messing with the resolution of builtins, which strangely works in Python3 but only from the interactive interpreter. I tried to figure out why, but after a while I decided that if it didn't work out of the box, and because what I wanted to do was a terribly ugly hack, I dropped out. ↩
I chose the .ay extension. I hope it doesn't become .ay! :) ↩