<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>.:: Marcos Dione/StyXman's glob ::. (Posts about ayrton)</title><link>https://www.grulic.org.ar/~mdione/glob/</link><description></description><atom:link href="https://www.grulic.org.ar/~mdione/glob/categories/ayrton.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2025 &lt;a href="mailto:mdione@grulic.org.ar"&gt;Marcos Dione&lt;/a&gt; </copyright><lastBuildDate>Sat, 15 Nov 2025 20:52:05 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Customizing the Python language</title><link>https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/</link><dc:creator>Marcos Dione</dc:creator><description>&lt;p&gt;Programming languages can be viewed as three things: their syntax and data model,
their standard
library and the third party libraries you can use. All these define the
expressiveness of the language, and determine what can you write (which problems
you can solve) and how easily or not. This post/talk is about how expressive I think
Python is, and how easy it is or not to change it.&lt;/p&gt;
&lt;p&gt;I said that we solve problems by writing (programs), but in fact, Python can solve
several problems without really writing a program. You can use the interpreter
as a calculator, or use some of the modules a programs:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;http.server&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;8000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With that you can serve the current directory via HTTP. Or do this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;timeit&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'"-".join(str(n) for n in range(100))'&lt;/span&gt;
&lt;span class="m"&gt;10000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;loops,&lt;span class="w"&gt; &lt;/span&gt;best&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;30&lt;/span&gt;.2&lt;span class="w"&gt; &lt;/span&gt;usec&lt;span class="w"&gt; &lt;/span&gt;per&lt;span class="w"&gt; &lt;/span&gt;loop
$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;timeit&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'"-".join([str(n) for n in range(100)])'&lt;/span&gt;
&lt;span class="m"&gt;10000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;loops,&lt;span class="w"&gt; &lt;/span&gt;best&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;27&lt;/span&gt;.5&lt;span class="w"&gt; &lt;/span&gt;usec&lt;span class="w"&gt; &lt;/span&gt;per&lt;span class="w"&gt; &lt;/span&gt;loop
$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;timeit&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'"-".join(map(str, range(100)))'&lt;/span&gt;
&lt;span class="m"&gt;10000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;loops,&lt;span class="w"&gt; &lt;/span&gt;best&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;23&lt;/span&gt;.2&lt;span class="w"&gt; &lt;/span&gt;usec&lt;span class="w"&gt; &lt;/span&gt;per&lt;span class="w"&gt; &lt;/span&gt;loop
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;to check which method is faster. Notice that these are modules in the standard
library, so you get this functionality out of the box. Of course, you could also
install some third party module that has this kind of capability. I find this way
of using modules as programs
very useful, and I would like to encourage module writers to consider providing
such interfaces with your modules if you think it makes sense.&lt;/p&gt;
&lt;p&gt;Similarly, there
are even programs written in Python that can also be used as modules, which I
think should also be considered by all program writers. For instance, I would really
like that &lt;code&gt;ssh&lt;/code&gt; was also a library; of course, we have &lt;code&gt;paramiko&lt;/code&gt;, but I think
it's a waste of precious developer time to reimplement the wheel.&lt;/p&gt;
&lt;p&gt;The next approach I want to show is glue code. The idea is that you take
modules, functions and classes, use them as building blocks, and write a few
lines of code that combine them to provide something that didn't exist before:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;centerlines&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;psycopg2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;shapely.geometry&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;shapely.wkt&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;shapely.wkb&lt;/span&gt;

&lt;span class="n"&gt;tolerance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.00001&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdin&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dbname&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'gis'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'FeatureCollection'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;feature&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'features'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shapely&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'geometry'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;simplify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;medials&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;centerlines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;skeleton_medials_from_postgis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;medials&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;centerlines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend_medials&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;medials&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;medials&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shapely&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MultiLineString&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt; &lt;span class="n"&gt;medial&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;simplify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                                 &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;medial&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;medials&lt;/span&gt; &lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;ans&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'features'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Feature'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                &lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shapely&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;medials&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ans&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This example does something quite complex: it takes a JSON representation of a
polygon from &lt;code&gt;stdin&lt;/code&gt;, calculates the centerline of that polygon, convert is back
to a JSON representation and outputs that to &lt;code&gt;stdout&lt;/code&gt;. You could say that I'm
cheating; most of the complexity is hidden in the &lt;code&gt;shapely&lt;/code&gt; and &lt;code&gt;centerlines&lt;/code&gt;
modules, and I'm using PostgreSQL to do the actual calculation, but this is what
we developers do, right?&lt;/p&gt;
&lt;p&gt;Once the building blocks are not enough, it's time to write our own. We can write
new functions or classes that solve or model part of the problem and we keep adding
glue until we're finished. In fact, in the previous example,
&lt;code&gt;centerlines.skeleton_medials_from_postgis()&lt;/code&gt; and &lt;code&gt;centerlines.extend_medials()&lt;/code&gt;
are functions that were written for solving this problem in particular.&lt;/p&gt;
&lt;p&gt;But the expressiveness of the language does not stop at function or method call
and parameter passing; there are also operators and other protocols. For instance,
instead of the pure OO call &lt;code&gt;2.add(3)&lt;/code&gt;, we can simply write &lt;code&gt;2 + 3&lt;/code&gt;, which makes
a lot of sense given our
background from 1st grade. Another example which I love is this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# [...]&lt;/span&gt;
    &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;versus&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# [...]&lt;/span&gt;
&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The second version is not only shorter, it's less error prone, as we can easily
forget to do the second &lt;code&gt;line = file.readline()&lt;/code&gt; and iterate forever on the same
line. All this is possible thanks to Python's
&lt;a href="https://docs.python.org/3/reference/datamodel.html#special-method-names"&gt;special methods&lt;/a&gt;,
which is a section of the Python reference that I definitely recommend reading.
This technique allowed me to implement things like this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;command1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;command2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;which makes a lot of sense if you have a shell scripting background; or this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;cd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# this is executed in path&lt;/span&gt;

&lt;span class="c1"&gt;# this is executed back on the original directory&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;which also will ring a bell for those of you who are used to &lt;code&gt;bash&lt;/code&gt; (but for
those of you who don't, it's written as &lt;code&gt;( cd path; ... )&lt;/code&gt;). I can now even
write this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;remote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# this body excecutes remotely in hostname via ssh&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Following this same pattern with the file example above, we can even simplify it
further like so:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# [...]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This has the advantage that not only relieves us from closing the file, that
would happen even if an unhandled exception is raised within the &lt;code&gt;with&lt;/code&gt; block.&lt;/p&gt;
&lt;p&gt;Special methods is one of my favorite features of Python. One could argue that
this is the ultimate language customization, that not much more can be done.
But I'm here to tell you that there is more, that you can still go further. But first let
me tell you that I lied to you: the pipe and &lt;code&gt;remote()&lt;/code&gt; examples I just gave you
are not (only) implemented with special methods. In fact, I'm using a more extreme
resource: AST meddling.&lt;/p&gt;
&lt;p&gt;As any other programming language, Python execution goes through the steps of a
compiler: tokenizing, parsing, proper compilation and execution. Luckily Python
gives us access
to the intermediate representation between the parsing and compilation steps,
know as Abstract Syntax Tree, using the &lt;code&gt;ast.parse()&lt;/code&gt; function. Then we can
modify this tree at our will and use other functions and classes in the &lt;code&gt;ast&lt;/code&gt;
module to make sure this modifications are still a valid AST, and finally use
&lt;code&gt;compile()&lt;/code&gt; and &lt;code&gt;exec()&lt;/code&gt; to execute the modified tree.&lt;/p&gt;
&lt;p&gt;For instance, this is how I implemented &lt;code&gt;|&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CrazyASTTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NodeTransformer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;visit_BinOp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;BitOr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# BinOp( left=Call1(...), op=BitOr(), right=Call2(...) )&lt;/span&gt;
            &lt;span class="n"&gt;update_keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'_out'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Pipe'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;
            &lt;span class="n"&gt;update_keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'_bg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'True'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;
            &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fix_missing_locations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;update_keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'_in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;
            &lt;span class="c1"&gt;# Call2(_in=Call1(...), _out=Pipe, _bg=True)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I used &lt;code&gt;Call1&lt;/code&gt; and &lt;code&gt;Call2&lt;/code&gt; to show which is which; they're really &lt;code&gt;ast.Call&lt;/code&gt;
objects, which represent a function call. Of course, once I rewrote the tree,
most of the code for how the commands are called and how the pipe is set up
is in the class that implements commands, which is quite more complex.&lt;/p&gt;
&lt;p&gt;For &lt;code&gt;remote()&lt;/code&gt; I did something even
more extreme: I took the AST of the body of the context manager, I &lt;code&gt;pickle()&lt;/code&gt;'d
it, added it as an extra parameter to &lt;code&gt;remote()&lt;/code&gt;, and replaced it with &lt;code&gt;pass&lt;/code&gt; as
the body of the context manager, so the AST becomes the equivalent of:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;remote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ast_of_body_pickled&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;When the context manager really executes, I send the AST over the &lt;code&gt;ssh&lt;/code&gt; connection
together with the &lt;code&gt;locals()&lt;/code&gt; and &lt;code&gt;globals()&lt;/code&gt; (its execution context), unpickle in the
other side, restore the context, continue with the &lt;code&gt;compile()/exec()&lt;/code&gt; dance, and
finally repickle the context and send it back. This way the body can see its
scope, and its modifications to it are seen in the original machine.&lt;/p&gt;
&lt;p&gt;And that should be it. We reached the final frontier of language customization,
while maintaining compatibility, through the AST, with the original interpreter...&lt;/p&gt;
&lt;p&gt;Or did we? What else could we do? We certainly can't&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt; modify the compiler
or the execution Virtual Machine, and we already modify the AST, can we do something
with Python's tokenizer or parser? Well, like the compiler and the VM, they're written in
C, and modifying them would force us to fork the interpreter, with all the
drawbacks of maintaining it. But can we make another parser?&lt;/p&gt;
&lt;p&gt;On one hand, the Python standard library provides a couple of modules to implement
your own parsers: &lt;code&gt;tokenize&lt;/code&gt; and &lt;code&gt;parser&lt;/code&gt;. If we're inventing a new language,
this is one way to go, but if we just want a few minor changes to the original
Python language, we must implement the whole tokenizer/parser pair. Do we have
other options?&lt;/p&gt;
&lt;p&gt;There &lt;em&gt;is&lt;/em&gt; another, but not a simple one. &lt;code&gt;pypy&lt;/code&gt; is, among other things,
a Python implementation written entirely in (r)Python. This implementation runs
under Python legacy (2.x), but it can parse and run current Python (3.x)
syntax&lt;sup id="fnref:4"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fn:4"&gt;4&lt;/a&gt;&lt;/sup&gt;. This implementation includes the tokenizer, the parser, its own AST
implementation&lt;sup id="fnref:2"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fn:2"&gt;2&lt;/a&gt;&lt;/sup&gt;, and, of course, a compiler and the VM. This is all free software,
so we can&lt;sup id="fnref:3"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fn:3"&gt;3&lt;/a&gt;&lt;/sup&gt; take the tokenizer/parser combination, modify it
at will, and as long as we produce a valid (c)Python AST, we can still execute it
in the cPython compiler/VM combination.&lt;/p&gt;
&lt;p&gt;There are three main reasons to modify this code. First, to make it produce a
valid cPython AST,
we will need to modify it a lot; cPython's &lt;code&gt;compile()&lt;/code&gt; function accepts only ASTs
built with instances of the classes from the &lt;code&gt;ast&lt;/code&gt; module (or &lt;code&gt;str&lt;/code&gt; or &lt;code&gt;bytes&lt;/code&gt;&lt;sup id="fnref:5"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fn:5"&gt;5&lt;/a&gt;&lt;/sup&gt;),
it does not indulge into duck-typing. &lt;code&gt;pypy&lt;/code&gt; produces ASTs with instances of its
own implementation of the &lt;code&gt;ast&lt;/code&gt; module; rewriting the code is tiresome but not
difficult.&lt;/p&gt;
&lt;p&gt;Second, on the receiving side, if we're trying to parse and execute a
particular version of Python, we must run it at least under the oldest Python
version that handles that syntax. For
instance, when I wanted to support f-strings in my language, I had no option but
to run the language on top of Python-3.6, because that's when they were
introduced. This meant that a big part
of the modifications we have to do is to convert it to Py3.&lt;/p&gt;
&lt;p&gt;Finally, we must
modify it so it accepts the syntax we want; otherwise, why bother? :)&lt;/p&gt;
&lt;p&gt;So what do we get with all this fooling around? Now we can modify
the syntax so, for instance, we can accept expressions as keyword argument names,
or remove the restriction that keyword and positional arguments must be in a
particular order:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;grep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;quiet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'mdione'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'/etc/passwd'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After we modify the parser, it's able to generate an AST, but this AST is
invalid because the compiler will reject it. So we still have to recourse to
more AST meddling before passing it to the compiler. What I did for the parameter
meddling was to create a &lt;code&gt;o()&lt;/code&gt; function which accepts a key and a value, so
&lt;code&gt;--quiet=True&lt;/code&gt; becomes the AST equivalent of &lt;code&gt;o('--quiet', True)&lt;/code&gt;. Once we've
finished this meddling, the original, official, unmodified interpreter will
happily execute our monster.&lt;/p&gt;
&lt;p&gt;All of these techniques are used in &lt;code&gt;ayrton&lt;/code&gt; in some way or another, even the
first one: I use &lt;code&gt;python3 -m unittest discover ayrton&lt;/code&gt; to run the unit tests!&lt;/p&gt;
&lt;div class="footnote"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Well, technically we &lt;strong&gt;can&lt;/strong&gt;, it's free software, remember! &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fnref:1" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;The cPython AST, while being part of the standard library, is not guaranteed
  to be stable from versions to version, so we can't really consider it as part
  of the API. I think this is the reason why other implementations took the liberty
  to do it their own way. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fnref:2" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;... as long as we respect the license. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fnref:3" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;In fact some of the work is implemented in the &lt;code&gt;py3.5&lt;/code&gt; branch, not yet merged
  into &lt;code&gt;default&lt;/code&gt;. I'm using the code from this branch. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fnref:4" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;This would also be another avenue: feed &lt;code&gt;compile()&lt;/code&gt; the definite bytecode,
  but that looks like doing a lot of effort, way more than what I explain here. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/#fnref:5" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description><category>ayrton</category><category>python</category><guid>https://www.grulic.org.ar/~mdione/glob/posts/customizing-the-python-language/</guid><pubDate>Sun, 10 Feb 2019 20:06:09 GMT</pubDate></item><item><title>ayrton 0.9.1</title><link>https://www.grulic.org.ar/~mdione/glob/posts/ayrton-0.9.1/</link><dc:creator>Marcos Dione</dc:creator><description>&lt;p&gt;Last night I realized the first point. Checking today I found the latter.
Early, often, go!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ayrton-0.9&lt;/code&gt; has debug on. It will leave lots of files laying around your file system.&lt;/li&gt;
&lt;li&gt;Modify the release script to do not allow this never ever more.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;make install&lt;/code&gt; was not running the tests.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Get it on &lt;a href="https://github.com/StyXman/ayrton/releases/tag/release-0.9.1"&gt;github&lt;/a&gt; or
&lt;a href="https://pypi.python.org/pypi/ayrton/0.9.1"&gt;pypi&lt;/a&gt;!&lt;/p&gt;</description><category>ayrton</category><category>python</category><guid>https://www.grulic.org.ar/~mdione/glob/posts/ayrton-0.9.1/</guid><pubDate>Wed, 07 Dec 2016 13:10:40 GMT</pubDate></item></channel></rss>