Creating your own importer in Python

ayrton has always been able to use any Python module, package or extension as long as it is in a directory in sys.path, but trying to solve a bigger bug, I realized that there was no way to use ayrton modules or packages. Having only laterally heard about the new importlib module and the new mechanism, I sat down and read more about it.

The best source (or at least the easiest to find) is possibly what Python's reference says about the import system, but I have to be honest: it was not an easy read. Next week I'll sit down and see if I can improve it a little. So, for those out there who, like me, might be having some troubles understanding the mechanism, here's how I understand the system works (ignoring deprecated APIs and corner cases or even relative imports; I haven't used or tried those yet):

def import_single(full_path, parent=None, module=None):
    # try this cache first
    if full_path in sys.modules:
        return sys.modules[full_path]

    # if not, try all the finders
    for finder in sys.meta_path:
        if parent is not None:
            spec = finder.find_spec(full_path, parent.__path__, target)
        else:
            spec = finder.find_spec(full_path, None, target)

        # if the finder 'finds' ('knows how to handle') the full_path
        # it will return a loader
        if spec is not None:
            loader = spec.loader

            if module is None and hasattr(loader, 'create_module'):
                module = loader.create_module(spec)

            if module is None:
                module = ModuleType(spec.name)  # let's assume this creates an empty module object
                module.__spec__ = spec

            # add it to the cache before loading so it can referenced from it
            sys.modules[spec.name] = module
            try:
                # if the module was passed as parameter,
                # this repopulates the module's namespace
                # by executing the module's (possibly new) code
                loader.exec_module(module)
            except:
                # clean up
                del sys.modules[spec.name]
                raise

            return module

    raise ImportError


def import (full_path, target=None):
    parent= None

    # this code iterates over ['foo', 'foo.bar', 'foo.bar.baz']
    elems = full_path.split('.')
    for partial_path in [ '.'.join (elems[:i]) for i in range (len (elems)+1) ][1:]
        parent = import_single(partial_path, parent, target)

    # the module is loaded in parent
    return parent

A more complete version of the if spec is not None branch can be found in the Loading section of the reference. Notice that the algorithm uses all the finders in sys.meta_path. So which are the default finders?

In [^9]: sys.meta_path
Out[^9]:
[_frozen_importlib.BuiltinImporter,
 _frozen_importlib.FrozenImporter,
 _frozen_importlib_external.PathFinder]

Of those finders, the latter one is the one that traverses sys.path, and also has a hook mechanism. I didn't use those, so for the moment I didn't untangle how they work.

Finally, this is how I implemented importing ayrton modules and packages:

from importlib.abc import MetaPathFinder, Loader
from importlib.machinery import ModuleSpec
import sys
import os
import os.path

from ayrton.file_test import _a, _d
from ayrton import Ayrton
import ayrton.utils


class AyrtonLoader (Loader):

    @classmethod
    def exec_module (klass, module):
        # «the loader should execute the module’s code
        # in the module’s global name space (module.__dict__).»
        load_path= module.__spec__.origin
        loader= Ayrton (g=module.__dict__)
        loader.run_file (load_path)

        # set the __path__
        # TODO: read PEP 420
        init_file_name= '__init__.ay'
        if load_path.endswith (init_file_name):
            # also remove the '/'
            module.__path__= [ load_path[:-len (init_file_name)-1] ]

loader= AyrtonLoader ()


class AyrtonFinder (MetaPathFinder):

    @classmethod
    def find_spec (klass, full_name, paths=None, target=None):
        # TODO: read PEP 420 :)
        last_mile= full_name.split ('.')[-1]

        if paths is not None:
            python_path= paths  # search only in the paths provided by the machinery
        else:
            python_path= sys.path

        for path in python_path:
            full_path= os.path.join (path, last_mile)
            init_full_path= os.path.join (full_path, '__init__.ay')
            module_full_path= full_path+'.ay'

            if _d (full_path) and _a (init_full_path):
                return ModuleSpec (full_name, loader, origin=init_full_path)

            else:
                if _a (module_full_path):
                    return ModuleSpec (full_name, loader, origin=module_full_path)

        return None

finder= AyrtonFinder ()


# I must insert it at the beginning so it goes before FileFinder
sys.meta_path.insert (0, finder)

Notice all the references to PEP 420. I'm pretty sure I must be breaking something, but for the moment this works.