SQLAlchemy: failed to locate name could not determine join condition

There used to be a time where I used SQLAlchemy a lot. Then life took a turn and its use was left inside the bowels of barely maintained code. It kept on working, no matter how many times my distro updated it. That's good; good technology is the one you don't hear about anymore.

A few months back I wanted to add to my photo filtering app support for reading and writing the digikam database. I already use digikam for mass tagging and face recognition, and I wanted to add tagging to my filtering workflow. Remembering the good experience with SQLA in the past, I decided to use it again.

Initially I just loaded three tables: AlbumRoots, Albums and Images. The relationship between them is simple, because Albums are always relative to its (Album)Root, so there is no parent relationship there. Because I didn't like plurals or redundancy in my class names, I decided to call the classes Root, Album and Image. Then came the 'hardest' part, which is tags.

For tags I just needed two more tables: Tags and ImageTags. The former is obvious, except that this one has a parent relationship, so tags can be nested. The latter is just a join table with one column pointing to Tags and the other pointing to Images.

Having no primary key on that table means we can't use SQLA's ORM on it. And this is the first new thing1 I learned about SQLA. You can have Mapper classes that provide the full ORM experience, and you can have simpler Table instances (notice the difference here) that just declare tables that you might need to use otherwise, which is our case here. So I called my class Tag and the table instance is ImageTags. This is because I'm more interested in the Tags for an Image than the Images for a Tag.

I want my code to be as consistent as possible. If there are two or more ways to do one thing, I chose one and use it all over the code. So when SQLA can refer to tables and columns (the lowercase here is intentional) either by instance or as a string, I prefer instances. But I also wanted to have the declaration of the tags attribute for Image in the class' definition itself (see below), and because of a chicken-egg situation (Image needs ImageTags declared, and ImageTags needs Image declared), I decided to convert them all to strings.

Here's the resulting code:

ImageTags = Table(
    'ImageTags', metadata,
    Column('imageid', Integer, ForeignKey('Image.id'), nullable=False, index=True),
    Column('tagid', Integer, ForeignKey('Tag.id'), nullable=False, index=True),
    UniqueConstraint('imageid', 'tagid')
)


class Tag(Base):
    __tablename__ = 'Tags'
    __table_args__ = (
        UniqueConstraint('name', 'pid'),
    )

    id = Column(Integer, primary_key=True)
    pid = Column(Integer)
    # TODO: define parent
    name = Column(Text, nullable=False)
    icon = Column(Integer)
    iconkde = Column(Text)


class Image(Base):
    __tablename__ = 'Images'
    __table_args__ = (
        UniqueConstraint('album', 'name'),
    )

    id = Column(Integer, primary_key=True)
    album = Column(Integer, index=True)
    name = Column(Text, nullable=False, index=True)
    status = Column(Integer, nullable=False)
    category = Column(Integer, nullable=False)
    # modification_date = Column('modificationDate', DateTime)
    modification_date = Column('modificationDate', Text)
    file_size = Column('fileSize', Integer)
    hash = Column('uniqueHash', Text, index=True)
    tags = relationship('Tag', secondary='ImageTags')

This doesn't work. Can you figure out why? No? Let me give you a hint:

When initializing mapper mapped class Image->Images, expression 'ImageTags' failed to
locate a name ("name 'ImageTags' is not defined"). If this is a class name, consider
adding this relationship() to the <class 'digikam.Image'> class after both dependent
classes have been defined.

I have to say, this really baffled me. Of course, again, it's been a long time since I used SQLA in any serious setting, and the documentation is huge and sprawling. Reading it all over again looked like too much just for a simple situation that is well described. To me, failed to locate a name meant it couldn't find the name in the module. "But it's right there!", I shouted.

So this is the second thing I learned about SQLA. When you use a string for referencing a table or column (lower case intentional, again), you need to refer to the database names. And because few examples actually have different class/attribute vs table/fields names, it's hard to figure it out, and because I'm renaming heavily, I got stuck. So the simple fix is replacing 'ImageTags' by ImageTags.

And that's when I get the next error:

Could not determine join condition between parent/child tables on relationship
Image.tags - there are no foreign keys linking these tables via secondary table
'ImageTags'. Ensure that referencing columns are associated with a ForeignKey or
ForeignKeyConstraint, or specify 'primaryjoin' and 'secondaryjoin' expressions.

And again the same shout: "But it's right there!" But again, my ForeignKeys are using strings, so of course it has to be the database names and not the instances names. Instead, I declared ImageTags after Image and put instances instead of strings for consistency. The price to pay? Now I have to declare the relationship outside the class, and the same will happen for the parent attribute in Tag (see TODO :). The final code:

class Tag(Base):
    __tablename__ = 'Tags'
    __table_args__ = (
        UniqueConstraint('name', 'pid'),
    )

    id = Column(Integer, primary_key=True)
    pid = Column(Integer)
    # TODO: define parent
    name = Column(Text, nullable=False)
    icon = Column(Integer)
    iconkde = Column(Text)


class Image(Base):
    __tablename__ = 'Images'
    __table_args__ = (
        UniqueConstraint('album', 'name'),
    )

    id = Column(Integer, primary_key=True)
    album = Column(Integer, index=True)
    name = Column(Text, nullable=False, index=True)
    status = Column(Integer, nullable=False)
    category = Column(Integer, nullable=False)
    # modification_date = Column('modificationDate', DateTime)
    modification_date = Column('modificationDate', Text)
    file_size = Column('fileSize', Integer)
    hash = Column('uniqueHash', Text, index=True)
    # see below
    # tags = Relationship('Tag', secondary='t_ImageTags')


ImageTags = Table(
    'ImageTags', metadata,
    Column('imageid', Integer, ForeignKey(Image.id), nullable=False, index=True),
    Column('tagid', Integer, ForeignKey(Tag.id), nullable=False, index=True),
    UniqueConstraint('imageid', 'tagid')
)

Image.tags = Relationship('Tag', secondary=ImageTags)

  1. When I say "new", I mean "new to me". This could also mean "Once I knew, but I forgot, so this new to me now" :)