Bus routes with osm2pgsql flex backend

10 days ago Joachim Topf commited the new flex backend into osm2pgsql. This new backend gives you full control on what data goes where, and the opportunity to massage it a little bit before committing to the db. This is useful not only because we could move processing from the rendering time to the import time (which, let's be honest, we could already do some with the --tag-transform-script option), but also because we can move away from the 4 table setting of the original osm2pgsql mode (line, point, polygons, roads) and also ignore data we don't want in the db or even create new one.

This new backend works exactly like osm2pgsql. There are two stages; the first one goes through all the nodes, ways and relations (in that order), and the second one only through ways and relations. For each one of these types you can define a function, process_[type](). process_node() is only going to be called in stage 1, but process_way() and process_relation() are going to be called in both stages. The functions, of course, can figure out in which stage they are; my advise is to split the functions into process_node_stage1() and process_node_stage2() and call them from process_node() to make that more clear. An object is processed in stage 2 only if it was marked in stage 1. For more details, please read the docs, and definitely learn lua.

My first task for this is bus routes. Last time I put some of my few spare time on my fork I managed to display bus routes but it was ugly: overlapping dash based lines that were difficult to follow.

Compare that to a real map of a (quite small) bus network, and you get the idea that my take wasn't going to be useful.

Let's try to think how this can be achieved. What we need is a way to take each way and count how many bus lines are going through it. Bus lines are represented as relations, of which we can take the members. So the first step must be to find all the members of a relation. This can be done in process_relation_stage1(). We mark all the ways in a bus relation, and we also somehow associate the bus to the way.

In stage 2, we process all the marked ways (ways with buses), we count how many buses go through it, we create a new way for rendering each bus line, and we displace this new line based on which 'slot' it belongs to, so the lines are parallel.

I have a first version of such algorithm. It's based on one of the examples. Let's take it apart:

local tables = {}

tables.routes = osm2pgsql.define_way_table('routes', {
    { column = 'tags',   type = 'hstore' },
    { column = 'offset', type = 'float' },
    { column = 'ref',    type = 'text' },
    { column = 'colour', type = 'text' },
    { column = 'way',    type = 'linestring' },
})

We declare a table that will store ways, called routes. It will store the original tags of the way in hstore format, plus an offset column, that will tell the renderer how much to displace the line, plus a ref and a colour; and of course, the way itself.

pt_ways = {}

function osm2pgsql.process_relation(relation)
    -- Only interested in relations with type=route, route=bus and a ref
    -- TODO: other pt routes too
    if relation.tags.type == 'route' and relation.tags.route == 'bus' and relation.tags.ref then
        for _, member in ipairs(relation.members) do
            if member.type == 'w' then
                -- print(relation.id, member.ref, relation.tags.ref)
                if not pt_ways[member.ref] then
                    pt_ways[member.ref] = {}
                end

                -- insert() is the new append()
                table.insert(pt_ways[member.ref], relation)
                osm2pgsql.mark_way(member.ref)
            end
        end
    end
end

We process the relations. Like I said before, we take all the members that are ways, we store this bus line in an array indexed by the original way id, and we mark the way for processing in stage 2. Notice the the bus line's ref is in the tags.ref indexes (more on this later), while it's id is in the ref index. This last part was confusing to me.

function sort_by_ref(a, b)
    return a.tags.ref < b.tags.ref
end

function osm2pgsql.process_way(way)
    -- Do nothing for ways in stage1, it'll be in relations where the magic starts
    if osm2pgsql.stage == 1 then
        return
    end

    -- We are now in stage2
    local routes = pt_ways[way.id]
    table.sort(routes, sort_by_ref)

We do nothing for ways in stage 1, and in stage 2 we sort the routes to give them a consistent ordering when rendering consecutive ways.

    local line_width = 2.5
    local offset = 0
    local side = 1
    local base_offset
    local offset
    local index
    local ref
    local slot
    local shift

    if #routes % 2 == 0 then
        base_offset = line_width / 2
        shift = 1
    else
        base_offset = 0
        shift = 0
    end

    for index, route in ipairs(routes) do
        -- index is 1 based!
        slot = math.floor((index - shift) / 2)
        offset = (base_offset + slot * line_width) * side

        if side == 1 then
            side = -1
        else
            side = 1
        end

This is the part of the algo that calculates the offset. It was refined after a couple of iterations and it seems to work fine with odd and even amount of bus lines. line_width will be moved to the style later, so I can apply widths depending on ZL. In short, we're assigning slots from the center to the outside, alternating sides.

        -- TODO: group by colour
        -- TODO: generic line if no colour
        row = {
            tags = way.tags,
            ref = route.tags.ref,
            colour = route.tags.colour,
            offset = offset,
            geom = { create = 'line' }
        }

        tables.routes.add_row(tables.routes, row)
    end
end

And this is the end. We set the row values and add it to our table. It's that simple :) Now we run osm2pgsql with the flex backend:

$ osm2pgsql --cache 1024 --number-processes 4 --verbose --create --database mc --output=flex --style bus-routes.lua --slim --flat-nodes nodes.cache --hstore --multi-geometry --drop
osm2pgsql version 1.2.0 (1.2.0-248-gba17b0c) (64 bit id space)
Reading in file: monaco-latest.osm.pbf
Processing: Node(46k 0.4k/s) Way(3k 3.78k/s) Relation(10 10.00/s)  parse time: 125s
Node stats: total(46745), max(7199027992) in 124s
Way stats: total(3777), max(770935077) in 1s
Relation stats: total(215), max(10691624) in 0s
Entering stage 2...
Creating id indexes...
Creating id index on table 'routes'...
Creating id indexes took 0 seconds
Lua program uses 0 MBytes
Entering stage 2 processing of 458 ways...
Entering stage 2 processing of 0 relations...
Clustering table 'routes' by geometry...
Using native order for clustering
Creating geometry index on table 'routes'...
Analyzing table 'routes'...
All postprocessing on table 'routes' done in 0s.
Osm2pgsql took 129s overall

Now, for the render part, it's twofold; a layer definition...:

SELECT
    way,
    COALESCE(colour, 'purple') as color,
    ref as route,
    "offset"
FROM routes
ORDER BY ref

... and the style itself:

#routes {
  [zoom >= 14] {
    line-width: 1;
    line-color: @transportation-icon;
    line-join: round;
    line-offset: [offset];
    [zoom >= 17] {
      line-color: [color];
      line-width: 2;
    }
  }
}

Quite simple too! That's because all the data is already prepared for rendering. All the magic happens at import time. Here's the same region as before:

Now, two caveats: This last thing means that if you want to change the style, you most probably will need to reimport the data. It must have taken me some 20 iterations until I got the data in a way I could use for rendering, that's why I tested with an extract of Monaco :) I also used a separate db from the main render database, but maybe just another table would be enough.

Second, you have to specify all the data you want to save, there is no compatibility with the current rendering database, so you will also need to base your code on the compatible example. In my tests, I just imported the data the usual way:

$ osm2pgsql --cache 1024 --number-processes 4 --verbose --create --database mc --output=flex --style bus-routes.lua --slim --flat-nodes nodes.cache --hstore --multi-geometry --drop
osm2pgsql version 1.2.0 (1.2.0-248-gba17b0c) (64 bit id space)
Mid: loading persistent node cache from nodes.cache
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using lua based tag processing pipeline with script openstreetmap-carto.lua
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Reading in file: monaco-latest.osm.pbf
Processing: Node(46k 0.8k/s) Way(3k 3.78k/s) Relation(210 105.00/s)  parse time: 61s
Node stats: total(46745), max(7199027992) in 58s
Way stats: total(3777), max(770935077) in 1s
Relation stats: total(215), max(10691624) in 2s
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 0s
Stopping table: planet_osm_ways
Stopped table: planet_osm_ways in 0s
Stopping table: planet_osm_rels
Stopped table: planet_osm_rels in 0s
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_line
Using native order for clustering
Using native order for clustering
Using native order for clustering
Using native order for clustering
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on planet_osm_point
Creating indexes on planet_osm_point finished
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating indexes on planet_osm_roads finished
All indexes on planet_osm_point created in 0s
Completed planet_osm_point
All indexes on planet_osm_roads created in 0s
Completed planet_osm_roads
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on planet_osm_polygon
Creating indexes on planet_osm_polygon finished
All indexes on planet_osm_polygon created in 0s
Completed planet_osm_polygon
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on planet_osm_line
Creating indexes on planet_osm_line finished
All indexes on planet_osm_line created in 0s
Completed planet_osm_line
Osm2pgsql took 64s overall

One thing to notice is that it took half of the time of the flex backend.