How did it start? Well, it actually started a long time ago, maybe a year ago, most probably more. I got myself tumbling down the beautiful rabbit hole of Blender based mapping. The idea is very simple: if you have a DEM, you can build a 3D representation of the terrain and use a renderer to build your map. To me the most striking thing about those maps were not their 3D-ness (which to some it's starting to be tiresome , and I agree), but the shadows. I've been pursuing the best shadow for a while, and this seemed like the perfect fit.
So, like I said, one year ago or so I took "the" Blender relief tutorial and run with it. I got to the point where I could reproduce it with a 1x1, 3600x3600px DEM from mapzen, but when I tried to automate it, I found out that Blender has a python console where it prints out the commands that are equivalent to the actions you make in the UI, but the resulting script was too horrible to my eyes and run out of breath (another of those cases of the perfect being the enemy of the good).
Then a few days ago I read that first link and got some steam build up. In fact, it was two passages in it that lit up the fire:
Most of these use a tool called Blender, an extremely powerful open-source tool for all kinds of 3D modeling and rendering. A few cartographers use other tools, such as Aerialod, or R‘s Rayshader plugin.
R! I can easily automate this!
If we stick to a very zoomed-out map, or if we have a really huge computer running Blender, we could try to do a hillshade for the entire world, and then slice that up for our map tiles. But there’s no way we could do this at a high-enough resolution so you could zoom all the way in, as in the sample tiles above.
Challenge accepted! (Spoiler: I'm not there yet).
I tried Rayshader. I wanna be honest: it's easy, quick, but I didn't like the results. It seemed like no matter how high you put the sun, it always drew very long shadows. So despite its pragmaticism, I left it on a side.
So I picked up what I did in the past and tried to apply it to a map. I re-rendered everything and applied it to my style. The outcome was encouraging:
To start with, I probably did that render with not enough render passes, so the output looks grainy. Second, the material color is too bright, so the height tints are washed out. Still, we can see the big shadow cast over the valley some 3200m below the Mont Blanc/Monte Bianco.
This proved to be a good place to test the method, because of the great difference between the valley and the peak casting the shadow over it, and that lead me to think: are there more extreme places in the world? An easy bet is yes, and the place to look for them was the Himalayas. The Aconcagua could have been a good contender, but the valley at its SE is some 4550m already. Surprisingly, the way I searched for a good place was to use existing maps with the usual hill shade technique, looking for big dark spots, specially those wide in the NW-SE direction. I first found my spot in the Tukuche Peak, that looms some 4350m above the Gandaki River, and then the nearby Dhaulagiri, that's even some 1250m higher, clocking at 8167m. Here's how they look (not the big structure in the upper left, but the bigger grey [8000m+] blob to the East of it; the river snakes in the right all the way down):
I had to add 3 more color bands to my style and reshuffle them because I never rendered any 8k'er before, so the colors were haphazardly concocted for the rendering and are not definitive. At least it lets you behold the grandiosity of that rock jutting through thousands and thousands of meters with very steep sides.
Time to get real. I usually render regions were I'll be going, and next time it's the Upper Segre Valley, so I rendered N42E001-2 in one go. That's 7201x4884px after reprojecting to WebMercator (and compensating as described in the second link!), so some 35Mpx. Blender took some 44m+ on a 4.5yo medium-high spec'ed laptop at 200 render samples, which means that I can continue to render small regions this way, but that for the moment I won't be applying this technique to the whole Europe.
Up to here I was just applying the same style in QGIS, which has been an indispensable tool to develop this
style. But trying these TIFFs in mapnik for the first time needed an extra step. Well, two, in fact. Blender
does not save the TIFFs georeferenced, so you have to copy the data from the original DEM. For that, use
gdal_edit.py -a_srs ... -a_ullr ...
with the right ESPG and the data from the output of gdalinfo
. Next, for
some reson, it always use 16bits integers, even when explicitly saying to use 8. This little snippet takes care
of that:
import imageio
image = imageio.imread('pirinoak-blender-10x.tif')
image = image / 256
image = image.astype('uint8')
imageio.imwrite('pirinoak-blender-10x-8bits.tif', image)
Thank $DEITY (and developers!) for good libraries.
The first thing I noticed was that we have been lied by maps (again!) for a long time. Most hill shading algos use a 45° high sun (the direction does not matter much). But if you think about it, how many mountains have sides 45°+ steep? According to a (real, not like me!) cartographer friend, for continental Argentina it's less than 1% at 30arcsecs of resolution (note that SRTM is 1arcsec). Still, some shadows are there, and they help us (and we get used to that) to recognize slope direction. And now we're asking a raytracing program to calculate real shadows? The result I initially got was underwhelming, even when I was already asking Blender to exaggerate height by 5x!:
So, I bit the bullet and went all in with 10x:
Much better, but not definitive. I still have to render Dhaulagiri again, and at least some region I already know well by having being there a lot.
Now some notes about "the" Blender relief tutorial. I followed it to the letter, but with my experience I had to make some changes. One you already know, using a displacement scale of 10x instead of 0.3. I have no exact idea why his initial rendering were so spiky, but I suspect that the DEM grid unit was not meters.
Second, since that first Mount Blanc/Monte Bianco render, we know the color is too bright. I lowered it to 0.6
(and later I found that that's what he actually suggests at the end of the plane section) and then compared the
grey in a plain (#b5b5b5
) to what GDAL outputs and compensated using a simple coefficient. The final value is
0.402.
Third, I was having issues rendering: I was getting a lot of terracing. After a long chat with Viktor_smg from blender.chat/support they figured out that the sRGB color space in the Image Texture is broken and that I should use XYZ instead. This meant installing Blender by hand instead of relying on the one in Debian Unstable because it's too old and does not have it. They also gave me pointers about how to automate it.
Last, you can't apply this technique DEM by DEM because you want the shadows from the neighbouring tiles to spill over the current one. That link shows how to render the tile and its 8 neighbouring ones, but I think that you can optimize it in two ways: First, since shadows come from the NW, just add the tiles that lie in that general direction. Second, no shadow would cast over 10s of kilometers. You could even get away with just adding a smaller band around the tile.
That's it for now. The next step is to automate this an publish that. $DEITY knows when that will happen.
python openstreetmap gdal elevation hillshading imageio gis dem
It all started with the following statement: "my renders are too slow". There were three issues as the root case: osm-carto had become more complex (nothing I can do there), my compute resources were too nimble (but upgrading them would cost money I don't want to spend in just one hobby) and the terrain rasters were being reprojected all the time.
My style uses three raster layers to provide a sense of relief: one providing height tints, one enhancing that color on slopes and desaturating it on valleys, and a last one providing hill shading. All three were derived from the same source DEM, which is projected in WGS 84 a.k.a. EPSG 4326. The style is of course in Pseudo/Web Mercator, a.k.a. EPSG 3857; hence the constant reprojection.
So my first reaction was: reproject first, then derive the raster layers. This single step made it all hell break loose.
I usually render Europe. It's the region where I live and where I most travel to. I'm using relief layers because I like mountains, so I'm constantly checking how they look. And with this change, the mountains up North (initially was around North Scandinavia) looked all washed out:
The reason behind this is that GDAL's hill and slope shading algorithms don't take in account the projection, but just use pixel values as present in the dataset and the declared pixel size. The DEMs I'm using are of a resolution of 1 arc second per pixel. WGS84 assumes a ellipsoid of 6_378_137m of radius (plus a flattening parameter which we're going to ignore), which means a circumference at the Equator of:
In [1]: import math
In [2]: radius = 6_378_137 # in meters
In [3]: circunf = radius * 2 * math.pi # 2πr
In [4]: circunf
Out[4]: 40_075_016.68557849 # in meters
One arc second is hence this 'wide':
In [5]: circunf / 360 / 60 / 60
Out[5]: 30.922080775909325 # in meters
In fact the DEMs advertises exactly that:
mdione@diablo:~/src/projects/osm/data/height/mapzen$ gdalinfo <span class="createlink">N79E014</span>.hgt
Size is 3601, 3601
Coordinate System is:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
Pixel Size = (0.000277777777778,-0.000277777777778)
Well, not exactly that, but we can derive it 'easily'. Deep down there, the
declared unit is degree
(ignore the metre
in the datum; that's the unit for
the numbers in the ELLIPSOID
) and its size in radians (just in case you aren't
satisfied with the amount of conversions involved, I guess). The projection declares
degrees as the unit of the projection for both axis, and the pixel size is declared in the units of the
projection, and 1 arc second is:
In [6]: 1 / 60 / 60 # or 1/3600
Out[6]: 0.0002777777777777778 # in degrees
which is exactly the value declared in the pixel size[1][2]. So one pixel is one arc second and one arc second is around 30m at the Equator.
The 'flatness' problem arises because all these projections assume all latitudes are of the
same width; for them, a degree at the Equator is as 'wide' as one at 60°N, which is not; in reality,
it's twice as wide, because the width of a degree decreases with the cosine of the latitude and
math.cos(math.radians(60)) == 0.5
. The
mountains we saw in the first image are close to 68N; the cosine up there is:
In [7]: math.cos(math.radians(68))
Out[7]: 0.37460659341591196
So pixels up there are no longer around 30m wide, but only about
30 * 0.37 ≃ 11 # in meters
!
How does this relate to the flatness? Well, like I said, GDAL's algorithms use the declared pixel size everywhere, so at f.i. 68°N it thinks the pixel is around 30m wide when it's only around 11m wide. They're miscalculating by a factor of almost 3! Also, the Mercator projection is famous for stretching not only in the East-West direction (longitude) but also in latitude, at the same pace, based on the inverse of the cosine of the latitude:
stretch_factor = 1 / math.cos(math.radians(lat))
So all the mountains up there 'look' to the algorithms as 3 times wider and 'taller' ('tall' here means in the latitude, N-S direction, not height AMSL), hence the washing out.
In that image we have an original mountain 10 units wide and 5 units tall; but gdal
thinks it's 20 units wide, so it looks flat.
Then it hit me: it doesn't matter what projection you are using, unless you calculate slopes and hill shading on the sphere (or spheroid, but we all want to keep it simple, right? Otherwise we wouldn't be using WebMerc all over :), the numbers are going to be wrong.
The Wandering Cartographer
mentions this
and goes around the issue by doing two things: Always using DEMs where the units
are meters, and always using a local projection that doesn't deform as much. He can do that
because he produces small maps. He also concedes he is
using the 111_120 constant
when the unit is degrees.
I'm not really sure about how that number is calculated, but I suspect that it
has something to do with the flattening parameter, and also some level of
rounding. It seems to have originated from
gdaldem
's manpage
(search for the -s
option). Let's try something:
In [8]: m_per_degree = circunf / 360
In [9]: m_per_degree
Out[9]: 111_319.49079327358 # in meters
That's how wide is a degree is at the Equator. Close enough, I guess.
But for us world-wide, or at least continent-wide webmappers like me, those suggestions are not enough; we can't choose a projection that doesn't deform because at those scales they all forcibly do, and we can't use a single scaling factor either.
gdal
's algorithm
uses the 8 neighboring pixels of a pixel to determine slope and aspect. The slope
is calculated on a sum of those pixels' values divided by the cell size. We can't change the
cell size that gdal
uses, but we can change the values :) That's the third triangle up there: the algorithm
assumes a size for the base that it's X
times bigger than the real one, and we just make the
triangle as many times taller. X
is just 1 / math.cos(math.radians(lat))
.
If we apply this technique at each DEM file individually, we have to chose a latitude for it. One option is
to take the latitude at the center of the file. It would make your code from more
complicated, specially if you're using bash
or, worse, make
because you have to pass a
different value for -s
, but you can always use a little bit of Python or any other high level,
interpreted language.
But what happens at the seams where one tile meets another? Well, since each file has it's own latitude, two neighbouring values, one in one DEM file and the other in another, will have different compensation values. So, when you jump from a ~0.39 factor to a ~0.37 at the 68° of latitude, you start to notice it, and these are not the mountains more to the North that there are.
The next step would be to start cutting DEM tiles into smaller ones, so the 'jumps' in between are less noticeable... but can we make this more continuous-like? Yes we can! We simply compensate each pixel based on its latitude.
This image cheats a little; this one is based on EU-DEM and not mapzen like the original. You can see where the DEM file ends because the slope and hillshading effects stop at the bottom left. I still have to decide which dataset I will use; at some point I decided that EU-DEM is not global enough for me, and that it's probably better to use a single dataset (f.i. mapzen) than a more accurate but partial one, but since I usually render the regions I need, I might as well change my workflow to easily change the DEM dataset.
This approach works fine as shown, but has at least one limitation. The data type used in those TIFFs
is Int16
. This means signed ints 16 bits wide, which means values can go between -32768..32767.
We could only compensate Everest up to less than 4 times, but luckily we don't have to; it
would have to be at around 75° of latitude before the compensated value was bigger than the
maximum the type can represent. But at around 84° we get a compensation of 10x
and it only gets quickly worse from there:
75: 3.863
76: 4.133
77: 4.445
78: 4.809
79: 5.240
80: 5.758
81: 6.392
82: 7.185
83: 8.205
84: 9.566
85: 11.474
86: 14.336
87: 19.107
88: 28.654
89: 57.299 # we don't draw this far, the cutout value is around 85.5 IIRC
90: ∞ # :)
The solution could be to use a floating point type, but that means all calculations would
be more expensive, but definitely not as much as reprojecting on the fly. Or maybe we can use
a bigger integer type. Both solutions would also use more disk space and require more memory.
gdal
currently supports band types of Byte, UInt16, Int16, UInt32, Int32, Float32, Float64,
CInt16, CInt32, CFloat32 and CFloat64 for reading and writing.
Another thing is that it works fine and quick for datasets like SRTM and mapzen because I can compensate whole raster lines in one go as all its pixels have the same latitude, but for EU-DEM I have to compensate every pixel and it becomes slow. I could try to parallelize it (I bought a new computer which has 4x the CPUs I had before, and 4x the RAM too :), but I first have to figure out how to slowly write TIFFS; currently I have to hold the whole output TIFF in memory before dumping it into a file.
Here's the code for pixel per pixel processing; converting it to do row by row for f.i. mapzen makes you actually remove some code :)
#! /usr/bin/env python3
import sys
import math
import rasterio
import pyproj
import numpy
in_file = rasterio.open(sys.argv[1])
band = in_file.read(1) # yes, we assume only one band
out_file = rasterio.open(sys.argv[2], 'w', driver='GTiff',
height=in_file.height, width=in_file.width,
count=1, dtype=in_file.dtypes[0],
crs=in_file.crs, transform=in_file.transform)
out_data = []
transformer = pyproj.Transformer.from_crs(in_file.crs, 'epsg:4326')
# scan every line in the input
for row in range(in_file.height):
if (row % 10) == 0:
print(f"{row}/{in_file.height}")
line = band[row]
# EU-DEM does not provide 'nice' 1x1 rectangular tiles,
# they use a specific projection that become 'arcs' in anything 'rectangular'
# so, pixel by pixel
for col in range(in_file.width):
# rasterio does not say where do the coords returned fall in the pixel
# but at 30m tops, we don't care
y, x = in_file.xy(row, col)
# convert back to latlon
lat, _ = transformer.transform(y, x)
# calculate a compensation value based on lat
# real widths are pixel_size * cos(lat), we compensate by the inverse of that
coef = 1 / math.cos(math.radians(lat))
line[col] *= coef
out_data.append(line)
# save in a new file.
out_file.write(numpy.asarray(out_data, in_file.dtypes[0]), 1)
That's it!
python openstreetmap gdal elevation hillshading rasterio pyproj gis dem crs
[1] Except for the negative value. That has to do with the fact that pixels count from top to bottom, but degrees from South (negative) to North.
[2] The other number, 0.0174532925199433, is how much a degree is in radians.
[3] All that extra math is converting degrees into radians so cos()
is happy.
10 days ago
Joachim Topf commited the new flex
backend into osm2pgsql
.
This new backend gives you full control on what data goes where, and the
opportunity to massage it a little bit before committing to the db. This is
useful not only because we could move processing from the rendering time to the
import time (which, let's be honest, we could already do some with the
--tag-transform-script
option), but also because we can move away from the 4
table setting of the original osm2pgsql
mode (line
, point
, polygons
,
roads
) and also ignore data we don't want in the db or even create new one.
This new backend works exactly like osm2pgsql
. There are two stages; the first
one goes through all the nodes, ways and relations (in that order), and the
second one only through ways and relations. For each one of these types
you can define a function, process_[type]()
. process_node()
is only
going to be called in stage 1, but process_way()
and process_relation()
are
going to be called in both stages. The functions, of course, can figure out in
which stage they are; my advise is to split the functions into process_node_stage1()
and process_node_stage2()
and call them from process_node()
to make that
more clear. An object is processed in stage 2 only if it was marked in stage 1.
For more details, please
read the docs,
and definitely learn lua.
My first task for this is bus routes. Last time I put some of my few spare time on my fork I managed to display bus routes but it was ugly: overlapping dash based lines that were difficult to follow.
Compare that to a real map of a (quite small) bus network, and you get the idea that my take wasn't going to be useful.
Let's try to think how this can be achieved. What we need is a way to take each
way and count how many bus lines are going through it. Bus lines are represented
as relations, of which we can take the members. So the first step must be to
find all the members of a relation. This can be done in process_relation_stage1()
.
We mark all the ways in a bus relation, and we also somehow associate the bus to
the way.
In stage 2, we process all the marked ways (ways with buses), we count how many buses go through it, we create a new way for rendering each bus line, and we displace this new line based on which 'slot' it belongs to, so the lines are parallel.
I have a first version of such algorithm. It's based on one of the examples. Let's take it apart:
local tables = {}
tables.routes = osm2pgsql.define_way_table('routes', {
{ column = 'tags', type = 'hstore' },
{ column = 'offset', type = 'float' },
{ column = 'ref', type = 'text' },
{ column = 'colour', type = 'text' },
{ column = 'way', type = 'linestring' },
})
We declare a table that will store ways, called routes
. It will store the
original tags of the way in hstore format, plus an offset column, that will tell
the renderer how much to displace the line, plus a ref
and a colour; and of
course, the way itself.
pt_ways = {}
function osm2pgsql.process_relation(relation)
-- Only interested in relations with type=route, route=bus and a ref
-- TODO: other pt routes too
if relation.tags.type == 'route' and relation.tags.route == 'bus' and relation.tags.ref then
for _, member in ipairs(relation.members) do
if member.type == 'w' then
-- print(relation.id, member.ref, relation.tags.ref)
if not pt_ways[member.ref] then
pt_ways[member.ref] = {}
end
-- insert() is the new append()
table.insert(pt_ways[member.ref], relation)
osm2pgsql.mark_way(member.ref)
end
end
end
end
We process the relations. Like I said before, we take all the members that are
ways, we store this bus line in an array indexed by the original way id, and we
mark the way for processing in stage 2. Notice the the bus line's ref
is in
the tags.ref
indexes (more on this later), while it's id
is in the ref
index. This last part was confusing to me.
function sort_by_ref(a, b)
return a.tags.ref < b.tags.ref
end
function osm2pgsql.process_way(way)
-- Do nothing for ways in stage1, it'll be in relations where the magic starts
if osm2pgsql.stage == 1 then
return
end
-- We are now in stage2
local routes = pt_ways[way.id]
table.sort(routes, sort_by_ref)
We do nothing for ways in stage 1, and in stage 2 we sort the routes to give them a consistent ordering when rendering consecutive ways.
local line_width = 2.5
local offset = 0
local side = 1
local base_offset
local offset
local index
local ref
local slot
local shift
if #routes % 2 == 0 then
base_offset = line_width / 2
shift = 1
else
base_offset = 0
shift = 0
end
for index, route in ipairs(routes) do
-- index is 1 based!
slot = math.floor((index - shift) / 2)
offset = (base_offset + slot * line_width) * side
if side == 1 then
side = -1
else
side = 1
end
This is the part of the algo that calculates the offset. It was refined after a
couple of iterations and it seems to work fine with odd and even amount of bus
lines. line_width
will be moved to the style later, so I can apply widths
depending on ZL. In short, we're assigning slots from the center to the outside,
alternating sides.
-- TODO: group by colour
-- TODO: generic line if no colour
row = {
tags = way.tags,
ref = route.tags.ref,
colour = route.tags.colour,
offset = offset,
geom = { create = 'line' }
}
tables.routes.add_row(tables.routes, row)
end
end
And this is the end. We set the row values and add it to our table. It's that
simple :) Now we run osm2pgsql
with the flex backend:
$ osm2pgsql --cache 1024 --number-processes 4 --verbose --create --database mc --output=flex --style bus-routes.lua --slim --flat-nodes nodes.cache --hstore --multi-geometry --drop
osm2pgsql version 1.2.0 (1.2.0-248-gba17b0c) (64 bit id space)
Reading in file: monaco-latest.osm.pbf
Processing: Node(46k 0.4k/s) Way(3k 3.78k/s) Relation(10 10.00/s) parse time: 125s
Node stats: total(46745), max(7199027992) in 124s
Way stats: total(3777), max(770935077) in 1s
Relation stats: total(215), max(10691624) in 0s
Entering stage 2...
Creating id indexes...
Creating id index on table 'routes'...
Creating id indexes took 0 seconds
Lua program uses 0 MBytes
Entering stage 2 processing of 458 ways...
Entering stage 2 processing of 0 relations...
Clustering table 'routes' by geometry...
Using native order for clustering
Creating geometry index on table 'routes'...
Analyzing table 'routes'...
All postprocessing on table 'routes' done in 0s.
Osm2pgsql took 129s overall
Now, for the render part, it's twofold; a layer definition...:
SELECT
way,
COALESCE(colour, 'purple') as color,
ref as route,
"offset"
FROM routes
ORDER BY ref
... and the style itself:
#routes {
[zoom >= 14] {
line-width: 1;
line-color: @transportation-icon;
line-join: round;
line-offset: [offset];
[zoom >= 17] {
line-color: [color];
line-width: 2;
}
}
}
Quite simple too! That's because all the data is already prepared for rendering. All the magic happens at import time. Here's the same region as before:
Now, two caveats: This last thing means that if you want to change the style, you most probably will need to reimport the data. It must have taken me some 20 iterations until I got the data in a way I could use for rendering, that's why I tested with an extract of Monaco :) I also used a separate db from the main render database, but maybe just another table would be enough.
Second, you have to specify all the data you want to save, there is no compatibility with the current rendering database, so you will also need to base your code on the compatible example. In my tests, I just imported the data the usual way:
$ osm2pgsql --cache 1024 --number-processes 4 --verbose --create --database mc --output=flex --style bus-routes.lua --slim --flat-nodes nodes.cache --hstore --multi-geometry --drop
osm2pgsql version 1.2.0 (1.2.0-248-gba17b0c) (64 bit id space)
Mid: loading persistent node cache from nodes.cache
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using lua based tag processing pipeline with script openstreetmap-carto.lua
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Reading in file: monaco-latest.osm.pbf
Processing: Node(46k 0.8k/s) Way(3k 3.78k/s) Relation(210 105.00/s) parse time: 61s
Node stats: total(46745), max(7199027992) in 58s
Way stats: total(3777), max(770935077) in 1s
Relation stats: total(215), max(10691624) in 2s
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 0s
Stopping table: planet_osm_ways
Stopped table: planet_osm_ways in 0s
Stopping table: planet_osm_rels
Stopped table: planet_osm_rels in 0s
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_line
Using native order for clustering
Using native order for clustering
Using native order for clustering
Using native order for clustering
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on planet_osm_point
Creating indexes on planet_osm_point finished
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating indexes on planet_osm_roads finished
All indexes on planet_osm_point created in 0s
Completed planet_osm_point
All indexes on planet_osm_roads created in 0s
Completed planet_osm_roads
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on planet_osm_polygon
Creating indexes on planet_osm_polygon finished
All indexes on planet_osm_polygon created in 0s
Completed planet_osm_polygon
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on planet_osm_line
Creating indexes on planet_osm_line finished
All indexes on planet_osm_line created in 0s
Completed planet_osm_line
Osm2pgsql took 64s overall
One thing to notice is that it took half of the time of the flex backend.
openstreetmap elevation
Since I started playing with rendering maps locally I've been modifying and
extending the original generate_tiles.py
script from mapnik-stilesheets
.
I added option parsing and lots of features; here's the current usage:
$ ./generate_tiles.py --help
usage: generate_tiles.py [-h] [-b BBOX] [-B BBOX_NAME] [-n MIN_ZOOM]
[-x MAX_ZOOM] [--tiles Z,X,Y [Z,X,Y ...]]
[-i MAPFILE] [-f FORMAT] [-o TILE_DIR]
[-m METATILE_SIZE] [-t THREADS]
[-p {threads,fork,single}] [-X] [-N DAYS]
[-E {skip,link,render}] [-d] [--dry-run]
optional arguments:
-h, --help show this help message and exit
-b BBOX, --bbox BBOX
-B BBOX_NAME, --bbox-name BBOX_NAME
-n MIN_ZOOM, --min-zoom MIN_ZOOM
-x MAX_ZOOM, --max-zoom MAX_ZOOM
--tiles Z,X,Y [Z,X,Y ...]
-i MAPFILE, --input-file MAPFILE
-f FORMAT, --format FORMAT
-o TILE_DIR, --output-dir TILE_DIR
-m METATILE_SIZE, --metatile-size METATILE_SIZE
-t THREADS, --threads THREADS
-p {threads,fork,single}, --parallel-method {threads,fork,single}
-X, --skip-existing
-N DAYS, --skip-newer DAYS
-E {skip,link,render}, --empty {skip,link,render}
-d, --debug
--dry-run
BBoxes are stored in a file called bboxes.ini
, so I can say -B Europe
instead
of remembering the coords. The idea of --format
is that I should be supporting
slippy maps .png
file structure or mbtiles
, but the latter support is a little
lagging behind because I don't have a use for them yet. You can choose to whether
use threads (broken because mapnik
cannot handle the situation; I can't find a
reference to the problem now), child processes (probably the only one
working correctly) or a single main process (so no parallelism). It handles
resuming a stopped render by not rendering if the tile exists or it's too new.
It also can skip writing empty seas tiles.
I use it to rerender my style
everytime I make a modification (or just update to the latest openstreetmap-carto
,
of which is a fork). I usually bulk render a great part of Europe up to ZL 11 or
14, and them some regions down to ZL 18 or 19 as needed for trips or other
interests.
For Europe, it can take a long while, so I've been thinking on ways to optimize
the rendering. Besides tuning the database, I first found that rendering big
metatiles (8x8, for instance) gave a big boost in rendering time. The next idea
is to reuse disk cache. When you render a (meta)tile in ZL N, the same data used
for rendering it is going to be used for the 4 sub(meta)tiles of ZL N+1 (except
when you remove features, which is rare but exists; city labels come to mind). I
don't think something could
be done at mapnik
level, but one can think of the tiles as a tree: a node
in ZL N has 4 subtiles in level N+1 and the leafs are the last ZL rendered.
The original algorithm did a breath first traveling of this tree, but if you do
a depth first algorithm, it could reuse the
kernel's page/disk cache for the data collected by mapnik
from the database or
other files. Also, I can
check whether the subtiles are render worthy: if they're only sea, I don't need
to render it or its subtiles; I can cut down whole tile trees. The only point
at which this could no longer be true is when we start rendering more things on
sea, which currently ammounts to ferry routes at ZL 7.
I finished implementing all these ideas, but I don't have any numbers to prove it works. Definitely not rendering sea tiles should be a great improvement, but I don't really know whether the caching idea works. At least it was fun to implement.
So the rendering batch will be cut in 3: ZLs 0-6 in one run, then 7 and 8 with less threads (these ZLs of my style use so much memory the machine starts thrashing), then 9-18/19 with full threads.
elevation openstreetmap
I've been improving a little Elevation's reproducibility. One of the steps of setting it up is to download an extract to both import in the database and fetch the DEM files that will be part of the background. The particular extract that I'm using, Europe, is more than 17GiB in size, which means that it takes a looong time to download. Thus, I would like to have the ability to continue the download if it has been interrupted.
The original script
that was trying to do that is using curl
. This version is not trying to continue
the download, which can easily be achieved by adding the --continue -
option. The
version that has it never hit the repo because of the following:
The problem arises when the file we want to download is rolled every day. This means
that the contents of the file changes from one day to the other, and we can't just
continue from we left if that's the case, we must start all over[1].
One could think that curl
has an option that looks like it handles that,
--time-cond
, which is what the script is trying to use. This option makes curl
send the
If-Modified-Since
HTTP header,
which allows the server to respond with a 304 (Not modified) if the file is not newer
that the provided date. The date the curl
provides is the one from the file
referenced by that option, and I was giving the same file as the one where the output
goes. I was using these options wrong, it was doing it the other way around: continue
if the file changed or doing nothing if not.
So I sat down to try and tackle the problem. I know one can use the HEAD
request
to check (at least) two things: the resource's date and size (bah, at least in the case of
static files like this). So the original idea was to get the URL's date and size;
if the date is newer than the local file, I should restart the download from scratch;
if not and the size was bigger than the local file, then continue; otherwise, assume
the file is finished downloading and stop there.
The last twist of the problem is that the only useful dates from the file were either
ctime
or mtime
, but both change on every write on the file. This means that if I
leave the script downloading the file, and in the meanwhile the file is rotated,
and the download is interrupted and I try again later, the file's c/mtime
is
newer that the URL, even when is for a file that is older then the URL. So I
had to add a parallel timestamp file that is created only when starting a download
and never updated (until the next full download; the file is actually touch
'ed),
and it is its mtime
the one used for comparing with the URL's.
Long story short, curl
's --time-cond
and --continue
options combined are not
for this, a HEAD
helps a little bit, but rotation-while-downloading can further
complicate things. One last feature one could ask to such a script would be to keep
the old file while downloading a new one and rotate at the end, but I will leave it
for when/if I really need it.
The new script
is written in ayrton
because it's easier to
handle execution output and dates in it than in bash
. This also pushed me to make
minor improvements to it, so expect a release soon.
[1] In fact the other options are not do anything (but then we're left with an incomplete, useless file) or to try and find the file; in the case of geofabrik, they keep the last week of daily rotation, the first day of each previous month back to the beginning of the year; then the first day of each year back to 2014. Good luck with that.
elevation ayrton
For a long time I've been searching for a program that would allow me to plan (car) trips with my friends. Yes, I know of the existence of Google Maps, but the service has several characteristics that doesn't make it appealing to me, and lacks a couple of features I expect. This is more or less the list of things I want:
- Define the list of points I want to go to. No-brainer.
- Define the specific route I want to take. This is normally implemented by adding more control points, but normally they're of the same category as the waypoins of the places you want to visit. I think they shouldn't.
- Define stages; for instance, one stage per day.
- Get the distance and time of each stage; this is important when visiting several cities, for having an idea of how much time during the day you'll spend going to the next one.
- Define alternative routes, just in case you don't really have/make the time to visit some points.
- Store the trips in cookies, share them via a URL or central site, but that anybody can easily install in their own server.
- Manage several trips at the same time.
So I sat down to try and create such a thing. Currently is just a mashup of several things GIS: my own OSM data rendering, my own waypoints-in-cookies idea (in fact, this is the expansion of what fired that post) and OSRM for the routing. As for the backend, I decided to try flask and flask-restful for creating a small REST API for storing all this. So far some basics work (points #1 and #6, partially), and I had some fun during the last week learning RESTful, some more Javascript (including LeafLet and some jQuery) and putting all this together. Here are some interesting things I found out:
- RESTful is properly defined, but not for all URL/method pairs. In particular, given that I decide that trip ids are their name, I defined a POST to trips/ as the UPSERT for that name. I hope SQLAlchemy implements it soon.
- Most of the magic of RESTful APIs happen in the model of your service.
- Creating APIs with flask-restful could not be more obvious.
- I still have to get my head around Javascript's prototypes.
- Mouse/finger events are a nightmare in browsers. In particular, with current
leafLet, you get
clicked
events on double clicks, unless you use the appropriatesingleclick
plugin from here. - Given XSS attacks, same-origin policy is enforced for AJAX requests. If you control the web service, the easiest way to go around it is CORS.
- The only way to do such calls with jQuery is using the low level function
$.ajax()
. - jQuery provides a function to parse JSON but not to serialize to it; use
window.JSON.stringify()
. - Javascript's default parameters were not recognized by my browser :(.
- OSRM's
viaroute
returns the coordinates multiplied by 10 for precision reasons, so you have to scale it down. - Nominatim and OSRM rock!
I still have lots of things to learn and finish, so stay tunned for updates. Currently the code resides in Elevation's code, but I'll split it in the future.
Update:
I have it running here. You can add waypoints by clicking in the map, delete them by doublecliking them, save to cookies or the server (for the moment it overwrites what's there, as you can't name the trips or manage several yet) and ask for the routing.
trip-planner elevation openstreetmap osrm python flask leaflet javascript jquery
Ever since I started working in Elevation I faced the problem that it's mostly a openstreetmap-carto fork. This means that each time osm-carto has new changes, I have to adapt mine. We all know this is not an easy task.
My first approach was to turn to osm-carto's VCS, namely, git
. The idea was to keep
a branch with my version, then pull from time to time, and merge the latest released
version into my branch, never merging mine into master
, just in case I decided to
do some modifications that could benefit osm-carto too. In that case, I would work on
my branch, make a commit there, then cherry pick it into master, push to my fork in
GitHub, make a Pull Request and voilà.
All this theory is nice, but in practice it doesn't quite work. Every time I tried to merge the release into my local branch, I got several conflicts, not to mention modifications that made some of my changes obsolete or at least forcing me to refactor them in the new code (this is the developer in me talking, you see...). While I resolved these conflicts and refactorings, the working copy's state was a complete mess, forcing me to fix them all just to be able to render again.
As this was not a very smooth workflow, I tried another approach: keeping my local modifications in a big patch. This of course had the same and other problems that the previous approach, so I gained nothing but more headaches.
Then I thought: who else manages forks, and at a massive scale? Linux distributions. See, distros have to patch the packaged software to compile on their environments. They also keep security patches that also are sent upstream for inclusion. Once a patch is accepted upstream, they can drop their local patch. This sounds almost exactly the workflow I want for Elevation.
And what do they use for managing the patches?
quilt
. This tool is heavily used in
the Debian distro and is maintained by someone working at SuSE. Its documentation is
a little bit sparse, but the tool is very simple to use. You start doing quilt init
just to create the directories and files that it will use to keep track of your
changes.
The modification workflow is a little bit more complex that with git
:
- You mark the beginning of a new patch with
quilt new <patch_name>
;- Then either tell
quilt
to track the files you will modify for this patch withquilt add <file> ...
(in fact it just needs to be told before you save the new version of each file, because it will save the current state of the file for producing the diff later), - Or use
quilt edit <file>
to do both things at the same time (it will use$EDITOR
for editing the file);
- Then either tell
- Then do your changes;
- Check everything is ok (it compiles, passes tests, renders, whatever);
- And finally record the changes (in the form of a patch) with
quilt refresh
.
In fact, this last 4 items (add, edit, test, refresh) can be done multiple times and they will affect the current patch.
Why do I say current? Because quilt
keeps a stack of patches, in what it calls a
series. Every time you do quilt new
a new patch is put on top of the stack, and in
the series just behind the current patch; all the other commends affect
the patch that is currently on top. You can 'move' through the series with
quilt pop [<patch_name>]
and quilt push [<patch_name>]
. if a patch name is
provided, it will pop/push all the intermediate patches in the series.
How does this help with my fork? It actually does not save me from conflicts and refactorings, it just makes these problems much easier to handle. My update workflow is the following (which non incidentally mimics the one Debian Developers and Maintainers do every time they update their packages):
- I
quilt pop -a
so I go back to the pristine version, with no modifications; - I
git pull
to get the newest version, thengit tag
andgit checkout <last_tag>
(I just want to keep in sync with releases); quilt push -a
will try to apply all the patches. If one fails,quilt
stops and lets me check the situation.quilt push -f
will try harder to apply the troubling patch; sometimes is just a matter of too many offset lines or too much fuzzyness needed.- If it doesn't apply, a
.rej
wil be generated and you should pick up from there. - In any case, once everything is up to date, I need to run
quilt refresh
and the patch will be updated[1]. - Then I try
quilt push -a
again. - If a patch is no longer useful (because it was fixed upstream or because it
doesn't make any sense), then I can simply
quilt delete <patch>
; this will remove it from the series, but the patch file will still exist (unless I use the option-r
[2]).
As long as I keep my patches minimal, there are big chances that they will be easier to integrate into new releases. I can even keep track of the patches in my own branch without fearing having conflicts, and allowing me to independently provide fixes upstream.
Let's see how this works in the future; at least in a couple of months I will be rendering again. Meanwhile, I will be moving to a database update workflow that includes pulling diffs from geofabrik.
[1] Being an old timer VCS user (since cvs
times), I wish they would call this
command update
.
[2] Why not --long-option
s?
elevation openstreetmap utils
Since I started playing with rendering maps I included some kind of elevation info for highlighting mountains. At the beginning it was just hillshading provided by some german guy (I don't have the reference on me right now), but after reading Tilemill's terrain data guide, I started using DEMs to generate 4 different layers: elevation coloring, slope shading, hillshading and contour lines.
When I started I could find only three DEM sources: SRTM 3arc and ViewFinderPanoramas (1arc and 3arc). The second one tries to flatten plains (for instance, the Po's plain nearby to where I live), where it generates some ugly looking terracing. The third one, when I downloaded the corresponding tile (they're supposed to be 1x1 degrees), its medatada reported an extension between 7 and 48 degrees east, and between 36 and 54 degrees north, and that its size is 147602x64801 pixels. I also remember stitching all the tiles covering Europe, just to get a nice 1x1 degrees hole in the North Adriatic sea. Not having much time to pre or post process the data, I decided to stick to SRTM.
Things changed at the end of last year. The US government decided to release 1arc, 30m global coverage (previously that resolution covered only the US). I started playing with the data mid January, only to find that it is not void-filled: this DEMs are derived from the homonymous Shuttle mission, which used radar to get the data. Radar gets very confused when water is involved; this is no problem on rivers, lakes or sea, where elevation is constant relative to the coast, but it's a problem on snow covered mountains, glaciers and even clouds. This means that the data has NODATA holes. The SRTM 3arc v4.1 I was using had these 'voids' filled; deFerrantis has painstakingly been filling these voids too by hand, using topographic maps as reference.
So I set up to fill these voids too. But first let's see how the original data looks like. All the images are for the area near Isola 2000, a ski station I go often. The first image is how this looks on the SRTM 3arc v4.1:
This is a 4x4 grid of 256x256 pixel tiles (1024x1024 in total) at zoom level 13. The heights range from ~700m up to ~2600m, and the image combines all the 4 layers. It already starts showing some roundness in the terrain, specially in ridges and mountain tops; even at the bottom of the deep valleys.
For contrast, this is deFerrantis data:
This is the first time I really take a look on the result; it doesn't seem to be much better that 3arc. Here's the terracing I mentioned:
For contrast, check what 1arc means:
From my point of view, the quality is definitely better. Peaks, crests and valleys are quite sharp. As for the mountain sides, they look rugged. My appreciation is this reflects better the nature of the terrain in question, but Christoph Hormann of Imagigo.de views it as sample noise. He has worked a lot on DEMs to generate very beautiful maps.
But then we have those nice blue lagoons courtesy of voids (the blue we can see is the water color I use in my maps). So, how to proceed?
The simplest way to fix this is covering the voids with averages calculated from the data at the seams of the voids. GDAL has a tool
for that called, of course, gdal_fillnodata.py
. This is the outcome:
At first this looks quite good, but once we start to zoom in (remember there are at least 5 more zoom levels), we start to see some regular patterns:
Another option is to use deFerrantis' data to fill the voids. For this we need to merge both datasets. One way to do it is using
GDAL's gdalwarp
tool. We create a file piling up layers of data; first the most complete one, then the layers with holes:
gdalwarp deFerrantis/N44E007.hgt mixed.tif
gdalwarp SRTM_1arc_v3/n44_e007_1arc_v3.tif mixed.tif
This looks like this:
I have to be honest, it doesn't look good. Both files declare the same extents and resolution (their metadata is similar, but the second file has more), but if you compare the renders for SRTM_1arc_v3 and deFerrantis, you will notice that they don't seem to align properly.
The last simple option would be to upsample SRTM_3arc_v4.1 and then merge like before, but it took me a while to figure out the right parameters:
gdalwarp -te 6.9998611 43.9998611 8.0001389 45.0001389 -tr 0.000277777777778 -0.000277777777778 -rb SRTM_3as_v4.1/srtm_38_04.tif srtm_1as_v3-3as_v4.1.tif
Creating output file that is 3601P x 3601L.
Processing input file SRTM_3as_v4.1/srtm_38_04.tif.
Using internal nodata values (eg. -32768) for image SRTM_3as_v4.1/srtm_38_04.tif.
0...10...20...30...40...50...60...70...80...90...100 - done.
gdalwarp SRTM_1as_v3/n44_e007_1arc_v3.tif srtm_1as_v3-3as_v4.1.tif
Processing input file SRTM_1as_v3/n44_e007_1arc_v3.tif.
Using internal nodata values (eg. -32767) for image SRTM_1as_v3/n44_e007_1arc_v3.tif.
0...10...20...30...40...50...60...70...80...90...100 - done.
The complex part was the -te
and -tr
parameters. The 3 arc file covers a 5x5 degrees zone at 40-45°N, 5-10°E.
The 1 arc file only covers 44-45°N, 7-8°E, so I need to cut it out. I use the -te
option for that, but according the doc it's
«[to] set georeferenced extents of [the] output file to be created». This means the degrees of the W, S, E and N limits of the output
file. Note that the actual parameters are 1.5" off those limits; I took those limits from the original 1 arc file.
The second one is
even more cryptic: «[to] set [the] output file resolution (in target georeferenced units)». This last word is the key, units.
According to gdalinfo
, both files have a certain pixel size in units declared in the projection. Both declare
UNIT["degree",0.0174532925199433]
; “degree“ has a clear meaning; the float besides it is the size of the unit in radians (π/180).
So the parameters for -tr
is how many units (degrees) does a pixel represent (or, more accurately, how many degrees are between
a pixel center and the next one). Notice that the vertical value is negative; that's because raster images go from North to South,
but degrees go the other way around (North and East degrees are positive, South and West negative). In any case, I also just copied
the pixel size declared in the 1 arc file.
After all this dissertation about GeoTIFF metadata, finally the resulting DEM:
As I said earlier, I don't have much time to invest in this; my map is mostly for personal consumption and I can't put much time on it. so my conclusion is this: if I can manage the 9x data size, I think I'm going this way.
Last night I got an idea: start implementing some stuff on top of Elevation. The first thing I thought of was the ability to save markers in cookies, and to recover them every time the map was accessed. This would allow anybody to safely add personal markers on top of it without having to trust me (except for the map service). This might lead to something else entirely later, but for now that was the goal.
So the first step was to find out how to access cookies in js. It seems that
there is no native way to do it, and you have to parse them for yourself from
the document.cookie
attribute. Luckily someone already wrote some code for
it. There was no license, but I
think it's ok. Then I added another function to have a list of all the cookies
whose name starts with some prefix, based on the readCookie()
function:
function readCookies(prefix) {
var nameEQ = prefix + '_';
var ca = document.cookie.split(';');
var ans = [];
j= 0;
for (var i = 0; i < ca.length; i++) {
var index = 0;
var c = ca[i];
while (c.charAt(index) == ' ') {
index = index + 1;
}
if (c.indexOf(nameEQ) == index) {
// keep reading, find the =
while (c.charAt(index) != '=') {
index = index + 1;
}
ans[j] = c.substring(index + 1, c.length);
j= j+1;
}
}
return ans;
}
The next step was to encode and decode markers into strings. The format I
decided is simple: CSV, lat,lon,text,URL
. So, here's the function that
converts cookies to markers:
function markersFromCookies () {
var cookies= readCookies ("marker");
for (var i= 0; i<cookies.length; i++) {
var cookie= cookies[i];
var data= cookie.split (',');
// it's lat,lon,text,url
var marker= L.marker([data[0], data[1]]).addTo (map);
// TODO: reconstruct the url in case it got split
if (data[3].length>0) {
marker.bindPopup ('<a href="'+data[3]+'">'+data[2]+'</a>').openPopup ();
} else {
marker.bindPopup (data[2]).openPopup ();
}
}
}
20 lines of code and there already is a TODO comment :) That's because URLs can have commas in them, but for the moment I'm thinking in short URLs from sites like bitly.
All this was working perfectly in Firefox' scratchpad. Then I decided to put it in "production". For that I took all the js from the original page and put it in a file, along with all these functions, I removed the permanent marker from my map, converting it into a cookie, pushed the code, reloaded and... it broke.
This is not the first time I see js fall apart. Last year I helped PyCon.ar's organization with the site, specifically the map showing the city with markers for the venue and the bus station. I even had to map the venue because it was not in OSM's data :) If you follow that link, you'll see that the popups are completely broken. These things were working perfectly in vacuum, but when I integrated it into the page it just fell apart. I never found out why.
In my current situation, if I try to run markersFromCookies()
in scratchpad,
this is what I get:
Exception: t.addLayer is not a function
o.Marker<.addTo@http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.js:7
markersFromCookies@http://grulicueva.homenet.org/~mdione/Elevation/local.js:59
@Scratchpad/1:1
Basically, that's the call to L.marker().addTo()
. Maybe the constructor is not
working anymore, maybe something else entirely. At least this time a dim light
in the back of my head told me maybe the map
variable is not global as it
seems to be from scratchpad, so I simply passed the map from setup_map()
to
markersFromCookies()
and now it works. Notice that the error message never
mentioned this fact but something else entirely. I'm just glad I didn't have to
follow the hint and debug Leaflet's to find out this. All I hope now is that
I don't go insane with this small project.
Next steps: adding new markers and sharing!
openstreetmap javascript elevation
Another long time without mentioning any advancements on my map making efforts. While not much has changed, what has changed is a big step to easier customization.
In the last post in this series I gave a quick list on how I make my own maps:
- Use SRTM elevation to generate a hypsometric base map, hill and slope
shading, and finally hypsometric curves. For the latter I'm using
gdal-contour
andshp2pgsql
. - Use any extract you want and import them with
osm2pgsql
. Small extracts import quite quick, but so far, I have never succeeded to import a big part of Europe (from Portugal to south Finland, so it covers UK/Ireland and Istambul) in a machine with 8GiB of RAM and hard disks. The recommendation is to use a machine with lots of RAM (16+, I heard up to 256 for the whole planet) and SSDs. - Use TileMill for the initial design.
- Do a lot of xml manipulation to try to customize OSM's mapnik style based on your design.
- Use [generate_tiles.py](https://github.com/openstreetmap/mapnik-stylesheets/ generate_tiles.py) to, well, generate the tiles.
But since August 2013 things have changed in the 3rd and 4th items in that
list. Andy Allan had finished a first big iteration of redoing OSM's mapnik
style in CartoCSS. The
latter is a CSS-like language that is the native way to style things in
TileMill. Since then, customizing is way much easier, not only because of the
Cascading part of CSS, but also because Andy took the time to make a lot of
things (widths, colors) to things similar to C/C++'s #define
s, which you can
override and impact anywhere where the definition is used.
So now, steps 3 and 4 are more like:
- Use TileMill to change OSM's initial design[1].
- Export the design to a mapnik XML file.
In fact, that last step could be avoided, given that TileMill can also render everything by himself.
The last step in having your own tiles to be able to use them. I use Marble in my phone, but I also setup a slippy map so I can brag about it. Unluckily I can't embed the map here (I should fix this).
The tiles served actually come from different rendering passes using different, incremental designs. The previous-to-last one can be seen in Deutschland's region; I will be rendering some parts of London with a newer design before the end of the month.
[1] You can use any text editor to change the CartoCSS files; TileMill will pick them up via inotify and rerender accordingly. The only problem is when you're editing multiple files to impact a zoom level that takes a lot to render (for me is between 6 and 8).
openstreetmap elevation gdal gis srtm