Another aspect I've been looking into with respect to optimizing the rendering speed is data sources that are not in the target projection. Not being in the target projection forces Mapnik to reproject them on the fly, and this for each (meta)tile, whereas it would make more sense to have them already reprojected.

In my case I have 3 datasets, big ones, in EPSG:4258. The three datasets are elevation, hill and slope shade based on EEA's DEM files. The source tiles amount to almost 29GiB, and the elevation layer, being RGB, takes more than that. So I set off to try to reproject the things.

My first, more obvious approach was to reproject every 5x5°, 18000x18000px tile, then derive the data I need, but I started to get gaps between tiles. Notice that the original data is cleanly cut (5° x 3600"/° x 1px/" == 18000px), without any overlapping.

The next approach was to merge them all in a .vrt file, then reproject chunks of it with gdalwarp. What I wanted as output was the same 5x5° tiles, reprojected, but with an extra pixel, so they overlap. This last requirement was the problematic one. See, the final projection makes any square in the original projection a tall rectangle, stretching more and more towards the poles. The closest I could get was to use the -ts option, but that meant that I didn't get any control about how many extra pixels I got in the vertical/latitude direction. My OCD started thrashing :) In fact what happened was that I was not sure how GDAL would handle the possible partial pixel, whether rounding down (meaning excluding it), up (finishing it), or simply leaving the pixel with partial data and impacting the final rendering.

Even Rouault pointed to me that gdalwarp can do something fantastic: it can generate a .vrt file too with all the parameters needed for the reprojection, so reading from there was automatically reprojecting the original data. The resulting dataset is 288,000x325,220px (the original is 288,000x180,000px), so I'm definitely going to cut it down in small tiles. After consulting with a eight-ball, I decided to discard the idea of tiles with boundaries based on coordinates, which might not even make sense anymore, but settle for pixel based sizes, still with an extra pixel. The chosen size is 2**14+1 a.k.a. 16385. For this gdal_translate is perfect.

The final algorithm is like this:

gdalwarp -t_srs "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 \
    +x_0=0.0 +y_0=0.0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs +over" \
    -r lanczos -tr 30.92208077590933 -30.92208077590933 \
    -of VRT EU-DEM.vrt EU-DEM-corrected.vrt

The values for the -tr option is the pixel size in meters, which is the unit declared in the SRS. Notice that as Mercator stretches towards the poles, this is the size at the origin of the projection; in this case, at 0° lat, 0° lon.

Then the reprojection (by reading from the reprojecting dataset) and cut, in a couple of loops:

tile_size=$((2**14)); \
for i in $(seq 0 17); do
    for j in $(seq 0 5); do
        for k in $(seq 0 3); do
            l=$((4*$j+$k));
            gdal_translate -co BIGTIFF=YES -co TILED=YES -co COMPRESS=LZMA \
                -co LZMA_PRESET=9 \
                -srcwin $(($tile_size*$i)) $(($tile_size*$l)) \
                    $(($tile_size+1)) $(($tile_size+1)) \
                -of GTiff EU-DEM-corrected.vrt \
                $(printf "%03dx%03d-corrected.tif" $i $l) &
        done;
        wait;
    done;
done

There's an extra loop to be able to launch 4 workers at the same time, because I have 4 cores. This doesn't occupy the 4 cores a 100% of the time (cores that already finished stay idle until the other finished), but it was getting akward to express in a Makefile, and this is run only once.

Before deriving the rest of the data there's an extra step: removing those generated tiles that actually have no data. I do a similar thing with empty sea tiles in the rendering process. Notice also that the original data is not tile complete for the covered region (79 tiles instead of the 160 they should be).


openstreetmap gis gdal

Posted mar 23 ene 2018 14:08:42 CET Tags: gdal

A few weeks ago an interesting PR for osm-carto landed in the project's GitHub page. It adds rendering for several natural relief features, adding ridges, valleys, aretes, dales, coulouirs and others to cliffs, peaks and mountain passes, which were already being rendered. I decided to try it in Elevation (offline for the moment).

I sync'ed the style first with the latest release, applied the patch and... not much. My current database is quite old (re-importing takes ages and I don't have space for updates), so I don't have much features like that in the region I'm interested in. In fact, I went checking and the closest mountain range around here was not in the database, so I added it.

By the way, the range is mostly concurrent with a part of an administrative boundary, but SomeoneElse and SK53 suggested to make a new line. Even when other features are nearby (there's a path close to the crest and it's also more or less the limit between a forest and a bare rock section), which already makes the region a little bit crowded with lines, it makes sense: boundaries, paths, forest borders and ridges change at different time scales, so having them as separate lines makes an update to any of those independent of the rest.

Now I wanted to export this feature and import it in my rendering database, so I can actually see the new part of the style. This is not an straightforward process, only because when I imported my data I used osm2pgsql --drop, which removes the much needed intermediate tables for when one wants to update with osm2pgsql --append. Here's a roundabout way to go.

First you download the full feature (thanks RichardF!). In this case:

http://www.openstreetmap.org/api/0.6/way/430573542/full

This not only exports the line (which is a sequence of references to nodes) with its tags, but the nodes too (which are the ones storing the coords). The next step is to convert it to something more malleable, for instance, GeoJSON. For that I used ogr2ogr like this:

ogr2ogr -f GeoJSON 430573542.GeoJSON 430573542.xml lines

The last parameter is needed because, quoting Even Rouault (a.k.a. José GDAL): «you will always get "points", "lines", "multilinestrings", "multipolygons" and "other_relations" layers when reading a osm file, even if some are empty», and the GeoJSON driver refuses to create layers for you:

ERROR 1: Layer lines not found, and <span class="createlink">CreateLayer</span> not supported by driver.

But guess what, that not the easiest way :) At least we learned something. In fact postgis already has a tool called shp2pgsql that imports ESRIShapeFiles, and ogr2ogr produces by default this kind of file. It creates a .shp file for each layer as discussed before, but again, we're only interested in the line one. So:

ogr2ogr 430573542 430573542.xml lines
shp2pgsql -a -s 900913 -S 430573542/lines.shp > 430573542.sql

We can't use this SQL file directly, as it has a couple of problems. First, you can't tell shp2pgsql the names of the table where you want to insert the data or the geometry column. Second, it only recognizes some attributes (see below), and the rest it tries to add them as hstore tags. So we have to manually edit the file to go from:

INSERT INTO "lines" ("osm_id","name","highway","waterway","aerialway","barrier","man_made","z_order","other_tags",geom)
    VALUES ('430573542','Montagne Sainte-Victoire',NULL,NULL,NULL,NULL,NULL,'0','"natural"=>"ridge"','010500002031BF0D[...]');

into:

INSERT INTO "planet_osm_line" ("osm_id","name","z_order","natural",way)
    VALUES ('430573542','Montagne Sainte-Victoire','0','ridge','010500002031BF0D[...]');

See? s/lines/planet_osm_line/, s/other_tags/"natural"/ (with double quotes, because natural is a keyword in SQL, as in natural join), s/geom/way/ and s/'"natural"=>"ridge"'/'ridge'/ (in single quotes, so it's a string; double quotes are for columns). And I also removed the superfluous values and the ANALIZE line, as I don't care that much. Easy peasy.

A comment on the options for shp2pgsql. -s 900913 declares the SRID of the database. I got that when I tried without and:

ERROR:  Geometry SRID (0) does not match column SRID (900913)

-S is needed because shp2pgsql by default generated MultiLineStrings, but that table in particular has a LineString way column. This is how I figure it out:

ERROR:  Geometry type (MultiLineString) does not match column type (LineString)

Incredibly, after this data massacre, it loads in the db:

$ psql gis < 430573542.sql
SET
SET
BEGIN
INSERT 0 1
COMMIT

Enjoy!


openstreetmap gdal postgis

Posted mar 12 jul 2016 17:37:22 CEST Tags: gdal

As I mentioned in one of my lasts posts about it, I started using SRTM 1 arc data for rendering my maps. So far, with the 3 arc dataset, I was able to generate a huge image by stitching the tiles with gdal_merge.py, then generating the 3 final images (height, slope and hill shade) plus one intermediary for the slope, all with gdaldem. Now this is no longer possible, as the new dataset is almost 10x the old one, so instead of going that way, I decided to try another one.

With gdalbuildvrt it is possible to generate an XML file that 'virtually' stitches images. This means that any attempt to access data through this file will actually make the library (libgdal) to find the proper tile(s) and access them directly.

So now the problem becomes processing each tile individually and then virtual stitching them. The first part is easy, as I just need to do the same I was doing to the huge stitched image before. I also took the opportunity to use tiled files, which instead of storing the image one scan line at a time (being 1 arc second resolution, each scan line has 3601 pixels; the extra one is for overlapping with the neighbors), it stores the file in 256x256 sub-tiles, possibly (that is, no tested) making rendering faster by clustering related data closer. The second step, with gdalbuildvrt, should also be easy.

The first block on the way is the fact that SRTM tiles above 50°N are only 1801 pixels wide, most posibly because it makes no sense anyways. This meant that I had to preprocess the original tiles so libgdal didn't have to do the interpolation at render time (in fact, it already has to do it once while rendering, using the lanczos scaling algorithm). This was done with gdalwarp.

The second one came from slope and hill shading tiles. As the algortihm goes, it generates some 'fade out' values in the edges, and when libgdal was stitching them, I could see it as a line in the seam. This was fixed by passing -compute_edges to gdaldem.

Finally, for some reason gdalbuildvrt was generating some very strange .vrt files. The format of these files is more or less the following:

  • For each band in the source tiles it creates a band in the result.
    • For each source tile, it describes:
      • The source file
      • The source band
      • The size and tiling of the source (3601², 256²)
      • The rectangle we want from the source (0², 3601²)
      • The rectangle in the result (x, y, 3601²)

The problem I saw was some weird declararions of the rectangles in the result, where the coordinates or the sizes didn't match what I expected. I will try to figure this out with the GDAL poeple in the following weeks, but first I want to make sure that the source tiles are easily downloadable (so far I have only found download options through USGS' EarthExplorer, which requires you to be logged in in order to download tiles, which means that it is not very scriptable, so not very reproducible).

So for the moment I'm using my own .vrt file generator, completely not generic enough for release, but soon. I also took the opportunity to make the rectangles in the result non-overlapping, being just 3600² in size. I know that the generated file works because I'm also generating smaller samples of the resulting layers (again, height, slope and hill shading) for rendering smaller zoom levels.

The only remaining question about huge DEM datasets is contour generation. So far I had just generated contour lines for each tile and lived with the fact that they too look ugly at the seams.


gdal gis srtm

Posted jue 30 abr 2015 15:03:22 CEST Tags: gdal

Since I started playing with rendering maps I included some kind of elevation info for highlighting mountains. At the beginning it was just hillshading provided by some german guy (I don't have the reference on me right now), but after reading Tilemill's terrain data guide, I started using DEMs to generate 4 different layers: elevation coloring, slope shading, hillshading and contour lines.

When I started I could find only three DEM sources: SRTM 3arc and ViewFinderPanoramas (1arc and 3arc). The second one tries to flatten plains (for instance, the Po's plain nearby to where I live), where it generates some ugly looking terracing. The third one, when I downloaded the corresponding tile (they're supposed to be 1x1 degrees), its medatada reported an extension between 7 and 48 degrees east, and between 36 and 54 degrees north, and that its size is 147602x64801 pixels. I also remember stitching all the tiles covering Europe, just to get a nice 1x1 degrees hole in the North Adriatic sea. Not having much time to pre or post process the data, I decided to stick to SRTM.

Things changed at the end of last year. The US government decided to release 1arc, 30m global coverage (previously that resolution covered only the US). I started playing with the data mid January, only to find that it is not void-filled: this DEMs are derived from the homonymous Shuttle mission, which used radar to get the data. Radar gets very confused when water is involved; this is no problem on rivers, lakes or sea, where elevation is constant relative to the coast, but it's a problem on snow covered mountains, glaciers and even clouds. This means that the data has NODATA holes. The SRTM 3arc v4.1 I was using had these 'voids' filled; deFerrantis has painstakingly been filling these voids too by hand, using topographic maps as reference.

So I set up to fill these voids too. But first let's see how the original data looks like. All the images are for the area near Isola 2000, a ski station I go often. The first image is how this looks on the SRTM 3arc v4.1:

This is a 4x4 grid of 256x256 pixel tiles (1024x1024 in total) at zoom level 13. The heights range from ~700m up to ~2600m, and the image combines all the 4 layers. It already starts showing some roundness in the terrain, specially in ridges and mountain tops; even at the bottom of the deep valleys.

For contrast, this is deFerrantis data:

This is the first time I really take a look on the result; it doesn't seem to be much better that 3arc. Here's the terracing I mentioned:

For contrast, check what 1arc means:

From my point of view, the quality is definitely better. Peaks, crests and valleys are quite sharp. As for the mountain sides, they look rugged. My appreciation is this reflects better the nature of the terrain in question, but Christoph Hormann of Imagigo.de views it as sample noise. He has worked a lot on DEMs to generate very beautiful maps.

But then we have those nice blue lagoons courtesy of voids (the blue we can see is the water color I use in my maps). So, how to proceed?

The simplest way to fix this is covering the voids with averages calculated from the data at the seams of the voids. GDAL has a tool for that called, of course, gdal_fillnodata.py. This is the outcome:

At first this looks quite good, but once we start to zoom in (remember there are at least 5 more zoom levels), we start to see some regular patterns:

Another option is to use deFerrantis' data to fill the voids. For this we need to merge both datasets. One way to do it is using GDAL's gdalwarp tool. We create a file piling up layers of data; first the most complete one, then the layers with holes:

gdalwarp deFerrantis/N44E007.hgt mixed.tif
gdalwarp SRTM_1arc_v3/n44_e007_1arc_v3.tif mixed.tif

This looks like this:

I have to be honest, it doesn't look good. Both files declare the same extents and resolution (their metadata is similar, but the second file has more), but if you compare the renders for SRTM_1arc_v3 and deFerrantis, you will notice that they don't seem to align properly.

The last simple option would be to upsample SRTM_3arc_v4.1 and then merge like before, but it took me a while to figure out the right parameters:

gdalwarp -te 6.9998611 43.9998611 8.0001389 45.0001389 -tr 0.000277777777778 -0.000277777777778 -rb SRTM_3as_v4.1/srtm_38_04.tif srtm_1as_v3-3as_v4.1.tif
Creating output file that is 3601P x 3601L.
Processing input file SRTM_3as_v4.1/srtm_38_04.tif.
Using internal nodata values (eg. -32768) for image SRTM_3as_v4.1/srtm_38_04.tif.
0...10...20...30...40...50...60...70...80...90...100 - done.
gdalwarp SRTM_1as_v3/n44_e007_1arc_v3.tif srtm_1as_v3-3as_v4.1.tif
Processing input file SRTM_1as_v3/n44_e007_1arc_v3.tif.
Using internal nodata values (eg. -32767) for image SRTM_1as_v3/n44_e007_1arc_v3.tif.
0...10...20...30...40...50...60...70...80...90...100 - done.

The complex part was the -te and -tr parameters. The 3 arc file covers a 5x5 degrees zone at 40-45°N, 5-10°E. The 1 arc file only covers 44-45°N, 7-8°E, so I need to cut it out. I use the -te option for that, but according the doc it's «[to] set georeferenced extents of [the] output file to be created». This means the degrees of the W, S, E and N limits of the output file. Note that the actual parameters are 1.5" off those limits; I took those limits from the original 1 arc file.

The second one is even more cryptic: «[to] set [the] output file resolution (in target georeferenced units)». This last word is the key, units. According to gdalinfo, both files have a certain pixel size in units declared in the projection. Both declare UNIT["degree",0.0174532925199433]; “degree“ has a clear meaning; the float besides it is the size of the unit in radians (π/180). So the parameters for -tr is how many units (degrees) does a pixel represent (or, more accurately, how many degrees are between a pixel center and the next one). Notice that the vertical value is negative; that's because raster images go from North to South, but degrees go the other way around (North and East degrees are positive, South and West negative). In any case, I also just copied the pixel size declared in the 1 arc file.

After all this dissertation about GeoTIFF metadata, finally the resulting DEM:

As I said earlier, I don't have much time to invest in this; my map is mostly for personal consumption and I can't put much time on it. so my conclusion is this: if I can manage the 9x data size, I think I'm going this way.


elevation gdal gis srtm

Posted sáb 07 feb 2015 02:25:17 CET Tags: gdal

Another long time without mentioning any advancements on my map making efforts. While not much has changed, what has changed is a big step to easier customization.

In the last post in this series I gave a quick list on how I make my own maps:

  • Use SRTM elevation to generate a hypsometric base map, hill and slope shading, and finally hypsometric curves. For the latter I'm using gdal-contour and shp2pgsql.
  • Use any extract you want and import them with osm2pgsql. Small extracts import quite quick, but so far, I have never succeeded to import a big part of Europe (from Portugal to south Finland, so it covers UK/Ireland and Istambul) in a machine with 8GiB of RAM and hard disks. The recommendation is to use a machine with lots of RAM (16+, I heard up to 256 for the whole planet) and SSDs.
  • Use TileMill for the initial design.
  • Do a lot of xml manipulation to try to customize OSM's mapnik style based on your design.
  • Use [generate_tiles.py](https://github.com/openstreetmap/mapnik-stylesheets/ generate_tiles.py) to, well, generate the tiles.

But since August 2013 things have changed in the 3rd and 4th items in that list. Andy Allan had finished a first big iteration of redoing OSM's mapnik style in CartoCSS. The latter is a CSS-like language that is the native way to style things in TileMill. Since then, customizing is way much easier, not only because of the Cascading part of CSS, but also because Andy took the time to make a lot of things (widths, colors) to things similar to C/C++'s #defines, which you can override and impact anywhere where the definition is used.

So now, steps 3 and 4 are more like:

  • Use TileMill to change OSM's initial design[1].
  • Export the design to a mapnik XML file.

In fact, that last step could be avoided, given that TileMill can also render everything by himself.

The last step in having your own tiles to be able to use them. I use Marble in my phone, but I also setup a slippy map so I can brag about it. Unluckily I can't embed the map here (I should fix this).

The tiles served actually come from different rendering passes using different, incremental designs. The previous-to-last one can be seen in Deutschland's region; I will be rendering some parts of London with a newer design before the end of the month.


[1] You can use any text editor to change the CartoCSS files; TileMill will pick them up via inotify and rerender accordingly. The only problem is when you're editing multiple files to impact a zoom level that takes a lot to render (for me is between 6 and 8).


openstreetmap elevation gdal gis srtm

Posted mié 05 feb 2014 01:59:00 CET Tags: gdal