<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>.:: Marcos Dione/StyXman's glob ::. (Posts about shapefiles)</title><link>https://www.grulic.org.ar/~mdione/glob/</link><description></description><atom:link href="https://www.grulic.org.ar/~mdione/glob/categories/shapefiles.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2025 &lt;a href="mailto:mdione@grulic.org.ar"&gt;Marcos Dione&lt;/a&gt; </copyright><lastBuildDate>Sat, 15 Nov 2025 20:52:05 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Measure your optimnizations</title><link>https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/</link><dc:creator>Marcos Dione</dc:creator><description>&lt;p&gt;One of the parts of having my own map style with hypsometric contour lines is that I
have to generate those contour lines. There's a tool in GDAL, particularly the one that
actually does everything based on DEM files, called &lt;code&gt;gdaldem&lt;/code&gt; that
can generate shapefiles with contour lines that &lt;code&gt;mapnik&lt;/code&gt; can read.
But since my source files are 1x1° files, I will have to generate one layer for
each shapefile  and that doesn't scale very well, especially
at planet size.&lt;/p&gt;
&lt;p&gt;So what I do is I convert those shapefiles to SQL files and then I inject
them into my database one by one, and then I can use &lt;code&gt;mapnik&lt;/code&gt;'s own support for filtering
by bbox when it's rendering, so that should be faster&lt;sup id="fnref:4"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fn:4"&gt;4&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;I put the SQL files in my file system,
and then I import them by hand as I need them, and I'm running out of space again. A few years
ago I had a 1TB disk, and that was enough, and now I am at the 2TB disk,
and it's getting small. I have the impression that the new DEMs I am using are bigger,
even if I streamlined every layer so it uses as less space as possible.&lt;/p&gt;
&lt;p&gt;One of the things I'm doing
is converting my processing script into a &lt;code&gt;Makefile&lt;/code&gt;, so I can remove intermediary files. My process goes
from the original DEM files, that are in LatLon, I project them to WebMerkator.
This file becomes the source for the terrain files, which gives the hypsometric
tints, and I generate the contours from there, and then I do a
&lt;a href="https://www.grulic.org.ar/~mdione/glob/posts/trying-to-calculate-proper-shading/"&gt;compensation for slope shade and hill shade&lt;/a&gt;.
Notice that I get
two intermediary files that I can easily remove, which are first, the reprojected file, because
once I have the terrain and contour files, I can remove it, I don't care anymore;
and also the compensated file, I don't need it anymore once I have the shade files.
The &lt;code&gt;Makefile&lt;/code&gt; is covering that part,
once the files are generated, the intermediary files are gone.&lt;/p&gt;
&lt;p&gt;Going back to the SQL files, I don't inject SQL data directly into my database, because I don't have space for
that. So, I just generate this SQL file and I compress it, so it's not using so much space,
because SQL is really a lot of text. I've been using &lt;code&gt;xz&lt;/code&gt; as the compressor,
and I have been blindly using its highest compression level, CL 9.
What do I mean with
blindly? I noticed it actually takes a lot of time. I just measured it with one tile,
and it took 451 seconds. That's 7.5 minutes per degree tile,
which is a lot. So I asked myself, what's the compression ratio to time
spent ratio?&lt;/p&gt;
&lt;p&gt;I took a single file and I compressed it with all the compression levels
between 1 and 9, and I took the time and the space in the final file. I made a scatter graph,
and it looks like this pretty weird Z figure&lt;sup id="fnref:2"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fn:2"&gt;2&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="" src="https://www.grulic.org.ar/~mdione/glob/images/xz_compression_levels-time_and_sizes.png"&gt;&lt;/p&gt;
&lt;p&gt;Here's the raw data&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: center;"&gt;level&lt;/th&gt;
&lt;th style="text-align: center;"&gt;time_in_seconds&lt;/th&gt;
&lt;th style="text-align: center;"&gt;readable_time&lt;/th&gt;
&lt;th style="text-align: right;"&gt;size_in_bytes&lt;/th&gt;
&lt;th style="text-align: center;"&gt;comp_ratio&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;1&lt;/td&gt;
&lt;td style="text-align: center;"&gt;57.84&lt;/td&gt;
&lt;td style="text-align: center;"&gt;57s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;129_486_376&lt;/td&gt;
&lt;td style="text-align: center;"&gt;29.21%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;2&lt;/td&gt;
&lt;td style="text-align: center;"&gt;117.40&lt;/td&gt;
&lt;td style="text-align: center;"&gt;1m57s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;129_993_440&lt;/td&gt;
&lt;td style="text-align: center;"&gt;29.33%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;3&lt;/td&gt;
&lt;td style="text-align: center;"&gt;252.28&lt;/td&gt;
&lt;td style="text-align: center;"&gt;4m12s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;130_306_780&lt;/td&gt;
&lt;td style="text-align: center;"&gt;29.40%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;4&lt;/td&gt;
&lt;td style="text-align: center;"&gt;212.26&lt;/td&gt;
&lt;td style="text-align: center;"&gt;3m32s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;102_359_596&lt;/td&gt;
&lt;td style="text-align: center;"&gt;23.09%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;5&lt;/td&gt;
&lt;td style="text-align: center;"&gt;347.51&lt;/td&gt;
&lt;td style="text-align: center;"&gt;5m47s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;98_992_464&lt;/td&gt;
&lt;td style="text-align: center;"&gt;22.33%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;6&lt;/td&gt;
&lt;td style="text-align: center;"&gt;344.58&lt;/td&gt;
&lt;td style="text-align: center;"&gt;5m44s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;99_114_560&lt;/td&gt;
&lt;td style="text-align: center;"&gt;22.36%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;7&lt;/td&gt;
&lt;td style="text-align: center;"&gt;370.20&lt;/td&gt;
&lt;td style="text-align: center;"&gt;6m10s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;99_043_096&lt;/td&gt;
&lt;td style="text-align: center;"&gt;22.34%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;8&lt;/td&gt;
&lt;td style="text-align: center;"&gt;416.48&lt;/td&gt;
&lt;td style="text-align: center;"&gt;6m56s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;99_005_352&lt;/td&gt;
&lt;td style="text-align: center;"&gt;22.33%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: center;"&gt;9&lt;/td&gt;
&lt;td style="text-align: center;"&gt;451.85&lt;/td&gt;
&lt;td style="text-align: center;"&gt;7m31s&lt;/td&gt;
&lt;td style="text-align: right;"&gt;99_055_552&lt;/td&gt;
&lt;td style="text-align: center;"&gt;22.35%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;I'm not going to explain the graph or table, except to point to the two obvious parts: the jump from CL 3 to 4,
where it's not only the first and only noticeable space gain, it also takes less time; and the fact that compressions
levels 1-3 and 4-9 have almost no change in space gained. So I either use CL 1 or 4. I'll go for 1, until I run out of
space again.&lt;/p&gt;
&lt;p&gt;All this to say: whenever you make an
optimization, measure all the dimensions, time, space, memory consumption,
and maybe you have other constraints like, I don't know, heat produced, stuff like that. Measure and compare.&lt;/p&gt;
&lt;div class="footnote"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Sorry for the ugly table style. I still don't know how to style it better. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fnref:1" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Sorry for the horrible scales. Either I don't know it enough, or LibreOffice is quite limited on how to format the
  axises&lt;sup id="fnref:3"&gt;&lt;a class="footnote-ref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fn:3"&gt;3&lt;/a&gt;&lt;/sup&gt;. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fnref:2" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;No, I won't bother to see how the plural is made, this is taking me long enough already :-P &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fnref:3" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;This claim has not been proven and it's not in the scope of this post. &lt;a class="footnote-backref" href="https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/#fnref:4" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description><category>compression</category><category>contours</category><category>dem</category><category>gdal</category><category>mapnik</category><category>optimization</category><category>shapefiles</category><category>shp2pgsql</category><category>xz</category><guid>https://www.grulic.org.ar/~mdione/glob/posts/measure-your-optimnizations/</guid><pubDate>Tue, 20 Aug 2024 13:53:57 GMT</pubDate></item></channel></rss>