Collating, processing, managing, backing up and serving a gallery of a 350GiB, 60k picture collection

In the last two days I have commented a little bit how I process and manage my photos. I'm not a very avid photographer, I have like 350 gigabytes of photos, most of them are yet not processed, around 60,000 of them. So I will comment a little bit more how do I manage all that.

I start with the camera, a 24Mpx camera, just a couple of lenses, nothing fancy. Go out, take some pictures, come back home.

I put the SD camera on my computer and I use my own software to import it. The import process is not fancy, it just empties the SD card, checks every file for the EXIF information, uses the date and time to create the filename, a sequence number if needed, and puts them all in a single incoming directory where all the current unprocessed images are1.

Then I use this software I developed in PyQt5. It's very, very basic, but it's really quick, it's mostly keyboard based. It reads the EXIF information and present some of the tags at the left of the screen; things like date, time, size, orientation and then focal length, aperture, ISO and various other data I can get from the images. It's mostly focused on my current camera and the previous one, both Nikons2. The previous one was an N90, right now it's an N7200. The image occupies most of the window, and the program is always in full screen. At the bottom there's the filename and a couple of toggles.

I can do several things with this:

  • Go forwards, backwards, by one, by ten, by a hundred and by a thousand, because that incoming directory right now has almost seven years of history, probably ten thousand pictures.

  • Move randomly, which allows me to pick up a new thing to collate when I get bored with the current one but I want to keep doing it to reduce the backlog.

  • Mark the images in different ways. The main ones are about selecting for storing, with two modes: One is to keep the image in the original size. I usually use this for my best landscape or astro photos. The other one will resize it down to twelve megapixels3, from 6000x4000 pixels to 4500x3000 pixels, 75% on each dimension.

  • Rotate the images, just in case the camera did not guess the orientation correctly, usually when I'm taking pictures right upward or right downwards.
  • Select several pictures for stitching, which will use hugin to do so. It's not 100% automatic, but at least puts the pictures in a stitch directory and point hugin there.

  • Select a picture for cropping or editing; I'm not going to develop a whole image editor, so I just delegate to an existing program, gwenview.

  • Select images for deleting and delete them permanently.

  • Select several images for comparison and enter/exit comparison mode, which means that going backwards and forwards applies only this this set. This is good for things like when you take certain pictures, but there are not necessarily sequences in the original picture sequence, which for me makes culling images faster.

  • It has two zoom levels, fit to screen and full size. I don't have much the need for other options.
  • 99% of the pictures I take are freehand, so in a sequence there's always some movement between images. In full size I can put every image on its own position, aligning the whole sequence and allow culling based on blurriness or other factors.

  • Also in full size, I can lock the view, so when I pan one of the images and I switch to another one, it will also pan that second image to that position. It also helps when I'm checking for details between two different images of the same thing.

  • Move all the selected images, resize them if needed, and put them in a folder. It also creates a hardlink between my categorization in folders into a folder that collects all the images by date; there's one folder for each month and year with all the pictures of that month inside. It uses hardlinks so it doesn't duplicate the image file, saving space.

  • It also has a readonly mode, so I can hand the computer to my kids to watch the photos.

When culling, I use the comparison mode and individual position and lock view features a lot, going back and forth between images, discarding until only one is left.

That's the first part, the one I must spend my time on, just basic culling, selection and storage. My main tree is just a tree based on my way of categorizing the images.

My program doesn't have a directory view; instead, I just use gwenview again.

Notice there's no photo editing in this workflow. I rarely shoot in RAW for two reasons: a) I'm really bad at postprocessing; and b) even if I was good, I don't have the time to do it; my free time is shared among several hobbies. I only do it for astro photograpy and very few, rare occasions.

The third tool I use is digikam. I use it for two things, which are related: semi-automatic and manual tagging. The semi-automatic is face detection; digikam can find and guess faces, but requires manual confirmation4. The fully manual part is plain tagging, mostly with location5 and sometimes some other info. I sometimes also rate my pictures; I mostly use four and five, sometimes three, only for my best pictures.

Then there's another script that reads the digikam database and uses the tags to create another directory for the tags, which also uses hardlinks. It still doesn't do anything about the rating, but I could easily add that.

That's all on my personal computer. I use rsync to make a copy on my home server that has two purposes. One, it's a backup, which includes all the original 24Mpx images that I hadn't culled yet, which I think is the biggest part of my collection.

The second one, it feeds a gallery program that is developed in PHP by a guy named Karl. It's probably the single paid software I use. It's a single PHP file that you put at the root of your gallery, you enable PHP processing by your web server (in my case, Apache), and generates the gallery on the run, just reading the directories and creating all the necessary thumbnails and all that. I did a small change to this program. The original algorithm creates thumbnails based on each file's path (and other attributes, 4 or 5 I think), but because I have all these hard links, it creates duplicated thumbnail files. So I changed it to use the filename instead of the filepath6.

I don't have any kind of synchronization with my phone. Most of the pictures I take with it are not the kind of pictures I usually will put in my own gallery, except the times I go out without my camera and I end up taking pictures anyway. I still don't have a workflow for that, it's mostly manual. So if I ever lose my phone, I'm fscked because I have definitely no backups of it.

That lack of synchronization also means that the only way to see the pictures in my phone is by opening the gallery in the browser. It's not the best, but I don't do that that often. I have tried to use alternatives like NextCloud, which I also have installed on my home server. I have some issues with permissions because, again, this is a backup directory, so it has all the owner information that belongs to me, instead of the web server. That means it doesn't have the proper permissions to let NextCloud manage those files. Luckily files.gallery just needs a subdirectory.

Another reason is that before I was using static gallery generators: sigal, gallerpy or even nikola, which drives this glob. All those can generate the gallery statically, so serving them is so much easier. My old home server died at some point and I had to come up with something. I had a spare old laptop laying around and I used that. Now it's enough to generate the gallery on the fly. I have plans to make something bigger, but that's for another time.


  1. In fact I have another directory for all the unprocessed photos from another era, and I'm thinking of starting a new era. 

  2. Even if EXIV is a standard for storing tags, there's no standard for the tag names, so every manufacturer has its own sets, that even change between camera lines. For a better idea of what I'm talking about, just peruse Image::ExifTool's source code

  3. I currently own no screen that is 4500 pixels of width, let alone 6000. Maybe my kids will, but by then Mpx count will be so different that it won't make any sense to accomodate that. Right now storage for me is expensive, so I'll keep it this way. 

  4. Or rejection: the false positive rate is bigger that I would like, and it doesn't have a way to say 'yes, this is that person, but don't train on this image'. This is the case for pictures where the face is either semi occluded, sometimes painted, sometimes bad lightning, and mostly just blurry. 

  5. Most of my pictures don't have GPS info, not even the ones in the phone. The latter I only enable when I really need the info later, mostly for mapping. Later I either discard the photo or remove the info. 

  6. For a while now I'm even making this distinction in my own code, filename vs filepath. 

Yuba Kombi + Virvolt 900 - a trip review

Editor's note: another dictated1

Last September or October we bought a Yuba Kombi and we took it to a bycicle repairman to install a Virvolt 900, which is a 691Wh battery and a 80Nm, 36V motor. This is slightly above the usual sizes you can find around, but the bicycle is quite heavy: it has the feet rests for the kids behind, plus the monkey bars, plus the seat for the youngest and a basket in the front, which is quite big. We have 24Kg for the bike, and then 3Kg for the motor and another 3Kg for the battery, excluding all the accessories.

Today I decided to take it for long spin. I've been using it for a while, but not for very long distances. Last weekend I took it almost to Callelongue, which is not that far: 20km way and back, with a pit stop at the supermarket for 2 full bags of groceries. It had less than a full battery and I had to return it with some juice for the next day, and it passed that non-test with good marks.

Today the trip was different: I wanted to have an idea of how much the battery lasts, and for that I selected a mixed trip, which includes mostly flat section, which is 34.5km long and only 240m of climb; and a quite steep one at the end, 9.2km long and 408m of climb. Technically I didn't choose that; what I choose is go to Sainte-Baume and I started looking at the routes that every router I had available. The first thing I noticed is that many of these routes were having a lot of uphill and downhill sections, ups and downs, ups and downs, which meant that there were several instances where every meter I gained I had to go down later and then go up again, which made no sense, specially since electric bikes do not have any type of energy recovery at all. If you spend energy twice or thrice to climb the same elevation meters, you are basically wasting it.

I remembered about BRouter, which is supposed to be a bicycle centered router, and start playing with it. I also watched a talk from five years ago where the author, Arndt Brenschede, explained that the only energy you have to think about is how much energy you lose due to friction against road and wind. I think he never explains it really well.

I chose the trekking bike profile, which was actually showing me some ups and downs, so I started picking at it, and I discovered that the profile was saying that going up had no penalty, and I thought that was wrong. In fact going down had a penalty of 60, which means that every 100m that you go down are equivalent to doing 6km flat. I say "well, if I want to penalize going up, then I just put a similar value for going up", and that gave me a route that was mostly flat and almost monotonically going up.

The trip2

Of course this is according to the data that these routers are using, which is mostly based on SRTM. This dataset has several issues, the first one being the resolution: it's only 30 meters, and that's 30 meters of longitude at the Equator, and 30 meters of latitude everywhere. I already discussed that the actual logitudinal resolution changes with latitude: every degree becomes smaller and smaller the closer you get to the poles, but I guess that's taken in account by the router. Even if it's a DEM, meaning it doesn't include trees and buildings, the level of detail for the road network can be really bad, and for ebikes and acoustic bikes, every meter counts.

On the way up I stopped several times: to buy some food; to eat said food, get some water and rest a little bit; to take some pictures, because to me that's part of going around. Everything was fine but I was getting really tired on the legs when I arrived to Saint-Zacharie, which is only 270 meters above sea level, right before the steep climb. So far I had mostly used level 2 out of five assistance levels, and gears between 7-9 out of nine gears, which is the same to what I use in the city when I'm alone. This is an assistive motor, not a pulling motor, so it only works when you work. That means I had almost constantly pedaled for 35km when I got to this part. I switched to using the lighter gears, between 6-3, and also using assistance levels 4 and 5.

When I reached Plan d'Aups I was not really tired; it was a fun experience under a really nice day. Then I decided to go down and I made one mistake, which was not taking the same route back home, but a shorter route that not only climbed 55m higher, it also had 80m of extra ascent6, for a total of 135m more than the original plan. I had only two bars out of five, and like many other electric vehicles, when this system gets to only one bar, it stops giving much assistance to save the battery from fully draining. My plug-in hybrid car does the same: when you reach the last 25% of the battery, it disables the full electric mode; at first it starts the ICE motor to charge the battery, then to drive the car, and there's no way to turn it off, even if you are close to your destination. I guess other cars would have this option if you used the internal routing system, but those maps are usually really crappy and outdated; or, if at least they're not out of date, it probably means it has a 4g chip that also sends a lot of telemetry home. That's a can of worms that I'll keep in the rack for a while.

I took the route down which was really fun, even if the bike is not the best for that. In particular, the wheels do not have road tires, but more like gravel tires with lugs. I think it lost a lot of energy in terms of friction the same way you can feel the difference between good asphalt from bad or old asphalt. Despite the steep descent, I could not make it go above 50km/h :)

When I reached Geménos I had only one bar3 and the system started to skimp energy, and that's when I got to the flats. I decided to save the battery so I turned the assistance off and start pedaling back home. It was still more than 20 kilometers away and I was really tired. I'm definitely out of shape for these kind of things; the whole trip was around 80km and 785m of climb, so something that I definitely could not do on an acoustic bike; I barely could with this one. Let's see tomorrow what my body says4.

From there the way back home was really slow. I stopped several times that allowed my legs to to rest a little bit, took a few pictures and finding directions, making sure I was getting the right road. I hit another issue: one of the roads that the router asked me to take was closed, and the only sensible option was to take another route that was going up 15m in some 300m. I was definitely very very very very very tired by then and my left knee started to hurt, so at this point I was being miserable and I still had more than 15 kilometers to go.

I made a few other stops, I bought a coffee and sweets in a bakery, and I just went back really really careful of what I was doing, taking every downhill section, and going at a pace that was not painful to me on every uphill section. It's a shame that my recording phone didn't work, the GPX would have showed how bad this was compared to being able to use the assistance, even when I was technically going down.

I'm now in the phase of preparing another testing trip: Clermont-Ferrand - Mont-Dore, some 45km +680m/+-380m6. My idea is to stop somewhere in the middle and try to recharge the battery, but the region is so sparsely populated that even finding a bakery, let alone a restaurant, means extending the trip for a few kms and +- meters. One route by the single restaurant I could find in the general direction adds 1.4km and +-120m; another, that goes by a charging station, adds 4.2km and and only +-30m, but this means having a snack, waiting in a parking, and going off the scenic road I wanted.

I have several rabbit holes I would like to go down and those include:

  • Figure out whether I want a big bicycle like this or a smaller one but with smaller tires (I still want to carry two young children for a while). Smaller tires mean worse riding experience, specially since they don't have any suspension at all. But a lighter bike with the same system would have more range! One thing I know is that I'm not going to make this kind of trips with my kids. They're still young (almost 5 and 7yo) and they would get bored in the uninteresting or monotonic sections of the trip. It should also be a transitional bike; once they grow old enough, they would get their own bikes, but that's a few years from now.

  • Could we collect GPS data to improve the data these routers use for ascent/descent/slope calculations?

  • What router parameters make sense for ebikes5? I used my intuition to go against what the original developer decided, and they seem way more informed than me.

  • What about buying an extra battery to extend the range and also as backup?

  • Would commerces like restaurants, bars and bakeries allow or even offer recharging services, free or for a fee? I though of asking to recharge the battery at the bakery, but it was one of those chain things and it could have put the single employee in trouble. One thing is to recharge a smartphone, another is a 700Wh ebike battery.

I'll probably only check the first one, since I do want to buy a bicycle.


  1. So dictated that I couldn't even finish this sentence while editing :) 

  2. This trip is not the one I parametrized but the same one I got just using a better profile. 

  3. My mistake here was to use assistance on the way down, so the motor was helping me reaccelerate after each hairpin, all 15 of them. 

  4. My left knee complained for a couple of days, but that was all. 

  5. And EVs in general. 

  6. My notation here is: distance in km, plain ascend in m, extra ascend in m. The extra ascend are meters that are both climbed up and down. Routers usually speak in terms of plain ascend (difference in elevation between start and destination) and (total) ascend, but I prefer to highlight the extra elevation. 

Monitoring the maximum latency for a scrape period with prometheus

Ever since I watched (several times) Gil Tene's talk about how we are measuring latency wrong I set myself to try get such values in a graph. His thesis is that we are using histograms and other similar tools for measuring percentiles of our response times and that that is wrong because we never see the maximum time; our tools are hiding them behind averages. What can we do about it with our current tools, in particular Prometheus and Grafana? Let's first define exactly what we want:

On one side we have a stream of log lines. These lines will be parsed and some values will be extracted from them. On the other side we have Prometheus trying to scrape a representative of those values. The representative that we want for every scrape is the max of all the values collected in that period between scrapes. For the next scraping period, we must forget this max and calculate the new one, and so on for each scrape interval.

Prometheus offers us three different metric types that could allow us to get such value. The first and obvious one would be a gauge, but there is no tool that does what I described in the previous paragraph.

The second one would be a histogram. For a histogram, we define a set of values that become boundaries. For each interval defined by those boundaries, the histogram gives us the amount of values that fall in them, and also below the lowest boundary and above the highest one. This is not enough because we can never have even a ballpark of what's the maximum value is, much less it's actual value. The max is lost among all the values that fall in the interval above the las boundary we define; we only know which interval contains the max. To at least get a ballpark you need to figure out an upper boundary, otherwise the max could be anywhere between the highest boundary and infinite. Maybe you're lucky and you have something like a timeout, but it's not often that a timeout fires during active data transmissions. On the other hand, if you're measuring transfer times for a service that returns files, you don't have an upper bound of how long it can take because it depends on the whole network path and the size of the file.

The last one is summaries. Summaries work slightly different. Instead of partitioning in terms of values, it partitions in terms of percentiles. One aspect of summaries is that you cannot aggregate them. Not only Prometheus' documentation mentions this, it also cites one of Gil Tene's articles about it1. But one of the peculiarities of percentiles is that if you take the 0 percentile, you get the minimum, and if you take the 100 percentile, you get the maximum. And as Gil Tene mentions in that same article, those are the only two values that you can aggregate because if you have all the minimums, the global minimum is just the minimum of all those values, and similarly for the maximum.

Now, how do we actually implement it? I had mentioned before, we have two tools for converting log lines into metrics for Prometheus. One is mtail, and the other one is the Grok exporter. mtail doesn't seem to have any support for summaries, so we are only left with the Grok exporter.

I tried to set up a Grok exporter configuration to do exactly that, only the minimum and the maximum for a given value, which in my case was time to first byte (TTFB), because I'm trying to apply it to a service that gives files as responses. The total latency of such service depends on the size of the file, but TTFB should not, so I'm interested in that.

I want the minimum and the maximum, so I set up a metric for that. I define it as summary, and I set it up for the zero and a hundred percentiles. And this is where the problem starts getting slightly more complicated.

First, the Grok exporter uses a library for computing the percentiles. That library does not accept just an array of the percentiles you are interested in; quoting the docs:

grok_exporter does not provide exact values for the quantiles, but only estimations. For each quantile, you also specify an uncertainty that is tolerated for the estimation

I don't know how much that impacts the two values I'm looking for, minimum and maximum, but I started with a 1% tolerance just to try:

- type: summary
  name: elevation_tile_ttfb_extremes
  help: Minimum and Maximum for TTFB
  match: '%{ELEVATION_LOG}'
  value: '{{multiply .first_byte 0.001}}'  # µs to ms
  quantiles: { 0: 0.01, 100: 0.01 }  # min, max
  max_age: 15s

And it didn't work.

First, it started complaining that when it was calculating some values, it was trying to convert the string that it got from the log lines to an integer, but there were empty strings and basically it was failing:

grok-exporter[1402204]: WARNING: skipping log line: error processing metric elevation_tile_ttfb:
    unexpected error while evaluating template for label __value__: template: __value__:1:2:
    executing "__value__" at <multiply .first_byte 0.001>: error calling multiply: error executing multiply function:
    cannot convert  to floating point number: strconv.ParseFloat: parsing "": invalid syntax

(I had to split it into several lines so you don't have to scroll horizontally [so much]. I'm not going to comment on whether Golang's tendency to wrap errors to the point, shown above, where the chain of error messages becomes a type of horizontal stack trace, is a good thing or not).

Once I fixed that, instead I got a traceback that points to an index error inside that library.

2024/03/10 13:42:33 http: panic serving 127.0.0.1:49912: runtime error: index out of range [99] with length 1
goroutine 39 [running]:
net/http.(*conn).serve.func1(0xc000420000)
        /usr/local/go/src/net/http/server.go:1801 +0x147
panic(0xb7d0e0, 0xc000038060)
        /usr/local/go/src/runtime/panic.go:975 +0x3e9
github.com/fstab/grok_exporter/vendor/github.com/beorn7/perks/quantile.(*Stream).Query(0xc00045cd20, 0x4059000000000000, 0x4051e3d70a3d70a4)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/beorn7/perks/quantile/stream.go:165 +0x16f
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus.(*summary).Write(0xc000144c00, 0xc000175b20, 0x3, 0xc00023f412)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/summary.go:310 +0x3c7
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus.processMetric(0xca4840, 0xc000144c00, 0xc00023f5c0, 0xc00023f5f0, 0x0, 0x0, 0x0)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/registry.go:596 +0xa2
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus.(*Registry).Gather(0xc000100410, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/registry.go:490 +0x8ca
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp.HandlerFor.func1(0x7fe9cd2298f8, 0xc00050e280, 0xc00022e300)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go:126 +0x99
net/http.HandlerFunc.ServeHTTP(0xc00017a230, 0x7fe9cd2298f8, 0xc00050e280, 0xc00022e300)
        /usr/local/go/src/net/http/server.go:2042 +0x44
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7fe9cd2298f8, 0xc00050e280, 0xc00022e300)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:40 +0xab
net/http.HandlerFunc.ServeHTTP(0xc00050d170, 0x7fe9cd2298f8, 0xc00050e280, 0xc00022e300)
        /usr/local/go/src/net/http/server.go:2042 +0x44
github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0xca9840, 0xc00024a000, 0xc00022e300)
        /go/src/github.com/fstab/grok_exporter/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:100 +0xda
net/http.HandlerFunc.ServeHTTP(0xc00050d260, 0xca9840, 0xc00024a000, 0xc00022e300)
        /usr/local/go/src/net/http/server.go:2042 +0x44
net/http.(*ServeMux).ServeHTTP(0x131bd20, 0xca9840, 0xc00024a000, 0xc00022e300)
        /usr/local/go/src/net/http/server.go:2417 +0x1ad
net/http.serverHandler.ServeHTTP(0xc000196000, 0xca9840, 0xc00024a000, 0xc00022e300)
        /usr/local/go/src/net/http/server.go:2843 +0xa3
net/http.(*conn).serve(0xc000420000, 0xcac140, 0xc000244000)
        /usr/local/go/src/net/http/server.go:1925 +0x8ad
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2969 +0x36c

In my last post about this, I mentioned that the Grok exporter has been unsupported for four years, nobody has written any single line of code after that, and there are many open issues and even several requests still open and nothing has happened in the last four years. And this library, the one that calculates the percentiles, is even worse. It's been five years since they last updated it, which means that there's no chance that even if I report the bug, I will get releases of both that library and then the Grok exporter to fix it. And I don't feel like reading a whole paper in statistics at this point of my life just to solve this problem2.

So what options do we have? Well, I think there's only one, and it's writing your own tool that fulfills the description in the second paragraph. I'm not really sure I have the energy to do it, but I think this will keep jumping around my head and I will succumb and do it, mostly because I think it's important.

But until that moment occurs, I have no solution for this problem, and that means that in the meantime I will have to live with histograms.



  1. It's actually interesting that Prometheus people are aware of Gil Tene's ideas and yet don't have a solution for them. 

  2. I remember being a frevient free software advocate for reasons exactly like this. But then, I don't have the energy nor I think I can convince my manager to get someone paid to fix it. 

Sending AWS CloudWatch alarms through SNS to MSTeams

I'm new to AWS os please take the following statements with a grain of salt. Also, I'm tired, but I want to get this of my chest before the weekend begins (although, technically, it has already begun), so it might not be so coherent.

AWS provides some minimum monitoring of your resources with a tool called CloudWatch. Think of prometheus + grafana, but more limited. Still, is good enough to the point it makes sense to setup some Alerts on it. Many of AWS's resources are not processes running on a computer you have access to, so you can't always install some exporters and do the monitoring yourself.

If you're like me, CloudWatch Alerts must be sent to the outside world so you can receive them and react. One way to do this1 is to channel them through SNS. SNS supports many protocols, most of them internal to AWS, but also HTTP/S. SNS is a pub-sub system, and requires a little bit of protocol before it works.

On the other end we2 have MSTeams3. MSTeams has many ways of communicating. One is Chat, which is a crappy chat67, and another is some kind of mix between a blog and twitter, confusingly called Teams. The idea in a Team is that you can post... Posts? Articles? And from them you can have an unthreaded converstion. Only Teams have webhooks; Chats do not, so you can't point SNS there.

If you have read other articles about integrating CloudWatch Alerts or SNS to MSTeams, they will always tell you that you not only need SNS, but also a Lambda program. Since we already handle gazillion servers, not all of them in AWS, and one in particular we pay quite cheap for dedicated HW, and also we're trying to slim our AWS bill (who doesn't), I decided to see if I can build my own bridge between SNS and Teams.

I already said that SNS has a litte protocol. The idea is that when you create an HTTP/S Subscription in SNS, it will POST a first message to the URL you define. This message will have a JSON payload. We're interested in two fields:

{
    "Type": "SubscriptionConfirmation",
    "SubscribeURL": "..."
}

What you have to do is get this URL and call it. That way SNS will know the endpoint exists and will associate an ARN to the Subscription. Otherwise, the Subscription will stay unconfirmed and no messages will be sent to it. Interestingly, you can't neither edit nor remove Subscriptions (at least not with the web interface), and I read that unconfirmed Subscriptions will disappear after 3 days or so 4.

SNS messages are also a JSON payload POST'ed to the URL. They look like this:

{
  "Type" : "Notification",
  "MessageId" : "<uuid1>",
  "TopicArn" : "<arn>",
  "Subject" : "...",
  "Message" : "...",
  "Timestamp" : "2024-01-19T14:29:54.147Z",
  "SignatureVersion" : "1",
  "Signature" : "cTQUWntlQW5evk/bZ5lkhSdWj2+4oa/4eApdgkcdebegX3Dvwpq786Zi6lZbxGsjof2C+XMt4rV9xM1DBlsVq6tsBQvkfzGBzOvwerZZ7j4Sfy/GTJvtS4L2x/OVUCLleY3ULSCRYX2H1TTTanK44tOU5f8W+8AUz1DKRT+qL+T2fWqmUrPYSK452j/rPZcZaVwZnNaYkroPmJmI4gxjr/37Q6gA8sK+WyC0U91/MDKHpuAmCAXrhgrJIpEX/1t2mNlnlbJpcsR9h05tHJNkQEkPwFY0HFTnyGvTM2DP6Ep7C2z83/OHeVJ6pa7Sn3txVWR5AQC1PF8UbT7zdGJL9Q==",
  "SigningCertURL" : "https://sns.eu-west-1.amazonaws.com/SimpleNotificationService-01d088a6f77103d0fe307c0069e40ed6.pem",
  "UnsubscribeURL" : "https://sns.eu-west-1.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=<arn>:<uuid2>"
}

Now, CloudWatch Alerts sent via SNS are sent in the Message field. As Message's value is a string and the Alert is encoded as JSON, yes, you guessed it, it's double encoded:

{
  "Message" : "{\"AlarmName\":\"foo\",...}"
}

Sigh. After unwrapping it, it looks like this:

{
  "AlarmName": "...",
  "AlarmDescription": "...",
  "AWSAccountId": "...",
  "AlarmConfigurationUpdatedTimestamp": "2024-01-18T14:32:17.244+0000",
  "NewStateValue": "ALARM",
  "NewStateReason": "Threshold Crossed: 1 out of the last 1 datapoints [10.337853107344637 (18/01/24 14:28:00)] was greater than the threshold (10.0) (minimum 1 datapoint for OK -> ALARM transition).",
  "StateChangeTime": "2024-01-18T14:34:54.103+0000",
  "Region": "EU (Ireland)",
  "AlarmArn": "<alarm_arn>",
  "OldStateValue": "INSUFFICIENT_DATA",
  "OKActions": [],
  "AlarmActions": [
    "<sns_arn>"
  ],
  "InsufficientDataActions": [],
  "Trigger": {
    "MetricName": "CPUUtilization",
    "Namespace": "AWS/EC2",
    "StatisticType": "Statistic",
    "Statistic": "AVERAGE",
    "Unit": null,
    "Dimensions": [
      {
        "value": "<aws_id>",
        "name": "InstanceId"
      }
    ],
    "Period": 60,
    "EvaluationPeriods": 1,
    "DatapointsToAlarm": 1,
    "ComparisonOperator": "GreaterThanThreshold",
    "Threshold": 10.0,
    "TreatMissingData": "missing",
    "EvaluateLowSampleCountPercentile": ""
  }
}

The name and description are arbitrary texts you wrote when setting the Alarm and the Subscription. Notice that the region is not the codename as in eu-west-1 but a supposedly more human readable text. The rest is mostly info about the Alarm itself. Also notice the Dimensions field. I don't know what other data comes here (probably the arbitrary fields and values you can setup in the Alarm), all I can say is that that format (list of dicts with only two fields, one called name and the other value) is possibly the most annoying implementation of a simple dict. I hope they have a reason for that, besides over engineering.

Finally, notice that the only info we get here about the source of the alarm is the InstanceId. As those are random strings, to me they don't mean anything. Maybe I can setup the Alarm so it also includes the instance'a name5, and even maybe the URL pointing to the metric's graph.

Finally, Teams' webhook also expects a JSON payload. I didn't delve much in what you can give to it, I just used the title, text and themeColor fields. At least text can be written in MarkDown. You get such a webhook going to the Team, click in the ("vertical ellipsis") icon, "Connectors", add a webhook and obtain the URL from there. @type and @context I copied from an SNS-to-Lambda-to-Teams post.

So to build a bridge between CloudWatch Alerts through SNS to MSTeams's Team we just need a quite straightforward script. I decided to write it in Flask, but I'm pretty sure writing it in plain http.server and urllib.request to avoid dependencies is not much more work; I just didn't want to do it. Maybe I should have tried FastAPI instead; I simply forgot about it.

Without further ado, here's the script. I'm running Python 3.8, so I don't have case/match yet.

#! /usr/bin/env python3

from flask import Flask, request
import json
import requests

app = Flask(__name__)

@app.route('/', methods=[ 'POST' ])
def root():
    print(f"{request.data=}")

    request_data = json.loads(request.data)

    # python3.8, not case/match yet
    message_type = request_data['Type']

    if message_type == 'SubscriptionConfirmation':
        response = requests.get(request_data['SubscribeURL'])
        print(response.text)

        return f"hello {request_data['TopicArn']}!"

    message = {
        '@type': 'MessageCard',
        '@context': 'http://schema.org/extensions',
        'themeColor': '4200c5',
    }

    if message_type == 'Notification':
        try:
            alarm = json.loads(request_data['Message'])
        except json.JSONDecodeError:
            message['title'] = request_data['Subject']
            message['text']  = request_data['Message']
        else:
            instance_id = alarm['Trigger']['Dimensions'][0]['value']
            state = alarm['NewStateValue']

            if state == 'ALARM':
                color = 'FF0000'
            else:
                color = '00FF00'

            message['title'] = f"{instance_id}: {alarm['Trigger']['MetricName']} {state}"
            message['text']  = f"""{alarm['AlarmName']}

{alarm['Trigger']['MetricName']} {alarm['Trigger']['ComparisonOperator']} {alarm['Trigger']['Threshold']}
for {int(alarm['Trigger']['Period']) // 60} minutes.

{alarm['AlarmDescription']}

{alarm['NewStateReason']}

for {instance_id} passed to {state} at {alarm['StateChangeTime']}."""
            message['themeColor'] = color

        response = requests.post('https://<company>.webhook.office.com/webhookb2/<uuid1>@<uuid2>/IncomingWebhook/<id>/<uuid3>', json=message)
        print(response.text)

        return f"OK"

  1. Again, I'm new to AWS. This is how it's setup at $NEW_JOB, but there might be better ways. If there are, I'm happy to hear them. 

  2. 'we' as in me and my colleagues. 

  3. Don't get me started... 

  4. I know all this because right now I have like 5-8 unconfirmed Subscriptions because I had to figure all this out, mostly because I couldn't find sample data or, preferably, a tool that already does this. They're 5-8 because you can't create a second Subscription to the same URL, so I changed the port for every failed attempt to confirm the Subscription. 

  5. We don't have pets, but don't quite have cattle either. We have cows we name, and we get a little bit sad when we sell them, but we're happy when they invite us to the barbecue. 

  6. OK, I already started... 

  7. I added this footnote (I mean, the previous one... but this one too) while reviewing the post before publishing. Putting the correct number means editing the whole post, changing each number twice, which is error prone. In theory nikola and/or MarkDown support auto-numbered footnotes, but I never managed to make it work. I used to have the same issue with the previous static blog/stite compiler, ikiwiki, so this is not the first time I have out-of-order footnotes. In any case, I feel like they're a quirk that I find cute and somehow defining. 

Debugging docker containers with the host's tools

Update:

cntr seems to be less hacky and much better maintained, so go use that instead.

At $NEW_JOB we heavily rely on Docker containers. The two biggest reasons I don't like them are that I think they're a nightmare to keep up to date security wise without also getting new versions with potential changes in behavior; and because they're usually built with as less tools inside the image as possible. I understand that the reasons might be also double: smaller image size and maybe intentionally reducing the attack surface and/or tools available to any hacker that might break into your shell from your service. I would rather have better defenses than having no tools to help me debug a break in.

For a while I've been thinking that it should be possible to mount the hosts's filesystem inside the container and use the host's tools to debug. I was attacking the problem kind of the wrong way, reading about how filesystem namespaces are implemented and how containers use them, until I finally changed my search query and found how to "Mount volumes into a running container" by Kynan Rilee.

The idea is really simple: find out the host's device that has the filesystem for the root partition (what? you have separate /usr?), create the device in the container, and then use nsenter without the --user namespace to mount it on /opt/host inside the container (otherwise you get a 'permission denied' error).

But that's still not enough. We have a few envvars to set before we can use the tools in /opt/host. First one is obviously PATH:

export PATH="$PATH":/opt/host/bin:/opt/host/sbin:/opt/host/usr/bin:/opt/host/usr/sbin

Still not enough, you need to also be able to load libraries from the new tree:

root@3e282deec242:/# mtr
mtr: error while loading shared libraries: libgtk-3.so.0: cannot open shared object file: No such file or directory

Here we have a dychotomy. We have to prioritize one of the two trees, either the container or the host. I think it's best to use the container's, but YMMV:

export LD_LIBRARY_PATH=/lib:/usr/lib:/opt/host/lib:/opt/host/usr/lib:/opt/host/usr/lib/x86_64-linux-gnu

Perl tools will also complain:

root@3e282deec242:/# ack
Can't locate File/Next.pm in @INC (you may need to install the File::Next module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0 /usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36 /usr/share/perl/5.36 /usr/local/lib/site_perl) at /opt/host/bin/ack line 11.
BEGIN failed--compilation aborted at /opt/host/bin/ack line 11.

So you need another one:

export PERL5LIB=/opt/host/etc/perl:/opt/host/usr/local/lib/x86_64-linux-gnu/perl/5.36.0:/opt/host/usr/local/share/perl/5.36.0:/opt/host/usr/lib/x86_64-linux-gnu/perl5/5.36:/opt/host/usr/share/perl5:/opt/host/usr/lib/x86_64-linux-gnu/perl-base:/opt/host/usr/lib/x86_64-linux-gnu/perl/5.36:/opt/host/usr/share/perl/5.36:/opt/host/usr/local/lib/site_perl

Incredibly python3 works OOTB.

I think that's all. I'll update this post if I find more envvars to set.

Here's a scripted version, except for all the exports up there. This omission has two or three reasons:

  • bash does not has a way to accept commands to run before showing the prompt. More below.
  • Some of those values are hard to guess; you will have to adapt them to your particular host's system.
  • I guess that's all :)

You can put them in your container's .bashrc and it will be read when bash starts.

Finally, the promised script:

#! /bin/bash

set -eu pipefail

container=$1

root_device=$(findmnt --noheadings --mountpoint / | awk '{ print $2 }')
container_pid=$(docker inspect --format {{.State.Pid}} "$container")

# create device and mount point
# the lack of double quotes around this -v----------------------------------v is intentional
docker exec "$container" mknod --mode 0660 "$root_device" b $(stat --format '%Hr %Lr' "$root_device")
docker exec "$container" mkdir -p /opt/host

# mount with host's root perms; that's why --user is not there
nsenter --target "$container_pid" --mount --uts --ipc --net --pid -- mount "$root_device" /opt/host

echo "go debug; don't forget to set envvars!"
docker exec "$container" /bin/bash

# cleanup
nsenter --target "$container_pid" --mount --uts --ipc --net --pid -- umount /opt/host
docker exec "$container" rm "$root_device"

You will probably need to run this as root, even if you can run docker naked, only because of nsenter.

Maybe I should also use nsenter for the debus session; that way I would be full root there too. I'll update this post if I ever find out situations where I needed that.

ikiwiki to nikola: the script

People asked for it:

#! /usr/bin/python3

import argparse
from datetime import datetime
from glob import glob
from os import stat
from os.path import basename, splitext
import re
import sys
import time


footnote_re = re.compile(r'\[(?P<foot_number>\d+)\]')
taglink_re = re.compile(r'\[\[!taglink (?P<tag_name>[^\]]*)\]\]')
image_re = re.compile(r'\[\[!img (?P<path>.*)\]\]')
format_start_re = re.compile(r'^\[\[!format (?P<language>.*) """$')
format_end_re = re.compile(r'^"""\]\]$')


def rewrite_footnotes_line(line, text_block, footnote_block, taglink_block, foot_number):
    new_line = line
    changed = False

    while footnote := footnote_re.search(new_line):
        # remove the []s
        start = footnote.start('foot_number') - 1
        end = footnote.end('foot_number') + 1

        prefix = new_line[:start]
        postfix = new_line[end:]
        foot_number = footnote.group('foot_number')

        if text_block:
            new_line = f"{prefix}[^{foot_number}]{postfix}"
        elif footnote_block:
            new_line = f"{prefix}[^{foot_number}]:{postfix}"
        else:
            raise ValueError('found a footnote in the taglink_block!')

        changed = True
    else:
        if not changed and footnote_block and len(line) > 0:
            # '[^]: ' <-- 5 extra chars
            new_line = f"{' ' * (len(foot_number) + 5)}{line.strip()}"

    return new_line, foot_number


def rewrite_footnotes(src):
    lines = src.splitlines()
    hr_count = len([ line for line in lines if line.startswith('---') ])

    new_lines = []
    text_block = True
    footnote_block = False
    taglink_block = False
    hr_seen = 0
    foot_number = ''

    for line in lines:
        line_length = len(line)

        if line_length > 4 and line[:4] == '    ':
            # it's an inline code block, leave alone
            new_lines.append(line)
            continue

        if line.startswith('---'):
            hr_seen += 1

            # if there is only one hr, then we have text + taglink blocks
            # if there are two or more, it's text + footnote + taglink blocks
            if text_block and hr_count >= 2 and hr_seen == hr_count - 1:
                text_block = False
                footnote_block = True
                # don't keep it
                continue
            elif hr_seen == hr_count:
                text_block = False
                footnote_block = False
                taglink_block = True
                # we'll need it later
                new_lines.append(line)
                continue

        try:
            new_line, foot_number = rewrite_footnotes_line(line, text_block, footnote_block, taglink_block, foot_number)
        except Exception as e:
            print(f"got `{e}´ for `{line}´.")
            raise

        new_lines.append(new_line)

    return '\n'.join(new_lines) + '\n'


def rewrite_taglinks(src):
    new_lines = []
    new_tags = []

    for line in src.splitlines():
        if len(line) > 0 and line == '-' * len(line):
            # don't keep it
            continue

        tags = taglink_re.findall(line)
        if len(tags) > 0:
            new_tags.extend(tags)
        else:
            new_lines.append(line)

    return '\n'.join(new_lines) + '\n', new_tags


def rewrite_images(src):
    new_lines = []

    for line in src.splitlines():
        image = image_re.search(line)
        if image is not None:
            # get the text before and after the whole directive
            start = image.start(0)
            end = image.end(0)

            prefix = line[:start]
            postfix = line[end:]
            path = image.group('path')

            # the root to which this 'absolute' path points is the website's root
            new_line = f"{prefix}![](/{path}){postfix}"

            new_lines.append(new_line)
        else:
            new_lines.append(line)

    return '\n'.join(new_lines) + '\n'


lang_map = dict(
    py='python',
    sh='bash',
)


def rewrite_format(src):
    new_lines = []

    for line in src.splitlines():
        start = format_start_re.match(line)
        if start is not None:
            lang = start.group('language')
            # if there's no mapping return the same lang
            new_line = f"```{lang_map.get(lang, lang)}"
            new_lines.append(new_line)
            continue

        if format_end_re.match(line):
            new_lines.append('```')
            continue

        new_lines.append(line)

    return '\n'.join(new_lines) + '\n'


def titlify(src):
    words = src.split('-')
    words[0] = words[0].title()

    return ' '.join(words)


def test_offesetify():
    src = -3600
    dst = '+0100'

    assert offsetify(src) == dst


def offsetify(src):
    hours, seconds = divmod(src, 3600)
    # "offsets are always in minutes" sounds like one item in 'things dveloper believe about timezones'
    minutes, _ = divmod(seconds, 60)

    # NOTE: time.timezone returns seconds west of UTC, which is opposite of what usual offsets go
    if src > 0:
        sign = '-'
    else:
        sign = '+'

    return f"{sign}{-hours:02d}{minutes:02d}"


def datify(src):
    '''1701288755.377908 -> 2023-11-29 21:12:35 +0100'''

    # BUG: I'm gonna assume current timezone.
    # thanks SirDonNick#python@libera.chat
    # dto=DT(2023,11,29, 12,13,59, tzinfo=UTC_TZ);  DT.astimezone( dto , getTZ('Europe/Brussels') ) #==> 2023-11-29 13:13:59+01:00
    offset = time.timezone
    dt = datetime.fromtimestamp(src)

    return f"{dt.strftime('%Y-%m-%d %H:%M:%S')} {offsetify(offset)}"


# zoneinfo for some reason doesn't know about CEST, so I'll just hack a mapping here
tzname_to_utc_offset = dict(
    CEST='+0200',
     CET='+0100',
)

month_name_to_number = dict(
    jan= 1,
    ene= 1,
    feb= 2,
    mar= 3,
    apr= 4,
    abr= 4,
    may= 5,
    jun= 6,
    jul= 7,
    aug= 8,
    ago= 8,
    sep= 9,
    oct=10,
    nov=11,
    dec=12,
    dic=12,
)


def dedatify(src):
    #       0         1      2     3      4       5           6     7
    # src=['Posted', 'Sun', '26', 'Aug', '2012', '11:27:16', 'PM', 'CEST']
    month = month_name_to_number[src[3].lower()]
    utc_offset = tzname_to_utc_offset[src[7]]
    h, m, s = [ int(x) for x in src[5].split(':') ]

    if src[6].upper() == 'PM':
        h += 12  # TODO: support 12PM

    return f"{src[4]}-{month:02d}-{int(src[2]):02d} {h:02d}:{m:02d}:{s:02d} {utc_offset}"


def build_meta(filepath, tags, date=None):
    filename = splitext(basename(filepath))[0]
    if date is None:
        mtime = stat(filepath).st_mtime
        date_string = datify(mtime)
    else:
        date_string = dedatify(date)

    meta = f""".. title: {titlify(filename)}
.. slug: {filename}
.. date: {date_string}
.. tags: {', '.join(tags)}
.. type: text
"""

    return filename, meta


def import_post(opts):
    src = open(opts.filepath).read()
    mid, tags = rewrite_taglinks(rewrite_footnotes(src))
    dst = rewrite_format(rewrite_images(mid))
    if opts.date is None:
        filename, meta = build_meta(opts.filepath, tags)
    else:
        filename, meta = build_meta(opts.filepath, tags, date=opts.date)

    open(f"posts/{filename}.md", 'w+').write(dst)
    open(f"posts/{filename}.meta", 'w+').write(meta)


def parse_args():
    parser = argparse.ArgumentParser()

    parser.add_argument('filepath', metavar='FILE')
    parser.add_argument('-d', '--date', nargs=8,
                        help='Just pass something like "Posted Wed 12 Sep 2012 08:19:23 PM CEST".')

    return parser.parse_args()


if __name__ == '__main__':
    opts = parse_args()

    import_post(opts)

I removed all the tests, but they all looked like this:

def test_dedatify():
    src = 'Posted Wed 12 Sep 2012 08:19:23 PM CEST'.split()
    dst = '2012-09-12 20:19:23 +0200'
    assert dedatify(src) == dst

Enjoy.

Logging shell scripts' executions: a better way

Some 11 years ago I wrote:

wanna debug it by at least tracing its execution? Well,
simply surround it by:

    (
    set -x
    ...
    ) > /tmp/$0-$(date +%F.%T).log

Well, a couple of years back at my $OLD_JOB we needed to do exactly that, even with the option turn it on and off at will. For the latter, we decided to use an envvar. Here's how we did it, it's much more robust:

if [ -n "$DEBUG" ]; then
    export PS4='> $(date +"%Y-%m_dT%H:%M:%S") > $(basename "${BASH_SOURCE}") > ${FUNCNAME[0]:=__main__}():${LINENO} > '
    exec 5>> /var/log/foo.log
    BASH_TRACEFD=5
    set -x
fi

There's a lot there, so let's try to unpack it.

  • The last line, set -x, makes bash print every line executed
  • PS4 defines the "prompt" for such lines. We're doing a lot here, so:
  • We start printing a date. Maybe the Y-m-s is too much, but it looks familiar from other log files. This is command substitution that is executed every time a line is printed, the same way you can do fun stuff with your shell prompt.
  • We print the filename of the source file.
  • The the function from the top of the stack, or __main__ in the case of the main body of the script. Guess who wrote this and what's their favorite language :)
  • Then the line number
  • Then bash prints the executed line
  • bash has grown organically, which is a nice way to say 'hacky'. This incantation is how you open a file for appending and copy the file descriptor to the file descriptor 5. Remember 0 is stdin, 1 is stdout and 2 is stderr. We're playing safe with 5.
  • We tell bash to use fd 5 for printing all this, sending it to the file. By default stderr is used, which you probably want to keep for your own script.
  • All this only if DEBUG has any contents. We usually do export DEBUG=1.

Enjoy.

Update

Forgot to mention: this will slow down your execution time noticeably, specially if it's a 65k LoC beast. Ask me how I know, but only after buying me a beer :)

Migrating from ikiwiki to nikola

As I mentioned several times already, my ikiwiki setup for this glob is falling apart in my machine. As it is written in perl, a language I haven't touched in may many years, and its community seems to have dwindled and almost disappeared, I've been thinking of migrating to something else. As a pythonista, one obvious option is nikola. Also because I know the original developer :)

But what would it take to do this? Well, my ikiwiki posts are written in Markdown, and nikola also reads that format. At the beginning I thought of converting to reStructuredText because I have an issue: because of a bad command (probably a cp instead of rsync or tar), I lost the original file times. With reStructuredText, I can provide the date as a directive, and I can recover the original dates from archive.org's snapshots of my glob. But then I read that the same data can be put in a sidecar .meta file, so I can keep my original file format. Also, many things I wanted work best with Markdown, most notably footnotes, which, I don't know if you noticed, never worked on this glob :) Thanks +ChrisWarrick#nikola@libera.chat for all the help!

Still, ikiwiki handles a few things not very Markdown'ly, including images, code snippets and tags. To be honest, the last two are not really a part of Markdown, but it still means I have to convert one markup into another.

I had used pytest in the past, but not much really. I usually write a test() function where I test with assert everything, and once all tests pass, I call main() at script start instead. This was another quick hack, but I wanted to give it a spin. I started with some pure TDD, writing input and outputs in test functions and just assert f(input) == output and pytest did everything else for me, including showing me a diff that points out to the small errors I was making. The iteration pace was feverish.

All in all, it took me 3 21-23h hackatons to mostly finish it. I wrote one function for each step (footnotes, tags, images and code snippets), all of them looking all the input lines all over again, but it doesn't really matter, as I have to import many files by hand to specify the original publishing date. I also tested each regexp1 individually, like I was discussing the other day2. They were short enough not to follow my first tip, but by $GOD I used the other two a lot. There are another four helper functions (convert slugs to titles; convert time.timezone format to UTC offset (for instance, +0100); convert timestamps to a certain date format; and convert another date format to the same one), all also well tested. Then one short function to write the sidecar file, one that glues everything together, and one for parsing command line parameters. All that, tests and their data and all, in 538 lines of very hacky Python :) I'll try to post the code some other day, but frankly I run out of steam and I still have lots of posts to import by hand.

And that's it! Hopefully this will be the first post in the new glob version. I imported a few old posts already and it's working just fine. I expect a few tweaks in the future, as we're talking about ~300 posts and I can't promise the very old ones follow the same format. I set the feed size to one and I'll grow for the next nine posts so I don't break planets and feed readers. I hope I got that right :)


  1. Yes, of course this includes regexps! But thanks to testing them, it didn't become the proverbial extra problem. 

  2. Are you also on mastodon? Come say hi! 

Monitoring apache with prometheus and grafana

NE: this one is not going to be that coherent either, even when I'm actually writing it. Again, sorry for the mess.

Since nobody seems to have ever written a blog post about this beyond the basic 'explain + install + graph one metric in grafana', here's my take.

Analyzing Apache activity with tools like prometheus and grafana is not easy task, mostly because the kind of things that Apache can do is quite varied: it could be serving static files, of any size, receiving uploads, answering API calls, thin REST ones or big fat heavy ones, or even offloading SSL or just proxying to other services. So the first thing you have to ask yourself is: what does my Apache do?

In my case it's two things:

  • Serving static files, mostly map tiles and photos. These vary in size a lot: tiles are usually small, but I also have big 12MiB photos and at least 2 scaled down versions, the thumbnail and the gallery main image. These thumbnails and main image are generated from the originals and cached.

  • Serving a few services: webmail, rss reader and 'cloud' suit (file sharing, calendars, etc).

Second, you have to define what do you actually want to know about it. There are many options: What's my error rate? How long does it take to serve requests? What about queries whose payload can wildly vary in size? What about uploads?

You can also ask yourself about those values: Do I wan to know it in general, by any individual URL, by certain prefixes...?

These question should lead you to define certain things, including changes you might need to do to Apache. For instance, Apache's 'combined' log format include only status and written size (including headers), but neither read size, response time or time to first byte. TTFB is an important value to know for requests that either upload or download lots of data; for instance, in my case, the gallery's images. TTFB will tell you much time you spend before starting your answer. Even then, this value can be misleading, since there's nothing to prevent a framework or application to send headers as long as their values are calculated, but still have to much computation before actually sending the payload of the response.

# enable time to first byte
LogIOTrackTTFB ON

LogFormat "%h %l %u %t \"%r\" %>s %I %O %^FB %D \"%{Referer}i\" \"%{User-Agent}i\"" combined_io

So, where do we start about this? A good starting point is Brian Brazil's oldish post, where he mentions the Grok Exporter, but there is no version in Debian. This means, I need a binary release. Unfortunately, the GE seems to be abandoned by its original developer. The people at SysDigLabs maintain a fork, but they seem to have mostly focused on doing only maintenance around CI, dependencies, but no major code changes. None of the 11 pull requests or the 70 issues in the orignal repo seem to be addressed. I don't blame them, I'm just pointing it out. Besides, they're not providing binaries, except if you count a docker image as such.

So I decided to pick another fork. Of them all, a couple of them do provide binary releases, so I just picked one.

Meanwhile, I also thought that since I was already modifying Apache's log, I could do it in such a way that it would write CSV instead and use the CVS exporter. But since the GE is already there, some people seem to be using it, and this other one for some reason mentions SSH, I decided to stick to GE.

Ok, we install the GE, enable and start the systemd service. This involves extracting the zip, putting the binary and the systemd service unit file somewhere, call systemd link so it sees the new service, write the configuration file and place the pattern files you want in one of the pattern directories declare in the config file. Now what? Well, you can say this is the fun part.

I decided to cut my monitoring by service. First to monitor was my raster tile based map. I butchered the original patterns for apache, splitting it into a prefix, a middle part that will match the path in the request, and a postfix that also adds the three new fields. This way I can write a middle section for each service. Things could be different if I was serving each one a different vhost. It also allowed me to capture parts of the path for instance to count served map tiles per zoom level. This was easy once I figured I had to do the last step of the previous paragraph, otherwise you get errors like Pattern %{IP:client_ip} not defined.

HTTPD_LOG_PREFIX %{IP:client_ip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\]
HTTPD_LOG_POSTFIX %{NUMBER:status} (?:%{NUMBER:read}|-) (?:%{NUMBER:wrote}|-) (?:%{NUMBER:first_byte}|-) (?:%{NUMBER:total_time}|-) %{QS:referrer} %{QS:agent}

ELEVATION_LOG %{HTTPD_LOG_PREFIX} "(?:%{WORD:verb} /~mdione/Elevation/%{NUMBER:zoom_level}/%{NOTSPACE}(?: HTTP/%{NUMBER:http_version})?|%{DATA:rawrequest})" %{HTTPD_LOG_POSTFIX}

Have you noticed Apache's log have this ugly format for timestamps? 18/Nov/2023:23:44:13 +0100. Luckily we don't really needed them, otherwise this would be a nighmare just to parse and convert to something we can use.

Now it's time to define the metrics. I wrote one metrics file per service. Since this is a static site, I built histograms for response time and response size. Just for fun, I also wrote one for FTTB. This is quite straightforward, except for one thing: defining the buckets. What I did was use awk to get the field I wanted, then sort -n and less -N. I was looking at around 1000 requests, so I made buckets of around 100 requests. This value will not be so easy to write for more wildly variant services. For instance, at $WORK we receive and send documents or bundles of documents to be signed. The size of these PDFs and bundles of them are virtually unlimited, and in fact, we're starting to experience issues where the servers freeze for up to 10m just chewing and digesting big uploads.

- type: histogram
  name: elevation_tile_ttfb
  help: Elevation tile Time-To-First-Byte
  match: '%{ELEVATION_LOG}'
  value: '{{multiply .first_byte 0.001}}'  # µs to ms
  buckets: [ 1, 10, 20, 30, 40, 50, 75, 100, 150 ]

- type: counter
  name: elevation_tile_zoom_level_count
  help: Counts tiles per zoom level
  match: '%{ELEVATION_LOG}'
  labels:
    zoom_level: '{{.zoom_level}}'

Last step of the chain, and quite frankly, the one that took me more time: having a nice graph in grafana. My grafana version is somewhat old (6.7.2 v 10.2.1!1), so my only option is a heatmap panel. One thing to know about the grok exporter's metrics is that they're counters, so the first thing you have to do is to use increase() to get the derivative. I took me quite a long time to find out this (no, I can't say I figure it out, I read it here). Second, Grafana by default assumes it has to massage the data to build the heat map, so you have to tell it that the data is already a histogram. In the same vein, you also have to tell it a second time you want a heatmap, this time as the query's format. I also hid zero values and made the tooltip a histogram. Play around with the rest of the settings.

{
  "type": "heatmap",
  "targets": [
    {
      "expr": "increase(gallery_ttfb_bucket[1m])",
      "legendFormat": "{{le}}",
  "format": "heatmap"
}
  ],
  "heatmap": {},
  "dataFormat": "tsbuckets",
  "yAxis": {
    "format": "ms",
  },
  "tooltip": {
    "show": true,
    "showHistogram": true
  },
  "hideZeroBuckets": true
}

Here's the Gallery's dashboard:

It's me traversing my gallery. The first part is me getting into a directory that has never been seen, so the gallery had to generate all the thumbnails. That's why we're seeing TTFBs and RTs so high, but the sizes stay low, because they're all thumbnails. Then it's me poking around here and there, most of the time seing the thumbnail table, but sometimes navigating the photos one by one in their big version. No original version was downloaded. I think I'll have to let these run for a while an revisit the buckets. The one for the response size already needs an adjustment. Also notice that my home server is quite boring, I have to do synthetic usage just to have values to graph.

One thing to notice here. Not once in this post I mention the word latency, even when it's one of the buzzwords around web services. About that I strongly suggest you to watch Gil Tene's How NOT to measure latency talk. It's an oldie, but man it packs a lot of info, including demystifying quantiles and nines.

Update

Today I noticed that an anternative to the Grok Exporter is mtail. It has the advantages that it's well maintained and already available in Debian, but on the other hand you have to program it. I'll stick to GE until a reason raises to switch to it.

Also:

  • Slight additions and clarifications.
  • Fixed typos and removed dupes.

  1. This was because I was using grafana from Debian and not their repository

Is dinant dead; or: A tip for writing regular expressions

NE: Another dictated and quickly revised post. Sorry for the mess.

Last night I was trying to develop a Prometheeus exporter for Apache logs. There's only one already written but it doesn't provide much information, and I just wanted to try myself (yes, a little NIH).

So I decided to start with the usual thing; that is, parsing the log lines. What's the best thing to do this than regular expressions and since I needed to capture a lot of stuff, and then be able to reference them, I thought "Oh yeah, now I remember my project dinant. What happened with it?"

I opened the last version of the source file and I found out that it's incomplete code and it's not in a good shape. So I said "look, it's too late, I'm not going to put it back in shape this because, even if I'm doing this for a hobby, eventually I will need this for work, so I will try to get something quick fast, and then when I have the time I'll see if I can revive dinant". So the answer to the title question is "maybe".

One of the ideas of dinant was that you would build your regular expressions piece by piece. Because it provides blocks that you could easily combine, that made building the regular expression easy, but it doesn't mean that you cannot do that already. For instance the first thing I have to parse is an IP address. What's an IP address? It's four octets joined by three dots. So we just define a regular expression that matches the octet and then a regular expression that matches the whole IP. Then for the rest of the fields of the line I kept using the same idea.

Another tip is that for defining regular expressions I like to use r-strings, raw strings, so backslashes are escaping regular expression elements like . or * and not escaping string elements like \n or \t, and given that they are prefixed by r, to me it's not only a raw string but it's also a regular expression string :)

Finally, building your regular expressions block by block and then combining them in a final regular expression should make your regular expressions easier to test, because then you can you can build test code that test each block individually, and then you test bigger and bigger expressions, exactly like I did for dinant.

Here's the regexps quite well tested:

import re

capture = lambda name, regexp: f"(?P<{name}>{regexp})"

octect = r'([0-9]|[1-9][0-9]|1[0-9]{1,2}|2[0-4][0-9]|25[0-5])'
assert re.fullmatch(octect, '0') is not None
assert re.fullmatch(octect, '9') is not None
assert re.fullmatch(octect, '10') is not None
assert re.fullmatch(octect, '99') is not None
assert re.fullmatch(octect, '100') is not None
assert re.fullmatch(octect, '255') is not None
assert re.fullmatch(octect, '-1') is None
assert re.fullmatch(octect, '256') is None

IPv4 = r'\.'.join([octect] * 4)  # thanks to r'', the \ is a regexp escape symbol, not a string escape symbol
assert re.fullmatch(IPv4, '0.0.0.0') is not None
assert re.fullmatch(IPv4, '255.255.255.255') is not None
assert re.fullmatch(IPv4, '255.255.255') is None
assert re.fullmatch(IPv4, '255.255') is None
assert re.fullmatch(IPv4, '255') is None

Meanwhile, after reading this, I decided to just use the grok exporter. More on that soon.

Update

Talking this morning about it with a friend, I realized that the IPv4 regex is more complex than it needs to be: Apache logs will never have a wrong IP, unless they're badly misbehaving, at which point you should have better ways to detect that.