Remember this? Ok, maybe you never read that. The gist of the post is that I used strace -r -T to produce some logs that we «amassed[sic] [...] with a python script for generating[sic] a CSV file [...] and we got a very interesting graph». Mein Gott, sometimes the English I write is terrible... Here's that graph again:

This post is to announce that that Python script is now public. You can find it here. It's not as fancy as those flame graphs you see everywhere else, but it's a good first impression, specially if you have to wait until the SysAdmin installs perf or any other tool like that (ok, let's be fair, l/strace is not a standard tool, but probably your grumpy SysAdmin will be more willing to install those than something more intrusive; I know it happened to me, at least). It's written in Python3; I'll probably backport it to Python2 soon, so those stuck with it can still profit from it.

To produce a similar graph, use the --histogram option, then follow the suggestions spewed to stderr. I hope this helps you solve a problem like it did to me!

profiling python

Posted Thu 09 Jun 2016 03:27:24 PM CEST Tags: profiling

We have a system that is not having very good performance. This system is a series of processes constantly polling a directory to find new files and notify other processes[1]. Currently there are 9 instances, but tests have proven that 3 or 4 has way more performance than 9, which actually is as performant as having only one. But all this in a test machine, that most probably is a virtual machine competing for the spinning disk and seeking disk head with other machines.

So we decided to do a little profiling to see if the processes are actually waiting for the disk. A priory they weren't, as top didn't show them as so, but as top is too passive, we decided to look better. The tool of choice was, obviously, strace. Having found the PID of the working thread of one of the instances, we simply launched this command:

strace -r -T -p <pid> -o foo.log & sleep 20; kill %1

This attaches strace to the thread's PID (-p) in the background for 20 seconds and then kills it. A log is written in foo.log (-f) with all the system calls, the time spent on each (-T), and the time spent between them (-r). Then we amassed foo.log with a python script for generating a CSV file with a summary which on each line has the syscall name, the amount of calls, the minimum, average, maximum and total call time. This is similar to strace's -c switch, but it only presents times and percentages. As we're interested in small response time, having the maximum time is important for us. Then we amassed a little more the data on a spreadsheet and we got a very interesting graph:

My God, does LibreOffice save some pretty awful images. But I digress...

The Count series (blue) is driven by the scale at the right, and the scale at the left, which drives the rest of the series, it's in logarithmic scale.

What we can immediately see is that we spend most of the time on stat(). This is not surprising given the workload of the system. But still, more than 6 seconds[2] spent on almost 100000 calls is a lot. If you squint a little, you will see that in these 20 seconds, only 800 files are processed (noticeable by the amount of open()s[3]).

We checked the logs and we found that more than a half of the calls to stat() were done with /etc/localtime, which is insane. It seems like the culprit is strftime() stat()ing the file on each call[4]. It seems that the way to avoid that is setting the TZ envvar.

... Or avoid calling strftime() so many times, of course.

[1] In my neighborhood we called this inotify. In a big enterprise it seems to be called MID. The pleasantries of working in MegaCorp...

[2] Remember that the scale is logarithmic, so almost between 1 and 10 is more or less 6. Seeing that reading logarithmic scales is not so easy led me to think of an logarithmic/linear scale, which is logarithmic for the orders of magnitude, but linear between them.

[3] I tried putting the scale on the right logarithmic too, but that makes the graph confusing... more confusing.

[4] You know me, I tried to give a link to an online browsable version of glibc's source code showing the culprit line, but really after 10 minutes of navigating it, I got lost in weird macro definitions and gave up. I wish I never ever have to deal with that code.


Posted Sat 01 Jun 2013 07:05:14 PM CEST Tags: profiling