In an application such as a database server, instrumentation is
like sex: it’s not enough to know how often things happen. You
also care about how long they took, and in many cases you want to
know how big they were.
“Things” are the things you want to optimize. Want to optimize
queries? Then you need to know what activities that query causes
to happen. Most systems have at least some of this kind of
instrumentation. If you look around at… let’s not pick on the
usual targets… oh, say Sphinx, Redis, and memcached. What metrics
do they provide? They provide counters that say how often various
things happened. (Most of these systems provide very few and
coarse-grained counters.) That’s not very helpful. So I read from
disk N times, and I read from memory N times, and I compared rows
N times… so what? I still don’t know anything relevant to
execution time.
That’s why we need to measure how long things took. It’d be …
[Read more]