Often I run into problems when trying to use mk-query-digest with tcpdump on “very” busy hosts. You might be thinking, “very busy is a relative and unquantifiable term,” and you’d be right, so I’ll phrase this differently. Let me give a little background to the problem first. Mk-query-digest tries to handle dropped or missing packets gracefully, but even so they can skew results dramatically. Imagine a situation where a single connection sends two queries and gets two responses, with a response time of R for each request, and a wait time of W between the requests. If the first response and second request are dropped by the kernel, the result – from mk-query-digest’s perspective – is that the database took 2R + W time to respond to the first request.
Back to the question of, “what is a very busy host?” In my
experience, if you are getting even 5% of tcpdump packets dropped
by the kernel, the results can be skewed enough to cause
confusion about which queries are really slow. Recently, I got
more than 60% dropped packets on a server with roughly 50MB/s of
traffic on port 3306, system load of about 10, and 8 CPU cores.
The resulting mk-query-digest output was obviously bogus when
compared to the host’s slow-query-log (for example, none of the
top 5 slow queries reported by mkqd appeared in the actual slow
log file). After a little brain-storming, we came up with a few
solutions:
- use “mk-query-digest –filter ‘$event->{time} && $event->{time} > 1′” to exclude all queries which it believes took longer than the servers long-query-time of 1 second
- tcpdump traffic only from a small and representative subset of clients
- tcpdump a modulo of incoming ports (including port 3306, because we must capture the responses from mysqld)
#1 has an obvious flaw — if your long-query-time is 1 second, and mkqd believes that a query which actually took 10ms instead took 0.9s, the results are still useless. That is to say, this doesn’t actually solve the real problem of dropped packets, it just applies a mask to the output. #2 seems like the simplest good solution, but when I tested this, I still got very high percentage of dropped packets (around 30% when filtering only 4 out of hundreds of active clients). While this is lower than without the filter, it is still unusable. #3 actually worked very well and resulted in about 0.2% packet loss on this host, which is acceptable — the variances are statistically smoothed out and don’t noticeably affect the results. Here is the tcpdump command used.
tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2'
I also had to forward the tcpdump output to another host for processing because the database host couldn’t handle the additional IO or CPU pressure of either writing it to a file or piping it to mk-query-digest. Here is a draft of a script to automate this; use at your own risk and update to suit your needs.
#!/bin/bash # Set some defaults d=$(date +%F-%T | tr :- _) self=`basename $0` REMOTEHOST='some.other.hostname' # CHANGEME LIMIT=50 # limit # of queries in report SLEEPTIME=1200 # duration to capture tcpdump data in seconds (1200 = 20 min) TMP='/tmp' LOCKFILE="$self.lockfile" TMPfile="$self.temp" RESfile="$self.result" # check lock file if [ -e "$TMP/$LOCKFILE" ]; then echo "$self: lock file $LOCKFILE already exists, aborting" exit 1 fi # set trap to be sure tcpdump doesn't run for ever # and clean up the temp file too trap 'rm -f $LOCKFILE; kill $PID; ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile"; exit' INT TERM EXIT touch $TMP/$LOCKFILE # run the tcpdump & write to remote file and sleep for a bit tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' 2>/dev/null \ | ssh $REMOTEHOST -- "cat - > $TMP/$TMPfile" & PID=$! sleep $SLEEPTIME kill $PID # set trap to be sure both remote files are removed trap 'ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"; rm -f $LOCKFILE $RESfile; exit' INT TERM EXIT # digest the result, copy to localhost, then email it ssh $REMOTEHOST -- "mk-query-digest --type tcpdump --limit $LIMIT < $TMP/$TMPfile 2>&1 > $TMP/$RESfile" scp -q $REMOTEHOST:$TMP/$RESfile $RESfile # email $RESfile using your preferred transport # clean up remote and local files. ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile" rm -f $RESfile $LOCKFILE trap - INT TERM EXIT exit 0