In computer performance, we’re especially concerned about
latency outliers: very slow database queries, application
requests, disk I/O, etc. The term “outlier” is subjective: there
is no rigid mathematical definition. From [Grubbs 69]:
An outlying observation, or “outlier,” is one that appears to
deviate markedly from other members of the sample in which it
occurs.
Outliers are commonly detected by comparing the maximum value in
a data set to a custom threshold, such as 50 or 100 ms for disk
I/O. This requires the metric to be well understood beforehand,
as is usually the case for application latency and other key
metrics. However, we are also often faced with a large number of
unfamiliar metrics, where we don’t know the thresholds in
advance.
There are a number of …
[Read more]