MySQL - The Slotted Counter Pattern

It is a common database pattern to increment an INT column when an event happens, such as a download. Early last year at GitHub we started encountering issues with slow UPDATES to these types of counters.

You can go far with this pattern until bursts of these types of events happen in parallel and you experience contention on a single row. When multiple transactions are trying to update the counter you are essentially forcing these transactions to run serially which is bad for concurrency.

You can see here a dramatic increase in query time when a burst like this occurred:

<img src=”/images/burst.png” height=“200px” width=”725px” />

To avoid problems like this we had to do this kind of counting differently. We decided on using a separate table with a schema similar to this:

CREATE TABLE `slotted_counters` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `record_type` int(11) NOT NULL,
  `record_id` int(11) NOT NULL,
  `slot` int(11) NOT NULL DEFAULT '0',
  `count` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `records_and_slots` (`record_type`,`record_id`,`slot`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
  • record_type - The type of counter (allows us to keep the table generic)
  • record_id - Identifies whatever we are counting, it could map to a repository id for example
  • slot - The slot we are going to increment
  • count - The count for each slot

A typical increment query would look like:

INSERT INTO slotted_counters(record_type, record_id, slot, count)
VALUES (123, 456, RAND() * 100, 1) 
ON DUPLICATE KEY UPDATE count = count + 1;

The idea here is that instead of incrementing a single row for a counter we are now picking a slot and incrementing the count in that slot. This means instead of hammering a single row, we are spreading the updates across 100 rows and reducing the potential for contention.

Once we have run the above INSERT a few times we can see the counter rows:

mysql> select * from slotted_counters;

+----+-------------+-----------+------+-------+
| id | record_type | record_id | slot | count |
+----+-------------+-----------+------+-------+
|  1 | 123         |       456 |    2 |     1 |
|  2 | 123         |       456 |   52 |     1 |
|  3 | 123         |       456 |   55 |     1 |
|  4 | 123         |       456 |    0 |     3 |
|  7 | 123         |       456 |   48 |     1 |
|  8 | 123         |       456 |   20 |     1 |
|  9 | 123         |       456 |   56 |     1 |
| 10 | 123         |       456 |   18 |     1 |
| 11 | 123         |       456 |   22 |     1 |
| 12 | 123         |       456 |   58 |     1 |
| 13 | 123         |       456 |   23 |     1 |
+----+-------------+-----------+------+-------+
11 rows in set (0.00 sec)

Getting the count for record_id 456 is as simple as this SELECT query:

SELECT SUM(count) as count FROM slotted_counters
WHERE (record_type = 123 AND record_id = 456);

Now we can have requests executing counter increments in parallel without causing contention and effecting concurrency.

There are a few ways you could implement this pattern which come down to the architecture of your app. One way would be to query the slotted_counters table to roll up the data and update a column stored with the rest of the data.

I’d be interested to hear how other people have solved this problem, so get in touch if you’ve come up with something different.

If you love solving problems then GitHub are hiring Application and Infrastructure Engineers!