Answering questions asked on the site.
Aryé asks:
Thanks for your article about late row lookups in MySQL.
I have two questions for you please:
- Is this workaround specific to MyISAM engine?
- How does PostgreSQL handle this?
The questions concerns a certain workaround for
MySQL LIMIT … OFFSET
queries like
this:
SELECT * FROM mytable ORDER BY id LIMIT 10 OFFSET 10000
which can be improved using a little rewrite:
SELECT m.* FROM ( SELECT id FROM mytable ORDER BY id LIMIT 10 OFFSET 10000 ) q JOIN mytable m ON m.id = q.id ORDER BY m.id
For the rationale behind this improvement, please read the original article.
Now, to the questions.
The second questions is easy: PostgreSQL won't
pull the fields from the table until it really needs them. If a
query involving an ORDER BY
along with
LIMIT
and OFFSET
is optimized to use
the index for the ORDER BY
part, the table lookups
won't happen for the records skipped.
Though PostgreSQL does not reflect the table
lookups in the EXPLAIN
output, a simple test would
show us that they are done only LIMIT
times, not
OFFSET + LIMIT
(like MySQL does).
Now, let's try to answer the first question: will this trick improve the queries against an InnoDB table?
To do this, we will create a sample table:
CREATE TABLE filler ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT ) ENGINE=Memory; CREATE TABLE lookup ( id INT NOT NULL PRIMARY KEY, value INT NOT NULL, shorttxt TEXT NOT NULL, longtxt TEXT NOT NULL ) ENGINE=InnoDB ROW_FORMAT=COMPACT; CREATE INDEX ix_lookup_value ON lookup (value); DELIMITER $$ CREATE PROCEDURE prc_filler(cnt INT) BEGIN DECLARE _cnt INT; SET _cnt = 1; WHILE _cnt <= cnt DO INSERT INTO filler SELECT _cnt; SET _cnt = _cnt + 1; END WHILE; END $$ DELIMITER ; START TRANSACTION; CALL prc_filler(100000); COMMIT; INSERT INTO lookup SELECT id, CEILING(RAND(20110211) * 1000000), RPAD('', CEILING(RAND(20110211 << 1) * 100), '*'), RPAD('', CEILING(8192 + RAND(20110211 << 1) * 100), '*') FROM filler;
There is one indexed INT
column and two
TEXT
columns, shorttxt
storing short
strings (1 to 100 characters),
longtxt
storing long strings (8193
to 8293 characters).
Let's run some queries against this table.
PRIMARY KEY and the INT column
No rewrite:
SELECT value FROM lookup ORDER BY id LIMIT 10 OFFSET 90000
value |
---|
12336 |
314476 |
535374 |
733443 |
61089 |
105117 |
342318 |
396237 |
954232 |
582449 |
10 rows fetched in 0.0002s (0.0415s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | lookup | index | PRIMARY | 4 | 90010 | 622.36 |
select `20110211_late`.`lookup`.`value` AS `value` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10
Rewrite:
SELECT l.value FROM ( SELECT id FROM lookup ORDER BY id LIMIT 10 OFFSET 90000 ) q JOIN lookup l ON l.id = q.id ORDER BY q.id
value |
---|
12336 |
314476 |
535374 |
733443 |
61089 |
105117 |
342318 |
396237 |
954232 |
582449 |
10 rows fetched in 0.0002s (0.0407s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 10 | 100.00 | Using filesort | ||||
1 | PRIMARY | l | eq_ref | PRIMARY | PRIMARY | 4 | q.id | 1 | 100.00 | |
2 | DERIVED | lookup | index | PRIMARY | 4 | 90010 | 622.36 | Using index |
select `20110211_late`.`l`.`value` AS `value` from (select `20110211_late`.`lookup`.`id` AS `id` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10) `q` join `20110211_late`.`lookup` `l` where (`20110211_late`.`l`.`id` = `q`.`id`) order by `q`.`id`
As you can see, there is almost no difference (41 ms vs 40 ms).
InnoDB tables are clustered on their
PRIMARY KEY
columns, which means the the index on
id
(used to serve the ORDER BY
condition) contains all the data the query needs. There is a
negligible benefit from not lookup up the value
columns at the index pages because the column is tiny and the
index pages need to be read anyway.
PRIMARY KEY and the short TEXT column
Rewrite:
SELECT LENGTH(l.shorttxt) FROM ( SELECT id FROM lookup ORDER BY id LIMIT 10 OFFSET 90000 ) q JOIN lookup l ON l.id = q.id ORDER BY q.id
LENGTH(l.shorttxt) |
---|
17 |
41 |
52 |
39 |
36 |
65 |
15 |
78 |
44 |
85 |
10 rows fetched in 0.0003s (0.0401s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 10 | 100.00 | Using filesort | ||||
1 | PRIMARY | l | eq_ref | PRIMARY | PRIMARY | 4 | q.id | 1 | 100.00 | |
2 | DERIVED | lookup | index | PRIMARY | 4 | 90010 | 622.36 | Using index |
select length(`20110211_late`.`l`.`shorttxt`) AS `LENGTH(l.shorttxt)` from (select `20110211_late`.`lookup`.`id` AS `id` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10) `q` join `20110211_late`.`lookup` `l` where (`20110211_late`.`l`.`id` = `q`.`id`) order by `q`.`id`
No rewrite:
SELECT LENGTH(shorttxt) FROM lookup ORDER BY id LIMIT 10 OFFSET 90000
LENGTH(shorttxt) |
---|
17 |
41 |
52 |
39 |
36 |
65 |
15 |
78 |
44 |
85 |
10 rows fetched in 0.0002s (0.0925s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | lookup | index | PRIMARY | 4 | 90010 | 622.36 |
select length(`20110211_late`.`lookup`.`shorttxt`) AS `LENGTH(shorttxt)` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10
There is quite a significant difference (92 ms vs 40 ms) the reasons for which we will discuss a little bit later, after we see the results of the third query.
PRIMARY KEY and the long TEXT column
Rewrite:
SELECT LENGTH(l.longtxt) FROM ( SELECT id FROM lookup ORDER BY id LIMIT 10 OFFSET 90000 ) q JOIN lookup l ON l.id = q.id ORDER BY q.id
LENGTH(l.longtxt) |
---|
8209 |
8233 |
8244 |
8231 |
8228 |
8257 |
8207 |
8270 |
8236 |
8277 |
10 rows fetched in 0.0002s (0.0396s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 10 | 100.00 | Using filesort | ||||
1 | PRIMARY | l | eq_ref | PRIMARY | PRIMARY | 4 | q.id | 1 | 100.00 | |
2 | DERIVED | lookup | index | PRIMARY | 4 | 90010 | 622.36 | Using index |
select length(`20110211_late`.`l`.`longtxt`) AS `LENGTH(l.longtxt)` from (select `20110211_late`.`lookup`.`id` AS `id` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10) `q` join `20110211_late`.`lookup` `l` where (`20110211_late`.`l`.`id` = `q`.`id`) order by `q`.`id`
No rewrite:
SELECT LENGTH(longtxt) FROM lookup ORDER BY id LIMIT 10 OFFSET 90000
LENGTH(longtxt) |
---|
8209 |
8233 |
8244 |
8231 |
8228 |
8257 |
8207 |
8270 |
8236 |
8277 |
10 rows fetched in 0.0000s (30.3594s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | lookup | index | PRIMARY | 4 | 90010 | 622.36 |
select length(`20110211_late`.`lookup`.`longtxt`) AS `LENGTH(longtxt)` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`id` limit 90000,10
The query employing the trick runs for same 40 ms, while the straightforward query takes as much as 30 seconds (30,359 ms, to be exact).
Why such a difference?
The reason is that InnoDB
, despite the fact it
stores the data in the clustered index, is still able to move
some data out of the index. This is called external storage.
With COMPACT
row format I used to create the tables,
InnoDB
, when trying to insert a record with two
TEXT
columns on the page, will try to fit both of
them on the page. If this is not possible, then it will split the
longest of the records in two parts: the first
768 bytes will be stored on the page and the
remaining data will be stored on a separate page (or pages), with
a pointer to these data stored in the original clustered index.
This will be repeated until all TEXT
columns fit on
the page of there is no more space there (in which case an error
would be thrown).
This means that all TEXT
columns shorter than
768 bytes will be stored completely on the page,
while those longer can be either split of stored as a whole (with
at least first 768 bytes still being on the
page).
With the column lengths chosen (1 to
100 and 8K to 8K +
100, accordingly), it can be easily seen that
shorttxt
will always be stored on-page,
while longtxt
will always be split (since
InnoDB
allows at most 8K
per record
(minus some overhead) to be stored on one page).
Now, this becomes more clear. As with MyISAM,
the straightforward query involving longtxt
should
perform two page lookups per each record scanned: the first one
on the clustered index, the second one on the external storage.
This takes lots of time and may even spoil the
InnoDB cache with unneeded data (which would
lead to increased cache miss ratio).
The query on shorttxt
is not that bad, since it only
requires come extra CPU cycles per record to calculate the string
length.
Now, let's check one more query which orders by a secondary
indexed field rather than id
:
Secondary index and the short text column
Rewrite:
SELECT LENGTH(l.shorttxt) FROM ( SELECT id, value FROM lookup ORDER BY value LIMIT 10 OFFSET 90000 ) q JOIN lookup l ON l.id = q.id ORDER BY q.value
LENGTH(l.shorttxt) |
---|
14 |
85 |
16 |
34 |
77 |
78 |
3 |
49 |
53 |
60 |
10 rows fetched in 0.0002s (0.0262s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 10 | 100.00 | Using filesort | ||||
1 | PRIMARY | l | eq_ref | PRIMARY | PRIMARY | 4 | q.id | 1 | 100.00 | |
2 | DERIVED | lookup | index | ix_lookup_value | 4 | 90010 | 622.36 | Using index |
select length(`20110211_late`.`l`.`shorttxt`) AS `LENGTH(l.shorttxt)` from (select `20110211_late`.`lookup`.`id` AS `id`,`20110211_late`.`lookup`.`value` AS `value` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`value` limit 90000,10) `q` join `20110211_late`.`lookup` `l` where (`20110211_late`.`l`.`id` = `q`.`id`) order by `q`.`value`
No rewrite:
SELECT LENGTH(shorttxt) FROM lookup ORDER BY value LIMIT 10 OFFSET 90000
LENGTH(shorttxt) |
---|
14 |
85 |
16 |
34 |
77 |
78 |
3 |
49 |
53 |
60 |
10 rows fetched in 0.0002s (0.2663s) |
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | lookup | index | ix_lookup_value | 4 | 90010 | 622.36 |
select length(`20110211_late`.`lookup`.`shorttxt`) AS `LENGTH(shorttxt)` from `20110211_late`.`lookup` order by `20110211_late`.`lookup`.`value` limit 90000,10
The execution times for these queries vary tenfold: 26 ms vs 266 ms.
As with MyISAM, the secondary index requires an
extra lookup to retrieve the actual values from the table (the
only difference is that any index is secondary in
MyISAM, including that used to police the
PRIMARY KEY
).
The first query does not perform these row lookups on the skipped
records and hence is ten times as fast. It is even faster than
queries ordering on the PRIMARY KEY
, because the
secondary index contains significantly less data than the
PRIMARY KEY
, holds much more records per page, is
hence more shallow and can be traversed faster.
The second query does perform the early row lookups, as usual.
Conclusion
A trick used to avoid early row lookups for the LIMIT …
OFFSET
queries is useful on InnoDB tables
too, though to different extent, depending on the ORDER
BY
condition and the columns involved:
- It's very useful on queries involving columns stored off-page
(long
TEXT
,BLOB
andVARCHAR
columns) - It's very useful on
ORDER BY
conditions served by secondary indexes - It's quite useful on moderate sized columns (still stored on page) or CPU-intensive expressions
- It's almost useless on short columns without complex CPU-intensive processing
Hope that helps.
I'm always glad to answer the questions regarding database queries.
The post Late row lookups: InnoDB appeared first on EXPLAIN EXTENDED.