We both agreed the dataset I was using was too small, but I got similar results even on larger scale. Then some time passed, and 5.6.9 was announced.
So for the 5.6.9 test I took one of my real tables on production. It is not extremely large: it's a ~ 300MB .ibd file, in the following format:
mysql> show create table tbl \G CREATE TABLE `tbl` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `a` varchar(255) CHARACTER SET utf8 NOT NULL DEFAULT '', `w` smallint(11) NOT NULL DEFAULT '0', `d` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `icount` smallint(5) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`id`) KEY_BLOCK_SIZE=8, UNIQUE KEY `u_idx` (`a`,`w`,`d`) KEY_BLOCK_SIZE=8, KEY `d` (`d`) KEY_BLOCK_SIZE=8 ) ENGINE=InnoDB AUTO_INCREMENT=16960441 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=16
Got some 2.5M rows in the table; desktop machine, 64 bit Linux, mysqlsandbox.
I have crossed several DDL statements with several DML statements. The DDL statements in this test are (ALTER TABLE...):
- ADD INDEX (w)
- DROP INDEX w
- ADD COLUMN c CHAR(1) NOT NULL
- DROP COLUMN c
The DML statements are:
- select max(id) from test.tbl; -- this queries the AUTO_INCREMENT value, which is of course a PRIMARY KEY
- select min(d) from test.tbl; -- there is an index on d, and normal execution plan is to optimize table away and just use the index
- select min(icount) from test.tbl; -- there is no index on icount, and full table scan is required
- update test.tbl set d = d + interval 1 second where id = 8057370; -- the UPDATE uses the PRIMARY KEY
- update test.tbl set d = d + interval 1 second where icount = 200; -- will affect 4 rows, but requires full scan.
|ALTER TABLE...||Time (sec)||General comments||select max PK||select min by index||select min by full scan||update by PK||update by full scan|
|ADD INDEX (w)||21||online||online||online||online||online|
|DROP INDEX w||0.1||[Instant operation]||n/a||n/a||n/a||n/a||n/a|
|ADD COLUMN c CHAR(1) NOT NULL||103||online||online||online||online||online|
|DROP COLUMN c||110||online||online||online||online||online|
- All operations were online: operations did not wait for ALTER to complete.
- I executed all operations multiple times during each ALTER.
- In addition, I executed operations from another client.
- Some operations were fast, others sometimes took as long as 7.34 seconds to complete. This is no small matter: the time it took for each DML was indeterministic, and longer than what it would usually take it. That's perfectly understandable. Just note that some operations took exceedingly long time to complete. My understanding is that the ALTER operations happens in chunks. DML statements are allowed in between these chunks. This is the reason why on smaller tables there didn't seem to be any "online" statement: the chunks were just too large in relation to table size. And so, and this is still my own understanding, your query may get lucky or unlucky depending on the exact moment it has been issued.
- I did not try it with FOREIGN KEYs. I previously concluded that foreign keys were a no-go for online DDL. I'm not sure if this is still the case. Another time for this test - but it must take place.
Still RC - but for the first time the online DDL seem to deliver what's promised. I'm very happy to see this.
I am yet to understand how the ALTER works via replication. With single threaded replication I would assume it's back to "wait till I'm done" on the slave, in which case the "online" term is not there yet. Even on multi-threaded replication DML on same schema would hang. I'm happy to be corrected on this by an authority.
My predicament is that oak-online-alter-table or pt-online-schema-change are here to stay for the next couple of years at least. Some operations, like partitioning, are not supported by current online InnoDB DDL. Also, these scripts allow you some control over the speed at which the ALTER process works, allowing for pre-defined sleep time in between chunks, so as to let the server - and its slaves - recover their breath.
Nonetheless, big kudos for the InnoDB team at Oracle for pulling this one out!