Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Schema changes – what’s new in MySQL 5.6?
+3 Vote Up -0 Vote Down

Among many of the improvements you can enjoy in MySQL 5.6, there is one that addresses a huge operational problem that most DBAs and System Administrators encounter in their life: schema changes.

While it is usually not a problem for small tables or those in early stages of product life cycle, schema changes become a huge pain once your tables get a significant amount of data. Planning for maintenance is becoming more and more difficult, and your worldwide users want the service to be up and running 24/7, while on the other hand, your developers desire to introduce schema changes every week.

PITA

But what is the real problem here? Let me illustrate very typical case:

Session1> ALTER TABLE revision ADD COLUMN mycol tinyint;
Query OK, 1611193 rows affected (1 min 5.74 sec)
Records: 1611193  Duplicates: 0  Warnings: 0
Session2> INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (48.30 sec)
Session3 > show processlist;
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                                                                 |
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
|  1 | root | localhost | test | Query   |   47 | copy to tmp table               | ALTER TABLE revision ADD COLUMN mycol tinyint                        |
|  2 | root | localhost | test | Query   |   30 | Waiting for table metadata lock | INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1 |
|  3 | root | localhost | NULL | Query   |    0 | init                            | show processlist                                                     |
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
3 rows in set (0.01 sec)

What you see above is how changing table’s structure works in MySQL in all pre-5.6 versions. Normally the example INSERT statement is done in matter of microseconds. But once a DBA runs this ALTER TABLE (session1), the application (session2) has to wait for the ALTER to complete before INSERT and other DML statements can succeed. As a result, application will stall on writes to this table until ALTER TABLE completes.

WORKAROUNDS

There are many less or more complicated approaches to the ALTER TABLE problem in MySQL, starting from just well planned downtime window, through master/slave switching techniques, ending on using advanced tools that let you do the change in less possible intrusive way. These tools are out there for years, yet it appears that still many MySQL users are not aware of them. Let me just name here pt-online-schema-change from Percona Toolkit, oak-online-alter-table from Openark Kit or Facebook OSC.

LONG STORY

Historically, MySQL had to perform full table copy for all DDL (Data Definition Language) operations, while you could not write to a table being altered during the process. This problem is even more painful when awaiting writes can also block following reads.

An important, though small step forward was made for InnoDB storage engine since new InnoDB version (aka InnoDB plugin) was introduced for MySQL 5.1. Since then, you can create and drop indexes without copying the whole table contents. So at least the operation that is practised very often for query optimizations, can be done much much quicker.

Next significant improvement (or perhaps I should say fix) in DDL operations area was introduced in MySQL 5.5 as Metadata Locking. Basically from now on schema changes are working properly and consistent with transactions. You will find this explained in details here.

ONLINE(!) DDL in MySQL 5.6

Fortunately, this is not where so much desired evolution in this area has stopped! With MySQL 5.6, this time a huge step forward was made: from now on most of the ALTER types won’t block writes to a table that is being changed!

Another improvement is that in addition to existing instant ALTER implementations (like change default value for a column), now you can perform also following operations without the need of full table copy:

  • Change auto-increment value for a column
  • Rename a column (if data type remains the same)*
  • Add/Drop a foreign key constraint

As said before though, the main improvement in MySQL 5.6 is that during the ALTER operation a table being changed can still be fully accessible by clients, so both reads and writes (with few exceptions) are allowed! Complete statement matrix can be found here.

Let’s see some examples in practice

Example 1 – reset auto-increment value for a column

It can happen that wrong explicit insert sets the value for an auto-increment column higher then necessary. It may be even close to the data type limit and we want to make the table using lower values back again. We realize that and delete that high value row, but…

Session1> SELECT t.TABLE_NAME,c.COLUMN_NAME,c.COLUMN_KEY,t.AUTO_INCREMENT FROM information_schema.tables t JOIN information_schema.columns c USING (TABLE_SCHEMA,TABLE_NAME) WHERE t.TABLE_SCHEMA="test" AND t.TABLE_NAME="revision" AND c.EXTRA="auto_increment";
+------------+-------------+------------+----------------+
| TABLE_NAME | COLUMN_NAME | COLUMN_KEY | AUTO_INCREMENT |
+------------+-------------+------------+----------------+
| revision   | rev_id      | PRI        |        2000002 |
+------------+-------------+------------+----------------+
1 row in set (0.03 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
| 1700000     |
+-------------+
1 row in set (0.00 sec)
Session1> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (0.02 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
| 2000002     |
+-------------+
1 row in set (0.00 sec)

Not possible since the table has next AUTO_INCREMENT=2000002. So this alter is our only help:

Session1> ALTER TABLE revision AUTO_INCREMENT=1700001;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Session1> insert into revision set rev_page=4,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (0.01 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
|     1700001 |
+-------------+

Finally, this operation in MySQL 5.6 is instant! In previous MySQL versions such ALTER causes full table rewrite and blocks the table for writes for the whole process time. In version 5.5.31 the same ALTER on the same hardware looks like that:

Session1> ALTER TABLE revision AUTO_INCREMENT=1700001;
Query OK, 1611226 rows affected (1 min 3.42 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

Example 2 – DROP COLUMN

Session1> ALTER TABLE revision DROP COLUMN rev_sha1;
Query OK, 0 rows affected (1 min 39.24 sec)
Records: 0 Duplicates: 0 Warnings: 0

During the ALTER is in progress:

Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2;
Query OK, 1 row affected (0.01 sec)

After the ALTER finished:

Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2;
ERROR 1054 (42S22): Unknown column 'rev_sha1' in 'field list'

Great! Drop table was non-blocking, we can use the table without interruption.

Example 3 – RENAME COLUMN

Original column definition was:
rev_deleted tinyint(1) unsigned NOT NULL DEFAULT '0',

Session1> ALTER TABLE revision CHANGE COLUMN rev_deleted rev_deleted1 tinyint(1) unsigned NOT NULL DEFAULT '0';
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

(The same column rename in MySQL 5.5 or earlier copies the whole table.)

But let’s try another column:
rev_timestamp char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',

Session1> ALTER TABLE revision CHANGE COLUMN rev_timestamp rev_date char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '';
Query OK, 1611226 rows affected (1 min 43.34 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

What the heck?

Hmm, let’s try another one:
rev_comment tinyblob NOT NULL,

Session1> ALTER TABLE revision CHANGE COLUMN rev_comment rev_comment1 tinyblob NOT NULL;
Query OK, 1611226 rows affected (2 min 7.91 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

So, by the way of writing this blog post I identified a new bug and reported it here: http://bugs.mysql.com/bug.php?id=69580
In short, Online DDL does not work as expected when you rename a column of binary data type, but also for a char type with binary collation. The bug not only leads to full table copy but also blocks a table for writes.

Example 4 – NEW ALTER TABLE OPTIONS

In case you are not sure if an ALTER TABLE will copy and/or block the table, and you want to make sure your DDL statement won’t cause such problems, you can add new alter specifications to the statement: ALGORITHM and LOCK.

Session1> ALTER TABLE revision ENGINE=InnoDB, LOCK=NONE;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

As we can see full table rebuild still needs locking.

Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=COPY, LOCK=NONE;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

So, ALGORITHM=COPY and LOCK=NONE are mutually exclusive.

Session1> ALTER TABLE revision MODIFY COLUMN rev_id bigint(8) unsigned NOT NULL AUTO_INCREMENT, ALGORITHM=INPLACE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.

Data type change of a column is both locking and makes full table copy.

Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (1 min 38.84 sec)
Records: 0  Duplicates: 0  Warnings: 0

Above example is pretty interesting as the new method is used, so no full table copy takes place, however whole operation is still as much expensive as the old way. This is because significant data reorganization is needed inside the existing ibd tablespace.

ALGORITHM and LOCK options allow you also to force using old method with full table copy by adding ALGORITHM=COPY or protect a table from writes with LOCK=SHARED. The same effect can be achieved with SET old_alter_table=1 before the ALTER.

NEW DIAGNOSTICS

I would like to mention that MySQL 5.6 offers new monitoring tools that we can use to see more details also about ongoing ALTERs.

This is how it looks like during the ALTER with adding a column is in progress:

Session3> SELECT * FROM information_schema.innodb_metrics WHERE name LIKE 'ddl%'\G
*************************** 1. row ***************************
NAME: ddl_background_drop_indexes
SUBSYSTEM: ddl
COUNT: 0
MAX_COUNT: NULL
MIN_COUNT: NULL
AVG_COUNT: 0
COUNT_RESET: 0
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of indexes waiting to be dropped after failed index creation
*************************** 2. row ***************************
NAME: ddl_background_drop_tables
SUBSYSTEM: ddl
COUNT: 0
MAX_COUNT: NULL
MIN_COUNT: NULL
AVG_COUNT: 0
COUNT_RESET: 0
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of tables in background drop table list
*************************** 3. row ***************************
NAME: ddl_online_create_index
SUBSYSTEM: ddl
COUNT: 1
MAX_COUNT: NULL
MIN_COUNT: 0
AVG_COUNT: 0.000005713795960346256
COUNT_RESET: 1
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: 0
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of indexes being created online
*************************** 4. row ***************************
NAME: ddl_pending_alter_table
SUBSYSTEM: ddl
COUNT: 1
MAX_COUNT: NULL
MIN_COUNT: 0
AVG_COUNT: 0.000005713795960346256
COUNT_RESET: 1
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: 0
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of ALTER TABLE, CREATE INDEX, DROP INDEX in progress
4 rows in set (0.01 sec)

To enable those counters, simply do:

Session3> set global innodb_monitor_enable = module_ddl;
Query OK, 0 rows affected (0.00 sec)

IS ONLINE DDL GOOD ENOUGH IN MySQL 5.6?

I would not say that. Although schema changes are now a lot more friendly, but there seems to be still a room for improvement in terms of becoming even more “online”. And there is another aspect – we can’t really control the way online DDL is done in terms of server I/O load and replication. It is often the case when full table rewrite would generate enough high disk utilization to cause performance problems. Yet we cannot do anything about this in MySQL’s Online DDL – it will copy a table with full available speed no matter if it can saturate I/O throughput or not. Here the tools like pt-online-schema-change have this big advantage where it divides table copy into chunks and checks automatically a given status variable if there is not too high load on the server, and pauses data copy process if so.

When we speak about replication – with Online DDL each ALTER is a single operation, so after the master finishes it, it gets replicated to slaves. This will cause replication lag for the time slave has to apply long ALTER, as well as increased I/O activity on all slaves at the same time. Also in this regard pt-online-schema-change chunk copy model has huge advantage – schema change is replicated to the slaves and the tool will check how much slaves are lagged hence pause the copy process in case the lag is too big. Also the data chunks are automatically adjusted to fit within given time limit.

I was curious to compare time required for the same ALTER with ONLINE DDL versus pt-online-schema-change:

[root@centos6-2 ~]# time pt-online-schema-change D=test,t=test --alter "ADD COLUMN mycol4 bigint" --execute
Operation, tries, wait:
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `test`.`test`...
Creating new table...
Created new table test._test_new OK.
Altering new table...
Altered `test`.`_test_new` OK.
Creating triggers...
Created triggers OK.
Copying approximately 952180 rows...
Copied rows OK.
Swapping tables...
Swapped original and new tables OK.
Dropping old table...
Dropped old table `test`.`_test_old` OK.
Dropping triggers...
Dropped triggers OK.
Successfully altered `test`.`test`.
real    0m22.531s
user    0m0.342s
sys     0m0.095s

versus:

Session1> ALTER TABLE test ADD COLUMN mycol4 bigint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (30.64 sec)
Records: 0  Duplicates: 0  Warnings: 0

pt-online-schema-change wins here clearly, but it’s just one simple example out of many possible scenarios!

OVERHEAD

The fact that a table is ready for writes during Online DDL in progress, does not mean it will be performed at the same speed.

I tried a simple sysbench test to compare overall throughput of the MySQL server when the server is idle versus when ONLINE ALTER is in progress. Before the test I warmed up InnoDB buffers and ran sysbench test several times. All data for sysbench table fit in memory.

Then I started ALTER statement (for an unrelated table):

Session1> ALTER TABLE revision ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=SHARED;                                                                                                             Query OK, 0 rows affected (1 min 54.73 sec)
Records: 0  Duplicates: 0  Warnings: 0

And just after that, the same sysbench command:

[root@centos6-2 ~]# sysbench --num-threads=3 --max-requests=5000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=test --mysql-user=root run
sysbench 0.4.12:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 3
Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 5000
Threads started!
...

Below we can see concurrent threads:

Session3 > show processlist;
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State          | Info                                                                           |
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
|  1 | root | localhost | test | Sleep   |  282 |                | NULL                                                                           |
|  2 | root | localhost | test | Query   |   29 | altering table | ALTER TABLE revision ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=SHARED |
|  5 | root | localhost | NULL | Query   |    0 | init           | show processlist                                                               |
| 39 | root | localhost | test | Execute |    0 | Writing to net | SELECT c from sbtest where id=?                                                |
| 40 | root | localhost | test | Execute |    0 | Writing to net | DELETE from sbtest where id=?                                                  |
| 41 | root | localhost | test | Execute |    0 | Writing to net | SELECT c from sbtest where id=?                                                |
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

To be clear, sysbench was started after ALTER, and finished before ALTER was done.

Result? On idle MySQL instance sysbench test score is around 270 transactions per second with minimal variation between many the same tests.
While Online DDL of unrelated table is in progress, the same test scored with average 110 transactions per second. Also minimal variation between many the same tests.

In next test, I altered the sbtest table – the same which is used by sysbench test, hence additional overhead of recording all the changes to that table.

Session1> ALTER TABLE sbtest ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (1 min 5.88 sec)
Records: 0  Duplicates: 0  Warnings: 0

Again, sysbench finished before the ALTER, and now the score was 90 transactions per second.
Note though this was only a very simple test of one example use case.

BUGS

At the end I’d like to mention bugs regarding online DDL that in my opinion are important to be aware of. Fortunately there are not many of them. Following were active as of version 5.6.12:

http://bugs.mysql.com/bug.php?id=69151 – in some circumstances statements needing table copy (like add column) create temporary table in tmpdir instead of the database directory

http://bugs.mysql.com/bug.php?id=69444 – Replication is not crash safe with DDL statements

http://bugs.mysql.com/bug.php?id=69580 – the bug around column rename, I just reported and mentioned above

CONCLUSION

I think schema changes in MySQL 5.6 are really less painful out of the box, plus we can still choose external tools if needed. Having said that, MySQL operators are in much better shape now. It is really good to see MySQL evolution is continued in right direction.

The post Schema changes – what’s new in MySQL 5.6? appeared first on MySQL Performance Blog.

Votes:

You must be logged in with a MySQL account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.