Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Employee The future of ALTER IGNORE TABLE syntax
+3 Vote Up -0 Vote Down

As part of the ongoing effort to refactor MySQL and make it better, we are defining the behaviour of certain features and in some cases reimplimenting them in a more maintainable fashion.

I have previously blogged on:

Today what I want to write about, is the IGNORE option in ALTER TABLE. From the MySQL manual:

“IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.”

This creates several issues for the MySQL server team:

  • IGNORE could remove rows from a parent table when using a foreign key relationship.
  • IGNORE makes it impossible to use InnoDB Online DDL for several operations, for example adding a PRIMARY KEY or UNIQUE INDEX.
  • IGNORE has some strange side-effects for replication. For example: DDL is always replicated via statement-based replication, and since SQL does not imply ordering, it’s not clear which rows will be deleted as part of the ignore step. I also see cross-version replication problematic if future MySQL versions were to introduce more strictness, since a slave may de-duplicate more rows.
  • The most common case

    We believe that the most common use case for IGNORE is to be able to add a UNIQUE INDEX on a table which currently has duplicate values present. i.e.

    ALTER IGNORE TABLE users ADD UNIQUE INDEX (emailaddress);

    In this scenario, a novice user manages to avoid auditing each entry in the users table, and simply lets MySQL pick a row to be kept, with all duplicates automatically removed.

    There are two other ways to be able to do that:

    Hand removal

    Using the same an example as above, return a list of email addresses and PRIMARY KEY values for records that conflict:

    SELECT GROUP_CONCAT(id), emailaddress, count(*) as count FROM users 
    GROUP BY emailaddress HAVING count >= 2;
    
    /* delete or merge duplicate from above query */
    
    ALTER TABLE users ADD UNIQUE INDEX (emailaddress);

    Note: This method will be the fastest way, since when not using IGNORE, MySQL is able to use InnoDB’s Online DDL.

    New table + INSERT IGNORE

    While this method looks very similar, internally it’s semantics are quite different:

    CREATE TABLE users_new LIKE users;
    ALTER TABLE users ADD UNIQUE INDEX (emailaddress);
    INSERT IGNORE INTO users_new SELECT * FROM users;
    DROP TABLE users;
    RENAME TABLE users_new TO users;

    By creating a table first, the MySQL server will not have to manage rows in a foreign key relationship. The rows will also be re-sent to the slave using row-based replication, so issue (3) I mentioned above does not come into play.

    Conclusion

    We are looking for feedback on how you use this feature.

    Is there another common use-case we are not accounting for?

    Would it be acceptable to add a feature to MySQL Workbench to assist in preparing a table for adding a unique index (similar to hand removal)?

    Help shape the future of MySQL. Please leave a comment or contact me.

    Votes:

    You must be logged in with a MySQL account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.