Any Galera documentation about limitations will state that tables must have primary keys. They state that DELETEs are unsupported and other DMLs could have unwanted side-effects such as inconsistent ordering: rows can appear in different order on different nodes in your cluster.If you are not actively relying on row orders, this could seem acceptable. Deletes […]
What's a good natural key? This is a very difficult question for most entities when you design your schema. In some rare cases, there seems to be an "obvious" candidate, such as a variety of ISO standards, including: ISO 639 language codesISO 3166 country codesISO 4217 currency codes But even in those cases, there might … Continue reading The Cost of Useless Surrogate Keys in Relationship Tables →
There are a number of options for generating ID values for your tables. In this post, Alexey Mikotkin of Devart explores your choices for generating identifiers with a look at auto_increment, triggers, UUID and sequences.
AUTO_INCREMENT
Frequently, we happen to need to fill tables with unique identifiers. Naturally, the first example of such identifiers is PRIMARY KEY data. These are usually integer values hidden from the user since their specific values are unimportant.
When adding a row to a table, you need to take this new key value from somewhere. You can set up your own process of generating a new identifier, but MySQL comes to the aid of the user with the AUTO_INCREMENT column setting. It is set as a column attribute and allows you to generate unique integer identifiers. As an example, consider the …
[Read more]The release of MySQL 8.0 has brought a lot of bold implementations that touched on things that have been avoided before, such as added support for common table expressions and window functions. Another example is the change in how AUTO_INCREMENT (autoinc) sequences are persisted, and thus replicated.
This new implementation carries the fix for bug #73563 (Replace result in auto_increment value less or equal than max value in row-based), which we’ve only found about recently. The surprising part is that the use case we were analyzing is a somewhat common one; this must be affecting a good number of people out there.
Understanding the bug
The business logic of the use case is such the UNIQUE column found in a table whose id is managed by an AUTO_INCREMENT sequence needs to be updated, and this is done with a …
[Read more]This morning I received an update on a MySQL bug, someone added a comment on an issue I filed in November 2003, originally for MySQL 4.0.12 and 4.1.0. It’s MySQL bug#1956 (note the very low number!), “Parser allows naming of PRIMARY KEY”:
[25 Nov 2003 16:23] Arjen Lentz Description: When specifying a PRIMARY KEY, the parser still accepts a name for the index even though the MySQL server will always name it PRIMARY. So, the parser should NOT accept this, otherwise a user can get into a confusing situation as his input is ignored/changed. How to repeat: CREATE TABLE pk (i INT NOT NULL); ALTER TABLE pk ADD PRIMARY KEY bla (i); 'bla' after PRIMARY KEY should not be accepted. Suggested fix: Fix grammar in parser.
Most likely we found it during a MySQL training session, training days have always been a great bug catching …
[Read more]Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:
How can I check the table definition consistency between servers?
Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.
…
[Read more]On a recent engagement I worked with a customer who makes extensive use of UUID() values for their Primary Key and stores it as char(36), and their row count on this example table has grown to over 1 billion rows.
The table is INSERT-only (no UPDATEs or DELETEs), and the bulk of their retrieval are PK lookups. Lookups by PK were performing acceptably, but they were concerned with the space usage by the table as we were approaching 1TB (running with innodb_file_per_table=1 and Percona Server 5.5).
This schema model presents an increasing burden for backups since they use Percona XtraBackup, and so the question was asked: does their choice of an effectively random Primary Key based on UUID() impact their on-disk storage, and to what extent? And as a neat trick I show towards the end of this post how you can calculate the …
[Read more]I ran into an interesting issue today, while examining some EXPLAIN outputs, and wanted to share the findings, as some of this is undocumented.
Basically, you can start with a very simple InnoDB table - 2 INT columns, Primary Key (PK) on the 1st column, regular index on the 2nd:
CREATE TABLE `t1` ( `id1` int(10) unsigned NOT NULL, `id2` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id1`), KEY `id2` (`id2`) ) ENGINE=InnoDB;
The query is:
SELECT id1 FROM t1;
This is a straight-forward query with no WHERE clause.
I ran into an interesting issue today, while examining some EXPLAIN outputs, and wanted to share the findings, as some of this is undocumented.
Basically, you can start with a very simple InnoDB table – 2 INT columns, Primary Key (PK) on the 1st column, regular index on the 2nd:
CREATE TABLE `t1` ( `id1` int(10) unsigned NOT NULL, `id2` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id1`), KEY `id2` (`id2`) ) ENGINE=InnoDB;
The query is:
SELECT id1 FROM t1;
This is a straight-forward query with no WHERE clause.
Given no WHERE clause, we know there will be a full table or index scan. Let’s look at EXPLAIN:
mysql> EXPLAIN SELECT id1 FROM t1\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t1 type: index possible_keys: NULL key: id2 key_len: 5 ref: NULL rows: 1 …[Read more]
Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight: Scalability issues due to tables without primary keys. This scalability issue is caused by the usage of tables without primary keys. This issue typically shows itself as contention on the InnoDB dict_sys mutex. Now the dict_sys mutex controls access to the data dictionary. This mutex is used at various important places throughout the InnoDB code and as such any contention on the dict_sys mutex is going to have a InnoDB system-wide negative affect.
The post InnoDB scalability issues due to tables without primary keys appeared first on ovais.tariq.