|
Some time go, we announced a new release model for MySQL. As all new things, it had some initial hiccups (with MySQL 5.4 we were still getting acquainted with the new model), but now it seems to be in full swing. By the time you read these lines, MySQL 5.5 will be available. If the mirrors aren't seeded yet, the impatient can compile and use the new version from the launchpad source tree.. |
OverviewWhat's this new release anyway? I'll leave it to Kaj's blog to give you the full description.
Suffice it to say that this release is the second milestone of
the current installment. It is of beta quality, and it will
mature to RC quality. There will be yet another milestone before
we release a GA in mid 2010.
One thing that this milestone shows is that there are no dead
weights. If a feature doesn't make the deadline, i.e. it doesn't
reach beta quality by the scheduled date, it will be dropped, and
eventually rescued at the next milestone.
With the introduction of the milestone model, we have also
increased our internal QA, especially thanks to the Random Query
Generator, which finds bugs in early stages of the code
faster than any other method. (1)
Built-in InnoDB pluginThe InnoDB plugin 1.0.5 is included in the
distribution, and, unlike MySQL 5.1, it's built-in. There is no
need to load and register the plugin. The performance
enhancements developed for MySQL 5.4 are now available together
with the other enhancements available with the InnoDB plugin.
This was already available in the previous milestone, but it's
worth mentioning it now, because not many people are aware of
that.
Semi-synchronous replicationOf all the new features, this one is
probably the most relevant. It is based on a patch made by Google
to the InnoDB engine, and adapted by MySQL developers to make it
engine-independent.
In short, it's a safety device that establishes some internal
communication between master and slaves, and makes sure that at
least one slave has received the changes being committed. That
is, before committing, the master waits until at least one slave
has acknowledged that it has received the portion of binary log
necessary to reproduce the transaction.
UPDATE As Harrison Fisk rightly notes, there is a mistake in my description.
The acknowledgment of the transaction being sent to the slave
only happens after the master's commit to the binary
log.
Some caveats apply:
- It's called semi-synchronous replication, because it doesn't necessarily apply to all the slaves. Although you can manually check if the replication has worked for all the slaves, it's enough for the master to make sure that at least one slave has got the goods.
- Received by a slave doesn't mean "executed". The slave acknowledges when it has got the binary log, even if the SQL thread is busy or stopped.
- If there is no slave that can acknowledge receipt (e.g. slaves are all down or stopped), then the master reverts to the normal asynchronous operation.
To use this feature, you need to install two plugins: one for the
master and one for each slave. No need to compile anything,
though. They are provided with the binaries. All you need to do
is load the appropriate plugin for each server.
master > INSTALL PLUGIN rpl_semi_sync_master SONAME 'libsemisync_master.so';
slave1 > INSTALL PLUGIN rpl_semi_sync_slave SONAME 'libsemisync_slave.so';
Additionally, there are a few variables that you must set, either
in the options file or online.
master > SET GLOBAL rpl_semi_sync_master_enabled=1;
slave1 > SET GLOBAL rpl_semi_sync_slave_enabled=1;
Now that the system is ready, let's see how to use it.
Before doing anything, we ask for the value of two important
status variables:
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+-----------------------------+-------+
The first one is the number of failed synchronized transactions,
the second one is the number of successful ones. Since nothing
has happened so far, they are both zero.
create table t1 (i int not null primary key) engine=innodb;
Query OK, 0 rows affected (0.13 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 1 |
+-----------------------------+-------+
The first operation (a table creation) was successfully
transferred to a slave. Let's do one more.
set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
insert into t1 values (1);
Query OK, 1 row affected (0.00 sec)
COMMIT;
Query OK, 0 rows affected (0.00 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 2 |
+-----------------------------+-------+
Also this one was successful.
Now, let's try something sneaky. On each slave, we execute "STOP
SLAVE SQL_THREAD". Normal replication would not work, but
semi-synchronous replication will go on.
insert into t1 values (2);
Query OK, 1 row affected (0.01 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 3 |
+-----------------------------+-------+
2 rows in set (0.00 sec)
The semi-synch replication has worked. However, if we query both
master and slaves, only the master has the new record. The slaves
have it only in their relay logs, which you can easily ascertain
with mysqlbinlog.
Enhanced partitioning syntax About one year ago, I briefly
announced that this feature was in the making. With some interface
improvement, it is now part of the regular partitioning. It's an
extension of partitioning BY RANGE. As you know, you can only
partition on one column value, and you can only partition on
INTEGER columns. Both these restrictions were lifted in 5.5, with
a syntax change that makes the code more readable and the overall
feature more usable.
You can now partition by date, datetime, varchar, and char
columns, not just integers, and you can use more than one column
in your list. The most immediate usage of this extension is the
ability of using dates without resorting to functions that
convert the dates into integers. For example:
CREATE TABLE t2
(dt date,a int, b int, c int)
PARTITION BY RANGE COLUMNS (dt)
(
PARTITION p0 VALUES LESS THAN ('2007-01-01'),
PARTITION p1 VALUES LESS THAN ('2008-01-01'),
PARTITION p2 VALUES LESS THAN ('2009-01-01'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
The COLUMNS keyword does the trick. The manual has more examples.
The partition helper has been updated to handle
this new feature and generate partitions accordingly.
SIGNAL and RESIGNAL
If you have used stored routines extensively, you will certainly
have asked yourself "why isn't there any way of raising an
exception?" In the SQL standard, exception handling is
implemented using the SIGNAL and RESIGNAL keywords, which were
notably missing in MySQL 5.0 and 5.1 stored routines.
There have been many clever hacks by several community members to
emulate the missing SIGNAL, but none were quite satisfactory.
After long waiting here we have SIGNAl and RESIGNAL, which make stored
routines programming much more robust and easier to debug. An
authoritative example on how to use the new syntax is available
in Roland Bouman's blog.
There is more. For the complete list of features, have a look at
the official manual.
Happy hacking!
UPDATE Added more partitions to the example, as suggested
by Jon.
(1) For the more technologically savvy, here's how Philip Stoev,
one of my distinguished QA colleagues, describes the
enhancements:
Historically, most of the MySQL tests have been manually
created, however a modern database is so complex that it is
impossible to test manually even a tiny percentage of the
available functionality. Therefore for Betony [codename for MySQL
5.5], and the upcoming Celosia [5.6], the majority of our testing
effort was concentrated around stochastic testing, using random
data and millions of random queries to validate the behavior of
the server across a wide range of scenarios and workloads.
For each new feature, and some existing ones, we automatically
generated tests that attempt to cover all relevant SQL
constructs, including the interaction between the feature being
tested and existing code within the server. For features that
have concurrency implications, we ran the random queries as a
stress test or along with concurrent DDL statements. For areas
such as the partitioning, we used the random queries to
functionally validate the new code, by comparing the result from
each query to a reference source, such as a previous version of
the server.