“The Colonoscopy of Database Software”
- Jim Starkey
If your project does not have something that you can adapt that quote to, odds are your testing is inadequate.
|Showing entries 1 to 30 of 180||Next 30 Older Entries|
About 1.5 months ago I blogged on MySQL 5.6 on POWER andtalked about what I had to poke at to make modern MySQL versions run and run well on shiny POWER8 systems.
One of those bugs, MySQL bug 47213 (InnoDB mutex/rw_lock should be conscious of memory ordering other than Intel) was recently marked as CLOSED by the Oracle MySQL team and the upcoming 5.6.20 and 5.7.5 releases should have the fix!
This is excellent news for those wanting to run MySQL on SMP systems that don’t have an Intel-like memory model (e.g. POWER and MIPS64).
This was the most major and invasive patch in the patchset for MySQL on POWER. It’s[Read more...]
We’ve known for over six years (since before we started Drizzle) that the query cache hurt performance. It was for that reason that the query cache was one of the early things to be removed from Drizzle, it just didn’t scale on multi[Read more...]
Of course, The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions. Also, these numbers should be considered preliminary, but trust me – I did get them and it’s not April 1st.
From my[Read more...]
In a previous post, I covered porting MySQL 5.6 to POWER and subsequently, some new record performance numbers with MySQL 5.6.17 on POWER8.
Well, those following at home will be aware that not only is the next sentence sponsored by IBM Legal, but that MySQL 5.7 alleviates a bunch of the mutex contention that we saw with MySQL 5.6. The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
In looking at MySQL performance on POWER, it’s inevitable that I should look at MySQL 5.7 and what’s coming up in the next stable release of MySQL.[Read more...]
The following sentence is brought to you by IBM Legal. The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Okay, now that is out of the way….
If you’re the kind of person who follows the MySQL bugs database closely or subscribes to the MySQL Internals mailing list, you may have worked out that I’ve spent a small amount of time poking at MySQL on modern POWER systems.
Unlike Intel CPUs, POWER CPUs require explicit memory barriers to synchronize memory state between[Read more...]
Recently, I’ve had reason to poke at MySQL performance on some pretty cool hardware. Comparing MySQL 5.6 to MySQL 5.7 is a pretty interesting thing to do when you have many CPU cores.
The improvements to creating read views in InnoDB is absolutely huge for small statements with large concurrency – MySQL 5.7 completely removes this as a bottleneck – as much as doubling maximum SQL queries per second, which is a pretty impressive improvement.
I haven’t poked at the similar improvements in Percona Server on this hardware setup – so I can only really guess as to the performance characteristics of it… If comparing to older MySQL versions, Percona Server 5.5 is likely to outperform MySQL 5.5 thanks to[Read more...]
There’s a pattern I keep seeing in threaded programs (or indeed multiple processes) writing to a common log file. This is more of an antipattern than a pattern, and is often found in code that has existed for years.
Basically, it’s having a mutex to control concurrent writing to the log file. This is something you completely do not need.
The write system call takes care of it all for you. All you have to do is construct a buffer with your log entry in it (in C, malloc a char or have one per thread, in C++ std::string may do), open the log file with O_APPEND and then make a single write() syscall with the log entry.
This works for just about all situations you care about. If[Read more...]
It may not be surprising that there’s been a few projects over the years that I’ve worked on where we’ve had to care about stack usage (to varying degrees).
For threaded userspace applications (e.g. MySQL, Drizzle) you get a certain amount of stack per thread – and you really don’t want to bust that. For a great many years now, there’s been both a configuration parameter in MySQL to set how much stack each thread (connection) gets as well as various checks in the source code to ensure there’s enough free stack to do a particular operation (IIRC open_table is the most hairy one of this in MySQL).
For the Linux Kernel, stack usage is a relatively (in)famous problem… although by now just about every real problem has been fixed and merely[Read more...]
As many of you know, I’ve been working in the MySQL (http://www.mysql.com) world for quite a while now. IN fact, it was nearly 10 years ago when I first started hacking on MySQL Cluster (https://www.mysql.com/products/cluster/) at MySQL AB.
Most recently, I was at Percona which was a wonderful journey where over my nearly three years there the company at least doubled in size, launched several new software products and greatly improved the quality and frequency of releases.
However the time has come for something[Read more...]
I have put up a set of scripts on github: https://github.com/stewartsmith/bzr-to-git-conversion-scripts. Why do I need these? Well… if only bzr fast-export|git fast-import worked flawlessly for large, complex and old trees. It doesn’t.
Basically, when you clone this repo you can run “./sync-BLAH.sh” and it’ll pull BZR trees for the project, convert to git and clean things up a bit. You will likely have to edit the sync-BLAH.sh scripts as I have them pointed at branches on my own machine (to speed up the process, not having to do fresh BZR branches of MySQL trees over the network is a feature - it’s never been fast.). You’ll also want to edit the[Read more...]
Over a year ago now, I announced the first Percona Server 5.6 alpha on the Percona MySQL Performance Blog (Announcing Percona Server 5.6 Alpha). That was way back on August 14th, 2012 and it was based on MySQL 5.6.5 released in April.
I’m really happy now to point to the release of the first GA release of Percona Server 5.6 along with[Read more...]
I’ve used the Bazaar (bzr) version control system since roughly 2005. The focus on usability was fantastic and the team at Canonical managed to get the entire MySQL BitKeeper history into Bazaar – facilitating the switch from BitKeeper to Bazaar.
There were some things that weren’t so great. Early on when we were looking at Bazaar for MySQL it was certainly not the fastest thing when it came to dealing with a repository as large as MySQL. Doing an initial branch over the internet was painful and a much worse experience than BitKeeper was. The work-around that we all ended up using was downloading a tarball of a recent Bazaar repository and then “bzr pull” to get the latest. This was much quicker than[Read more...]
For MySQL 5.1, 5.5 and 5.6 in the same repository, after repacking:
bzr: 269MB (217MB pack, 52MB indicies)
git: 177MB repo (152MB pack)
One thing I’ll say is that BZR is always more chatty over the network and is substantially slower than GIT in pulling a fresh copy.
First I find out the first commit that is in 5.7 that isn’t in 5.6 (using bzr missing) and then look at the authors of all of those commits. Measuring the number of commits is a really poor metric as it does not measure the complexity of the code committed, and if your workflow is to revise a patchset before committing, you get much fewer commits than if you commit 10 times a day.
There are a good number of people who are committing a lot of code to the latest MySQL development tree. (Sorry for the annoying layout of “count. number-of-commits name”)
There was some suggestion after my previous post (Who works on MariaDB and MySQL?) that I look at MariaDB 10.0 – so I have. My working was very simple, in a current MariaDB 10.0 BZR tree (somewhat beyond 10.0.3), I ran the following command:
bzr log -n0 -rtag:mariadb-10.0.0..|egrep '(author|committer): '| \ sed -e 's/^\s*//; s/committer: //; s/author: //'| \ sort -u|grep -iv oracle
I recently got pointed towards https://github.com/shodanium/nanomysql/ which is a tiny (less than 400 lines of C++) MySQL client library which is GPL licensed.
If you need to link into non-GPL compatible code, there is the (slightly larger and full featured) libdrizzle library. But if you want something *tiny* and are okay with GPL, then nanomysql may be something to look at.
Looking at the committers/authors of patches in the bzr tree for MariaDB 5.5.31.
Non Oracle Contributors:
Oracle (as they pull Oracle changes):
Whenever I stick my head into the MySQL storage engine API, I’m reminded of a MySQL User Conference from several years ago now.
Specifically, I’m reminded of a slide from an early talk at the MySQL User Conference by Paul McCullagh describing developing PBXT. For “How to write a Storage Engine for MySQL”, it went something like this:
A lot of people stop at step 3. It’s a really good place to stop too. It avoids most of the tricky parts that are unexpected, undocumented and unlogical (yes, I’m inventing words here).
There’s a big difference in how plugins are treated in MySQL and how they are treated in Drizzle. The MySQL way has been to create a C API in front of the C++-like (I call it C- as it manages to take the worst of both worlds) internal “API”. The Drizzle way is to have plugins be first class citizens and use exactly the same API as if they were inside the server.
This means that MySQL attempts to maintain API stability. This isn’t something worth trying for. Any plugin that isn’t trivial quickly surpasses what is exposed via the C API and has to work around it, or, it’s a storage engine and instead you have this horrible mash of C and C++. The byproduct of this is that no core server features are being re-implemented as plugins. This means[Read more...]
I had reason to look into the extended secondary index code in MariaDB and MySQL recently, and there was one bit that I really didn’t like.
share->set_use_ext_keys_flag(legacy_db_type == DB_TYPE_INNODB);
use_extended_sk= (legacy_db_type == DB_TYPE_INNODB);
In case you were wondering what “legacy_db_type” actually does, let me tell you: it’s not legacy at all, it’s kind of key to how the whole “metadata” system in MySQL works. For example, to drop a table, this magic number is used to work out what storage engine to call to drop the table.
Now, these code snippets[Read more...]
The Example storage engine is meant to serve mainly as a code example of the stub of a storage engine for example purposes only (or so the code comment at the start of ha_example.cc reads). In reality however, it’s not very useful. It likely was back in 2004 when it could be used as a starting point for starting some simple new engines (my guess would be that more than a few of the simpler engines started from ha_example.cc).
The sad reality is the complexity of the non-obviousness of the bits o the storage engine API you actually care about are documented in ha_ndbcluster.cc, ha_myisam.cc and ha_innodb.cc. If you’re doing something that isn’t already done by one of those three engines: good luck.
Whenever I looked at ha_example.cc I always wished there was something[Read more...]
I wonder how much longer the ARCHIVE storage engine is going to ship with MySQL…. I think I’m the last person to actually fix a bug in it, and that was, well, a good number of years ago now. It was created to solve a simple problem: write once read hardly ever. Useful for logs and the like. A zlib stream of rows in a file.
You can actually easily beat ARCHIVE for INSERT speed with a non-indexed MyISAM table, and with things like TokuDB around you can probably get pretty close to compression while at the same time having these things known as “indexes”.
ARCHIVE for a long time held this niche though and was widely and quietly used (and likely still is). It has the great benefit of being fairly lightweight – it’s only about 2500 lines of code (1130 if[Read more...]
This is one close to my heart. I’ve recently written on other storage engines: Where are they now: MySQL Storage Engines, The MERGE storage engine: not dead, just resting…. or forgotten and The MEMORY storage engine. Today, it’s the turn of MySQL Cluster.
Like InnoDB, MySQL Cluster started outside of MySQL. Those of you paying attention at home may notice a correlation between storage engines not written exclusively for MySQL and being at all successful.
In this case, I really don’t mind. It’s rather exciting that they’ve gone ahead and done this – and it’s not just a code drop: https://github.com/Tokutek/ft-engine is where things are at, and recent commits were 2hrs and 18hrs ago which means that this is being maintained. This is certainly a good way to grow a developer community.
While being a MySQL engine is really interesting, the MongoDB[Read more...]
I’ve started poking around the MySQL 5.7.1 source tree (although just from tarball as I don’t see a BZR tree yet). I thought I’d share a few thoughts:
While Domas may have rather effictively trolled the discussion with his post on howto configure table/user statistics (which gave me a good chuckle I do have to say), it’s at least incorrect for Percona Server as you have to enable the “userstat” server option :)
That being said, once enabled there are no extra configuration variables to think about. This is a huge advantage over configuring PERFORMANCE_SCHEMA - which has a total of THIRTY configuration options (31 if you include the global enable/disable option).
Some of these thirty odd configuration variables are only going to matter if[Read more...]
A few weekends ago, I started to again look at the code in Drizzle for producing internal temporary tables. Basically, we have a few type of tables:
If you’re lucky enough to be creating one of the first three types, you go through an increasingly lovely pile of code that constructs a nice protobuf message about what the table should look like and hands all responsibility over to the storage engine as to how to do that. The basic idea is that Drizzle gets the heck out of the way and lets the storage engine do its thing. This code path looks rather different[Read more...]
A naive wc based “lines of code” for MySQL 5.6 sql/ directory is ~490kLOC which contasts with MySQL 5.5 being ~375kLOC by the same measure. If we diffstat the sql/ directory like I did for MariaDB 5.5 we get:
357 files changed, 172871 insertions(+), 67922 deletions(-)
Versus, as you remember from yesterday for MariaDB 5.5 over MySQL 5.5:
250 files changed, 83639[Read more...]
|Showing entries 1 to 30 of 180||Next 30 Older Entries|