The details on this issue are here:
https://github.com/facebook/mysql-5.6/issues/369
This test is very simple. I loaded the SSB (star schema
benchmark) data for scale factor 20 (12GB raw data), added
indexes, and tried to count the rows in the table.
After loading data and creating indexes, the .rocksdb data
directory is 17GB in size.
A full table scan "count(*)" query takes less than four minutes,
sometimes reading over 1M rows per second, but when scanning the
index to accomplish the same count, the database can only scan
around 2000 rows per second. The four minute query would take an
estimated 1000 minutes, a 250x difference.
I have eliminated the type of CRC32 function (SSE vs non-SSE) by
forcing the hardware SSE function by patching the code.
There seem to be problems with any queries …
The details on this issue are here:
https://github.com/facebook/mysql-5.6/issues/369
This test is very simple. I loaded the SSB (star schema
benchmark) data for scale factor 20 (12GB raw data), added
indexes, and tried to count the rows in the table.
After loading data and creating indexes, the .rocksdb data
directory is 17GB in size.
A full table scan "count(*)" query takes less than four minutes,
sometimes reading over 1M rows per second, but when scanning the
index to accomplish the same count, the database can only scan
around 2000 rows per second. The four minute query would take an
estimated 1000 minutes, a 250x difference.
I have eliminated the type of CRC32 function (SSE vs non-SSE) by
forcing the hardware SSE function by patching the code.
There seem to be problems with any queries …
When your car doesn’t start, you don’t just blindly change the battery, starter, fuel pump, spark plugs or all of the above. Instead, you go to your mechanic and ask him to check what is wrong (or you check it yourself if you are the mechanic) and then fix whatever is broken.
Yet very often I see DBAs doing exactly the opposite with their MySQL servers. Rather than assessing what is the server so busy with, they keep changing configuration options until the problem “goes away”. Alternatively, they add more RAM, more CPUs or faster disks, depending on which resources seems to be the most busy at a time. Or they switch to a new server altogether.
MySQL (with a help of some tools) has a really convenient way to analyse the workload and see clearly what exactly is MySQL so busy doing. And even how much improvement you can expect by, say, fixing a specific MySQL query.
…
[Read more]
PS_history is a tool which collects historical snapshots of the
PERFORMANCE_SCHEMA (P_S). This allows you to trend P_S values
over time, for example, it is possible to look at the 95 th
percentile response time for a query over time.
PS_history is stored procedure and event based, and thus it
resides entirely inside of the database with no external
dependencies. It uses a clever technique to capture all of the
P_S data in one consistent snapshot. This ensures that all of the
sys_history views (bundled now with PS_history) have a consistent
set of data.
By default, as long as the event_schedule is enabled, PS_history
will collect data every 30 seconds. If a snapshot takes 30
seconds, there will be a 30 second delay before the next snapshot
starts. This value can be changed by calling the
`ps_history`.`set_collection_interval`(N) where N is the number
of seconds between samples.
The `sys_history` schema is …
PS_history is a tool which collects historical snapshots of the
PERFORMANCE_SCHEMA (P_S). This allows you to trend P_S values
over time, for example, it is possible to look at the 95 th
percentile response time for a query over time.
PS_history is stored procedure and event based, and thus it
resides entirely inside of the database with no external
dependencies. It uses a clever technique to capture all of the
P_S data in one consistent snapshot. This ensures that all of the
sys_history views (bundled now with PS_history) have a consistent
set of data.
By default, as long as the event_schedule is enabled, PS_history
will collect data every 30 seconds. If a snapshot takes 30
seconds, there will be a 30 second delay before the next snapshot
starts. This value can be changed by calling the
`ps_history`.`set_collection_interval`(N) where N is the number
of seconds between samples.
The `sys_history` schema is …
General query logs and slow query logs remain one of the most
popular sources of auditing and diagnostic information in MySQL
databases. Customers often ask about the cost of general and slow
query logging so I went googling for existing research I could
point them to. The data I found was not quite what I hoped for,
which is why I decided to do some more testing.
Introduction Logging overhead is an obvious thing and it has been
the topic of many blog posts before this one. Sure enough, my
Google search returned multiple articles, however I wasn't fully
satisfied the findings.
Perhaps the topic is so obvious that DBAs don't revisit it very
often. In any case, most of the articles I found were several
years old and that presents a few issues:
- Most of the tests were run on MySQL 5.0 or 5.1.
- Database hardware isn't what it used to be, which changes the perspective on database performance bottlenecks. …
If there is something that I love about information technology is
the incredible amount of unpredictable ways that bugs can bite
you where and when you less expect it.
This one I'm about to describe is a serious one, yet there has
been a very quiet response from Percona bug team at Launchpad,
where I first submitted it (because we use Percona server here).
If you have read my other posts you know by now that I am a
laz... err, busy guy so bear with me if only today I have
verified that this is, in fact, an upstream bug and have raised a
bug to Oracle too.
It affects 5.6.33, latest version at the time of this
article.
Anyways, the bug started to manifest itself in the form of
complete stalls of our main cluster, which was happening at
different times and apparently in a way that was unrelated to the
amount of traffic hitting the master. When stalling, system
CPU time was topping 100% of total available …
FRM-less, transactional data dictionary is arguably the most
significant feature change announced MySQL 8.0 development
release. The server still has two separate dictionaries (MySQL,
InnoDB) but the infamous FRM files are finally being replaced
with transactional, InnoDB-based storage.
While this is interesting for various reasons, this particular
post will focus on the impact this change has on data dictionary
performance.
Test configuration Server configuration:
- Hardware: Amazon EC2 m4.4xlarge, 500GB, 15K IOPS
- OS: Debian Jessie
- MySQL versions tested: 5.7.15, 8.0.0-dmr
- Notable customer MySQL configuration values:
- innodb_buffer_pool_size = 4GB
- innodb_log_file_size = 256MB
- innodb_buffer_pool_dump_at_shutdown = 0
- innodb_buffer_pool_load_at_startup = 0
- …
In this blog post we will present a first look at the performance of Group Replication (GR), now that the RC is out. The goal is to provide some insight on the throughput, latency and scalability one can expect in a modern computing infrastructure, using GR in single- and multi-master configurations, but mostly focused on single-master.…
I'm not a huge fan of the InnoDB FULLTEXT feature and I admit I
wasn't too keen to play with it in the past. Apparently, the
feeling is mutual and so FULLTEXT issues haven't popped up in too
many projects I worked on... until last week.
This post describes the troubleshooting process of a FULLTEXT
cache performance issue. Quite inconspicuous at the beginning, it
proved to be a lot of fun in the end.
Ready, Set, GDB!
Background It all started with a generic performance issue report
unrelated to FULLTEXT indexing. The issue, as described in the
report, was:
- The server performs acceptably for a few days.
- After a certain amount of time, DML performance suddenly drops and remains low until the server is rebooted.
- Reboot restores original performance, which again only lasts for a few days.
The situation was resulting in DML latency degradation during normal OLTP …
[Read more]