And the results are in:
Innodb (no compression/original) - 660Gb
RocksDB - 209Gb
TokuDB (snappy) - 144Gb
TokuDB (LZMA) - 67Gb
Benchmark performance with mysqlslap on production sample queries
:
(8-9 Very quick SELECTs + 1-2 medium SELECTs)
Innodb (original)
Benchmark
Avg: 0.100 seconds
Min: 0.091 seconds
Max: 0.182 seconds
Total: 5.101s
TokuDB (snappy)
Benchmark
Avg: 0.100 seconds
Min: 0.089 seconds
Max: 0.183 seconds
Total: 5.106s
RocksDB
Benchmark
Avg: 0.113 seconds
Min: 0.104 seconds
Max: 0.164 seconds
Total: 5.730s
TokuDB (LZMA)
Benchmark
Avg: 0.099 seconds
Min: 0.090 seconds
Max: 0.155 seconds
Total: …
I am happy that the MySQL team is, during the last years, blogging about each major feature that MySQL Server is getting; for example, the series on Recursive Common Table Expressions. Being extremely busy myself, …
[Read more]
Things I look for when optimising or debugging a Linux OS:
- IOschedular (noop or deadline)
- Linux Kernel > 3.18 (multi queuing)
- IRQbalance > 1.0.8
- File System: noatime, nobarrier
- ext4: data=ordered
- xfs: 64k
- logfiles in different partition (if possible)
- Swapiness (0 or 1, depending)
- Jemalloc (if needed)
- Transparent hugepages - disabled
- Ulimit (open files) >1000
- Security
- IPtables
- PAM security
- Raid Controller/Smart HBA
- write-cache enabled
- battery backed
- For HP servers: hpssacli controller all show (detail|status)
- Tweak cache-ratio to 50/50 or 75/25 (and test)
Update: I included the results for when PCID is disabled, for comparison, as a worse case scenario.
After learning about Meltdown and Spectre, I waited patiently to get a fix from my OS vendor. However, there were several reports of performance impact due to the kernel mitigation- for example on the PostgresQL developers mailing list there was reports of up to 23% throughput loss; Red Hat engineers report a regression range of 1-20%, but setting OLTP systems as the worse type of workload. As it will be highly dependent on the hardware and workload, I decided of doing some test myself for the …
[Read more]
When I consult at a company, I aim to identify issues with their
database and give options on how to solve them.
However, sometimes implementing those solutions may be a more
lengthy process than it needs to be and sometimes they may not be
implemented at all. During my career, I have observed some
reasons as to why that might happen within organizations.
Obviously, the following observations will never happen at
your company. I am just writing about them so that you might
notice them in other places.
1. Legacy code
People don't like to have anything to do with legacy code. It’s
painful. It’s difficult. It’s risky to change. It runs business
critical functions. Worse of all, they didn’t write it. This can
be a problem as often, the most cripling database issues require
changes to legacy code.
2. New Technologies or Methods
People don’t like you to introduce any …
Setting up databases in development environments can be
challenging.
Normally, what I usually see is some automated process for
setting up empty databases with up-to-date data structures. This
is helpful for integration testing, but is tricky for actual
development as well as performance testing.
For example:
- It is difficult to conceptually get your head around writing a query when you cannot see any data in your tables
- You cannot possibly know if your query is slow before you deploying it to production without running it against 'some' data.
Relevant Post: How to Not be the One that Deploys that Slow Query
to Production
In addition, there can be a strict requirement to not let
sensitive customer data be available outside certain secure
environments and …
The following query gets data from performance_schema in MySQL
and attempts to regex the digest to the list of tables in the
same schema.
SELECT d.*,
(SELECT group_concat(distinct TABLE_NAME) FROM information_schema.TABLES
WHERE table_schema = d.schema_name and d.digest_text regexp table_name) table_name
FROM performance_schema.events_statements_summary_by_digest d
WHERE d.DIGEST_TEXT regexp "^(SELECT|UPDATE|DELETE|REPLACE|INSERT|CREATE)"
and d.LAST_SEEN >= curdate() - interval 7 day
ORDER BY d.SUM_TIMER_WAIT DESC limit 10\G
Who are the top individual authors of influential recent posts to
planet MySQL? The planet MySQL page includes a list of the
top 20 authors as well as a list of the top 10 vendor blogs.
However, since posts to the vendor blogs make up at least
1/4 of all of the posts, and the authors of vendor blog posts are
not included in the top author list, I decided to compute my own
top author list. I include the hidden authors from the
vendor blogs when computing my top author list.
The first problem is to identify the hidden authors for posts
from each vendor blog. This requires that the author
information be extracted from the individual posts, and this
requires a specialized parser for each vendor blog to extract the
author name from the document.
The second problem is to rank the authors using some criteria
such as the number of posts in a given recent time range.
I could run a page rank algorithm if I …
While looking to optimize your MySQL, you are needing to spot slow/bad queries at a glance and get deep insights about them with ease.
The latest update of MONyog brings new easier ways to find problem SQL in Real-Time, Wayback Machine and sniffer based Query Analyser using Performance Schema. Here is what’s new:
Query Execution Status and Full Table Scan
Count
View success/failure status of every query executed and also the
number of ‘full table scans’ for queries in a single
glance.
The latest update also has an option to switch between Performance Schema and Processlist in Real-Time making it easy for you to enable/disable Performance Schema or Processlist.
Get the latest update now to enjoy these features.Existing customers can download MONyog 6.52 from Customer Area. To evaluate MONyog 6.52, please …
[Read more]The "mysql_real_connect" function is used by MySQL clients, including multi-threaded clients, to connect to a MySQL server. The "mysql_real_connect" function is allowed to be run by concurrent threads without application level serialization. Unfortunately, the thread sanitizer found a data race in the "mysql_real_connect" function when processing the …
[Read more]