By courtesy of Intel we had access to a very large cluster of
Intel servers for a few
weeks. We took the opportunity to see the improvements of the
Intel
servers in the new Haswell implementation on the Intel Xeon
chips. We also took
the opportunity to see how far we can now scale flexAsynch, the
NoSQL benchmark
we've developed for testing MySQL Cluster.
Last time we tested we were using MySQL Cluster 7.2 and the main
bottleneck
then was that the API nodes could not push through more than
around 300k reads
per second and we have a limit of up to 255 nodes in total. This
meant that we
were able to reach a bit more than 70M reads per second using
MySQL Cluster 7.2.
In MySQL Cluster 7.3 we improved the handling of thread
contention in the NDB API
which means that we are now able to process much more traffic per
API node.
In MySQL Cluster 7.4 we also improved …
Some weeks back I helped a customer lower time to bulk-load data
into MySQL, they where at the time using a MySQL dumpfile
(containing SQL statements) to populate their tables during
nightly jobs.
By using LOAD DATA INFILE command and creating secondary
indexes after bulk-load of data load time went down by a factor
of almost 2x.
My test environment:
DB: MySQL 5.6.23
OS: Ubuntu 14.04
HW: My Toshiba Portege laptop with 2 cores and SSD disk
Commands/tables used in tests:
CREATE TABLE t1PK (i BIGINT UNSIGNED PRIMARY KEY, art
VARCHAR(32)) ENGINE=InnoDB; CREATE TABLE t1 (i BIGINT UNSIGNED PRIMARY KEY, art VARCHAR(32) UNIQUE KEY) ENGINE=InnoDB; LOAD DATA INFILE '/home/ted/labb/load-data/1000000' INTO TABLE test.t1 FIELDS TERMINATED BY ','; |
I created a file …
The other day we were doing some peer review on a schema change at VividCortex and the topic of character set, column length, and the potential downsides of using utf8 came up. I knew from experience that there are some hidden gotchas with this, and usually I’ll just say a couple words and send a link with further reading. But Google turned up nothing, so I am writing this post to try to help fix that.
TL;DR version is that when MySQL can’t use an index for operations such as ORDER BY, it may allocate fixed-length memory buffers large enough to hold the worst-case values, and the same might apply to temporary tables on disk. This can be seen in EXPLAIN as “Using filesort; using temporary” and similar. And when this happens, you might end up making MySQL do gigabytes worth of work to finish a query on a table that’s only a fraction of that size.
Let’s see why this can happen.
The Theory of UTF8
In …
[Read more]Packt Publishing started a Free Learning Campaign by providing a free ebook everyday till 6th March 2015. How ? Follow this link: http://bit.ly/1DoRno5
[…] I am connecting following Andrews advice: Stackoverflow Link Also, Monitoring and Managing […]
This Log Buffer Edition makes it way through the realms of Oracle, SQL Server and MySQL and brings you some of the blog posts.
Oracle:
Introducing Oracle Big Data Discovery Part 3: Data Exploration and Visualization
FULL and NO_INDEX Hints
Base64 Encode / Decode with Python (or WebLogic Scripting Tool) by Frank Munz
Why I’m Excited About Oracle Integration Cloud Service – New Video
Reminder: …
[Read more]For years, MySQL replication used to be based on binary log events - all a slave knew was the exact event and the exact position it just read from the master. Any single transaction from a master may have ended in different binary logs, and in different positions in these logs. It was a simple solution that came with limitations - more complex topology changes could require an admin to stop replication on the hosts involved. Or these changes could cause some other issues, e.g., a slave couldn’t be moved down the replication chain without time-consuming rebuild process (we couldn’t easily change replication from A -> B -> C to A -> C -> B without stopping replication on both B and C). We’ve all had to work around these limitations while dreaming about a global transaction identifier.
GTID was introduced along with MySQL 5.6, and brought along some major changes in the way MySQL operates. First of all, every transaction has …
[Read more]DBAs often encounter situations where they need to kill queries to ensure there are no long-running queries on a MySQL server that would impact performance. Long-running queries can be the result of many factors. Fortunately, Percona Server contains some handy tools to remove problematic MySQL processes. I will highlight all of the tools via some examples in this post.
pt-kill:
There have been some good posts on this blog about the pt-kill
tool, like this one by Arunjith Aravindan titled “How a set of queries can be killed in MySQL using
Percona Toolkit’s pt-kill.” Let’s dive into pt-kill a bit further with a few more …
This past weekend we had the pleasure of attending Scale13X in Los Angeles. It was full of great conversations with solid people, and it was nice to escape the winter cold for a couple days.
Baron Schwartz presented on VivdCortex’s approach to building a time-series database in MySQL. The talk is titled Scaling VividCortex’s Big Data Systems on MySQL, and the recording and video can be seen here. If you are not automaticaly directed, Baron’s presentation begins at 5:49:00.
Below are the slides and a brief description to pique your interest.
In this talk, Baron Schwartz discusses VividCortex’s unique time-series data requirements and implementation, including:
- How we built a solution using MySQL and additional components such as Redis …