In MySQL Cluster there is a limiting factor in the receive
threads that limits our update performance to about 0.5M update
transactions per node group per second (usually 2 nodes per node
group). In MySQL Cluster 7.2.7 we have removed most of this
bottleneck and can now achieve 3x as many update transactions.
We're reaching about 1.5M updates per node group per second. On a
30-node configuration we achieved 19.5M update transactions per
second which corresponds to 1.17BN updates per minute. This means
we achieve almost linear increase of update performance all the
way to 30 data nodes.
The benchmarks were executed using the benchmark scripts
dbt2-0.37.50 available at dev.mysql.com, the benchmark program is
the flexAsynch program mentioned in some of my earlier blogs. We
used 8 LQH threads per data node.
Read the original article at Consulting essentials: Managing & Completing Engagements
This is the second in a series of three articles on
Consulting Essentials.
Read the previous post, Consulting essentials: Getting the
business
Communicating well and knowing when to step in or stand back is
the linchpin of successful consulting.
Some people have natural charm. If you’re one of these people
you’ll find consulting is definitely for you. You’ll use that
skill all the time as each new client brings a half dozen or a
dozen new people to interact with.
If it doesn’t come easily, practice practice practice. Try to get out of your own head space, and hear what troubles …
[Read more]The explain statement can be an important tool for understanding how a query is being executed and what you can do to make it run better. Although the output of EXPLAIN is relatively straightforward it can be confusing to inexperienced users or can be mangled by terminal wrapping.
To help with these problems as well as provide a pastebin for MariaDB developers to share explains during development we created The MariaDB/MySQL Explain Analyzer. This tool:
- Helps unmangle explains (both vertical and tabular format)
- Displays explains in an easy-to-read format.
- Highlights and provides explanations for some terms.
- Links to KB articles for different optimization techniques.
- (Optionally) Allows you to save the explain for sharing.
This …
[Read more]Users of MySQL Replication sometimes throttle client requests to give slaves time to catch up to the master. PECL/mysqlnd_ms 1.4, the current development version, features some throttling through the quality-of-service filter and global transaction identifier (GTID). Both the plugins client-side GTID emulation and the MySQL 5.6 built-in GTID feature can be used to slow down PHP MySQL requests, if wanted.
How its done
The replication plugin has a neat feature called quality-of-service filter. If, for example,
the quality of service you need from a MySQL Replication cluster
is "read your writes", you call …
I have always found it strange that we do not have a MySQL user
group in Sweden - this is the country where the MySQL saga
started.
Therefore I am delighted to announce that since today we have a
user group in Sweden and I hope it will be a active one!
If you want to join our user group in Sweden, join the
group here!
Often I see a SQL problem solved incorrectly and I do not mean inefficiently. Simply incorrectly. In many cases the developer remains unaware that they aren’t getting the results they were expecting or even if a result is correct, it is only by chance, for example because the database engine was smart enough to figure out some non-sense in a query. In a few posts I will try to disclose some of the more common problems.
Aggregate with GROUP BY
Unlike many other database systems, MySQL actually permits that an aggregate query returns columns not used in the aggregation (i.e. not listed in GROUP BY clause). It could be considered as flexibility, but in practice this can easily lead to mistakes if a person that designs queries does not understand how they will be executed. For example, what values an aggregate query returns for a column that wasn’t part of the grouping key?
mysql> SELECT user_id, id, …[Read more]
In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html, I
reviewed the differences between OLTP and Analytics
databases.
Scale challenges are different between those 2 worlds of
databases.
Scale challenges in the Analytics world are with the growing
amounts of data. Most solutions have been leveraging those 3 main
aspects: Columnar storage, RAM and parallelism.
Columnar storage makes scans and data filtering more precise and
focused. After that – it all goes down to the I/O - the faster
the I/O is, the faster the query will finish and bring results.
Faster disks and also SSD can play good role, but above all: RAM! …
Have you ever tried to kill a query, but rather than just go away, it remained among the running ones for an extended period of time? Or perhaps you have noticed some threads makred with killed showing up from time to time and not actually dying. What are these zombies? Why does MySQL sometimes seem to fail to terminate queries quickly? Is there any way to force the kill command to actually work instantaneously? This article sheds some light on it.
Threads and connections
MySQL uses a separate thread for each client connection. A query sent to MySQL is handled by a thread that was previously associated with the connection over which the query arrived. Anyone with sufficient privileges can see the list of currently active threads, along with some additional details, by running SHOW PROCESSLIST command, which returns a table-like view where each connection becomes a separate row:
…[Read more]
My old post (http://www.jroller.com/mmatthews/entry/speeding_up_batch_inserts_for) on the performance gains from batch rewritten statements gets regurgitated in various forums, and conference sessions, and often people miss the nuances of it.
Under the hood, what is happening with this feature is the following:
(1) The driver attempts to detect that the SQL being prepared is
an INSERT. We (on purpose) don‘t ship a full-fledged parser in
the driver, so this works 95% of the time. For the other 5%,
you‘re out of luck unless you can simplify your query text.
(2) If the statement is an INSERT, the driver attempts to
determine if it can be rewritten as a multi-value INSERT. From
the code itself, the conditions are:
// Needs to be INSERT, can‘t have INSERT … SELECT or
// INSERT … ON DUPLICATE KEY UPDATE …
Believe me managing EC2 instances is not as simple or magical as
marketers would like for you to believe. The cloud gets
complicated when it gets bigger. EC2 really only enables a person
to ignore power, network layout (which is bad), and getting more
servers that end up costing more then actual servers fast.
Things that EC2 is missing to make life easier for the
developer:
Ability to update all servers with packages and code. Natively
they do not support the ability to push files or install new
software packages to server groups. Thus install cluster-it and
puppet and write your own deploy program.
Server names and the EC2 AWS management console do not match.
Everything is referenced by instance ids. The interface does not
allow one to launch many instances in a named pattern so you have
to go back and sync up the instance with …