In MySQL, even though the name read_buffer_size implies that the variable controls only read buffering, but it actually does dual purpose by providing sequential IO buffering for both reads and[...]
A couple of question I get a lot from MySQL customers is “how will this hardware upgrade improve my transactions per second (TPS)” and “what level of TPS will MySQL perform on this hardware if I’m running ACID settings?” Running sysbench against MySQL with different values for per-thread and global memory buffer sizes, ACID settings, and other settings gives me concrete values to bring to the customer to show the impact that more RAM, faster CPUs, faster disks, or cnf changes have on the server. Here are some examples for a common question: “If I’m using full ACID settings vs non-ACID settings what performance am I going to get from this server?”
Let’s find out by running sysbench with the following settings (most are self explanatory – if not the man page can explain them):
- sysbench –test=oltp –db-driver=mysql –oltp-table-size=1000000 –mysql-engine-trx=yes –oltp-test-mode=complex …
The need: Often there is a requirement where data in a particular table has to be processed, and the data processing might be slow, while the table might be a one that is used by your application extensively. For example, a logging table that logs page hits. Or there might be an archiving operation that has to be performed on a particular table. Archiving / processing / aggregating records, all these operations are slow and can really blog down a website, combine that with the added overhead if the table that needs to have these operations performed is one that...
Does having small data-sets really help? Of course it does! Are memory lookups faster that disk lookups. Of course ! So many times I have seen people complain about queries taking too long now, while they were not taking that long earlier. There is one big reason for this, earlier the size of data-set was small so it could fit into memory. Now that the data-set has grown large enough that it cannot fit entirely into memory, the disk seeks really have slowed down the queries significantly. What to do now? Vertical partitioning. Divide the data-set into separate data-sets vertically....
We had a one of our slave servers frequently stop replicating with the “Innodb Lock Wait Timeout” error. The slave IO thread would continue to fetch the binlogs while the slave SQL thread kept stopping with the above mentioned error. The teams initial inclination was to change the innodb lock wait timeout variable from 50 secs to a higher value. It was a read-only slave. Our expectation was there would be no competing writes. Then we started listing what are the next steps possible and what could be wrong.
- There could be a user with “super” privilege in the system that was running updates directly on the slave
- A backup script that could be locking the tables out for backup
- Increase the “innodb lock wait timeout variable“
- Enable the innodb lock monitor
While we were working on the system, we noticed that there were few select queries that …
[Read more]
You know already that InnoDB in MySQL 5.5 has great
improvements in performance and scalability. You will have to
wait a few months for that, though, because MySQL 5.5 is not
GA yet. But if you need some extra performance in MySQL 5.1, you may want to use the Innodb Plugin instead of the built-in one. As of version 5.1.47, the Innodb plugin is of GA quality, and it comes with a good out-of-the-box improvement compared to the built-in engine. |
To test my assumptions, I used one of my test Linux servers to
perform a sysbench on 5.0.91, 5.1.47 built-in and plugin, and
5.5.4. The MySQL servers were all configured with
(No three-part series is complete without a part 4, right?)
Here is an analogy that describes well what group commit does. We
have a bus driving back and forth transporting people from A to B
(corresponding to fsync()
"transporting" commits to
durable storage on disk). The group commit optimisation is to
have the bus pick up everyone that is waiting at A before driving
to B, not drive people one by one. Makes sense, huh? :-)
It is pretty obvious that this optimisation of having more than one person in the bus can dramatically improve throughput, and it is the same for the group commit optimisation. Here is a graph from a benchmark comparing stock MariaDB 5.1 vs. MariaDB patched …
[Read more]
*edit* I want to point out that this test was done on a single
database server which used MySQL partitioning. This is a
demonstration of how Shard-Query can improve performance in
non-sharded databases too.*edit*.
Over the weekend I spent a lot of time improving my new
Shard-Query tool (code.google.com/p/shard-query) and the
improvements can equate to big performance gains on partitioned
data sets versus executing the query directly on MySQL.
I'll explain this graph below, but lower is better (response
time) and Shard-Query is the red line.
MySQL understands that queries which access data in only certain
partitions don't have to read the rest of the table. This
partition elimination works well, but MySQL left a big
optimization out of partitioning: …
I created a new tool this week:
http://code.google.com/p/shard-query
As the name Shard-Query suggests, the goal of the tool is to run
a query over multiple shards, and to return the combined results
together as a unified query. It uses Gearman to ask each server
for a set of rows and then runs the query over the combined set.
This isn't a new idea, however, Shard-Query is different than
other Gearman examples I've seen, because it supports
aggregation.
It does this by doing some basic query rewriting based on the
input query.
Take this query for example:
select c2, sum(s0.c1), max(c1) from t1 as s0 join t1 using (c1,c2) where c2 = 98818 group by c2;
The tool will split this up into two queries.
This first query will be sent to each shard. Notice that …
MySQL supports two different algorithms for views: the MERGE algorithm and the TEMPTABLE algorithm. These two algorithms differ greatly. A view which uses the MERGE algorithm can merge filter conditions into the view query itself. This has significant performance advantages over TEMPTABLE views. A view which uses the TEMPTABLE algorithm will have to compute the entire contents of the view before any filter is applied. Computing the entire contents requires a temporary table and many more rows may be accessed than otherwise would had the filter been merged into the view query.
As long as a view avoids aggregation, DISTINCT, GROUP BY, ORDER BY, etc, then it can use the MERGE algorithm. Unfortunately, this means that views of any significant complexity will almost always use the TEMPTABLE algorithm.
This blog post demonstrates a workaround that allows the convenience of complex views, including those with aggregation (that is views …
[Read more]