Showing entries 11 to 20 of 56
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: production (reset)
Getting History of Table Sizes in MySQL

One data point which is very helpful but surprisingly few people have is the history of the table sizes. Projection of data growth is very important component for capacity planning and simply watching the growth of space used on partition is not very helpful.

Now as MySQL 5.0+ has information schema collecting and keeping this data is very easy:


  2. USE stats;
  3. CREATE TABLE `tables` (
  4. `DAY` date NOT NULL,
  5. `TABLE_SCHEMA` varchar(64) NOT NULL DEFAULT '',
  6. `TABLE_NAME` varchar(64) NOT NULL DEFAULT '',
  7. `ENGINE` varchar(64) DEFAULT NULL,
  11. `DATA_FREE` bigint(21) …
[Read more]
Replication of MEMORY (HEAP) Tables

Some Applications need to store some transient data which is frequently regenerated and MEMORY table look like a very good match for this sort of tasks. Unfortunately this will bite when you will be looking to add Replication to your environment as MEMORY tables do not play well with replication.

The reason is very simple - both STATEMENT and ROW replication contain the changes to the data in binary logs. This requires the data to be same on Master and Slave. When you restart the slave you will lose contents of your MEMORY tables and replication will break. STATEMENT replication will often continue to run, with contents of the table just being
different as there is a little checks whenever statements produce the same results on the slave. ROW replication will
complain about ROW not exist for UPDATE or DELETE operation.

So what you can do ?

Use Innodb Table Instead Innodb is quite fast when it …

[Read more]
The story of one MySQL Upgrade

I recently worked on upgrading MySQL from one of very early MySQL 5.0 versions to Percona Server 5.1. This was a classical upgrade scenario which can cause surprises. Master and few slaves need to be upgraded. It is a shared database used by tons of applications written by many people over more than 5 years timeframe. It did not have any extensive test suite we could use for validation. As you might guess in such cases some of the original authors have moved on and nobody is exactly sure what application does or does not do with the database. Database is production critical with serious role in serious company so we can't just do a reckless upgrade

First we needed to do a sanity check on existing replication setup. As we're checking replication consistency down the road we need to make sure replication is in sync to begin with to avoid false …

[Read more]
Cache Miss Storm

I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything would go back to normal. What was interesting is Same query was taking large portion of slots in PROCESSLIST. I do not just mean query with same fingerprint but literally the same query with same constants.

What we observed was a cache miss storm – situation which can happen with memcache (as in this case) as well as with query cache. If you have the item which is expensive to generate but which has a lot of hits in the cache you can get into situation when many clients at once will have miss in the cache and will attempt to re-create the item pushing server to overload. Now because a lot of requests …

[Read more]
Estimating Replication Capacity

It is easy for MySQL replication to become bottleneck when Master server is not seriously loaded and the more cores and hard drives the get the larger the difference becomes, as long as replication
remains single thread process. At the same time it is a lot easier to optimize your system when your replication runs normally - if you need to add/remove indexes and do other schema changes you probably would be looking at some methods involving replication if you can't take your system down. So here comes the catch in many systems - we find system is in need for optimization when replication can't catch up but yet optimization process we're going to use relays on replication being functional and being able to catch up quickly.

So the question becomes how can we estimate replication capacity, so we can deal with replication load before slave is unable to catch up.

Need to replication capacity is not only needed in case you're …

[Read more]
Scaling: Consider both Size and Load

So lets imagine you have the server handling 100.000 user accounts. You can see the CPU,IO and Network usage is below 10% of capacity - does it mean you can count on server being able to
handle 1.000.000 of accounts ? Not really, and there are few reasons why, I'll name most important of them:

Contention - This is probably the most obvious one. MySQL (and systems in general) do not scale perfectly with numbers of CPUs and number of concurrent requests. Reduced efficiency of CPU cache, Mutex contention and database lock contention all come here. Some of them are preventable and can be reduced by code changes, such as there have been a lot of advanced in scalability of MySQL by improving locking code design, others, such as row level locks would require application changes to allow more concurrent process. The scalability numbers depend a lot on the system scale, software and workload.

Data size …

[Read more]
Color code your performance numbers

When analyzing how good or bad response time is it is not handy to look at the averages, min or max times - something what is easily computed using built in aggregate functions. We most likely would like to see some percentile numbers - 95 percentile or 99 percentile. The problem is computing these in SQL is ugly and resource consuming. There is however a simple way to get similar data, looking at it from the different point of view.

When we're speaking about application we may not always care about exact value of response time but rather we want to see response time to be within certain range. For example if we define page feels good if response time is below 50ms it is not as important if response was 40ms or 44ms - it is much more important how frequently this goal was reached.

In fact I prefer to define 2 performance level. One what users would consider good performance, another what is acceptable, and basically the third level …

[Read more]
On Good Instrumentation

In so many cases troubleshooting applications I keep thinking how much more efficient things could be going if only there would be a good instrumentation available. Most of applications out there have very little code to help understand what is going on and if it is there it is frequently looking at some metrics which are not very helpful.

If you look at the system from bird eye view - system needs to process transactions and you want it to successfully complete large number of transactions it gets (this is what called availability) and we want it to serve them with certain response time, which is what is called performance. There could be many variables in environment which change - load, number of concurrent users, database, the way users use the system but in the nutshell all what you really care is having predictable response time within certain range. So if we care about response time - this is exactly what our instrumentation should …

[Read more]
Should I buy a Fast SSD or more memory?

While a scale-out solution has traditionally been popular for MySQL, it's interesting to see what room we now have to scale up - cheap memory, fast storage, better power efficiency.  There certainly are a lot of options now - I've been meeting about a customer/week using Fusion-IO cards.  One interesting choice I've seen people make however, is buying an SSD when they still have a lot of pages read/second - I would have preferred to buy memory instead, and use the storage device for writes.

Here's the benchmark I came up with to confirm if this is the case:

  • Percona-XtraDB-9.1 release
  • Sysbench OLTP workload with 80 million rows (about 18GB worth of data+indexes)
  • XFS Filesystem mounted with nobarrier option.
  • Tests run with:
    • RAID10 with BBU over 8 disks
    • Intel SSD X25-E 32GB
    • FusionIO 320GB MLC
[Read more]
Missleading Innodb message on recovery

As I wrote about 2 years ago the feature of Innodb to store copy of master's position in Slave's Innodb tablespace got broken. There is a lot of discussions at the corresponding bug report while outcome of the fix remained uncertain for me (the bug is market duplicate while the bugs it seems to be duplicate for describe different issues).
Anyway. The customer came to me today having the following message in the error log after Slave crash while running MySQL 5.1.41 (extra virgin version without XtraDB or Percona Patches)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 10000000, file name mysql-bin.005000
InnoDB: Last MySQL binlog file position 0 20000000, file name ./mysql-bin.003000

He …

[Read more]
Showing entries 11 to 20 of 56
« 10 Newer Entries | 10 Older Entries »