Showing entries 121 to 130 of 365
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Benchmarks (reset)
Apache Spark with Air ontime performance data

There is a growing interest in Apache Spark, so I wanted to play with it (especially after Alexander Rubin’s Using Apache Spark post).

To start, I used the recently released Apache Spark 1.6.0 for this experiment, and I will play with “Airlines On-Time Performance” database from
http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time. You can find the scripts I used here https://github.com/Percona-Lab/ontime-airline-performance. The uncompressed dataset …

[Read more]
State of Percona Server 5.6, MySQL 5.6 and MySQL 5.7 RC

This week Oracle will release MySQL 5.7 GA, so it’s a perfect time to do a quick review of the current state of Percona Server 5.6.26, MySQL 5.6.26 and MySQL-5.7.8 RC. We used two boxes from our benchmark lab for this:
– Box 1: 16 Cores+HT (32 virt cores)/fast PCIe ssd card/RAM: 192GB
– Box 2: 24 Cores+HT(48 virt cores)/fast PCIe ssd card/RAM: 128GB

Dataset: sysbench/uniform, 32 tables with 12M rows each, ~95GB
Tests: sysbench – point select, oltp read only, oltp read/write
Test sequence: start server, warmup, series of the tests (each lasts 5 minutes) from 1 to 4096 threads

Tests were run for two setups:
– CPU bound (in memory) – innodb_buffer_pool_size=100GB
– IO bound – innodb_buffer_pool_size=25GB

[Read more]
Testing MySQL partitioning with pt-online-schema-change

There are cases where we need to implement MySQL partitioning, or at least test if it is an option to solve an issue. However, how are we able to test it when the table on our production environment has hundreds of millions of rows, several composite indexes and/or the size on disk is hundreds of gigabytes?

Testing environments usually don’t have all the data that production has and if they have, probably you are not testing all the use-cases at a database level. Therefore, is it possible to test MySQL Partitioning on production impacting as less as possible?

When we execute pt-online-schema-change, it creates a table, triggers, and then copies the data. As we are going to test partitioning we are going to need both tables – with and without partitioning – and we are going to use triggers to keep both tables consistent. A good thing about changing a table to use partitioning is that, usually, …

[Read more]
The MySQL query cache: Worst enemy or best friend?

During the last couple of months I have been involved in an unusually high amount of performance audits for e-commerce applications running with Magento. And although the systems were quite different, they also had one thing in common: the MySQL query cache was very useful. That was counter-intuitive for me as I’ve always expected the query cache to be such a bottleneck that response time is better when the query cache is turned off no matter what. That lead me to run a few experiments to better understand when the query cache can be helpful.

Some context

The query cache is well known for its contentions: a global mutex has to be acquired for any read or write operation, which means that any access is serialized. This was not an issue 15 years ago, but with today’s multi-core servers, such serialization is the best way to kill performance.

However from a performance …

[Read more]
Multi-source replication in MySQL 5.7 vs Tungsten Replicator

MySQL 5.7 comes with a new set of features and multi-source replication is one of them. In few words this means that one slave can replicate from different masters simultaneously.

During the last couple of months I’ve been playing a lot with this trying to analyze its potential in a real case that I’ve been facing while working with a customer.

This was motivated because my customer is already using multi-sourced slaves with Tungsten Replicator and I wanted to do a side-by-side comparison between Tungsten Replicator and Multi-source Replication in MySQL 5.7

Consider the following scenario:


DB1 is our main master attending mostly writes from several applications, it also needs to serve read …

[Read more]
InnoDB vs TokuDB in LinkBench benchmark

Previously I tested Tokutek’s Fractal Trees (TokuMX & TokuMXse) as MongoDB storage engines – today let’s look into the MySQL area.

I am going to use modified LinkBench in a heavy IO-load.

I compared InnoDB without compression, InnoDB with 8k compression, TokuDB with quicklz compression.
Uncompressed datasize is 115GiB, and cachesize is 12GiB for InnoDB and 8GiB + 4GiB OS cache for TokuDB.

Important to note is that I used tokudb_fanout=128, which is only available in our latest Percona Server release.
I will …

[Read more]
Using Cgroups to Limit MySQL and MongoDB memory usage

Quite often, especially for benchmarks, I am trying to limit available memory for a database server (usually for MySQL, but recently for MongoDB also). This is usually needed to test database performance in scenarios with different memory limits. I have physical servers with the usually high amount of memory (128GB or more), but I am interested to see how a database server will perform, say if only 16GB of memory is available.

And while InnoDB usually respects the setting of innodb_buffer_pool_size in O_DIRECT mode (OS cache is not being used in this case), more engines (TokuDB for MySQL, MMAP, WiredTiger, RocksDB for MongoDB) usually get benefits from OS cache, and Linux kernel by default is generous enough to allocate as much memory as available. There I should note that while TokuDB (and TokuMX for MongoDB) supports DIRECT mode (that is bypass OS cache), we found there is a performance gain if OS cache is used for compressed pages.

[Read more]
Update on the InnoDB double-write buffer and EXT4 transactions

In a post, written a few months ago, I found that using EXT4 transactions with the “data=journal” mount option, improves the write performance significantly, by 55%, without putting data at risk. Many people commented on the post mentioning they were not able to reproduce the results and thus, I decided to further investigate in order to find out why my results were different.

So, I ran sysbench benchmarks on a few servers and found when the InnoDB double-write buffer limitations occur and when they don’t. I also made sure some of my colleagues were able to reproduce the results. Basically, in order to reproduce the results you need the following conditions:

  • Spinning disk (no SSD)
  • Enough CPU power
  • A dataset that fits in the InnoDB buffer pool
  • A continuous high write …
[Read more]
LinkBenchX: benchmark based on arrival request rate

An idea for a benchmark based on the “arrival request” rate that I wrote about in a post headlined “Introducing new type of benchmark” back in 2012 was implemented in Sysbench. However, Sysbench provides only a simple workload, so to be able to compare InnoDB with TokuDB, and later MongoDB with Percona TokuMX, I wanted to use more complicated scenarios. (Both TokuDB and TokuMX are part of Percona’s product line, in the case you missed Tokutek now part of the Percona family.)

Thanks to Facebook – they provide LinkBench, a benchmark that emulates the social graph …

[Read more]
Measuring the impact of tcpdump on Very Busy Hosts

A few years back Deva wrote about how to use tcpdump on very busy hosts. That post sparked my interest about exploring how to measure the impact of tcpdump on very busy hosts. In this post, I wanted to highlight how much of an impact there really is and what options you have to make the query collection much more effective.

Some things you need to know:

  • The test is a sysbench read-only workload, 8 tables, 8 threads, 1000000 rows each with 16G of buffer pool. Dataset fully in memory.
  • sysbench is ran on the same host, on 1Gbps connection, sysbench can saturate the network and therefore affect my network test with netcat so I decided to run locally.
  • There are 13 tests, 5 minutes each with 1 minute interval, varying on how the dump file is captured.
    • First one as …
[Read more]
Showing entries 121 to 130 of 365
« 10 Newer Entries | 10 Older Entries »