Don't look at an industry benchmark here, it's a real client
story.
200 Billion records in a month and it should be transactional but
not durable.
For regular workload we use LOAD DATA INFILE into partitioned
InnoDB, but here we have estimated 15TB of RAID storage. This is
a lot of disks and it can't no more stay inside a single server
internal storage.
MariaDB 5.5 come with TokuDB storage engine for compression, but
is it possible in the time frame impose by the workload?
We start benchmarking 380G of raw input data files, 6
Billion rows.
First let's check the compression with the dataset.
Great job my TokuDB 1/5, without tuning a single parameter other
than durability! …
Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight:
Scalability issues due to tables without primary keys
This scalability issue is caused by the usage of tables without primary keys. This issue typically shows itself as contention on the InnoDB dict_sys mutex. Now the dict_sys mutex controls access to the data dictionary. This mutex is used at various places. I will only mention a few of them:
- During operations such as opening and closing table handles, or
- When accessing I_S tables, or
- During undo of a freshly inserted row, or
- During other data dictionary modification operations such as CREATE TABLE, or
- Within the “Persistent Stats” subsystem, among other things.
Of course this list is not exhaustive but should …
[Read more]I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.
OK, let’s start off with the configuration details.
Configuration
First of all let me describe the EC2 instance type that I used.
EC2 Configuration
I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.
High-Memory Quadruple Extra Large Instance 68.4 GB of memory 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute …[Read more]