Showing entries 1 to 10 of 156
10 Older Entries »
Displaying posts with tag: RocksDB (reset)
Exposing MyRocks Internals Via System Variables: Part 7, Use Case Considerations

(In the previous post, Part 6, we covered Replication.)

In this final blog post, we conclude our series of exploring MyRocks by taking a look at use case considerations. After all, having knowledge of how an engine works is really only applicable if you feel like you’re in a good position to use it.

Advantages of MyRocks

Let’s start by talking about some of the advantages of MyRocks.

Compression

MyRocks will typically do a good job of reducing the physical footprint of your data. As I mentioned in my previous post in this series about compression, you have the ability to configure compression down to the individual compaction layers for each column family. You also get the advantage of the fact that data isn’t updated once it’s written to disk. Compaction, which was …

[Read more]
Exposing MyRocks internals Via system variables: Part 6, Replication

(In the previous post, Part 5, we covered Data Reads.)

In this blog post, we continue our series of exploring MyRocks mechanics by looking at the configurable server variables and column family options. In our last post, I explained at a high level how reads occur in MyRocks, concluding the arc of covering how data moves into and out of MyRocks. In this post, we’re going to explore replication with MyRocks, more specifically read-free replication.

Some of you may already be familiar with the concepts of read-free replication as it was a key feature of the TokuDB engine, which leveraged fractal tree indexing. TokuDB was similar to MyRocks in the sense that it had a pseudo log-based storage …

[Read more]
Exposing MyRocks internals via system variables: Part 5, Data Reads

(In the previous post, Part 4, we covered Compression and Bloom Filters)

In this blog post, we continue on our series of exploring MyRocks mechanics by looking at the configurable server variables and column family options. In our last post, I explained at a high level how compression and bloom filtering are applied to data files as they are initially flushed from immutable memtables and are subsequently passed through the compaction process. With that being covered, we should now have a clear understanding as to how data writing works in MyRocks and can start reviewing how data read requests are handled.

The Read Process

Let’s start off by talking about how read processes are handled at the file level. When a read request comes in, the first thing it needs to do is pull the …

[Read more]
Hot Backup For MyRocks(Rocksdb) using Percona Xtrabackup

Xtrabackup now supports Hotbackup for Myrocks!! yes you heard me right, this is one of the most awaited features with xtrabackup. With the latest release of percona xtrabackup 8.0.6 this is enabled and is supported only for Percona Server version 8.0.15-6 or higher, you can see detailed released notes here.

Myrocks is getting much of the attention now because of its much improved write capabilities and compression. We have also planned to have detailed blog on Myrocks features and limitations.

We shall proceed to test the backup and restore of Myrocks

Environment:

OS : Debian GNU/Linux 9 …
[Read more]
CRUM conjecture - read, write, space and cache amplification

The RUM Conjecture asserts that an index structure can't be optimal for all of read, write and space. I will ignore whether optimal is about performance or efficiency (faster is better vs efficient-er is better). I want to use CRUM in place of RUM where C stands for database cache.

The C in CRUM is the amount of memory per key-value pair (or row) the DBMS needs so that either a point query or the first row from a range query can be retrieved with at most X storage reads. The C can also be reported as the minimal database : memory ratio to achieve at most X storage reads per point query.

My points here are:

  • There are …
[Read more]
Exposing MyRocks Internals Via System Variables: Part 3, Compaction

(In the previous post, Part 2, we covered Initial Data Flushing.)

In this blog post, we continue our series of exploring MyRocks mechanics by looking at the configurable server variables and column family options. In our last post, I explained at a high level how data moves from immutable memtables to disk. In this post, we’re going to talk about what happens to that data as it moves through the compaction process.

What is Compaction?

One of the philosophies of MyRocks is “write the data quickly and sort out data organization later”, which is pretty far removed from engines like InnoDB that take the approach of “continuously organize data on disk so it’s optimal as soon as possible”. MyRocks implements its philosophy in a way that is heavily reliant on a process …

[Read more]
Exposing MyRocks Internals via system variables: Part 2, Initial Data Flushing

(In the previous post, Part 1, we covered Data Writing.)

In this blog post, we continue on our series of exploring MyRocks mechanics by looking at configurable server variables and column family options. In our last post, I explained at a high level how data first entered memory space and in this post, we’re going to pick up where we left off and talk about how the flush from immutable memtable to disk occurs. We’re also going to talk about how newly created secondary indexes on existing tables get written to disk.

We already know from our previous post in the series that a flush can be prompted by one of several events, the most common of which would be when an active memtable is filled to its maximum capacity and is rotated into immutable status.

When your immutable memtable(s) is ready …

[Read more]
Exposing MyRocks internals via system variables: Part 1, Data Writing

Series Introduction

Back in 2016 I decided to write a blog series on InnoDB in hopes that it would help give a good description of the high level mechanics of the storage engine. The main motivating factor at that time was that I knew there was a lot of information out there about InnoDB, but a lot of it was ambiguous or even contradictory and I wanted to help make things a bit clearer if I could.

Now there’s a new storage engine that’s rising in popularity that I feel needs similar attention. Specifically MyRocks, the log-structured merge-driven RocksDB engine in MySQL. Given the amount of discussion in the community about MyRocks, I’m sure most of you already have some degree of familiarity, or at the very least have heard the name.

Now we’ve arrived at a point where this is no longer just a Facebook integration project and major players in the community like Maria and Percona have their own implemented …

[Read more]
Less "mark" in MySQL benchmarking

My goal for the year is more time learning math and less time running MySQL benchmarks. I haven't done serious benchmarks for more than 12 months. It was a great experience but I want to learn new things. MySQL 8.0.14 has been released with fixes for a serious bug I found via the insert benchmark. I won't confirm whether it has been fixed. I hope someone else does.

My tests and methodology are described in posts for sysbench, linkbench and the insert benchmark.  I hope the upstream distros (MySQL, MariaDB, Percona) repeat my tests and methodology and I am happy to answer questions about that. I even have inscrutable shell scripts that …

[Read more]
Optimal configurations for an LSM and more

I have been trying to solve the problem of finding an optimal LSM configuration for a given workload. The real problem is larger than that, which is to find the right index structure and the right configuration for a given workload. But my focus is RocksDB so I will start by solving for an LSM.

This link is to slides that summarizes my effort. I have expressed the problem to be solved using differentiable functions to express the cost that is to be minimized. The cost functions have a mix of real and integer valued parameters for which values must be determine to minimize the cost. I have yet to solve the …

[Read more]
Showing entries 1 to 10 of 156
10 Older Entries »