Comparing MongoDB, MySQL, and TokuMX Data Layout

A lot is said about the differences in the data between MySQL and MongoDB. Things such as “MongoDB is document based”, “MySQL is relational”, “InnoDB has a clustering key”, etc.. Some may wonder how TokuDB, our MySQL storage engine, and TokuMX, our MongoDB product, fit in with these data layouts. I could not find anything describing the differences with a simple google search, so I figured I’d write a post explaining how things compare.

So who are the players here? With MySQL, users are likely familiar with two storage engines: MyISAM, the original default up until MySQL 5.5, and InnoDB, the current default since MySQL 5.5. MongoDB has only one storage engine, and we’ll refer to it as “vanilla Mongo storage”. And of course, there is TokuDB for MySQL, and TokuMX.

First, let’s get some quick terminology out of the way. Documents and collections in MongoDB can be thought of as rows and tables in MySQL, respectively. And while not identical, fields in MongoDB are similar to columns in MySQL. A full SQL to MongoDB mapping can be found here. When I refer to MySQL, what I say applies to TokuDB, InnoDB, and MyISAM. When I say MongoDB, what I say applies to TokuMX and vanilla Mongo storage.

What do the collections and tables look like?

Here we have the first major difference between MySQL and MongoDB. MySQL has schemas, whereas MongoDB is schemaless. Let’s look at an example. In MongoDB, we can do the following:

> db.createCollection("exampleColl")
{ "ok" : 1 }
> db.exampleColl.insert({ _id : 1 , a : 10 , aa : 100})
> db.exampleColl.insert({ _id : 2 , b : 20 , bb : 200 , bbb : 2000})
> db.exampleColl.find()
{ "_id" : 1, "a" : 10, "aa" : 100 }
{ "_id" : 2, "b" : 20, "bb" : 200, "bbb" : 2000 }

We created a collection “exampleColl” and inserted two documents into it. Note that the two documents have no fields in common except “_id” (we’ll get to why soon). This is how MongoDB is schema-less. Each document can have its own set of defined fields, and those fields may be any type. In practice, many documents in a collection have common fields, but that is an application choice as opposed to a strict rule of the collection.

This cannot be done with MySQL. In MySQL, one would have something that looks more like this:

mysql> create table foo (a int, aa int, aaa int) engine=TokuDB;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into foo values (1,10,100);
Query OK, 1 row affected (0.01 sec)

mysql> insert into foo values (2,20,200);
Query OK, 1 row affected (0.00 sec)

mysql> select * from foo;
+------+------+------+
| a    | aa   | aaa  |
+------+------+------+
|    1 |   10 |  100 |
|    2 |   20 |  200 |
+------+------+------+
2 rows in set (0.00 sec)

In MySQL, each row inserted must have a value for each column, or be NULL. So, if I want one row to have only the column “a”, I would do:

insert into foo values (10, NULL, NULL);

whereas in MongoDB, I would simply insert a document that does not have the field “aaa”:

db.exampleColl.insert({ _id : 3 , a : 10 })

I’m sure users notice an “_id” column in the MongoDB examples that don’t exist in the MySQL examples. I’ll get to that soon.

What is a row identifier?

All the storage engines discussed here save the entire document or row SOMEWHERE. Whether that location is in a Fractal Tree® index, B-tree, or flat file does not matter at this point. To find the entire row, the database must have some identifier to use. We call this a row identifier.

So, a row identifier is logically some information that the storage engine can use to retrieve the entire row. All the storage engines have one. We’ll dig into the specifics below, but they can be a (primary) key in a B-Tree or Fractal Tree, or an offset into a flat file, or theoretically something else.

How do secondary indexes work (note, not talking primary keys yet)?

Secondary indexes in MongoDB and MySQL are very similar. Secondary indexes declare fields or columns to be sorted separate from the rest of the data, and use row identifiers to reference the rest of the row for a query.

Let’s take a look at quick examples, with MySQL. Suppose we did the following:

mysql> create table foo (a int, b int, c int, key (a)) engine=TokuDB;
mysql> insert into foo values (1, 10, 100), (5, 50, 500), (3, 30, 300), (NULL, 40, 400);

Suppose abstractly, the row identifier for these four rows respectively are “id1”, “id2”, “id3″, and “id4”. This table will have two logical data structures associated with it: one for the entire data, and one for the secondary index. The main data store will have data like:

+--------+-------+------+------+
| row_id | a     | b    | c    |
+--------+-------+------+------+
|    id1 |     1 |   10 |  100 |
|    id2 |     5 |   50 |  500 |
|    id3 |     3 |   30 |  300 |
|    id4 |  NULL |   40 |  400 |
+--------+-------+------+------+

and the secondary index will look something like:

+-------+--------+
| a     | row_id |
+-------+--------+
|  NULL |    id4 |
|     1 |    id1 |
|     3 |    id3 |
|     5 |    id2 |
+-------+--------+

So, any query on “a” can use the row_id to retrieve the entire row.

A similar example in MongoDB. Suppose we had a collection “foo”, and did the following:

db.foo.ensureIndex({ a : 1 })
db.foo.insert([{ a : 1, b : 10, c : 100}, { a : 5, b : 50, c : 500},
               { a : 3, b : 30, c : 300 }, { b : 40, c : 400 }])

Similarly, the secondary will have the same structure:

+-------+--------+
| a     | row_id |
+-------+--------+
|  null |    id4 |
|     1 |    id1 |
|     3 |    id3 |
|     5 |    id2 |
+-------+--------+

The “null” signifies that the field “a” did not exist in the document associated with “id4”.

There are more differences between secondary indexes in MongoDB v. MySQL that I will not dive into here (such as declaring fields in an index to be in reverse order, or indexing arrays that produce multiple entries per document), but for the purposes of understanding how the data is laid out, this is sufficient.

How is data laid out?

Now that we understand these basic concepts, we can dig into the storage decisions made by the various engines.

What do TokuDB and InnoDB do?

Let’s look at TokuDB and InnoDB, which are very similar with respect to data layout. There are two cases: when the table has a primary key declared, and when the table has no primary key declared.

Let’s take the first case. Suppose we define a table as follows:

mysql> create table foo (a int, b int, c int, primary key (a), key (b)) engine=TokuDB;
mysql> insert into foo values (1,100,100),(2,2000,200),(3,30,300);

TokuDB and InnoDB both cluster the primary key. That is, the InnoDB B-Tree or TokuDB Fractal Tree index that stores the primary key also stores the entire row with the key. The row identifier used by secondary indexes is the primary key. So, in the example above, TokuDB and InnoDB will have a dictionary for the primary key with data sorted like this:

(1, 100, 100)
(2, 2000, 200)
(3, 30, 300)

and a secondary key for (b) with data sorted like this:

(30, 3)
(100, 1)
(2000, 2)

Note the secondary index stores the columns declared in the index followed by the primary key.

If the primary key does not exist, InnoDB will try to use a unique index as its clustered key and row identifier in other secondary indexes. If no such unique index exists, InnoDB will auto-generate a 6 byte primary key in monotonically increasing order. Some more details are here.

If the primary key does not exist in a TokuDB table, TokuDB will auto-generate an 8 byte primary key in monotonically increasing order. TokuDB never tries to use an existing unique index as its clustered key and row identifier.

What does MyISAM do?

MyISAM differs with TokuDB and InnoDB in one giant way here. MyISAM does not cluster its primary key. Instead, MyISAM stores its rows into a flat file, with the extension .MYD. Indexes are stored in B-Trees in a .MYI file. The offset into the .MYD file becomes the logical row identifier. The primary key, should it exist, can be thought of as just another unique secondary index for the table. Indexes (primary and secondary), use the offset to identify the entire row.

What does vanilla Mongo storage do?

Let’s insert some data into a collection:

> db.foo.insert([{ a : 1 }, { a : 10, b : 20 }])
> db.foo.insert({ _id : 1 , a : 100 })
> db.foo.ensureIndex({ a : 1 })
> db.foo.find()
{ "_id" : 1, "a" : 100 }
{ "_id" : ObjectId("51ec4d43c54a36b1fe085265"), "a" : 1 }
{ "_id" : ObjectId("51ec4d43c54a36b1fe085266"), "a" : 10, "b" : 20 }

The first thing to notice is that each MongoDB document MUST have an _id field (there are corner cases, such as capped collections where this is not true, but let’s disregard that for now). If the document inserted does not have an _id field, then an _id field is auto-generated, as described here.

Now let’s look at the indexes defined:

> db.system.indexes.find()
...
{ "v" : 1, "key" : { "_id" : 1 }, "unique" : true, "ns" : "test.foo", "name" : "_id_" }
{ "v" : 1,  "key" : { "a" : 1 }, "ns" : "test.foo", "name" : "a_1" }
...

Note that in addition to the defined index on “a”, MongoDB also auto-generates a unique index on the _id field. Therefore, the _id field MUST be unique.

To store the data, vanilla Mongo storage places all the documents into flat files. The offset and file becomes the logical row identifier. Secondary indexes are stored using B-Trees. Secondary indexes use the logical row identifier to identify the entire row. The “_id” index can be thought of as an auto-defined unique secondary index.

As one can see, there are similarities to MyISAM. One can think of the “_id” index in vanilla Mongo storage as the primary key in a MyISAM table. Additionally, note in the link above that the auto-generation of the key ensures that the “_id” field is “generally” increasing, as its first four bytes are the timestamp. So, there is a similarity with InnoDB’s and TokuDB’s auto-generated hidden primary key (or a user defined auto-increment key) in that newly generated “_id” fields are placed near the end of the dictionary, ensuring a right-most pattern of insertion.

What does TokuMX do?

TokuMX, following the MongoDB protocol, forces documents to have a unique “_id” field, and auto-generates the _id if the user did not specify one. An index on the “_id” exists. However, like TokuDB and InnoDB, TokuMX makes the “_id” index be a clustered index that contains the entire document. The _id field is used as the row identifier in secondary indexes.

Basically, with TokuMX, the “_id” index can be thought of as the primary key of a TokuDB table.

Summary

So, at a high level, here are the similarities and differences between MongoDB and MySQL.

MySQL forces rows to fit a schema, with some value for each column (including NULL), whereas MongoDB documents only forces a document to have an _id field. The rest of the document can be whatever the user pleases. TokuMX, TokuDB, and InnoDB cluster an index that can be thought of as a primary key, and use that primary key as a row identifier for secondary indexes. MyISAM and vanilla Mongo store the data in a file (or files), and use an offset into the file as the row identifier.