How MariaDB ColumnStore’s filenames work

Unlike most storage engines, MariaDB ColumnStore does not store its data files in the datadir. Instead these are stored in the Performance Modules in what appears to be a strange numbering system. In this post I will walk you through deciphering the number system.

If you are still using InfiniDB with MySQL, the system is exactly the same as outlined in this post, but the default path that the data is stored in will be a little different.

The default path for the data to be stored is /usr/local/mariadb/columnstore/data[dbRoot] where “dbRoot” is the DB root number selected when the ColumnStore system was configured.

From here onwards we are looking at directories with three digits ending in “.dir”. Every filename will be nested in similar to 000.dir/000.dir/003.dir/233.dir/000.dir/FILE000.cdf.

Now, to understand this you first need to understand how ColumnStore’s storage works. …

Part of my history inside InfiniDB/ColumnStore

Several years ago there was a fork of the unreleased MySQL 6.0 called Drizzle. It was designed to be a lightweight, cloud/web/UTF8 first database server with a microkernel style core. I worked for a while as one of the core developers of Drizzle until the corporate sponsor I worked for ceased funding its development.

Fast-forward to 2016 and I start working on MariaDB ColumnStore and one of the biggest surprises to me is that it incorporated part of Drizzle! Specifically the BSD licensed MySQL/MariaDB compatible client library called libdrizzle.

ColumnStore’s MariaDB plugin gets the entire query plan tree for a query and passes it on to its internal processes to break it up into parts that can be worked on in parallel. Since this doesn’t happen inside MariaDB server it needs a way to get at data that is not part of ColumnStore (such as …

Building BLOBs in MariaDB ColumnStore

My team and I are working on finalizing the feature set for MariaDB ColumnStore 1.1 right now and I wanted to take a bit of time to talk about one of the features I created for ColumnStore 1.1: BLOB/TEXT support.

For those who don’t know, MariaDB ColumnStore is a fork of InfiniDB which has been brought up to date by making it work with MariaDB 10.1 instead of MySQL 5.1 and has many new feature and bug fixes.

ColumnStore’s storage works by having columns of a fixed size of 1, 2, 4 or 8 bytes. These are then stored in 8KB blocks (everything in ColumnStore is accessed using logical block IDs) inside extents of ~8M rows. This is fine until you want to store some data that is longer than 8 bytes such as CHAR/VARCHAR.

To solve this for columns greater than VARCHAR(7) and CHAR(8) …

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala.

This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the MySQL server to execute queries. This allows for an easy “migration” of existing analytical tools. The results are quite interesting and promising.

Quick How-To

The InfiniDB documentation is not very clear on step-by-step instructions so I’ve created this quick guide:

  1. Install Hadoop cluster (minimum …
New MySQL & MariaDB Instructional Videos from SkySQL

Are you looking to expand your knowledge about MySQL and MariaDB database solutions?

Well, you’re in luck! SkySQL is introducing an exclusive collection of educational videos featuring some of the industry’s leading experts on the MySQL database and related technologies. View informative, technical talks on a variety of topics, from the experts at SkySQL, MariaDB, Calpont InfiniDB, Continuent, ScaleDB, Severalnines, Sphinx, Webyog, and others.

Vote for MySQL[plus] awards 2011 !

First of all, I wish you a happy new year.
Many things happened last year, it was really exciting to be involved in the MySQL ecosystem.
I hope this enthusiasm will be increased this year, up to you !

To start the year, I propose the MySQL[plus] Awards 2011
It will only take 5 minutes to fill out these polls.
Answer with your heart first and then with your experience with some of these tools or services.

Polls will be closed January 31, so, vote now !
For “other” answers, please,  let me a comment with details.

Don’t hesitate to submit proposal for tools or services in the comments.
And, please, share these polls !


Note: There is a poll embedded within this post, …

Muzing on NoSQL, damned ! can't get rid of InifiniDB

NoSQL have been frequently used for building analytic solutions. The big picture is using some scalable client code with map reduce to distribute full data scan. 

This approach have nothing new to the RDBMS world and can be considered an extension of the Kimball ROLAP normalization, just allowing more CPU power on a single query. 

NoSQL or sharding take advantages on 

  • Distributed processing 

NoSQL or sharding are loosing advantages such        

  • Fast memory communication 
  • Per column data type optimization and deserialization cost (NoSQL)
  • C processing when reduced with slower language (NoSQL)

There is more non technical advantages in classic ROLAP normalization like using same well known OLTP tools and bug free storage engine, all coming with GPL licences for reducing the …

451 CAOS Links 2010.11.05

Oracle increases MySQL pricing. Jono Bacon wants some respect. And more.

Follow 451 CAOS Links live @caostheory on Twitter and, and daily at
“Tracking the open source news wires, so you don’t have to.”

# Oracle increased the prices for MySQL and rejigged its editions.

# A good overview of the resulting MySQL pricing hubbub from @tiensoon

# SkySQL named first customers in open letter to Oracle MySQL customers.

# Actuate reported over $5.1m in BIRT-related business for Q3, up …

Calpont InfiniDB 2.0 and BI QuickStarts

The 2.0 release of Calpont InfiniDB
is ready for download. New features for the columnar database storage engine for MySQL include data compression, fully parallelized & scalable UDFs, and partition drop has been added to the automatic vertical & horizontal data partitioning.

  • Benefits of InfiniDB Enterprise 2.0
  • 20-50% query performance improvement when reading from disk
  • Distributed in-database calculations provide greater flexibility to the data analyst, and enable faster performance for deep analytics
  • Removing obsolete data from the database quickly frees up disk storage and improves query response

And for those of you new to data warehousing and business intelligence, there are QuickStart for data …

Open Source BI -- Pentaho and Jaspersoft Part I

Hey DBAs! Are you seeking more efficient ways of shifting through your data to aid your business operations? Two popular Business Intelligence products have community Open Source software are Pentaho and JasperSoft. And both work with MySQL.

Both are easy to download and install. Both will use a JDBC connector to connect to MySQL. But how easy are the two to configure and run a simple report against a running instance of MySQL?

Setting up a JDBC connection with JasperSoft or Pentaho is pretty much like using any other JDBC connection.
The …

