Showing entries 1 to 10 of 63
10 Older Entries »
Displaying posts with tag: indexing (reset)
MySQL: a few observations on the JSON type

MySQL 5.7 comes with built-in JSON support, comprising two major features:

Despite being added rather recently (in MySQL 5.7.8 to be precise - one point release number before the 5.7.9 GA version), I feel the JSON support so far looks rather useful. Improvements are certainly possible, but compared to for example XML support (added in 5.1 and 5.5), the JSON feature set added to 5.7.8 is reasonably complete, coherent and standards-compliant.

(We can of course also phrase …

[Read more]
Log Buffer #426: A Carnival of the Vanities for DBAs

This Log Buffer edition transcends beyond ordinary and loop in few of the very good blog posts from Oracle, SQL Server and MySQL.


  • Variable selection also known as feature or attribute selection is an important technique for data mining and predictive analytics.
  • The Oracle Utilities SDK V4. has been released and is available from My Oracle Support for download.
  • This article provides a high level list of the new features that exist in HFM and details the changes/differences between HFM and previous releases.
  • In recent …
[Read more]
7 quick MySQL performance tips for the small business

We’ve heard lots in recent years about Big Data and the alternative models of data management and processing, like Hadoop and NoSQL. But truth be told, relational databases are still the workhorses of most of today’s small and medium sized businesses. Relational DBs date back over 40 years and SQL skills are fairly common, and they’re known to be highly secure.


MySQL is the world’s second most popular relational database management system (RDMS) and is the most popular open-source version of the database. It’s easily accessible and is often known to be part of the LAMP web development stack, standing for the ‘M’ in the acronym of popular tools, along with Linux, Apache, and PHP/Perl/Python. The fact that MySQL is free, easy to setup and scales fast are some of the main reasons why it’s the best match for many SMBs.


[Read more]
Advanced JSON for MySQL: indexing and aggregation for highly complex JSON documents

What is JSON
JSON is an text based, human readable format for transmitting data between systems, for serializing objects and for storing document store data for documents that have different attributes/schema for each document. Popular document store databases use JSON (and the related BSON) for storing and transmitting data.

Problems with JSON in MySQL
It is difficult to inter-operate between MySQL and MongoDB (or other document databases) because JSON has traditionally been very difficult to work with. Up until recently, JSON is just a TEXT document. I said up until recently, so what has changed? The biggest thing is that there are new JSON UDF by Sveta Smirnova, which are part of the MySQL 5.7 Labs releases. Currently the JSON UDF are up to version 0.0.4. While these new UDF are a welcome edition to the MySQL database, they don't solve the really tough …

[Read more]
Introducing ‘MySQL 101,’ a 2-day intensive educational track at Percona Live this April 15-16

Talking with Percona Live attendees last year I heard a couple of common themes. First, people told me that there is a lot of great advanced content at Percona Live but there is not much for people just starting to learn the ropes with MySQL. Second, they would like us to find a way to make such basic content less expensive.

I’m pleased to say we’re able to accommodate both of these wishes this year at Percona Live! We have created a two-day intensive track called “MySQL 101” that runs April 15-16. MySQL 101 is designed for developers, system administrators and DBAs familiar with other databases but not with MySQL. And of course it’s ideal for anyone else who would like to expand their professional experience to include MySQL. The sessions are designed to lay a solid foundation on many aspects of MySQL development, design and …

[Read more]
Why Unique Indexes are Bad

Before creating a unique index in TokuMX or TokuDB, ask yourself, “does my application really depend on the database enforcing uniqueness of this key?” If the answer is ANYTHING other than yes, do not declare the index to be unique. Why? Because unique indexes may kill your write performance. In this post, I’ll explain why.

Unique indexes are a strange beast: they have no impact on standard databases that use B-Trees, such as MongoDB and MySQL, but may be horribly painful for databases that use write optimized data structures, like TokuMX’s Fractal Tree(R) indexes. How? …

[Read more]
Purging old rows with QueryScript: three use cases

Problem: you need to purge old rows from a table. This may be your weekly/monthly cleanup task. The table is large, the amount of rows to be deleted is large, and doing so in one big DELETE is too heavy.

You can use oak-chunk-update or pt-archiver to accomplish the task. You can also use server side scripting with QueryScript, offering a very simple syntax with no external scripting, dependencies and command line options.

I wish to present three cases of row deletion, with three different solutions. In all cases we assume some TIMESTAMP column exists in table, by which we …

[Read more]
532x Multikey Index Insertion Performance Increase for MongoDB with Fractal Tree Indexes

In my three previous MongoDB blogs I wrote about our implementation of Fractal Tree(R) indexes on MongoDB, showing a 10x insertion performance increase, a 268x query performance increase, and a comparison of covered indexes and clustered indexes. These benchmarks show the difference that rich and efficient indexing can make to your MongoDB workload.

Given the high performance of Fractal Tree Indexes, we’ve created a new benchmark to test our ability to handle indexing large …

[Read more]
My Talk Next Week at HighLoad++

Next week I’ll be visiting Moscow to talk at Highload++. The conference will take place during Monday 22nd and Tuesday 23rd at the Radisson hotel. I will be giving my personal version of an indexing talk that my colleagues have given in meetups and conferences in the US.

Highload++ conference is targeted to address the issues of complex high traffic web properties. Most of these sites depend on databases to deliver their content, record the traffic and report the application activities in real time. As I learned early in my career at MySQL, the database schema and in particular the indexing strategy, are critical to achieve the highest possible performance out of the database. I’ll be reviewing the basic strategies to define the right indexes. I will also cover TokuDB’s Fractal Tree® and Cluster …

[Read more]
Looking for MongoDB users to test Fractal Tree Indexing

In my three previous blogs I wrote about our implementation of Fractal Tree Indexes on MongoDB, showing a 10x insertion performance increase, a 268x query performance increase, and a comparison of covered indexes and clustered indexes. The benchmarks show the difference that rich and efficient indexing can make to your MongoDB workload.

It’s one thing for us to benchmark MongoDB + TokuDB and another to measure real world performance. If you are looking for a way to improve the performance or scalability of your MongoDB deployment, we can help …

[Read more]
Showing entries 1 to 10 of 63
10 Older Entries »