Showing entries 1 to 5
Displaying posts with tag: JOIN performance (reset)
A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks for different queries and explain the results so that you have a better understanding of when using the Hash Join will be best and when not. Although Hash Joins are available since MariaDB 5.3, but I will be running my benchmarks on the newer MariaDB 5.5.

Overview

Hash Join is a new algorithm introduced in MariaDB 5.3/5.5 that can be used for joining tables that have a equijoin conditions of the form tbl1.col1 = tbl2.col1, etc. As I mentioned above that what is actually implemented is the Classic Hash Join. But its known as Block Nested Loop Hash (BNLH) Join in MariaDB.
The Classic Hash Join Algorithm …

[Read more]
Join Optimizations in MySQL 5.6 and MariaDB 5.5

This is the third blog post in the series of blog posts leading up to the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. This blog post is targeted at the join related optimizations introduced in the optimizer. These optimizations are available in both MySQL 5.6 and MariaDB 5.5, and MariaDB 5.5 has introduced some additional optimizations which we will also look at, in this post.

Now let me briefly explain these optimizations.

Batched Key Access

Traditionally, MySQL always uses Nested Loop Join to join two or more tables. What this means is that, select rows from first table participating in the joins are read, and then for each of these rows an index lookup is performed on the second table. This means many point queries, say for example if table1 yields 1000 …

[Read more]
Joining on range? Wrong!

The problem I am going to describe is likely to be around since the very beginning of MySQL, however unless you carefully analyse and profile your queries, it might easily go unnoticed. I used it as one of the examples in our talk given at phpDay.it conference last week to demonstrate some pitfalls one may hit when designing schemas and queries, but then I thought it could be a good idea to publish this on the blog as well.

To demonstrate the issue let’s use a typical example – a sales query. Our data is a tiny store directory consisting of three very simple tables:

PLAIN TEXT SQL:

  1. CREATE TABLE `products` (
  2.   `prd_id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  3.   `prd_name` varchar(32) NOT NULL,
  4.   PRIMARY KEY (`prd_id`),
  5.   KEY …
[Read more]
JOIN Performance & Charsets

We have written before about the importance of using numeric types as keys, but maybe you've inherited a schema that you can't change or have chosen string types as keys for a specific reason. Either way, the character sets used on joined columns can have a significant impact on the performance of your queries.

Take the following example, using the InnoDB storage engine:

PLAIN TEXT SQL:

  1. CREATE TABLE `t1` (
  2. `char_id` char(6) NOT NULL,
  3. `v` varchar(128) NOT NULL,
  4. PRIMARY KEY (`char_id`)
  5. ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  6.  
  7. CREATE TABLE `t2` (
  8. `id` int UNSIGNED NOT NULL AUTO_INCREMENT,
[Read more]
Performance of JOIN using columns of different numeric types

I know it's been a very long time since I posted anything, but I felt the itch today. A question came up earlier this week at work: How much is JOIN performance affected when using columns that are not the exact same data type?

This is important for me because entity-attribute-value tables require a lot of self-joins. Let me start by saying that we mitigate one of the common drawbacks to EAVs - mashing diverse data types into a single column - by separating numeric, date, and character data into different tables. However, we mashed a lot of integer data into a DECIMAL(13,4) column right alongside our financial data. I recently noticed that most of the data in this EAV table has no fractional part, and to determine whether it would be worth moving it all into another table - as well as determine what column type to use - I spent an afternoon running …

[Read more]
Showing entries 1 to 5