Showing entries 1 to 3
Displaying posts with tag: parallel query (reset)
Now available in swanhart-tools: NATIVE asynchronous query execution for any MySQL client!

There is often a need to run queries in the background in MySQL. This generally is accomplished using a message queue (like gearman), or by using extensions to a client (PHP has such extensions) or by using MariaDB's native async query interface (C client only I think).

While such solutions work well, they don't work for say a GO client, or for the mysql command line client itself.

I present "async"; part of the Swanhart Toolkit ( Async is a stored procedure and event based solution for asynchronous queries.

It consists of:

  • A queue table to hold the SQL to run, the state of execution, error messages, etc
  • A settings table that controls the number of parallel threads to use for executing queries
  • A stored routine …
[Read more]
Big Data: InfiniDB vs Spider: What else ?

Many of my recent engagements have been all around strategy to implement Real Time Big Data Analytics: Computing hardware cost of extending a single table collection with MariaDB and Parallel Query found in the Spider storage engine to offload columnar MPP storage like InfiniDB or Vertica.

As of today Parallel Query is only available from releases of MariaDB Spider supported by spiral arms. The more efficient way to use parallel query with Spider can be done on group by, and count queries that use a single spider table. In such case Spider Engine will execute query push down AKA map reduce.

Spider gets multiple levels of parallel execution for a single partitioned tables.

First level is per backend server:
The way to actually tell spider to scan different backends in concurrency is to set  spider_sts_bg_mode=1

Other level is per …

[Read more]
Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the partition hint, Shard-Query can add parallelism to any partitioning method (even subpartioning) on 5.6 but it is limited to RANGE/LIST partitioning methods on early versions.

The output from Shard-Query is from the commandline client, but you can use MySQL proxy to communicate with Shard-Query too.

In the examples I am going to use the schema from the Star Schema Benchmark.  I generated data for scale factor 10, which means about 6GB of data in the largest table. I am going to show a few different queries, and …

[Read more]
Showing entries 1 to 3