Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 5

Displaying posts with tag: shard-query (reset)

Parallel Query for MySQL with Shard-Query
+0 Vote Up -0Vote Down

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the partition hint, Shard-Query can add parallelism to any partitioning method (even subpartioning) on 5.6 but it is limited to RANGE/LIST partitioning methods on early versions.

The output from Shard-Query is from the commandline client, but you can use MySQL proxy to communicate with Shard-Query too.

In the examples I am going to use the schema from the Star Schema Benchmark.  I generated data for scale factor 10, which means about 6GB of data in the largest

  [Read more...]
Shard-Query supports background jobs, query parallelism, and all SELECT syntax
+0 Vote Up -0Vote Down

SkySQL just blogged about a tool to schedule long running MySQL jobs, prevent too many queries from running simultaneously, and stores the results in tables.  It even uses Gearman.  You will note that the article says that it uses PAQU, which uses Shard-Query.

I think PAQU was created for two reasons.  A) Shard-Query lacked support for fast aggregation of STDDEV and VARIANCE (this has been fixed), and B) their data set requires “cross-shard queries”.  From what I can see though, their type of cross-shard queries can be solved using subqueries in the FROM clause using Shard-Query, instead of using a customized (forked) version of Shard-Query.  It is unfortunate, because my recent improvements to Shard-Query have to be ported into PAQU by the PAQU authors.

I’d like to encourage you to look at Shard-Query

  [Read more...]
Shard-Query 2.0 Beta 1 released
+4 Vote Up -0Vote Down

It is finally here.  After three years of development, the new version of Shard-Query is finally available for broad testing.

This new version of Shard-Query is vastly improved over previous versions in many ways.  This is in large part due to the fact that the previous version of Shard-Query (version 1.1) entered into production at a large company.  Their feedback during implementation was invaluable in building the new Shard-Query features.   The great thing is that this means that many of the new 2.0 features have already been tested in at least one production environment.

This post is intended to highlight the new features in Shard-Query 2.0.  I will be making posts about individual features as well as posting benchmark results.

  [Read more...]
Shard-Query EC2 images available
+1 Vote Up -0Vote Down

Infobright and InnoDB AMI images are now available

There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here on out. Shard-Query is very flexible in it’s configuration, so you can use this sample database to spread processing over up to 20 nodes.

The Infobright Community Edition (ICE) images are available in 32 and 64 bit varieties. Due to memory requirements, the InnoDB versions are only available on 64 bit instances. MySQL will fail to start on a micro instance, simply decrease the values in the /etc/my.cnf file if you really want to try micro instances.

  [Read more...]
Intra-query parallelism for MySQL queries without an appliance or closed source database
+2 Vote Up -0Vote Down
*edit* I want to point out that this test was done on a single database server which used MySQL partitioning. This is a demonstration of how Shard-Query can improve performance in non-sharded databases too.*edit*.

Over the weekend I spent a lot of time improving my new Shard-Query tool (code.google.com/p/shard-query) and the improvements can equate to big performance gains on partitioned data sets versus executing the query directly on MySQL.


I'll explain this graph below, but lower is better (response time) and Shard-Query is the red line.

MySQL understands that queries which access data in only certain partitions don't have to read the rest of the table. This partition






  [Read more...]
Showing entries 1 to 5

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.