I created a new tool this week:
As the name Shard-Query suggests, the goal of the tool is to run a query over multiple shards, and to return the combined results together as a unified query. It uses Gearman to ask each server for a set of rows and then runs the query over the combined set. This isn't a new idea, however, Shard-Query is different than other Gearman examples I've seen, because it supports aggregation.
It does this by doing some basic query rewriting based on the input query.
Take this query for example:
select c2, sum(s0.c1), max(c1) from t1 as s0 join t1 using (c1,c2) where c2 = 98818 group by c2;
The tool will split this up into two queries.
This first query will be sent to each shard. Notice that …