Showing entries 1 to 5
Displaying posts with tag: Hints and tips (reset)
SQL injection in the MySQL server (of the proxy kind!)

As work on WarpSQL (Shard-Query 3) progresses, it has outgrown MySQL proxy.  MySQL proxy is a very useful tool, but it requires LUA scripting, and it is an external daemon that needs to be maintained.  The MySQL proxy module for Shard-Query works well, but to make WarpSQL into a real distributed transaction coordinator, moving the proxy logic inside of the server makes more sense.

The main benefit of MySQL proxy is that it allows a script to “inject” queries between the client and server, intercepting the results and possibly sending back new results to the client.  I would like similar functionality, but inside of the server.

For example, I would like to implement new SHOW commands, and these commands do not need to be implemented as actual MySQL SHOW commands under the covers.

For example, for this blog post I made a new example command called “SHOW PASSWORD

[Read more]
Access Shard-Query with the MySQL client without using MySQL proxy

One of the great features of Shard-Query is the ability to use MySQL proxy to access resultsets transparently. While this is a great tool, many people have expressed reservations about using MySQL Proxy, an alpha component in their production environment.

I recognize that this is a valid concern, and have implemented an alternate method of retrieving resultsets directly in the MySQL client, without using a proxy. This means that any node can easily act as the “head” node without any extra daemon, instead of having to run many proxies.

The sq_helper() routine has been checked into the git repository and is available now.

The function takes a few parameters:

  • sql to run
  • shard-query schema name (empty string or null for default schema)
  • schema to store temp table in
  • temp table name (where results are sent to)
  • return result (boolean, 1 returns …
[Read more]
Shard-Query loader gets a facelift and now Amazon S3 support too

Shard-Query (source) now supports the MySQL “LOAD DATA INFILE” command.

When you use LOAD DATA LOCAL INFILE a single threaded load from the current process will be performed.  You can specify a path to a file anywhere readable by the PHP script.  This allows loading without using the Gearman workers and without using a shared filesystem.

If you do not specify LOCAL, then the Gearman based loader is used.  You must not specify a path to the file when you omit the LOCAL keyword.  This is because the shared path will the pre-pended to the filename automatically.  The shared path must be a shared or network filesystem (NFS,CIFS,etc) and the files to be loaded must be placed on the shared filesystem for the Gearman based loader to work.  This is because workers may run on multiple nodes and …

[Read more]
Shard-Query supports background jobs, query parallelism, and all SELECT syntax

SkySQL just blogged about a tool to schedule long running MySQL jobs, prevent too many queries from running simultaneously, and stores the results in tables.  It even uses Gearman.  You will note that the article says that it uses PAQU, which uses Shard-Query.

I think PAQU was created for two reasons.  A) Shard-Query lacked support for fast aggregation of STDDEV and VARIANCE (this has been fixed), and B) their data set requires “cross-shard queries”.  From what I can see though, their type of cross-shard queries can be solved using subqueries in the FROM clause using Shard-Query, instead of using a customized (forked) version of Shard-Query.  It is unfortunate, because my recent improvements to Shard-Query have to be ported into PAQU by the PAQU authors.

I’d like to encourage you to look at Shard-Query if you need to run complex jobs in the background and get the results later.  As a bonus, you …

[Read more]
Tips for working with append-only databases using sharding and log structured tables

This post is structured like a series of questions and answers in a conversation.  I recently had a number of conversations that all pretty much went this same way.  If you, like others, have many basic questions about how to proceed when faced with an append-only store for the first time, well then hopefully this post will help provide some answers for you.  The post focuses on column stores, the most common append-only store, but there are others.

Why do I want to use a column store? Column stores are optimal for OLAP analysis

Column stores offer substantial performance increases for OLAP  compared to row stores.  Row stores are optimized for OLTP workloads.  While a row store can be used for OLAP, it may not perform well because a row store has to retrieve every column for a row (unless there is a covering index).  This is one of the reason’s that I’ve said that covering index allows you …

[Read more]
Showing entries 1 to 5