In one of the last commits I added a SQL Tokenizer which understands the basic tokens of (My)SQL:
- Strings
- Literals
- Numbers
- Comments
- Operators
With this basic understanding we can normalize Queries and build statistics over similar queries.
The idea is simple and already implemented in
mysqldumpslow:
/* login.php:37 */SELECT * FROM tbl WHERE id = 1243 AND name = "jan"
is turned into
SELECT * FROM `tbl` WHERE `id` = ? AND `name` = ?
The queries look like prepared statements now and can be used the characterize queries of the same kind.
- comments are removed
- whitespaces are stripped
- literals are quoted
- constants are replaced with ?
Taking the famous world-db and executing some simple queries like:
root@127.0.0.1:4040 …[Read more]