Abdel-Mawla Gharieb: Impact of General Query Log on MySQL Performance

Sometimes, it is required to enable the general query log (which is disabled by default). If the general query log is enabled the server writes to this log information when clients connect or disconnect, and each SQL statement received from the client.

The question is, does enabling the general query log affects the MySQL performance ?
Also, it is possible to record the output of this log into either file or table in the mysql database (mysql.general_log), what is the performance impact of each one?

Let's do some simple benchmark for those scenarios to measure the actual impact on the mysql performance.

System Information:

HW configurations:

Software configurations:

  • OS: Ubuntu 12.04
  • MySQL Server: 5.6.17
  • Sysbench: 0.4.12

Test Information:

  • Sysbench command: sysbench --num-threads=1 --max-requests=1000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-user=root run .
  • Table structure which was used by sysbench tests:

    mysql> show create table sbtest.sbtest\G
    
    CREATE TABLE `sbtest` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `k` int(10) unsigned NOT NULL DEFAULT '0',
      `c` char(120) NOT NULL DEFAULT '',
      `pad` char(60) NOT NULL DEFAULT '',
      PRIMARY KEY (`id`),
      KEY `k` (`k`)
    ) ENGINE=InnoDB AUTO_INCREMENT=8574 DEFAULT CHARSET=latin1


Note:

The test had been made against 1,2,4,8,16 and 32 threads, each throughput/response time value for each thread's number for each test case is generated by the average of ten (10) times execution.

General log Disabled:

To make sure that the general query log is disabled:

mysql> show global variables like'general_log';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| general_log   | OFF   |
+---------------+-------+


Testing results:

General log Enabled:

The general query log is a dynamic variable which means that it could be enabled or disabled ONLINE without restarting MySQL (since MySQL 5.1):

mysql> SET GLOBAL general_log=ON;


We can choose the output for this log to be either in a log file (by default) or in a MySQL table (mysql.general_log).

What are the benefits that we might get if we have the log output to be stored in a table not in a file?

  • We can use the normal SQL statements to access the log contents to get information about specific criteria (e.g. using WHERE condition) and this is a little harder when dealing with files.
  • The log contents could be accessed remotely if someone can connect to the MySQL server.
  • Standard format for the log entries.
  • If the CSV engine is used for the log table, it will be easy to import the CSV file into spreadsheets.
  • It is easy to expire the logs by simply TRUNCATE the log table.
  • Log rotation is possible by using RENAME TABLE statement.
  • Log entries are not replicated to the slave because they are not written to the binary logs.
  • The mysqldump does not include the log tables contents (general_log or slow_log) in the backup even if --all-databases backup option is used.

So, let's check the performance impact then of each log output.

Output is FILE:

To check the output destination of the general log, the following command should be used:

mysql> show global variables like'log_output';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_output    | FILE  |
+---------------+-------+


Testing results:

Output is TABLE (CSV table):

To change the output destination of the general log from file to table (CSV by default), the following command should be used:

mysql> SET GLOBAL log_output='TABLE';
mysql> show global variables like'log_output';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_output    | TABLE |
+---------------+-------+


Testing results:

Output is TABLE (MyISAM table):

Maybe due to the nature of the CSV storage ENGINE we faced performance issues in the previous case. Is it possible to change the table engine of the general_log table?

The answer is yes, but unfortunately, we are restricted to use only MyISAM storage engine and no engines other than CSV or MyISAM are allowed. Check this link for more information.

To alter the log table, you must first disable the logging:

mysql> alter table mysql.general_log engine=MYISAM;
ERROR 1580 (HY000): You cannot 'ALTER' a log table if logging is enabled
mysql> SET GLOBAL general_log=OFF;

mysql> alter table mysql.general_log engine=MYISAM;

mysql> SET GLOBAL general_log=ON;


Testing results:

Output is TABLE (MyISAM table with some structures changes):

In general, to make any SQL query work faster, we might need to optimize the table structure, add indexes, re-write the query, .. etc.

The following is the general log table structure:

mysql> show create table mysql.general_log\G

CREATE TABLE `general_log` (
  `event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `user_host` mediumtext NOT NULL,
  `thread_id` bigint(21) unsigned NOT NULL,
  `server_id` int(10) unsigned NOT NULL,
  `command_type` varchar(64) NOT NULL,
  `argument` mediumtext NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='General log'


Let's check what can we do to optimize the general_log table structure (other suggestions are welcomed):

  • Can we create partitions on that table which might boost our search?
    Although this is a normal MyISAM table but partitioning IS NOT allowed for the log tables (by the way, partitioning is not permitted also for CSV tables).
  • Can we change the datatype of the `user_host` column from mediumtext to e.g. varchar(100)? (The max length of this column's data on my machine didn't exceed 50 characters)
    While it is - syntax wise - accepted but no logs will be stored in the table after that and the following error will be printed in the error log file:
        2014-03-06 18:44:21 6987 [ERROR] Failed to write to mysql.general_log:
    2014-03-06 18:44:23 6987 [ERROR] Incorrect definition of table mysql.general_log: expected column 'user_host' at position 1 to have type mediumtext, found type varchar(100).

  • What about creating indexes on the columns we will use for most of our searches (`event_time` and `argument`)? lets try that!
    • Add index on `event_time` column:
      mysql> SET GLOBAL general_log=OFF;
      
      mysql> alter table mysql.general_log add index ev_tm_idx(`event_time`);
      
      mysql> show create table mysql.general_log\G
      
      CREATE TABLE `general_log` (
        `event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
        `user_host` mediumtext NOT NULL,
        `thread_id` bigint(21) unsigned NOT NULL,
        `server_id` int(10) unsigned NOT NULL,
        `command_type` varchar(64) NOT NULL,
        `argument` mediumtext NOT NULL,
        KEY `ev_tm_idx` (`event_time`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='General log'
      
      mysql> SET GLOBAL general_log=ON;

      Testing results:

    • Add FULLTEXT index on `argument` column:
      mysql> SET GLOBAL general_log=OFF;
      
      mysql> alter table mysql.general_log add fulltext index (`argument`);
      
      mysql> show create table mysql.general_log\G
      
      CREATE TABLE `general_log` (
        `event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
        `user_host` mediumtext NOT NULL,
        `thread_id` bigint(21) unsigned NOT NULL,
        `server_id` int(10) unsigned NOT NULL,
        `command_type` varchar(64) NOT NULL,
        `argument` mediumtext NOT NULL,
        KEY `ev_tm_idx` (`event_time`),
        FULLTEXT KEY `argument` (`argument`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='General log'
      
      mysql> SET GLOBAL general_log=ON;

      Testing results:


To make it more clear, the following is the combination of all results in one graph followed by response time comparison:





The raw results in Transactions / Sec might be useful:


Threads 1 2 4 8 16 32
General log disabled 383.996 814.759 1421.288 1674.733 1414.985 1071.189
General log enabled (File) 281.642 521.39 1230.743 1406.127 1095.896 923.986
General log enabled (CSV Table) 231.659 447.173 787.578 507.846 426.324 439.992
General log enabled (MyISAM Table) 249.47 536.379 933.304 532.912 476.454 454.015
General log enabled (MyISAM Table + index) 238.508 430.05 875.209 465.464 465.464 395.063
General log enabled (MyISAM Table + Fulltext index) 157.436 236.156 210.968 212.273 218.617 220.701


Conclusion:

  • The best MySQL performance - among all above test cases - could be achieved by disabling the general query log, e.g. if we compared the above results for the case 4 concurrent threads (the highest value for most scenarios) we would find that:
    • Using the general log enabled (general_log = ON) and the log destination is file (log_output = FILE) decreased the throughput by 13.4% and increased the response time by 17.5%.
    • Using the general log enabled and the log destination is CSV table decreased the throughput by 44.6% and increased the response time by 90%.
    • Using the general log enabled and the log destination is MyISAM table decreased the throughput by 34.3% and increased the response time by 59%.
    • Using the general log enabled and the log destination is MyISAM having an index added on column `event_time` decreased the throughput by 38.4% and increased the response time by 73%.
    • Using the general log enabled and the log destination is MyISAM having an index added on column `event_time` and FULLTEXT index added on column `argument` decreased the throughput by 85% and increased the response time by 542%.
  • Although using table as a log output destination has many benefits - as described above - but it has more negative impact on the MySQL performance as compared to log file.
  • Increasing the number of concurrently running threads - in the case of log_output=TABLE - will increase the general_log table contention which is controlled by the table locking level for MyISAM or CSV ENGINES.
  • Like any other MySQL table - as many rows inserted in the log table as more negative performance impact.
  • Finally, it is preferred to only enable the general query log when it is really needed and it is not recommended to enable it in a production system. It could be enabled (dynamically) for a while then should be disabled again once we have got what we are searching for.