Innotop: A real-time, advanced investigation tool for MySQL

GUI monitoring tools for MySQL are not always suitable for all our needs or situations. Most of them are designed to provide historical views into what happens to our database over time rather then real-time insight into current MySQL server status. Excellent free tools for this include Cacti, Zabbix, Ganglia, Nagios, etc. But each of them needs to be properly configured to provide details on what is going on in our MySQL instances. And setting up one of these monitoring solutions is neither quick nor trivial (well, maybe with the exception of Ganglia).

MySQL Workbench provides a substitute for real-time MySQL status views, though it is very limited. It also needs to be installed on a workstation with a graphical environment.

The fact is that you won’t find many text-mode tools of this kind for MySQL. The most widely known are Innotop and mytop. The latter unfortunately seems to be an abandon project as the last significant update happened in 2007. Additionally, mytop offers way less functionality then Innotop.

INNOTOP

Innotop can be really handy when you need a quick and easy tool that can provide a lot of details on what MySQL is doing – without using difficult queries to get those details manually. For our Percona Support customers, using Innotop will often be much easier then running complex SHOW statements and filtering their output in order to get some simple facts like slave replication lag, number of busy threads or InnoDB history list length. Also for us, Support engineers, it’s sometimes quicker to get a fast system overview with Innotop before we do deeper investigation using SHOW statements, Information_schema or more complex ways.

But Innotop is even more then that! In this article I will show what are my favorite features of this little tool, that you probably were not aware of. It is not my intention to show all the functionality here, so to get quick idea of the base tool capabilities, check the sample screenshots.

Configuration

Let’s start from quick configuration examples to demonstrate how you can get the tool up and running in 5 minutes. Btw. I think configuration examples can be really useful as documentation is bit unclear about some of it’s aspects.

After we are successful with installing latest Innotop, simply type ‘innotop’ command in our terminal and this is what most likely you will see:

[RO] Dashboard (? for help)                   localhost, 0.00, 0 QPS, 0/0/0 con/run/cac thds, !localhost
Uptime  MaxSQL  ReplLag  QPS  Cxns  Run  Miss  Lock  Tbls  Repl  SQL
                                         0.00
localhost: Access denied for user 'root'@'localhost' (using password: NO)

There are actually at least three ways to establish first working connection to MySQL server(s) for Innotop.

The simplest way to just start working is to run it by using similar host/user/password parameters as you do for basic mysql client:

innotop -h 127.0.0.1 -u root -p mysecret

Another way is to run innotop and configure connection(s) from inside. Quick help screen (invoked with ‘?’ key) tells us: “@ Select/create server connections”, so let’s press ‘@’:

You can enter the name of a new connection to create it.
____________________________ Choose from ____________________________
localhost  DBI:mysql:;host=localhost;mysql_read_default_group=client
Choose connections for this mode:

As we can see there is already ‘localhost’ one created automatically. We can now select this one, or create a new one. However, after you enter ‘localhost’ – it just continues to use current connection and gets you back to previous screen. I have no idea if you can actually edit this existing ‘localhost’ connection, so let’s just create a new one:

Choose connections for this mode: local
There is no connection called 'local'. Create it?: y

Next step:

Typical DSN strings look like
  DBI:mysql:;host=hostname;port=port
The db and port are optional and can usually be omitted.
If you specify 'mysql_read_default_group=mysql' many options can be read
from your mysql options files (~/.my.cnf, /etc/my.cnf).
Enter a DSN string: DBI:mysql:;host=localhost

so the only thing I put in this ‘connection wizard’s’ step was “DBI:mysql:;host=localhost”, but if you have some custom TCP port or socket, you must specify it here.
Next step:

Optional: enter a table (must not exist) to use when resetting InnoDB deadlock information: test.deadarch
Do you want to specify a username for local?: y
Do you want to specify a password for local?: y
Enter username for local: root
Enter password for 'root' on local:
Save password in plain text in the config file?: y

And after that step, we are immediately back in our Dashboard view, but now it is actually working:

[RO] Dashboard (? for help)                    local, 1h59m, 0.40 QPS, 2/1/0 con/run/cac thds, 5.5.33-31.1-log
Uptime  MaxSQL  ReplLag  QPS   Cxns  Run  Miss  Lock  Tbls  Repl  SQL
 1h59m                   0.40     2       0.00     0    26  Off

This way is for sure much more complicated and longer then just running with -h -u -p parameters, so why bother?
Also, if we quit the tool, and run again – the set up connection is lost!
OK, I’ll show you later why it is actually good to set up connection this way.
First, we make sure the set up connection will be saved! To make so, run it like this:

innotop --write

Then configure the connection again and quit the tool. Now Innotop has created configuration files for us:

[root@centos6-2 ~]# ls .innotop/
innotop.conf plugins

We edit the .innotop/innotop.conf file and… it’s huge! All options inside… Check documentation for all of their meaning, I will only focus on my favourite ones.
For us the most important will be this section:

[connections]
local=user=root have_user=1 pass=mypaSS have_pass=1 dsn=DBI:mysql:;host=localhost savepass=1
localhost=user= dsn=DBI:mysql:;host=localhost;mysql_read_default_group=client dl_table=test.innotop_dl
[/connections]

Actually, using the connection wizard was useful for me only to create initial first connection definition, which we can use as an example to create connections to more MySQL instances. It is just hard to create it manually just by reading the documentation.
We can see both the default ‘localhost’ connection which was there already, and our new ‘local’ where we did specify connection credentials. Now feel free to remove the line with ‘localhost’ one, we don’t need it.

Another important section is [active_connections] where each tool’s view can have assigned connection name. For example:
A=local
means that ‘local’ is default connection for Dashboard view. If we only have one connection configured in Innotop, after we enter another mode, it will get automatically assigned in configuration.

Monitoring Many Hosts

OK, since we now have a new configuration template ready, let’s manually configure more connections, which will be much faster then using the ‘wizard’.
For my tests, I used example replication sandbox (set up in 1 minute with mysqlsandbox), so connections will look like this:

[connections]
master56=user=msandbox have_user=1 pass=msandbox have_pass=1 dsn=DBI:mysql:;host=127.0.0.1;port=19479; savepass=1
slave156=user=msandbox have_user=1 pass=msandbox have_pass=1 dsn=DBI:mysql:;host=127.0.0.1;port=19480; savepass=1
slave256=user=msandbox have_user=1 pass=msandbox have_pass=1 dsn=DBI:mysql:;host=127.0.0.1;port=19481; savepass=1
[/connections]

Cool, now I can switch between each server for a given view mode using ‘n’ key, or choose a connection from the list after pressing ‘@’.
This is how an example [Q]uery view for slave256 looks like:

[RO] Query List (? for help)                       slave256, 23h, 10.93k QPS, 8/5/0 con/run/cac thds, 5.6.14-log
When   Load  Cxns  QPS     Slow  Se/In/Up/De%  QCacheHit  KCacheHit  BpsIn    BpsOut
Now    0.00     6  10.93k     0  99/ 0/ 0/ 0       0.00%    100.00%    1.05M  787.03k
Total  0.00   151   8.72      0  92/ 0/ 0/ 0       0.00%    100.00%  977.10     3.14k
Cmd      ID      State               User      Host           DB      Time      Query
Connect       2  Slave has read all  system u                         23:24:35
Execute      23  Sending data        msandbox  localhost      sbtest     00:00  SELECT count(k) FROM sbtest WHERE k BETWEEN ? AND ?
Execute      24  Sending data        msandbox  localhost      sbtest     00:00  SELECT count(k) FROM sbtest WHERE k BETWEEN ? AND ?
Execute      25  preparing           msandbox  localhost      sbtest     00:00  SELECT count(k) FROM sbtest WHERE k BETWEEN ? AND ?
Execute      26  Sending data        msandbox  localhost      sbtest     00:00  SELECT count(k) FROM sbtest WHERE k BETWEEN ? AND ?

Server Groups

But that’s not enough! Maybe you have 100 servers, in 10 master->slave(s) groups – you can watch them all using single Innotop instance in a sensible way! All thanks to server groups functionality. From the defined connections, you can easily configure groups and for almost each view mode – you can either choose a single connection or a group. Then switching between groups is as easy as pressing TAB. There is information in the top right corner of the screen telling us what server group or what single server we are watching at.

So let’s create our first group:

[server_groups]
56=master56 slave156 slave256
[/server_groups]

Now we can see all the servers in the group in the same screen, which for some view modes makes more in other less sense. I am sure you will like the group view for replication mode.
Below is an example when one of the slaves is lagging. We can see 52s lag on slave1, I/O threads are keeping up on both slaves with master (pos 596814971), but what I am missing here is the Exec_Master_Log_Pos info for slaves. But the information is enough to see it’s the SQL thread on slave2 not keeping up.

[RO] Replication Status (? for help)                                                   Servers: 56
__________________________ Slave SQL Status ___________________________
CXN       Master     On?  TimeLag  Catchup  Temp  Relay Pos  Last Error
slave156  127.0.0.1  Yes    00:52     0.00     0  510811815
slave256  127.0.0.1  Yes    00:00     0.00     0  596815181
________________________________________ Slave I/O Status _________________________________________
CXN       Master     On?  File              Relay Size  Pos        State
slave156  127.0.0.1  Yes  mysql-bin.000003     569.17M  596814971  Waiting for master to send event
slave256  127.0.0.1  Yes  mysql-bin.000003     569.17M  596814971  Waiting for master to send event
__________________ Master Status __________________
CXN       File              Position   Binlog Cache
master56  mysql-bin.000003  596814971        99.82%
slave156  mysql-bin.000002  510811605        99.80%
slave256  mysql-bin.000002  596814971        99.82%

One press key (I) and we are in the InnoDB IO threads view, where there we can see one important detail difference about slave1 – it’s doing much more fsyncs and write operations then two other servers.

[RO] InnoDB I/O Info (? for help)                                Servers: 56
______________________________ I/O Threads _______________________________
CXN       Thread  Purpose               Thread Status
master56       0  insert buffer thread  waiting for completed aio requests
master56       1  log thread            waiting for completed aio requests
master56       2  read thread           waiting for completed aio requests
master56       3  read thread           waiting for completed aio requests
master56       4  read thread           waiting for completed aio requests
master56       5  read thread           waiting for completed aio requests
master56       6  write thread          waiting for completed aio requests
master56       7  write thread          waiting for completed aio requests
master56       8  write thread          waiting for completed aio requests
master56       9  write thread          waiting for completed aio requests
slave156       0  insert buffer thread  waiting for completed aio requests
slave156       1  log thread            waiting for completed aio requests
slave156       2  read thread           waiting for completed aio requests
slave156       3  read thread           waiting for completed aio requests
slave156       4  read thread           waiting for completed aio requests
slave156       5  read thread           waiting for completed aio requests
slave156       6  write thread          waiting for completed aio requests
slave156       7  write thread          waiting for completed aio requests
slave156       8  write thread          waiting for completed aio requests
slave156       9  write thread          waiting for completed aio requests
slave256       0  insert buffer thread  waiting for completed aio requests
slave256       1  log thread            waiting for completed aio requests
slave256       2  read thread           waiting for completed aio requests
slave256       3  read thread           waiting for completed aio requests
slave256       4  read thread           waiting for completed aio requests
slave256       5  read thread           waiting for completed aio requests
slave256       6  write thread          waiting for completed aio requests
slave256       7  write thread          waiting for completed aio requests
slave256       8  write thread          waiting for completed aio requests
slave256       9  write thread          waiting for completed aio requests
_________________________________ Pending I/O __________________________________
CXN       Async Rds  Async Wrt  IBuf Async Rds  Sync I/Os  Log Flushes  Log I/Os
master56                                     0          0            0         0
slave156                                     0          0            0         0
slave256                                     0          0            0         0
_____________________________ File I/O Misc ______________________________
CXN       OS Reads  OS Writes  OS fsyncs  Reads/Sec  Writes/Sec  Bytes/Sec
master56      1024     118562       4737       0.00        0.00          0
slave156       565     120586      20333       0.00      144.86          0
slave256       532     119388       4644       0.00        0.00          0
__________________________ Log Statistics __________________________
CXN       Sequence No.  Flushed To  Last Checkpoint  IO Done  IO/Sec
master56  929977732     929977732   929977732          22777    0.00
slave156  929560312     929560312   927035927          17145   95.00
slave256  935238232     935238232   935238232          22767    0.00

If you still did not guess – it’s the sync_relay_log=1 and sync_binlog=1 set on slave1 only resulting in many more write operations.

These kind of views for multiple servers on the same screen are really nice and allow to very fast compare many servers or find the ones having problems.

More Advanced Options

One of the modes, which many of you probably did not know Innotop has, is one where you can view any status variables how changing in real time.
Below is view of Variables & Status mode (S), which btw. you can view as standard (vmstat like) view, pivoted or graph (tload). The one below is “pivoted”.

[RO] Variables & Status (? for help)    master56, 2h33m, InnoDB 1s :-), 5.09k QPS, 5/4/0 con/run/cac thds, 5.6.14-log
name               set_0    set_1    set_2    set_3    set_4    set_5    set_6    set_7    set_8    set_9
QPS                5211.69  4478.65  5457.82  5057.26  5704.39  5522.03  5737.59  5800.76  5188.59  58.52
Commit_PS          260.91   223.28   272.94   252.86   285.74   275.35   286.97   290.46   258.44   2.92
Rollback_Commit    0        0        0        0        0        0        0        0        0        0
Write_Commit       17.97    18.05    17.99    17.99    17.96    18.05    17.98    17.96    18.07    18
R_W_Ratio          0.78     0.78     0.78     0.78     0.78     0.78     0.78     0.78     0.78     0.78
Opens_PS           0        0        0        0        0        0        0        0        0        0.01
Table_Cache_Used   0        0        0        0        0        0        0        0        0        0
Threads_PS         0        0        0        0        0        0        0        0        0        0
Thread_Cache_Used  0        0        0        0        0        0        0        0        0        0
CXN_Used_Ever      3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31
CXN_Used_Now       3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31     3.31

What is really cool about this view is that you can select many kinds of variables to display here. With pressing ‘c’ key in this mode you will see the list (truncated as it’s very wide):

________________________________________________________ Choose from _________________________________________________________
          commands   Uptime, Questions, Com_delete, Com_delete_multi, Com_insert, Com_insert_select, Com_replace, Com_replace_
cxns_files_threads   Uptime,Aborted_clients,Aborted_connects,Bytes_received,Bytes_sent,Compression,Connections,Created_tmp_dis
           general   set_precision(Questions/Uptime_hires) as QPS, set_precision(Com_commit/Uptime_hires) as Commit_PS, set_pr
           handler   Uptime,Handler_read_key,Handler_read_first,Handler_read_next,Handler_read_prev,Handler_read_rnd,Handler_r
            innodb   Uptime,Innodb_row_lock_current_waits,Innodb_row_lock_time,Innodb_row_lock_time_avg,Innodb_row_lock_time_m
     innodb_health   dulint_to_int(IB_tx_trx_id_counter) - dulint_to_int(IB_tx_purge_done_for) as OldVersions,IB_sm_mutex_spin
    innodb_health2   percent(1-((Innodb_buffer_pool_pages_free||0)/($cur->{Innodb_buffer_pool_pages_total}||1))) as BP_page_ca
         key_cache   Uptime,Key_blocks_not_flushed,Key_blocks_unused,Key_blocks_used,Key_read_requests,Key_reads,Key_write_req
         prep_stmt   Uptime,Com_dealloc_sql,Com_execute_sql,Com_prepare_sql,Com_reset,Com_stmt_close,Com_stmt_execute,Com_stmt
       query_cache   percent((Qcache_hits||0)/(((Com_select||0)+(Qcache_hits||0))||1)) as Hit_Pct,set_precision((Qcache_hits||
      query_status   Uptime,Select_full_join,Select_full_range_join,Select_range,Select_range_check,Select_scan,Slow_queries,S
      slow_queries   set_precision(Slow_queries/Uptime_hires) as Slow_PS, set_precision(Select_full_join/Uptime_hires) as Full
               txn   Uptime,Com_begin,Com_commit,Com_rollback,Com_savepoint,Com_xa_commit,Com_xa_end,Com_xa_prepare,Com_xa_rec
Choose a set of values to display, or enter the name of a new one:

What is really cool about this mode, is that you can define your own set of values or compose functions making some calculations from many of them!

Not only a passive watch-tool!

Another thing worth mentioning is that Innotop is not only meant for monitoring! You can manage your servers there too (in some limited way).
For example, let’s see the replication mode example:
press ‘O’ key:

[RO] Replication Status (? for help)                                                   Servers: 56
__________________________ Slave SQL Status ___________________________
CXN       Master     On?  TimeLag  Catchup  Temp  Relay Pos  Last Error
slave156  127.0.0.1  Yes    00:00     0.00     0   57271325
slave256  127.0.0.1  Yes    00:00     0.00     0   57271325
________________________________________ Slave I/O Status ________________________________________
CXN       Master     On?  File              Relay Size  Pos       State
slave156  127.0.0.1  Yes  mysql-bin.000005      54.62M  57271115  Waiting for master to send event
slave256  127.0.0.1  Yes  mysql-bin.000005      54.62M  57271115  Waiting for master to send event
_________________ Master Status __________________
CXN       File              Position  Binlog Cache
master56  mysql-bin.000005  57271115         0.00%
slave156  mysql-bin.000004  57271115         0.00%
slave256  mysql-bin.000005  57271155         0.00%
_________________________________ Choose from _________________________________
master56   DBI:mysql:;host=127.0.0.1;port=19479;
slave156   DBI:mysql:;host=127.0.0.1;port=19480;
slave256   DBI:mysql:;host=127.0.0.1;port=19481;mysql_read_default_group=client
Which servers?: slave156
Command to send: stop slave SQL_thread;
Success on slave156
Press any key to continue

But… this option is actually a hidden gate to just run ANY command that is allowed to the user Innotop uses to connect:

(...)
Which servers?: slave156
Command to send: DROP TABLE test.t1;
Success on slave156
Press any key to continue

So beware of using too powerful users for your Innotop client. In the case above I was able to drop table on slave even though read_only=1 since I am connecting with super user.

Innotop and MySQL 5.6

Thanks to mysqlsandbox it is super fast and easy to test various MySQL versions and configurations. So, after we enable GTID replication in our replication sandbox (as easy as starting ‘enable_gtid’ script) – the replication view mode in Innotop looks unfortunately the same, no additional information. There is already a feature request on this, just needs some love: https://code.google.com/p/innotop/issues/detail?id=85
Note that if you have Innotop version older then 1.9.1 – the InnoDB status related modes will crash with MySQL 5.6, so update to 1.9.1.

Innotop and Galera replication

This would be really cool to see some wsrep status variables in Innotop. Currently, monitoring Percona XtraDB Cluster (PXC) nodes works well, you can configure each cluster as a group, just the replication/node status in cluster part is not seen.
I’ve made a new feature request for this, with a hope that some Perl master takes care of: http://code.google.com/p/innotop/issues/detail?id=90

However, that does not mean that you cannot monitor Galera status with Innotop at all! Remember the custom set of values for (S) mode?
Quick tutorial:
Press ‘S’ to enter the mode, choose connection to any PXC node you want, then press ‘c’ key:

Choose a set of values to display, or enter the name of a new one: wsrep1
Enter variables for wsrep1: wsrep_cluster_conf_id,wsrep_cluster_size,wsrep_local_recv_queue

Voila!

[RO] Variables & Status (? for help)      pxc1, 3h, InnoDB 1s :-), 1.90 QPS, 202/1/0 con/run/cac thds, 5.5.29-log
name                    set_0  set_1  set_2  set_3  set_4  set_5  set_6  set_7  set_8  set_9
wsrep_cluster_conf_id   3      3      3      3      3      3      3      3      3      3
wsrep_cluster_size      3      3      3      3      3      3      3      3      3      3
wsrep_local_recv_queue  0      0      0      0      0      0      0      0      0      0

If we have readonly option disabled in innotop.conf, after we exit the tool, our new variable set is saved like this:

[varsets]
wsrep=wsrep_cluster_conf_id,wsrep_cluster_size,wsrep_local_recv_queue
[/varsets]

Unfortunately, only numeric values are allowed here, also the S mode does not work well with server groups yet

There are many more bit hidden features inside Innotop you are probably not aware of, and which can be quite handy, so I encourage you to take a deep look into internal help (press ?) in each of the mode views.

Bugs – help needed!

Unfortunately, my favorite feature – ability of monitoring server groups on single screen is seriously affected by those two bugs:
https://code.google.com/p/innotop/issues/detail?id=87
https://code.google.com/p/innotop/issues/detail?id=89

List of all active bugs and feature request: https://code.google.com/p/innotop/issues/list

Links

http://www.mysqlperformanceblog.com/2013/07/12/innotop-1-9-1-released/
Complete manual: http://innotop.googlecode.com/svn/html/manual.html

The post Innotop: A real-time, advanced investigation tool for MySQL appeared first on MySQL Performance Blog.