Practical P_S: Extending PROCESSLIST

MySQL 5.6 introduced major advances to monitoring made via PERFORMANCE_SCHEMA, but also made a change in how it binds to the network by default.  In MySQL 5.5, the –bind-address configuration option defaulted to “0.0.0.0″, meaning only IPv4.  This changed to “*” in MySQL 5.6, accepting connections on both IPv6 and IPv4 interfaces.  Somehow (I’ve not looked into it yet), my (unsupported) WindowsXP installation now refuses to bind to IPv4, which caused surprising problems for certain tools that seem to internally map “localhost” to IPv4-specific 127.0.0.1, where connections fail.  In working through this problem, I found myself wishing that PROCESSLIST output included information about which mechanism or interface was being used by each connection.  Fortunately, we can leverage PERFORMANCE_SCHEMA to extend PROCESSLIST in meaningful ways – this post aims to demonstrate how to do this by adding information about the interface as an example.

Here’s output from a basic PROCESSLIST:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
Id: 6
User: root
Host: localhost:2873
db: performance_schema
Command: Query
Time: 0
State: init
Info: SHOW PROCESSLIST
1 row in set (0.00 sec)

Notice that we get “localhost” here, rather than in IP address – there’s no way to tell from this information whether the connection leverages IPv4 or IPv6.  There’s another way to get that same information:

mysql> SELECT * FROM information_schema.processlist\G
*************************** 1. row ***************************
ID: 6
USER: root
HOST: localhost:2873
DB: performance_schema
COMMAND: Query
TIME: 0
STATE: executing
INFO: SELECT * FROM information_schema.processlist
1 row in set (0.06 sec)

That’s useful, because now we’re doing a SELECT on a table which we can use with JOINs.  So, what table should we JOIN with?  PERFORMANCE_SCHEMA.THREADS is where you want to look:

mysql> SELECT * FROM performance_schema.threads
-> WHERE processlist_id = 6\G
*************************** 1. row ***************************
THREAD_ID: 27
NAME: thread/sql/one_connection
TYPE: FOREGROUND
PROCESSLIST_ID: 6
PROCESSLIST_USER: root
PROCESSLIST_HOST: localhost
PROCESSLIST_DB: performance_schema
PROCESSLIST_COMMAND: Query
PROCESSLIST_TIME: 0
PROCESSLIST_STATE: Sending data
PROCESSLIST_INFO: SELECT * FROM performance_schema.threads
WHERE processlist_id = 6
PARENT_THREAD_ID: NULL
ROLE: NULL
INSTRUMENTED: YES
1 row in set (0.02 sec)

The PROCESSLIST_ID column is obviously what we want to use to do the JOIN:

mysql> SELECT thread_id, p.*
-> FROM performance_schema.threads t
-> JOIN information_schema.processlist p
->  ON (p.id = t.processlist_id)\G
*************************** 1. row ***************************
thread_id: 27
ID: 6
USER: root
HOST: localhost:2873
DB: performance_schema
COMMAND: Query
TIME: 0
STATE: executing
INFO: SELECT thread_id, p.*
FROM performance_schema.threads t
JOIN information_schema.processlist p
ON (p.id = t.processlist_id)
1 row in set (0.06 sec)

The THREAD_ID column is what’s most useful to join to other PERFORMANCE_SCHEMA tables, so we’ll use the THREAD table as a bridge.  In looking at the connection information, the table we want to look at is PERFORMANCE_SCHEMA.SOCKET_INSTANCES:

mysql> SELECT * FROM performance_schema.socket_instances\G
Empty set (0.00 sec)

If you get an empty resultset like above, it’s because the socket_instances table relies on PERFORMANCE_SCHEMA instrumentation which is not on by default.  So, taking a step back, if you want to see this information, you need to enable it in PERFORMANCE_SCHEMA.  The documentation explains two ways you can enable this instrumentation:  You can do this at startup with –performance_schema_instrument=wait/io/socket/sql/client_connection=counted (beware of this bug report), or you can issue the following UPDATE statement at runtime:

UPDATE performance_schema.setup_instruments
SET enabled = 'YES'
WHERE name LIKE 'wait/io/socket/client_connection'

Once enabled, new (and only new) connections will be counted, and you’ll start to see this type of information in the SOCKET_INSTANCES table:

mysql> SELECT * FROM socket_instances\G
*************************** 1. row ***************************
EVENT_NAME: wait/io/socket/sql/client_connection
OBJECT_INSTANCE_BEGIN: 35914240
THREAD_ID: 27
SOCKET_ID: 105332
IP: ::1
PORT: 2873
STATE: ACTIVE
1 row in set (0.00 sec)

Figuring out which interface is in use takes a little manipulation, but not much.  I use an ugly REGEX to isolate out IPv4 (or IPv6-mapped IPv4) connections: ‘^(::ffff:)?[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$’ (suggestions for improvements welcomed, though it worked for me).  Unix socket connections also show here, with a PORT value of “0″ and an IP value of “”.  Right now, shared memory or named pipe connections on Windows don’t appear in this table, so anything which doesn’t meet the IPv4 REGEX and has a PORT value greater than 0 should be considered IPv6.  Putting this all together, here’s the final query:

mysql> SELECT
->   p.*,
->   CASE
->     WHEN PORT = 0 AND IP = '' THEN 'Unix Socket'
->     WHEN IP REGEXP '^(::ffff:)?[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' THEN 'IPv4'
->     WHEN PORT > 0 THEN 'IPv6'
->     ELSE 'Undetermined'
->     END AS interface
-> FROM performance_schema.socket_instances si
->   RIGHT JOIN performance_schema.threads t
->     ON (t.thread_id = si.thread_id)
->   JOIN information_schema.processlist p
->     ON (t.processlist_id = p.id)\G
*************************** 1. row ***************************
ID: 6
USER: root
HOST: localhost:2873
DB: performance_schema
COMMAND: Query
TIME: 0
STATE: executing
INFO: SELECT
p.*,
CASE
WHEN PORT = 0 AND IP = '' THEN 'Unix Socket'
WHEN IP REGEXP '^(::ffff:)?[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$'
THEN 'IPv4'
WHEN PORT > 0 THEN 'IPv6'
ELSE 'Undetermined'
END AS interface
FROM performance_schema.socket_instances si
RIGHT JOIN performance_schema.threads t
ON (t.thread_id = si.thread_id)
JOIN information_schema.processlist p
ON (t.processlist_id = p.id)
interface: IPv6
1 row in set (0.13 sec)

Leveraging PERFORMANCE_SCHEMA in MySQL 5.6 allows you to get meaningful information about connections beyond what’s pre-packaged in PROCESSLIST.  Hopefully this post gives you ideas on how you can leverage this capability to meet your diagnostic needs.