Troubleshooting MySQL Crashes related to Metadata Locking

I just wrote an article about “Troubleshooting ‘Waiting for table metadata lock’ Errors for both MyISAM and InnoDB Tables” and then ran into a new, different metadata locking issue right after I posted it, and so I thought I’d share that too, just in case anyone ever encounters a similar situation.

In this case, mysqld kept crashing on restart, reporting “out of memory” errors:

/opt/app/mysql/product/mysql/bin//mysqld: Out of memory (Needed 840 bytes)
stack_bottom = 7fb4ebaeae58 thread_stack 0x40000
mysqld: /mysql/mysys/ int __cxa_pure_virtual():
  Assertion `! "Aborted: pure virtual method called."' failed.
mysqld: /mysql/mysys/ int __cxa_pure_virtual():
  Assertion `! "Aborted: pure virtual method called."' failed.
Fatal signal 6 while backtracing

There is a recent bug fixed (in 5.5.29) that mentions the same assert above, but that is not the same problem here, as that bug always reports “InnoDB: Failing assertion: page_get_n_recs(page) > 1″.

Looking further into these repeated crashes, I noticed the following some of the stack traces always showed this:

stack_bottom = 7f2cd227ce58 thread_stack 0x40000

At first glance, it’s kind of hard to read. However, if you focus in, you can extract some useful information.

Looking just before the crash (handle_fatal_signal), we see these 3 lines:


The keys parts of that, in reverse order are:


These 2 functions and variable, respectively, are only found in (and mdl.h) – the metadata lock files.

And after examining the MySQL source code, this stack trace tells us that there is an issue with the metadata locks. Specifically, it is trying to acquire a metadata lock (try_acquire_lock_impl), but it is failing (enum_mdl_type is set when try_acquire_lock_impl() fails, an assert is thrown, and mysqld crashes).

I’ve searched the bugs database for anything similar to this, but didn’t find any existing ones specific to try_acquire_lock_impl. That’s not to say there isn’t some yet-to-be-detected-bug here, but in this case, the machine was testing a custom application, so I strongly suspected the custom app causing the contention, which was the case.

So if you do happen to encounter this crash, then hopefully this will help you track down the problem.

Hope this helps, and happy troubleshooting.