A recovery trivia or how to recover from a lost ibdata1 file

A few day ago, a customer came to Percona needing to recover data. Basically, while doing a transfer from one SAN to another, something went wrong and they lost the ibdata1 file, where all the table meta-data is stored. Fortunately, they were running with innodb_file_per_table so the data itself was available. What they could provide us was:

  • all the tables ibd files (Nearly 200 of them, 40 GB of data)
  • the schema in sql format
  • No remote access

Their first question was “Is it possible to recover the data?”

The answer was yes since I was sure to at least be able to recover the data with the Innodb recovery tool of Aleksandr Kuzminsky a colleague on the Percona European team of consultants. In the past, I used this tool a few times and, although it is awesome to dig for data, it is time consuming and the perspective of recovering by hand nearly 200 tables was not really exciting. I needed something faster and more automatic so, after reading this post from Chris Calendar, I decided to use the following strategy.

  1. Start with a clean MySQL install so that Innodb tablespace id is set to 1
  2. Import the schema in MyISAM format to avoid screwing up the tablespace id
  3. Record the tablespace id of all the tables provided by the customer and but the info in a MyISAM table
  4. Advance the Innodb tablespace id by creating fake Innodb tables
  5. Once the Innodb tablespace id is minus one of a customer table
    1. alter the table definition from MyISAM to Innodb
    2. Discard the tablespace
    3. Replace the table ibd file with the one provided by the customer
    4. Import the tablespace
    5. At this point I should have access to the table but, although MySQL was not complaining, the data was not available. An “alter table tablename engine=InnoDB;” would solve the problem but then, the tablespace id would increase by one, annoying. I decided, also for portability, to do “alter table tablename engine=MyISAM;”
  6. Script all that

In order to get a decent disk space and good internet connectivity, I fired up an Ubuntu 10.04 EC2 instance, installed MySQL and provided the customer the credentials to upload their files. Then, I proceed.

Loading the schema

root@domU:/mnt/tables# perl -p -i -e 's/ENGINE=InnoDB/ENGINE=MyISAM/g' schema.sql
root@domU:/mnt/tables# cat schema.sql | mysql -u root test

The MyISAM table recording the space ids of the tables

mysql> show create table test.tables\G
*************************** 1. row ***************************
       Table: tables
Create Table: CREATE TABLE `tables` (
  `name` varchar(100) DEFAULT NULL,
  `spaceid` int(11) NOT NULL,
  PRIMARY KEY (`spaceid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Filling up the table with table names and space ids
The challenge was now to fill up the table. It is fairly easy to find the space id in a .ibd file.

#root@domU:/mnt/tables# hexdump profile_data.ibd -C | head -4
#00000000  62 bd c3 19 00 00 00 00  ff ff ff ff ff ff ff ff  |b...............|
#00000010  00 00 00 54 86 9a 91 93  45 bf 00 00 00 00 00 00  |...T....E.......|
#00000020  00 00 00 00 00 11 00 00  00 11 80 39 00 00 00 02  |...........9....|
#00000030  4f 00 00 02 47 c0 00 00  00 00 00 00 00 13 00 00  |O...G...........|

In the above example, the table space id is “00 11″ as stored in bytes 0×24 and 0×25 (actually, 0×22 and 0×23 are also for tablespace id but always 00 in this case). To load the data was just some Bash scripting:

# for t in `ls *.ibd`; do tname=`echo $t | cut -d'.' -f1`; spaceid=`hexdump -C $t | head -3 | tail -1 | awk '{print $6$7}'`; mysql -u root -e "insert into tables (name,spaceid) values ('$tname',conv('$spaceid',16,10))" test; done

Recovering the tables
With all this in place, recovery the data was just another Bash script:

#!/bin/bash

CURRENT_ID=1

mysql -u root "create database if not exists filler;"

while [ 1 ]
do
        CURRENT_SPACEID=`mysql -u root -B -N -e "select spaceid from test.tables where spaceid >= $CURRENT_ID order by spaceid limit 1;"`

        if [ "a$CURRENT_SPACEID" == "a" ]; then
                exit
        fi

        CURRENT_TABLENAME=`mysql -u root -B -N -e "select name from test.tables where spaceid =  $CURRENT_SPACEID;"`

        echo "doing $CURRENT_TABLENAME $CURRENT_SPACEID"

        # DO WE NEED TO SKIP SOME ID
        while [ "$CURRENT_ID" -lt "$CURRENT_SPACEID" ]
        do
                echo "skipping spaceID $CURRENT_ID"
                mysql -u root -e "create table space_$CURRENT_ID (id int) engine=Innodb;" filler
                let CURRENT_ID+=1
        done

        exit
        echo "creating the table"
        mysql -u root -e "create table recover.$CURRENT_TABLENAME like test.$CURRENT_TABLENAME;"
        sleep 1

        echo "switching to Innodb "
        mysql -u root -e "alter table recover.$CURRENT_TABLENAME engine=Innodb;"
        sleep 1

        echo "discarding tablespace "
        mysql -u root -e "alter table recover.$CURRENT_TABLENAME discard tablespace;"
        sleep 1

        echo "moving the datafile in place"
        cp /mnt/tables/$CURRENT_TABLENAME.ibd /var/lib/mysql/recover/
        chown mysql.mysql /var/lib/mysql/recover/$CURRENT_TABLENAME.ibd
        sleep 1

        echo "importing tablespace"
        mysql -u root -e "alter table recover.$CURRENT_TABLENAME import tablespace;"
        sleep 1

        echo "switching to MyISAM"
        mysql -u root -e "alter table recover.$CURRENT_TABLENAME engine=MyISAM;flush tables;"
        sleep 1

        let CURRENT_ID+=1
done

I did a backup of /var/lib/mysql before the first run to make sure I could restart if needed, which of course, I did a few times. And that, recovered all except one table. For that one, I had to use the Innodb recovery tools but even then, I had a hard time. I pulled in Aleksandr and he basically reversed engineer the table structure and found that the schema the customer provided (likely from dev or staging) contained an extra column. With a modified schema, the recovery completed. The customer was able to download the tables in MyISAM format and then he just convert them back to InnoDB.