A few day ago, a customer came to Percona needing to recover data. Basically, while doing a transfer from one SAN to another, something went wrong and they lost the ibdata1 file, where all the table meta-data is stored. Fortunately, they were running with innodb_file_per_table so the data itself was available. What they could provide us was:
- all the tables ibd files (Nearly 200 of them, 40 GB of data)
- the schema in sql format
- No remote access
Their first question was “Is it possible to recover the data?”
The answer was yes since I was sure to at least be able to recover the data with the Innodb recovery tool of Aleksandr Kuzminsky a colleague on the Percona European team of consultants. In the past, I used this tool a few times and, although it is awesome to dig for data, it is time consuming and the perspective of recovering by hand nearly 200 tables was not really exciting. I needed something faster and more automatic so, after reading this post from Chris Calendar, I decided to use the following strategy.
- Start with a clean MySQL install so that Innodb tablespace id is set to 1
- Import the schema in MyISAM format to avoid screwing up the tablespace id
- Record the tablespace id of all the tables provided by the customer and but the info in a MyISAM table
- Advance the Innodb tablespace id by creating fake Innodb tables
- Once the Innodb tablespace id is minus one of a customer
table
- alter the table definition from MyISAM to Innodb
- Discard the tablespace
- Replace the table ibd file with the one provided by the customer
- Import the tablespace
- At this point I should have access to the table but, although MySQL was not complaining, the data was not available. An “alter table tablename engine=InnoDB;” would solve the problem but then, the tablespace id would increase by one, annoying. I decided, also for portability, to do “alter table tablename engine=MyISAM;”
- Script all that
In order to get a decent disk space and good internet connectivity, I fired up an Ubuntu 10.04 EC2 instance, installed MySQL and provided the customer the credentials to upload their files. Then, I proceed.
Loading the schema
root@domU:/mnt/tables# perl -p -i -e 's/ENGINE=InnoDB/ENGINE=MyISAM/g' schema.sql root@domU:/mnt/tables# cat schema.sql | mysql -u root test
The MyISAM table recording the space ids of the tables
mysql> show create table test.tables\G *************************** 1. row *************************** Table: tables Create Table: CREATE TABLE `tables` ( `name` varchar(100) DEFAULT NULL, `spaceid` int(11) NOT NULL, PRIMARY KEY (`spaceid`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
Filling up the table with table names and space
ids
The challenge was now to fill up the table. It is fairly easy to
find the space id in a .ibd file.
#root@domU:/mnt/tables# hexdump profile_data.ibd -C | head -4 #00000000 62 bd c3 19 00 00 00 00 ff ff ff ff ff ff ff ff |b...............| #00000010 00 00 00 54 86 9a 91 93 45 bf 00 00 00 00 00 00 |...T....E.......| #00000020 00 00 00 00 00 11 00 00 00 11 80 39 00 00 00 02 |...........9....| #00000030 4f 00 00 02 47 c0 00 00 00 00 00 00 00 13 00 00 |O...G...........|
In the above example, the table space id is “00 11″ as stored in bytes 0×24 and 0×25 (actually, 0×22 and 0×23 are also for tablespace id but always 00 in this case). To load the data was just some Bash scripting:
# for t in `ls *.ibd`; do tname=`echo $t | cut -d'.' -f1`; spaceid=`hexdump -C $t | head -3 | tail -1 | awk '{print $6$7}'`; mysql -u root -e "insert into tables (name,spaceid) values ('$tname',conv('$spaceid',16,10))" test; done
Recovering the tables
With all this in place, recovery the data was just another Bash
script:
#!/bin/bash CURRENT_ID=1 mysql -u root "create database if not exists filler;" while [ 1 ] do CURRENT_SPACEID=`mysql -u root -B -N -e "select spaceid from test.tables where spaceid >= $CURRENT_ID order by spaceid limit 1;"` if [ "a$CURRENT_SPACEID" == "a" ]; then exit fi CURRENT_TABLENAME=`mysql -u root -B -N -e "select name from test.tables where spaceid = $CURRENT_SPACEID;"` echo "doing $CURRENT_TABLENAME $CURRENT_SPACEID" # DO WE NEED TO SKIP SOME ID while [ "$CURRENT_ID" -lt "$CURRENT_SPACEID" ] do echo "skipping spaceID $CURRENT_ID" mysql -u root -e "create table space_$CURRENT_ID (id int) engine=Innodb;" filler let CURRENT_ID+=1 done exit echo "creating the table" mysql -u root -e "create table recover.$CURRENT_TABLENAME like test.$CURRENT_TABLENAME;" sleep 1 echo "switching to Innodb " mysql -u root -e "alter table recover.$CURRENT_TABLENAME engine=Innodb;" sleep 1 echo "discarding tablespace " mysql -u root -e "alter table recover.$CURRENT_TABLENAME discard tablespace;" sleep 1 echo "moving the datafile in place" cp /mnt/tables/$CURRENT_TABLENAME.ibd /var/lib/mysql/recover/ chown mysql.mysql /var/lib/mysql/recover/$CURRENT_TABLENAME.ibd sleep 1 echo "importing tablespace" mysql -u root -e "alter table recover.$CURRENT_TABLENAME import tablespace;" sleep 1 echo "switching to MyISAM" mysql -u root -e "alter table recover.$CURRENT_TABLENAME engine=MyISAM;flush tables;" sleep 1 let CURRENT_ID+=1 done
I did a backup of /var/lib/mysql before the first run to make sure I could restart if needed, which of course, I did a few times. And that, recovered all except one table. For that one, I had to use the Innodb recovery tools but even then, I had a hard time. I pulled in Aleksandr and he basically reversed engineer the table structure and found that the schema the customer provided (likely from dev or staging) contained an extra column. With a modified schema, the recovery completed. The customer was able to download the tables in MyISAM format and then he just convert them back to InnoDB.