I’d like to think that I’m pretty decent at backup and recovery. I’ve done hundreds of restores in my career in a vast array of scenarios for real-world recovery, data movement, and fixing developers’ mistakes. Today I saw an error I’ve never seen before, so I thought I’d share.
This restore was grabbing an older backup image from one system(staging) and restoring into a different database in another system(dev). There was some data in the backup image that needed to be recovered, but not the whole database. This is a small implementation – the client does not have Recovery Expert. Developers would then be able to query to get data out of the old database and manually transform it into the formats used in the newer database. Let’s call the existing database SAMPLE and the copy from another server OLD_SAM.
To accomplish this, I have to restore the database into a different database name, and depending on the database layout, may have to do a redirected restore.
I checked the space utilization, and the database filesystems were less than 50% full, indicating I had space for a second copy of the database. (Does anyone see the mistake I made at this point?)
I used the awesome redirected script generation option to write the redirected restore script for me. (I remember manually writing a script that did this before this option was available because redirected restores of PeopleSoft databases were ridiculous without it.) Here are the steps it took and the output received:
RESTORE DATABASE SAMPLE FROM '/restore' TAKEN AT 20170117000005 ON '/db2data1/db2inst1/sample/old_sam' INTO OLD_SAM LOGTARGET '/db2logs/db2inst1/sample/old_sam/' NEWLOGPATH '/db2logs/db2inst1/sample/old_sam/' REDIRECT SQL1277W A redirected restore operation is being performed. During a table space restore, only table spaces being restored can have their paths reconfigured. During a database restore, storage group storage paths and DMS table space containers can be reconfigured. DB20000I The RESTORE DATABASE command completed successfully. SET STOGROUP PATHS FOR IBMSTOGROUP ON '/db2data1/db2inst1/sample/old_sam' DB20000I The SET STOGROUP PATHS command completed successfully. RESTORE DATABASE SAMPLE CONTINUE SQL1277W A redirected restore operation is being performed. During a table space restore, only table spaces being restored can have their paths reconfigured. During a database restore, storage group storage paths and DMS table space containers can be reconfigured. DB20000I The RESTORE DATABASE command completed successfully.
I noticed at this point that the output on that last command was a bit odd. It returned very quickly, and normally, you don’t get SQL1277W at that point. But hey, I got the DB20000I, so I thought it succeeded. It was clear when I tried to rollforward that this restore did not complete successfully:
$ db2 rollforward db old_sam to end of backup and complete SQL0752N Connecting to a database is not permitted within a logical unit of work when the CONNECT type 1 setting is in use. SQLSTATE=0A001 $ db2 rollforward db old_sam SQL0752N Connecting to a database is not permitted within a logical unit of work when the CONNECT type 1 setting is in use. SQLSTATE=0A001
If the restore had indeed completed successfully, both of the above commands should have worked.
After a lot of googling, hair-pulling, and trying the same thing again and again in hopes that it would just work, I finally realized that since the database had only one storage path, I did not, in fact, need to do a redirected restore. This is a small and simple database with a small and simple storage layout. I used this command instead:
$ db2 "RESTORE DATABASE SAMPLE FROM '/restore' TAKEN AT 20170117000005 ON '/db2data1/db2inst1/sample/old_sam' INTO OLD_SAM NEWLOGPATH '/db2logs/db2inst1/sample/old_sam/'" SQL2563W The restore process completed successfully. However, one or more table spaces from the backup image were not restored.
Finally, here’s an error message that I know and respect. Doh! The filesystem is filling up. A quick peek in the db2 diagnostic log confirmed this.
The mistake I made was assuming that 2X the database size on my dev database sever would cover restoring in a copy of the database from the staging system. The old backup from my staging system was somewhat larger, and therefore I ran out of room. I would have also discovered this had I looked at the DB2 diagnostic log sooner.
I am a little frustrated with DB2 for not giving me a more informative error message on redirected restore, though. I’m hoping IBM will correct this scenario to make it a little easier to figure out what’s going on earlier. I won’t soon forget this issue, and hope that this article will help others who run into the same thing.