As I mentioned in a recent post, I do a lot of restoring databases to get them from one server to another. On rollforward after a recent restore, I saw this:
$ db2 force applications all; db2 deactivate db WC037P01; db2 "restore db WC037D01 from /app/IBM/db2/db_bkup taken at 20120723030327 into WC037P01 logtarget /app/IBM/db2/db_bkup/logs replace existing without prompting" DB20000I The FORCE APPLICATION command completed successfully. DB21024I This command is asynchronous and may not be effective immediately. SQL1493N The application is already connected to an active database. SQL2540W Restore is successful, however a warning "2528" was encountered during Database Restore while processing in No Interrupt mode. $ db2 get db cfg for wc037p01 |grep -i log Log retain for recovery status = RECOVERY User exit for logging status = YES Catalog cache size (4KB) (CATALOGCACHE_SZ) = 4096 Log buffer size (4KB) (LOGBUFSZ) = 512 Log file size (4KB) (LOGFILSIZ) = 10000 Number of primary log files (LOGPRIMARY) = 12 Number of secondary log files (LOGSECOND) = 50 Changed path to log files (NEWLOGPATH) = Path to log files = /app/IBM/db2/db_logs/NODE0000/ Overflow log path (OVERFLOWLOGPATH) = Mirror log path (MIRRORLOGPATH) = First active log file = S0000078.LOG Block log on disk full (BLK_LOG_DSK_FUL) = NO Block non logged operations (BLOCKNONLOGGED) = NO Percent max primary log space by transaction (MAX_LOG) = 0 Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0 Percent log file reclaimed before soft chckpt (SOFTMAX) = 520 Log retain for recovery enabled (LOGRETAIN) = RECOVERY User exit for logging enabled (USEREXIT) = OFF HADR log write synchronization mode (HADR_SYNCMODE) = NEARSYNC First log archive method (LOGARCHMETH1) = DISK:/app/IBM/db2/db_arch_logs/WC037P01/ Options for logarchmeth1 (LOGARCHOPT1) = Second log archive method (LOGARCHMETH2) = OFF Options for logarchmeth2 (LOGARCHOPT2) = Failover log archive path (FAILARCHPATH) = Number of log archive retries on error (NUMARCHRETRY) = 5 Log archive retry Delay (secs) (ARCHRETRYDELAY) = 20 Log pages during index build (LOGINDEXBUILD) = OFF $ cp ./logs/*.LOG /app/IBM/db2/db_logs/NODE0000/ $ db2 rollforward db wc037p01 to end of backup and stop SQL1265N The archive log file "S0000078.LOG" is not associated with the current log sequence for database "WC037P01" on node "0".
On getting this, I did some further investigation to see where it was getting this log file:
$ ls /app/IBM/db2/db_logs/NODE0000/ S0000078.LOG SQLLPATH.TAG $ ls -latr logs total 4972 drwxr-xr-x 5 db2inst1 db2inst1 4096 Jul 23 03:09 .. -rw------- 1 db2inst1 db2iadm1 5070848 Jul 23 03:12 S0000078.LOG drwxr-xr-x 2 db2inst1 db2iadm1 4096 Jul 23 03:12 . $ cd /app/IBM/db2/db_arch_logs/WC037P01/ $ ls db2inst1 $ cd d* $ ls WC037P01 $ cd W* $ ls NODE0000 $ cd N* $ ls C0000000 C0000001 $ cd C0000001 $ pwd /app/IBM/db2/db_arch_logs/WC037P01/db2inst1/WC037P01/NODE0000/C0000001 $ ls -latr total 379872 drwxr-x--- 4 db2inst1 db2iadm1 4096 Jul 20 23:20 .. -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:01 S0000067.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:02 S0000068.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:03 S0000069.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:04 S0000070.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:05 S0000071.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:06 S0000072.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:07 S0000073.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:20 S0000074.LOG -rw-r----- 1 db2inst1 db2iadm1 40968192 Jul 21 00:21 S0000075.LOG -rw-r----- 1 db2inst1 db2iadm1 10706944 Jul 21 03:01 S0000076.LOG -rw-r----- 1 db2inst1 db2iadm1 4710400 Jul 22 03:01 S0000077.LOG -rw-r----- 1 db2inst1 db2iadm1 4386816 Jul 23 03:00 S0000078.LOG -rw-r----- 1 db2inst1 db2iadm1 16384 Jul 23 03:04 S0000079.LOG drwxr-x--- 2 db2inst1 db2iadm1 4096 Jul 23 03:04 .
What I discovered here is that the log file it is complaining about is the one that I restored from the backup image (into db_bkup/logs). But it looks like the database I’m restoring into also had a S0000078.LOG, and it’s in the archive path. So, for some reason, even thought the log file exists in the active path (because I manually copied it there after the restore), the rollforward is also attempting to retrieve it from the archive log path. When I delete or rename the log file in the archive path, the problem is resolved – In the example below, I’m doing this by simply compressing it, which makes it so that DB2 cannot retrieve it.
$ pwd /app/IBM/db2/db_arch_logs/WC037P01/db2inst1/WC037P01/NODE0000/C0000001 $ gzip *.LOG $ db2 rollforward db wc037p01 to end of backup and stop Rollforward Status Input database alias = wc037p01 Number of nodes have returned status = 1 Node number = 0 Rollforward status = not pending Next log file to be read = Log files processed = S0000078.LOG - S0000078.LOG Last committed transaction = 2012-07-23-01.04.12.000000 UTC DB20000I The ROLLFORWARD command completed successfully.
I still have a question in mind as to why I’m seeing this behavior. It seems to me that if the log file is already in the active log path, db2 should not be retrieving it from the archive log path. But I’ve seen this error at least three times on two different servers in the last two weeks on a set of 9.7 FixPack 4 servers.