Logging method
There are two methods of logging that DB2 supports: Circular and Archive. I believe Oracle has similar modes.
Circular
To my extreme disgust, the default that Commerce uses if you don’t change anything is Circular logging. Circular logging is more often appropriate for databases where you don’t care about the data (seen it for databases supporting Tivoli and other vendors) or for a Data Warehousing and Decision Support databases where you have extremely well defined data loading processes that can easily be re-done on demand. You must also be willing to take regular outages for database backups because Circular logging does not allow you to take online backups. Circular logging also does not allow you to rollforward or back through transaction logs to reach a point in time – any restores are ONLY to the time a backup was taken.
On every new build, I move away from circular logging. I just don’t find it appropriate for a OLTP database, where your requirement often include very high availability and the ability to recover from all kinds of disasters with no data loss.
Archive
So why, then isn’t archive logging the default? Well, it requires proper management of transaction log files. Which can really get you in trouble if you don’t know what you’re doing. If you compress or delete an active transaction log, you will crash your database and have to restore from a backup. I’ve seen it happen, and it’s not fun. And the highest freqency of OS level backups you’re willing to do should be applied to the directories holding transaction log files.
I ensure that my archive logs are always on a separate path from the active ones so I and whoever gets paged out when a filesystem is filling up can easily see which is which.
Personally, I use scripts to manage my trasaction log files. I actually do most of it with my backup script. How long you keep them depends on your restore requirements and your overall backup/restore strategy. I also use a simple cron job to find files in the archive log path older than a certain time frame (1 day or 3 days is most common) and compress them. I hear that a nice safe way to delete logs is the prune logs command, but I never got used to it.
This is one of the areas where it is critical for DBA’s to have an excruciatingly high level of attention to detail.
Logging settings
Ok, ready for the most complicated part?
All the settings discussed are in the db cfg.
LOGRETAIN
So the most important setting here is LOGRETAIN. If set to ‘NO’, then you have circular logging. If it is set to ‘Recovery’ then you have Archive logging. To enable archive logging, you simply update this parameter.
LOGARCHMETH1
Second in my mind is LOGARCHMETH1. This parameter specifies the separate path for your archive logs to be sent to. It can be a location on DISK or TSM. Do not leave it set to ‘LOGRETAIN’.
WTH is this USEREXIT thing?
I undoubtedly have some newer DBAs wondering about this. The LOGARCHMETH1 parameter and others that dictate the location of archive logs was only introduced in db28 (or was it 7?). Before that, we had these nasty things called userexit programs that we had to locate C compilers to compile and be aware of the location of the uncompiled versions to make changes if needed. And the compiled file had to be in the right place with the right permissions. Really, I hated working with them. But the functionality is still in DB2 to use them. I imagine they could do things you can’t do natively, but the parameters are so good that it’d be a rare situation that you need them.
LOGFILSIZ
This is the size of each transaction log file. Generally my default for Commerce is 10000 (which I think Commerce itself actually sets on instance creation), but I’ve gone higher – it’s not unusual to go up to 40,000 while data is being loaded or for stagincopies.
LOGPRIMARY
This determines the number of log files of the size LOGFILSZ that compose the database’s active log files. These are all created on database activation (which happens on first connection), so you don’t want to go too large. But you do want to generally have the space here to handle your active logs. Most Commerce databases do well at around 12.
LOGSECOND
This determines the number of additional active logs that can be allocated as needed. LOGPRIMARY + LOGSECOND cannot exceed 255. The nice thing about LOGSECOND is that these are not allocated on database activation, but only as needed. The other awesome thing here is that they can be increased online – one of the few logging parameters that can be. I usually start with 50, but increase if there’s a specific need for more. Remember, these should not be used on an ongoing basis – just to handle spikes.
All the others
So there are all kinds of nifty things you can do with logging. Infinite logging, mirrored loging, logging to a raw device, etc. So I’m not going to cover all the logging parameters there are in this post.
Potential issues
Deleting or compressing an active log file
The best case if you delete or compress an active log file is that DB2 is able to recreate it. This may affect your ability to take online backups. The worst (and more likely) case is that your database ceases functioning and you have to restore from backup. Keep your active and archive logs in separate directories to help prevent this, and educate anyone who might try to alleviate a filesystem full. If you do get an error on an online backup referencing the inability to include a log file, take an offline backup just as soon as you can – you will be unable to take online backups until you do.
Filling up a filesystem due to not managing log files
If your archive log filesystem is separate and fills up, it doesn’t hurt anything. If the filesystem your active log path is on fills up, your database will be inaccessible until you clear up the filesystem full. The moment the filesystem is no longer full, the database will function, so there is no need to restore. I recommend monitoring for any filesystems involved in transaction logging.
Deleting too many log files and impacting recovery
If you’re on anything before DB2 9.5, make absolutely sure that you use the “include logs” keyword on the backup command. If you don’t, you may end up with a backup that is completely useless, because you MUST have at least one log file to restore from an online backup. When you delete log files, keep in mind your backup/recovery strategy. There’s very little worse than really needing to restore but being unable to do so because you’re missing a file. I recommend backing up your transaction logs to tape or through other OS level methods as frequently as you can.
Deleting recent files and impacting HADR
Sometimes HADR needs to access archive log files – especially if HADR is behind and needs to catch up. If you run into this situation, you have to re-set-up HADR using a database restore. If you’re using HADR, it is important to monitor HADR so you can catch failures as soon as possible and reduce the need for archive logs.
Log files too small
Tuning the size of your log files may be a topic for another post, but I’ll cover the highlights. Large deletes are the most likely to chew through everything you’ve got. The best solution is to break up large units of work into smaller pieces, especially deletes. Where that’s not possible (ahem, stagingcopy), you’ll need to increase any of LOGFILSZ, LOGPRIMARY, or LOGSECOND. Only LOGSECOND can be changed without recycling the database.
Log file saturation
This one confuses the heck out of new DBAs. You get what looks like a log file full, yet the disk is not full and a snapshot says there’s plenty of log space available. The problem here is that with archive logging, log files and each spot in those log files must be used sequentially – even if there are things that have already been committed. Normally the database is rolling through the logs, with the same number of files active at once, but constantly changing which files.
Sometimes an old connection is sitting out there hanging on to a page in the log file with an uncommitted unit of work. Then the connection becomes idle and stays that way, sometimes for days. Then DB2 gets to the point where it has to open another log file, and it can’t because that would be more than it is allowed to allocate. So it throws an error that looks pretty similar to log file full. In that case, you must force off the old idle connection. Details are written to the diag log, and you can also use a database snapshot to get the id of the connection holding the oldest log file.
This never happens to Commerce’s own connections, in my experience. It is usually a developer’s connection from what I’ve seen in Commerce databases. Commerce when functioning normally rarely has a connection with more than 5 minutes of idle time. So I like to have a db2 governor running that forces off connections that are IDLE for more than 4 hours.
Hi Ember,
We have a DB2 warehouse database in Tivoli environment. Lately the database has been generating heavy logs. The Transaction Log directory is the same as for the database. And i am not able to find a proper way to archive/delete the logs.
we have configured online backup for the database and it runs 3 time a week, does the backup include the Transaction logs also ?
Also if you could help on how to archive/delete the logs, that would be great.
Any help in this regard is appreciated 🙂
Thanks in advance.
Regards,
Priyank Arora
Backups will only include the log files used during the backup itself, assuming you’re on DB2 9 or higher OR you specify INCLUDE LOGS when you take the backup.
Do you have TSM? If so, it may be best to archive log files to TSM using the LOGARCHMETH1 parameter. The other option is to set LOGARCHMETH1 to a directory on the server. Then you know it is safe to compress anything in that archive directory – and you can also back up that archive log directory to TSM or TAPE.
Once the logs are in a separate archive log directory, you have to decide how long you would like to keep them. You can delete them from that archive log directory whenever you want to, keeping in mind that you want to retain log files as long as you need to be able to restore the database to a point in time. So in my case, I have to be able to restore some databases to any given time in the last two weeks. So I compress the logs in my archive log directory when they’re older than 3 days, and delete them when they’re older than 14 days – just using normal OS-level compress and delete commands. I never touch a log file in the active log directory.
You could also use AUTO_DEL_REC_OBJ registry parameter along with REC_HIST_RETENTN and NUM_DB_BACKUPS – if you want to manage log files (and database backup files) through DB2, and you don’t care to keep anything in your history file longer than it is on disk.
Some Tivoli databases, you don’t really care about the data, but just getting the structure of the database back, and you only retain log files so you can take online backups. If that’s the case here, then you could even set LOGARCHMETH1 and then delete log files older than a day or so, and rely on the INCLUDE LOGS functionality of the backup to ensure each backup can be restored itself, but you would be unable to roll forward past the end of the backup.
Backups will only include the log files used during the backup itself, assuming you’re on DB2 9 or higher OR you specify INCLUDE LOGS when you take the backup.
From the above, you mean that only the active logs will be backed up or the older log files also will be backed up ?
we have LOGARCHMETH1 set to Logretain. can you please explain how can we set it to a directory on the server and the archive/compress the logs ? And which log files it will archive to that directory ? ( any conditions on which files should be archived for e.g older than 3 days)
unfortunately we dnt have TSM. we have 9.7 FP2 version of DB2.
/opt/IBM/ITM/db2/db2inst1/NODE0000/SQL00001/SQLOGDIR – this is the log directoy in our env Log Directory.
Can we have a separate directory for active logs and the old ones ?
By which factor we can decide that how long should be the log files retained ?? because the database data we need is for 3 months atleast.
Also i am not sure wether the online backup we are taking is with the includes log option.
Thanks Again.
Priyank
query : How can i analyse the db2 database archive log file ? How to verify the log file is related to particular database ? How to view the db2 database id ?
Because i am facing the following issue as :
RETCODE : ZRC=0x071000D6=118489302=SQLP_EXT_DBID_INCORR
“Database ID does not match Extent probably for another database.”
Here i had recreated the database with same name and during restore some how older log file came into picture.
I can’t find a way to do that – either to query or compare the database seed or to get the database seed from the log file. I’m asking around and will let you know if I find a way to do it.
Thanks.It will be helpful if we get this answer in finding whether its our vendor issue or db2 side issue.
Hi Ember,
we are a data warehouse team and we are using DB2 which is running on ISAS box.
As part of daily data load in some of our facts, we are deleting data from 3 core tables and reloading latest data.
The number of rows marked for daily delete is approx close to 4,00,000 rows . We are using a Stored procedure to perform this delete and the commit point set is 1,60,000.
Of late we are witnessing a considerable delay in the delete operations for 3,50,000 or above rows. When we abort the query and restart it, the delete gets completed in normal time.
This is happening more frequent of late and we are trying to identify the ways to manage these delete operations.
We have started with fine tuning the delete query , but also wondering if the commit size of 1,60,000 rows is a bit too many which might clog up the log.
what is your thoughts on this?
PS : At the time of delete operations, there are multiple sessions are running in parallel, which are reading/inserting data into the same database schema.
How is the stored procedure coded to limit the number of rows in a delete? Does it run multiple deletes in parallel?
The SP has logic to pick up rows for the last 3 calender days from the table and delete the data for those 3 days. No , multiple deletes do not run at the same time , but yes, there are multiple sessions( informatica sessions) which are inserting/reading data from the same db.
I just wonder how it’s setting commit points because there are slow and fast ways of doing that. See the comments on this post: https://datageek.blog/2012/07/12/how-to-delete-data-from-a-db2-table-and-release-disk-space/
You may want to give the table a reorg(chk) to see if its fragmented.
thx.
Hi, Ember.. we have db2 V9.7.7 installed .. we have separate filesystem for archive logs.. and its getting filled up. we take daily online backups.. right now the current active lognumber is 1556, so my question here is can I delete my old archived logs starting from lognumber 1 ,as I mentioned about filesystem space issue.. as we take daily online backups or do we need full offline backup then delete archive logs., please advise..
You need to keep the logs to restore with whatever your recovery requirements are. For most systems, I like to be able to restore to any time in the last two weeks. This requires that I keep two weeks worth of log files. This obviously depends on your recovery requirements and how/if the log files are copied to system backups or to tape. Each backup lists the earliest log file it needs for rollforward recovery – you can also use that to make sure you keep what you like. There’s also an automated methodology to delete both logs and backups using these three parameters:
num_db_backups
number of database backups to retain
rec_his_retentn
number of days recovery history will store information on recovery objects, including transaction logs and backups
auto_del_rec_obj
if set to “yes” or “on”, means recovery objects will be actually deleted when they expire using REC_HIS_RETENTN and NUM_DB_BACKUPS
Thanks Ember, valuable information.
Hi Ember, thanks for the site/blog.
Anyways, are you doing any log shipping for offsite backup? I see you mention some scripts for handling log files. I was wondering, ur, if you had already done the work. I was thinking of a cron job script that would
1) Close active log files
2) FTP closed log files to the DR server that werent already over there.
and of course
3) A cron job on the DR server to roll forward to the end of logs every X hours or so.
I am searching away today, learning about log files etc. I will check some of the links you have here also.
Thanks again
Phil
Why would you use this form of log shipping instead of HADR?
Actually, I am just starting to get a handle on HADR, and looking into that. Its more a case of discovering what options are out there.
I would use HADR long before I would use such a method. I haven’t seen that kind of manual log shipping in years. I’ve seen mirroring at the disk level, but not that.
Hi Ember,
Is there a way to prune archive logs from out of the DISK:/Archive_log_path upto a certain number of days and at same time this pruning of just the archive logs alone needs to reflect in the history or metadata with the related archive log entries removed and this too without removing the other backup, recovery and other load entries from out of the history file around that duration of days or time…
Looks like the db2 prune history command with DELETE’ parm can clean up the the metadata in the recovery history file without touching physical log or backup files.
Do you or has anyone tested this and can show a small example of the something similar being done…
I don’t have an example, but if you set the parameter AUTO_DEL_REC_OBJ (see http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.config.doc/doc/r0051456.html?cp=SSEPGG_9.7.0%2F2-2-6-7-6) and then issue the prune history command, it should do what you are asking. Keep in mind it will ALSO prune backup images if they are purged from history with this parameter. I tend to be a fan of scripting it myself – deleting files older than a certain number of days as long as their number is not after the first log file I want to keep based on my recovery history.
I will try what you mentioned, but I would need to prune the archive log history alone out of the recovery history file, along with the archive logs in the DISK path,
and in the process, leave out the backup, restore and load related history entries intact within….
Similarly need to know if its possible to purge the backup entries or load entries from history file individually as well thru db2 prune history or any other similar command….
That setting does log files and backups as one – though it respects NUM_DB_BACKUPS. If you want to do them individually, you’ll have do do them manually with a script.
Nice post, this one and all your others, very informative. I don’t fully agree with your comment on LOGRETAIN and circ /archive logging. But that is not my question.
You provided a link under LOGARCHMETH1 that I’d like to take a look at but IBM changed, again, and the link no longer works. Where/how might I find the info in that link.
Thanks,
Mario
I’ll correct it in the post, too: http://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.config.doc/doc/r0011448.html
at what point, Transactions logs will be moved to archive log path. My understanding is after commit, transaction logs moved to archive log path.. am I right? if not clear me out, Thanks in advance
After commit, transaction logs CAN be moved to archive path. But remember that there are many transactions of varying durations happening at once that use only pieces of a transaction log file. The only times that all transaction logs will be archived are on backup, on database deactivation/activation (I think), and on the archive log command. You can see attempts to archive transaction logs in the DB2 diagnostic log, and also in the database history.
i have run below purn command successfully but logs are not purn.
Any configuration needed to run this command??
db2 prune logfile prior to