(Edited 8/12/2014 to add links to the old tutorials from IBM)
There are a number of ways to cover error logging. I have covered some specific elements in previous posts, so I’m going for a more comprehensive approach in this post.
There used to be this great “Problem Determination Mastery” certification available. The study material and the test were only online. But the Tutorials associated with it were pretty good. They’re a bit on the old side now, but there’s still good information in them. I’ve included a copy of them here:
Part 2: Installation
Part 3: Connectivity
Part 4: Tools
Part 5: Engine
Part 6: Performance
Part 7: Multi-Node
Part 8: Application
Part 9: OS
These were clearly written by IBM and not me. If I could, I’d link to IBM’s site for them, but I can’t find them on any IBM site anymore. I do wish they’d release an updated version – I found them some of the most useful material.
I also cover a bit of problem determination in my post on What to do when DB2 won’t work
Understanding the basics of DB2’s error logging is really critical for any DB2 DBA.
DB2 Error Log Locations
DB2’s default location for error logging is to $INSTHOME/sqllib/db2dump. This is true also for clients – which do have a DB2 diagnostic log. You can change or find the location that db2 is using for the various error logging using the DIAGPATH dbm cfg parameter. Check the value using:
db2 get dbm cfg |grep -i DIAG Diagnostic error capture level (DIAGLEVEL) = 3 Diagnostic data directory path (DIAGPATH) = /db2diag/ Current member resolved DIAGPATH = /db2diag/ Alternate diagnostic data directory path (ALT_DIAGPATH) = Current member resolved ALT_DIAGPATH = Size of rotating db2diag & notify logs (MB) (DIAGSIZE) = 0
There are two major error logging files. The
db2diag.log is called the DB2 diagnostic log, and it is written to based on the the value of DIAGLEVEL. The other main file is called the administration notification log, and is named
<instance_name>.nfy. Other files can be written to the DIAGPATH as well.
There have been some recent changes in this area. Because the db2 instance is largely unavailable if the location where DIAGPATH is fills up, ALT_DIAGPATH has been added to allow you to specify a secondary location for error logging to ensure that the database instance can continue to function. You can also now specify a DIAGSIZE value. When you do specify a number in MB, this size is the total size you want your DB2 diagnostic log and administration log (on Linux or UNIX – on Windows it does not include the administration log). DB2 will allocate 90% of this size to the DB2 diagnostic log and 10% to the administration notification log. It will then manage the rotation of 10 log files each 1/10th the size specified – when the 10th file is full, it deletes the 1st file and creates the 11th file.
Since those are relatively new additions, I tend to just go with my old management strategy – which is to archive and compress the DB2 diagnostic log file and administration log file at the beginning of each month, and clear out some of the accumulated dump and misc files in the DIAGPATH at the same time. I retain the compressed files for 3-9 months depending on space.
If either the DB2 diagnostic log or the administration notification log are removed or renamed, db2 will re-create them so it has something to write to.
DB2 Diagnostic Log
The DB2 diagnostic log (db2diag.log) has been around forever. It contains detailed information about certain activities in the database.
The level of information captured in db2diag.log depends on the value of DIAGLEVEL. The possible values of DIAGLEVEL from the IBM DB2 Knowledge Center are:
0 – No diagnostic data captured
1 – Severe errors only
2 – All errors
3 – All errors and warnings
4 – All errors, warnings and informational messages
The default DIAGLEVEL is 3. 4 is usually too much information and may slow down processing – particularly LOADs. 1 or 2 may be too little information. In reality, I’ve only seen 4 used when troubleshooting a problem.
I have recently been subjected to a Tivoli monitoring implementation that emails me for EVERY error. And let me tell you, there are some things that I really wouldn’t consider error worthy. Examples include: a user specifies a password incorrectly, a database name is spelled incorrectly in some commands, some SQL syntax errors, a db2 command is piped to head or tail, and certain ways of executing db2 commands in perl scripts. I now fat-finger some things and I get multiple emails from Tivoli telling me so. It is a bit frustrating. I’m remembering why I don’t usually go in for db2diag.log parsing and alerting on the results.
Parsing the DB2 Diagnostic Log
I still tend to default to just taking a quick look at the db2diagnostic log for simple looks, and then accessing the db2 diagnostic log with sql for more complicated analysis. I only use the db2diag tool for archiving really – SQL access is easier. Click on either of the links to get details on using the other methods for accessing the data.
In my opinion, it is still important to be able to read a text db2diag.log and glean information from it. There is a wealth of data in there. The IBM DB2 Knowledge Center has an excellent level of detail on this:
If you go to that link, you’ll see that the general form of a db2diag.log entry is:
A couple of the most important things to note there are #3, which is the “level” indicator – it can help you eliminate some things as informational or warning rather than errors. #9 is the database name, which is useful if you have more than one database in an instance. Finally, a lot of error messages or other data are at the end, in #16 or after. Those are merely the highlights – I highly recommend reading the IBM DB2 Knowledge Center page on it to really understand reading (and therefore querying) entries in the db2diagnostic log.
DB2 Administration Notification Log
The DB2 Administration Notification Log has been added since I first became a DBA. It is billed as a more human readable place to look for errors, but I’m still an addict of the DB2 diagnostic log – once you learn to read it, there’s so much more information in the DB2 diagnostic log.
I recently ran a highly unscientific poll on the blog, asking the question “Do you use the DB2 administration notification log (instance_name.nfy)?”. This is what the results looked like:
I should really force voters to login so I could see how many of the responses were just Ian messing with me.
Twenty percent of the responses indicated didn’t know what the DB2 administration notification log was. A larger number than I expected admitted to using it. It is up to you whether or not you use the DB2 administration notification log. Just know that there is often more information in db2diag.log, for better or worse. I’d love to hear comments of how people work with it.
OS Error logging
Never forget your OS level logging when investigating db2 problems. A disk issue, for example, can explain a lot of oddness that may be harder to explain just from database level error logging.
Other Error Related Output
This entry has focused on the DB2 diagnostic log and the DB2 adminstration notification log. But there are other things that land in this directory, including:
- dump files
- trap files
- first occurrence data collection (FODC) packages
- STMM logs
- locktimeout files
These files can be useful, either for your own purposes or for providing to IBM DB2 support in case you have to open a PMR about an issue. They should be cleaned up on a regular basis, though, too, to prevent them from filling up the disk. I usually keep them for a minimum of 30 days.
Hi Ember, thank you for your blog. It is good place to find something useful. Just only one clarification to this article from me. It would be better if administrator of database will know how to use the command options for search information instead of using “grep” util. Sometime the size of log can be very huge and it can take a lot of time to find information by mentioned way. There is set of options for db2diag util which can help you with filtering of the messages in required period of time. Some example of searching errors during last hour you can find below:
db2diag -gi “Level:=error” -H 1h
Agree completely – I have an entry on db2diag: https://datageek.blog/2011/04/20/db2diag-tool-for-parsing-the-diagnostic-log/
If in case the alternative diag path is not set and if the current diag path is an external path which goes unavailable, what would happen to the instance? will the diag log goto the default /db2home in the meantime until it is available and resume it to go the external path after it is up?
My guess is that the instance will become unavailable and not write to some default location, but I don’t know for sure.
Thanks for your response. I did the activity after I posted this question . I set the alternative log path and did the exercise.