This probably falls more in the DB2 newbie category, but I suspect I may have some readers who are DB2 newbies or at least not veterans.
Several years ago (ok, probably more than 5), DB2 seemed to be considering a “Problem Determination Mastery” certification. I jumped on board and went through a series of free tutorials from IBM (which are still linked to from the 8.2 Info center, but aren’t there any more), and took an online test. But IBM never moved forward with it – I got some certificate, but it doesn’t show up alongside my regular DB2 certifications.
The tutorials, while now a bit outdated are still excellent and I think they should be required for new DBAs – it’s such good information that isn’t presented that way anywhere else. I have no right to redistribute them, and would link to them if I could, but I can’t – they’re not out there anymore that I can find. So I’ve attached the PDF’s to this post. Their rights are fully owned by IBM.
The Error message is relevant
So in DB2 the error message and the whole error message returned when an error is received is very relevant. First, of course, is the SQL error code. This is the letters SQL followed by 4 or 5 digits. Commerce also sometimes reports these as negative numbers in output, such as:
> 2011-03-04 11:37:18.076, , com.ibm.commerce.context.content.ant.tasks.UpdateWorkspacesTableTask::performTask(), S1 CWXBR0001E: A generic runtime system exception occured. The exception is: "com.ibm.db2.jcc.a.SqlException: DB2 SQL Error: SQLCODE=-569, SQLSTATE=56092, SQLERRMC=WSCOMUSR, DRIVER=4.3.111
In this case, the error being reported is SQL0569. The return codes may also be relevant based on which error it is, and often if you have the full text of the error it will give you an object or buffer name or other critical information.
Based on that information, you can look up the error in the DB2 info center and /or at the command line:
$ db2 ? SQL0569 SQL0569N Authorization ID "" does not uniquely identify a user, a group or a role in the system. Explanation: The authorization ID specified by the GRANT or REVOKE statement does not uniquely identify a user, a role, or a group in the security namespace. The reference to "" is ambiguous. Note that when using DCE security, the USER, GROUP or ROLE keyword is always required. User response: Change the statement to explicitly specify the USER, GROUP or ROLE keyword to uniquely identify the specified authorization id. sqlcode: -569 sqlstate: 56092
You can also google the error and get quite good info, and search the DB2 support site to see if your issue might be a common one or one that has an associated APAR. Also the letter on the end can be useful “W” is clearly for warning, and normally means that the statement finished, but there’s an issue that may or may not affect the results. “N” is usually a full failure for that connection and processing completely stops with the statement receiving the error failing. “C” is rarer, but is usually associated with a system level failure where not only is the current application failing, but likely others for every connection as well.
I’ve personally never found SQLSTATE to be of any relevance.
Most SQL error messages that you see will be due to a user or application error of some sort, so some dbas just kind of pass them off. My take on it is that even though it might not be an error that I can correct at the system level, I can help the reporter to better understand the problem and help them find a way to resolve it.
Learn to read the DB2 diagnostic log
The notify log was designed to be more readable, but I’ve never been a fan of it. There are plenty of things in the DB2 diagnostic log that are not errors, so it can be cluttered. And even when you’re an expert at reading it, you’re not going to understand every bit. But it is excellent for seeing what problems might be and giving you pieces of information that might not be returned as part of the SQL error message for system-level errors.
You do have to manage the retention of the diagnostic log and associated trap files – I like to archive it off (db2diag -a) every month, and keep the old stuff for 9o days.
Don’t be afraid to open a PMR
So it’s not always fun dealing with IBM on PMRs. I frequently know more about the area I’m calling about than the initial people I’m talking to. But I have never had DB2 Support tell me to go read a manual. If I’m making a dumb mistake, they patiently help me find it. Most often someone has seen a similar problem and they can get me on the right track. Very rarely I’ve found a truly new bug and they’ll do a special build to get it fixed for me. All of this is surrounded by hours of wasted time and trying things that don’t work, and gathering sometimes GB of information. And sometimes they can’t tell you why it happened. They do have tricks with db2dart or hidden parameters or hidden commands that can be a silver bullet. They serve a valuable role, and especially if you haven’t been a DB2 DBA for 10 years, they can be a good source.
whoah this blog is great i love reading your articles. Keep up the great work! You know, lots of people are hunting around for this information, you could help them greatly.