When I do work in other domains – even something as connected as scripting – I find the error messages confusing and unhelpful. Most of the time, this is mostly because I don’t know much about the language or the the area. I’d like to think that my opinion that Db2 error messages are really quite helpful and useful is unbiased, but having built a career on focusing on IBM Db2, I’m sure I am biased.
I ran into this error message recently and thought I’d describe what the issue was.
I’d like to think that I’m pretty decent at backup and recovery. I’ve done hundreds of restores in my career in a vast array of scenarios for real-world recovery, data movement, and fixing developers’ mistakes. Today I saw an error I’ve never seen before, so I thought I’d share.
I can’t say that I work with federation on a daily basis. I have, however, set up federation for an Oracle data source from top to bottom, and support at least one system on a daily basis that uses federation to access Oracle, Informix, and DB2 for AS/400. This database also happens to be a BLU database, and as such, we want to stay really current on the fix pack. Currently with BLU being so new, there can be some drastic differences with just a fix pack.
Loading data into a DB2 database using the LOAD utility has a lot of ins and outs. If a DBA has spent a lot of time working in a database without referential integrity or check constraints, then they may forget to check for tables in a check-pending state after loading data. To illustrate the points here, I’ll be using the SAMPLE database and examples you can work through to learn the concepts in detail.
I recently ran into a couple of db2look problems for different clients, and thought I’d share them in case someone else runs into them.
When connecting between different versions of DB2, you generally need to bind packages from the different versions against the DB2 database. Just because you’ve bound db2ubind and db2cli on the server does not mean that all possible clients connecting in are covered. Sometimes binds happen implicitly, but sometimes they don’t. And you don’t necessarily want to grant a user rights to bind packages just because they’re going to need to do it once every 6 months when you apply a Fixpack or upgrade. It used to be that we didn’t have much choice in the matter – we had to get a client of the right version to bind from that client. But, luckily, in recent years, IBM has provided us with the files we need to perform binds for any version and Fixpack we may need.
Environment and ComplicationsThere were a couple of complications due to the client being on 9.5 FixPack 1. It was a small data warehouse (1.6 TB on 1 catalog and 4 data nodes). LOCKTIMEOUT is set to -1. The troubleshooting and resolution would have been significantly easier on a newer version of DB2. There were two complicating factors with the old environment:
- The db2top utility had not yet been incorporated into DB2 on this version, so I had manually installed it, and occasionally it doesn’t function quite right. At the time of this issue, it refused to show me the sessions screen without crashing.
- A host of table functions I use to administer DB2 were introduced in db2 9.7, and are not available in 9.5, and there is no way to back-port them. I normally would have made use of either MONREPORT or MON_GET functions or both.
Clearing up Lock ChainThe first thing I did was to clear up the reorg and other connections. As the instance owner, I made my putty session as large on my screen as possible. Next, I looked at the connections in db2top by issuing:
db2top -d sample(where sample is the database name)and then pressing U to get the locking screen. I don’t have a screen shot of how that screen looked, but it showed me multiple connections in a red “lock waiting” state. By pressing L, I got the lock chain. While not from this specific issue, the lock chain screen looks something like this: The main thing I’m getting from that screen are the three applicaiton ids that were involved in lock chains. Next, I use one of two methods to determine which application was the reorg and force off the other two. Method 1: From the locks screen in db2top, press A. It will prompt you for an agent id. Enter one of the agent ids and see if you can find where it lists an sql statement or former action. It may or may not give you this info in an easy to read format. This screen is the reason I make my putty window as large as possible – there is a lot of data to display Method 2: Exit out of db2top by pressing q or ctrl+c. At the command line, issue:
db2 get snapshot for applications on sample > appsnap.out
eral screens of text for each one to see if there is a reference to a reorg or to a script being run. Pay attention to the authorization IDs being used and the idle time. Alternately, search the file on “eorg” to find applications that have recently done reorgs. I copy about the first screen of this output when giving a client details about a connection I’d like to force, as it includes things like where the connection is coming from, the authorization id, and the idle time.
Identifying and Clearing Table Problem
The reorg then completed or failed immediately. I attempted a simple select on the table and noted the exact error message and return code. This is what I got:
Backups are so ingrained into DBAs. They should really be the first and the last thing we think of and ensure we do properly. We do regular backups so we can get data back in case of some failure, be it human, software, or hardware. We do ad-hoc backups before and after upgrades or fixpacks, before and after major application or database structure changes. Frequently, backups are used to move data between servers.
There are a couple of errors that you can get on database connection that simply mean you typed something wrong, but I figure there are people who will search on these errors, so I thought I would share. If you do not already have a database connection, you can get:
A story of fail and recover.