It happens. We joke that it doesn’t. Make sure not tell our Oracle or SQL Server counterparts but occasionally DB2 becomes … How do I put this eloquently? Hosed.
Zombied (think the “Walking Dead” TV show).
Sometimes there is a memory leak, instance level crash, trap, or OS issue. DB2 just goes into a bad state and you have to bring down DB2 as gently as you can.
And if not, hit it with a bat. Whatever works.
Even when DB2 comes down properly, you need to be diligent to make sure EVERYTHING has stopped before some more serious work happens, like a DB2 upgrade. You need ZERO DB2 processes in memory.
[Side Note: Ember has a really good article on stopping DB2 in preparation for an upgrade. Check out the article here – Link]
There is a level of escalation when DB2 is snarled. You start with deactivating the database which allows all things to come to a natural end. However, if step two fails, move to step 3, if step 3 fails, move to the big guns in step four. If you end up hitting step five, you know things are not good.
- Deactivate Database
- DB2STOP force
- Server Reboot
There is always real risk with a DB2_KILL of instance or database corruption. It is always a last resort.
Let’s say DB2 is down one way or another. And we want to be sure all DB2 processes are out of memory and the server isn’t chewing on something that could be a zombie. How do we accomplish that in an AIX/UNIX/LINUX environment?
From the DB2 Info Center: “The ipcs command writes to the standard output information about active interprocess communication facilities. If you do not specify any flags, the ipcs command writes information in a short form about currently active message queues, shared memory segments, semaphores, remote queues, and local queue headers”.
To use this in our environment you can issue the following.
Command: ipcs | grep db2 This returns something similar to: q 154140691 0xffffffff -Rrw------- INSTID GROUPNM q 611319866 0xffffffff -Rrw------- INSTID GROUPNM m 708837404 0x7e074e61 --rw------- INSTID GROUPNM m 577765405 0xffffffff --rw------- INSTID GROUPNM s 16777236 0x7e074e74 --ra-ra-ra- FENCEID GROUPNM
(Where ‘q’ is Message Queue, ‘m’ is Shared Memory Segment, and ‘s’ is a semaphore)
In short, all items returned are DB2 processes still in memory. If you brought DB2 down cleanly there may be very few or no processes. If you hit DB2 with a bat (db2_kill), you may have many.
To clean your memory of DB2 processes you can issue two different commands.
Issuing “ipclean” by itself will act as a catch-all. Essentially any processes in memory for that ID are terminated, or a terminate is attempted. If you came down cleanly with a deactivate and DB2STOP, IPCLEAN should be sufficient. DB2_KILL issues an ipclean as a part of its processing.
From the DB2 Info Center: “Removes message queue, semaphore set, or shared memory identifiers”.
Command: Ipcrm -q ID Ipcrm -m ID Ipcrm -s ID
Where q,m,s matches what was returned in IPCS command. The ID used is listed in the first column of IPCS.
ipcrm -q 154140691
For more details on the IPCS and IPCRM commands, head over to the DB2 Info Center at http://tinyurl.com/dyra4m2 .
Let’s admit it; we are not System Administrators (SA). Use these commands with caution as they can cause more harm than good. Run them as the DB2 instance owner. Use common sense, and if you are over your head, reach out to your SA. They will be happy to help. Getting them involved could prevent a reboot.
I don’t know about you, but I bribe my SA for help with coffee. Don’t make a newbie mistake and try to bribe with that weak stuff in the break room. If the SA grumbles and snarls, you used cheap coffee. Proceed to Starbucks quickly.
Michael Krafick is an occasional contributor to db2commerce.com. He has been a production support DBA for over 12 years in data warehousing and highly transactional OLTP environments. He was acknowledged as a top ten session speaker for “10 Minute Triage” at the 2012 IDUG Technical Conference. Michael also has extensive experience in setting up monitoring configurations for DB2 Databases as well as preparing for high availability failover, backup, and recovery. He can be reached at “Michael.Krafick (at) icloud (dot) com”. Linked-in Profile: http://www.linkedin.com/in/michaelkrafick. Twitter: mkrafick
Mike’s blog posts include:
10 Minute Triage: Assessing Problems Quickly (Part I)
10 Minute Triage: Assessing Problems Quickly (Part II)
Now, now you two play nice … DB2 and HACMP failover
Technical Conference – It’s a skill builder, not a trip to Vegas.
Attack of the Blob: Blobs in a Transaction Processing Environment
First of all let me thank you for your wonderful blogs which are so informative, resourceful and easily understandable. Humble recommendations for future blog topics –
a) Can you write an article about db2top utility and the various practical ways in which it could be used to diagnose or investigate different types of issues? I don’t think there are many informative articles about db2top available on the internet.
b) I know this may be a tough one. Decoding access plans is a challenge for any DBA. Query tuning was an art in my opinion, and it spans way beyond converting your table scans to index scans, which is where most DBA’s declare success. However I have seen a number of cases in my own project, where “ugly” access plans (with table scans) run much faster than ones with only index scans. Sometimes DB2 tries to ignore indexes even though you think you have created the best one for that particular query and in other cases, DB2 chooses a “dirty” index which you wouldn’t want it to. Apart from indexscans and index statistics, the type of “join” operators also hold a key to the efficiency of the operation being performed. As a result of all these, I have come to a state where the more queries I tune, the more number of uncertainties keep propping up which leave me less confident ultimately. And there aren’t many tutorials which teach you what to look for in an access plan (apart from the scans), how do you determine if a plan is an efficient one or not, what are the various parameters or data items in an access plan you should look at, and so on and so forth. So I thought it would be good if you could write an article on “How to decode access plans” which could help thousands of budding DBA’s like myself.
Once again thanks a lot for your time and effort in writing these blogs. I cant tell you how useful they are to me.
Crisp and clean, and only a little caffeine. Good Article.