This post ties in directly to my Friday presentation on the DB2Night Show’s DB2’s Got Talent Competition. Listen to the replay here: http://www.dbisoftware.com/blog/db2nightshow.php?id=341. My presentation is at the 52 minute mark.. Most importantly, please vote for me at https://www.surveymonkey.com/s/DB2TALENT16MAR
I have just recently reached the point where 50% of my work is on db2 9.7. Due to WebSphere Commerce still providing Support for DB2 8.2 in combination with WebSphere Commerce 6, I still have some clients on DB2 8.2. That means that sometimes I’m a bit late to the nifty new 9.7 features. But within the last 6 months, I’ve reached the tipping point, and am enjoying discovering new features and methods.
This post is about db2caem (Capture Activity Event Monitor data tool)
Why look into this?
The first time I saw an explain plan with actuals, my immediate thought was something along the lines of “OMG, I NEED that.” Yeah, sometimes I’m a Valley Girl in my head.
Anyway, the second time I saw it several months later my thoughts were more critical. I wondered how it gathered the information. Sure, it seems simple enough, but the picture that immediately came to mind was a statement event monitor gathering many MB of data and then pulling the data for just one statement out of it. I’m a bit shy of statement event monitors. I’ve seen an e-commerce database generate over a GB of data in just 15 minutes. In fact, when I ran a statement event monitor as my last resort in figuring out just what db2caem was doing, It generated over 150 MB of data on a non-production database, late in the evening. Maybe this is specific to WebSphere Commerce then, rather than all e-commerce databases. In any case, in my mind, a statement event monitor is a performance-impacting thing that generates a lot of data and should only be turned on with caution and for the shortest possible period of time. I sure didn’t want some utility doing it without my knowledge.
And thus begins the Odyssey of investigating exactly what the tool does.
How I did this investigation
This was a bit of a challenge for me. I started, like most at the info center. I read up on Activity Event Monitors, and was not suprised to see that like the locking event monitors I used in Analyzing Deadlocks – the new way, Activity Event Monitors require two things to caputure data – the existence of the event monitor and some other factor that tells d2 to capture and send the data. This alone is different that the methodology I’m used to, where a statement event monitor caputres all statements that meet it’s where clause – without me having to tell DB2 to also capture the data – it seems to me to be a trend in the way things are going.
Once I had figured out it was an event monitor, I decided to see what tables it was creating. I knew it was an event monitor that wrote to tables because one of the syntax options is to specify a tablespace where such tables are created. I wrote a very simple shell script that simply listed all tables in my sparsely used tablespace, and from the tables I saw it creating, I could tell the logical data groups that the activity event monitor was capturing.
The toughest nut for me to crack was what method it could be using to capture that event monitor activity. I was pretty sure by now that it wasn’t just capturing all activity and then pulling out the data that it needed. At one point, I worried that it was creating a workload, which would put me out of licensing compliance, since I’m not licensed for the Work Load Manager(WLM). I then guessed that it was using WLM_CAPTURE_ACTIVITY_IN_PROGRESS, but needed to verify this. It’s not exactly cake to come up with things like the last time a particular procedure was executed, so after looking at the dynamic SQL on the sparse hope it would capture a procedure call, I settled on using a statement event monitor. I knew that a statement event monitor would capture both static and dynamic calls. So I created a statement event monitor late at night on a development system, ran db2caem, and parsed the output.
What it really does
I’m going to go through it step by step here, in order. db2caem:
- Creates an independent set of explain tables (15 tables, 11 indexes). They included a unique id in the schema name so no other process, including another db2caem execution can use them. These tables are not created in the same tablespace you specify for the event monitor tables. They’re created in the SYSTOOLS tablespace. This buggs me a bit – I wish they were created in the specified tablespace.
- Creates the event monitor for activities and the 4 associated tables in the tablespace that I specified at execution time.
- Uses the stored procedure WLM_SET_CONN_ENV to gather information to the activity event monitor on anything done from this connection. Note that this is different from my guess – I was wrong in that guess – good thing I decided to be thorough in my research.
- Runs the statement that you gave it
- Uses the stored procedure WLM_SET_CONN_ENV to stop gathering activity for the connection.
- Turns off the event monitor
- Rolls back the statement that you gave it
- Uses EXPLAIN_FROM_ACTIVITY stored procedure to capture explain information
- Runs dozens and dozens of SQL statements on the explain tables, the statement event monitor tables, and the system catalog tables. Obviously I can’t see what it does with every piece of information – I can only see that it’s capturing the data.
- Writes the explain plan with actuals to the output directory
- Exports data from the event monitor tables to a subdirectory of the output directory
- Drops the event monitor, the event monitor tables, and the explain tables
Overall, it’s not as bad as I had feared. However, it’s also not as good as I could imagine. It creates a large number of objects (yes, I consider 19 tables a large number). It also does a lot of SQL that I can only hope is optimized. I don’t particularly like that the explain tables are created in the SYSTOOLS tablespace when I specified a different tablespace. I do love having the actuals, though. I’ve been in at least two tuning situations in my career where having those would have saved me a lot of time.
Please go vote for me! https://www.surveymonkey.com/s/DB2TALENT16MAR
DB2 info center entry on db2caem:
DB2 Info Center entry on activity event monitors:
Parameters included in the various logical data groups:
Details on WLM_SET_CONN_ENV:
Details on EXPLAIN_FROM_ACTIVITY: