Edit 2019-04-02: corrected typo in the explanation of the -tbspname option. Also, a colleague notes that a 16K page size table space also works for this.
db2caem is a tool in 9.7 that reports actual row counts in your explain plan alongside your estimated row counts.
What does this look like?
Access Plan: ----------- Total Cost: 570.128 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 1.0123 NLJOIN ( 2) 570.128 551.026 /-------+-------\ 1.01244 0.999855 TBSCAN FETCH ( 3) ( 4) 554.792 15.1474 549 2.0013 | /---+----\ 6918 0.999855 6917 TABLE: WSCOMUSR IXSCAN TABLE: WSCOMUSR CATENTRY ( 5) CATENTDESC Q1 7.5741 Q2 1 | 6917 INDEX: WSCOMUSR I0001248 Q2
Access Plan: ----------- Total Cost: 570.128 Query Degree: 1 Rows Rows Actual RETURN ( 1) Cost I/O | 1.0123 0 NLJOIN ( 2) 570.128 NA /-------+-------\ 1.01244 0.999855 0 0 TBSCAN FETCH ( 3) ( 4) 554.792 15.1474 NA NA | /---+----\ 6918 0.999855 6917 NA 0 NA TABLE: WSCOMUSR IXSCAN TABLE: WSCOMUSR CATENTRY ( 5) CATENTDESC Q1 7.5741 Q2 NA | 6917 NA INDEX: WSCOMUSR I0001248 Q2
So it may look like a minor difference to the untrained eye. In this case you can see that the actual row counts are actually 0. That would be a problem in many scenarios. In this case, I’m actually using mimiced statistics for another project, so I expect that, and it’s fine. If you’re using mimiced statisctics, you’ll just about always see a huge disperity.
When is This Information Useful?
Sometimes the optimizer gets it wrong or estimates incorrectly. If that’s the case, it can cause performance problems, and seeing estimates and actuals that are different can be important. One thing this might tell you is that your runstats are old. It might also tell you that you need to increase NUM_QUANTILES or NUM_FREQUVAULES because your data is unusually skewed. Sometimes you may have to go as far as using an optimization profile to nudge DB2 in a different direction.
How do I use db2caem?
This is really simple. db2caem is easily executed from the command line. Here’s my preferred syntax:
> db2caem -d SAMPLE -o query1.caem -tbspname dba32k -sf query1.sql ____________________________________________________________________ _____ D B 2 C A E M _____ DB2 Capture Activity Event Monitor data tool I B M The DB2CAEM Tool is a utility for capturing the activity event monitor data with details, section and values, as well as actuals. which could be used when analyze the SQL statement performance. ____________________________________________________________________ ________________________________________________________________________________ DBT7041I "db2caem": The db2caem utility is connecting to the following database: "SAMPLE". DBT7038I "db2caem": The db2caem utility successfully connected to the following database: "SAMPLE". DBT7042I "db2caem": The SQL statement "SELECT CD.NAME FROM wscomusr.CATENTDESC CD , wscomusr.CATENTRY C WHERE C.CATENTRY_ID = CD.CATENTRY_ID AND C.MFPARTNUMBER = ?" is being issued. DBT7043I "db2caem": The db2caem utility is disconnecting from the following database: "SAMPLE". DBT7039I "db2caem": The db2caem utility successfully disconnected from the following database: "SAMPLE". DBT7000I db2caem completed. Output path: "/redacted/ember/query1.caem/DB2CAEM_2012-03-08-23.08.08.376708".
Let’s look at that one element at a time.
This is simply the database name. It may have been changed in the output in this example to protect the innocent.
This is a directory to place the output in. You have to create the directory before running db2caem – it will not create it for you. You don’t have to specify this, but I like to keep similar names on files related to a specific query so I can easily look back at it later. The output is actually put in a subdirectory of this directory.
Again, you don’t have to specify this. This utility temporarily creates an event monitor that writes to an unformatted event monitor table, and this tablespace will hold the table. I have a 32k tablespace that I keep around just for my own use that could never possibly impact my running app if it were to fill up. I wouldn’t expect this process to use a whole lot of space, but I’m a paranoid control freak, so prefer to specify it.
This is the name of the file where I’ve stored the query I want to do the explain on. I’ve made sure the schema qualifiers are in place and that there’s a semi-colon at the end.
Friday, I’ll be presenting on the DB2Night Show’s DB2’s Got Talent competition about what db2caem is really doing behind the scenes. Sign up to attend here: https://www2.gotomeeting.com/register/663768226
Thank you for all the info.
I did try the CLP you wrote above and didn’t work
The error message say.
DBT7020E “db2caem”: The db2caem command failed because the specified output path does not e qualified. Specified output path: “/home/db2inst/f21monitor/query2.caem”.
I did specify the path of output.
Could you please tell me what I did wrong?
Thank you in advance.
I would try specifying an output path with the -o option. I did not encounter that particular error. The full syntax in the info center is available here: http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0057282.html
You would also have to specify a tablespace that actually exists – the DBA32K is the name of a tablespace in my database, in case I did not make that clear.
Excellent articles. I like all your posts. Thanks for sharing your knowledge
I did try db2caem. The statements must be declared with real values on the where clause. If we try using parameter markers, the analyser tool returns CLI0100E (Wrong number of parameters.).
Agreed, that is a problem with db2caem.