db2caem – Get Actual Row Counts in Your Explain Plans

Edit 2019-04-02: corrected typo in the explanation of the -tbspname option. Also, a colleague notes that a 16K page size table space also works for this.

db2caem is a tool in 9.7 that reports actual row counts in your explain plan alongside your estimated row counts.

What does this look like?

With db2exfmt:

Access Plan:
-----------
        Total Cost:             570.128
        Query Degree:           1

                Rows
               RETURN
               (   1)
                Cost
                 I/O
                 |
               1.0123
               NLJOIN
               (   2)
               570.128
               551.026
         /-------+-------\
     1.01244            0.999855
     TBSCAN              FETCH
     (   3)              (   4)
     554.792             15.1474
       549               2.0013
       |               /---+----\
      6918        0.999855       6917
 TABLE: WSCOMUSR   IXSCAN   TABLE: WSCOMUSR
    CATENTRY       (   5)     CATENTDESC
       Q1          7.5741         Q2
                      1
                     |
                    6917
               INDEX: WSCOMUSR
                  I0001248
                     Q2

With db2caem:

Access Plan:
-----------
        Total Cost:             570.128
        Query Degree:           1

                Rows
             Rows Actual
               RETURN
               (   1)
                Cost
                 I/O
                 |
               1.0123
                  0
               NLJOIN
               (   2)
               570.128
                 NA
         /-------+-------\
     1.01244            0.999855
        0                   0
     TBSCAN              FETCH
     (   3)              (   4)
     554.792             15.1474
       NA                  NA
       |               /---+----\
      6918        0.999855       6917
       NA             0           NA
 TABLE: WSCOMUSR   IXSCAN   TABLE: WSCOMUSR
    CATENTRY       (   5)     CATENTDESC
       Q1          7.5741         Q2
                     NA
                     |
                    6917
                     NA
               INDEX: WSCOMUSR
                  I0001248
                     Q2

So it may look like a minor difference to the untrained eye. In this case you can see that the actual row counts are actually 0. That would be a problem in many scenarios. In this case, I’m actually using mimiced statistics for another project, so I expect that, and it’s fine. If you’re using mimiced statisctics, you’ll just about always see a huge disperity.

When is This Information Useful?

Sometimes the optimizer gets it wrong or estimates incorrectly. If that’s the case, it can cause performance problems, and seeing estimates and actuals that are different can be important. One thing this might tell you is that your runstats are old. It might also tell you that you need to increase NUM_QUANTILES or NUM_FREQUVAULES because your data is unusually skewed. Sometimes you may have to go as far as using an optimization profile to nudge DB2 in a different direction.

How do I use db2caem?

This is really simple. db2caem is easily executed from the command line. Here’s my preferred syntax:

> db2caem -d SAMPLE -o query1.caem -tbspname dba32k -sf query1.sql

      ____________________________________________________________________

                      _____     D B 2 C A E M     _____

                DB2 Capture Activity Event Monitor data tool
                              I      B      M

          The DB2CAEM Tool is a utility for capturing the activity event
         monitor data with details, section and values, as well as actuals.
          which could be used when analyze the SQL statement performance.
      ____________________________________________________________________

________________________________________________________________________________

DBT7041I  "db2caem": The db2caem utility is connecting to the following database: "SAMPLE".

DBT7038I  "db2caem": The db2caem utility successfully connected to the following database: "SAMPLE".

DBT7042I  "db2caem": The SQL statement "SELECT CD.NAME FROM wscomusr.CATENTDESC CD , wscomusr.CATENTRY C WHERE C.CATENTRY_ID = CD.CATENTRY_ID AND C.MFPARTNUMBER = ?" is being issued.

DBT7043I  "db2caem": The db2caem utility is disconnecting from the following database: "SAMPLE".

DBT7039I  "db2caem": The db2caem utility successfully disconnected from the following database: "SAMPLE".

DBT7000I  db2caem completed. Output path: "/redacted/ember/query1.caem/DB2CAEM_2012-03-08-23.08.08.376708".

Let’s look at that one element at a time.

-d SAMPLE

This is simply the database name. It may have been changed in the output in this example to protect the innocent.

-o query1.caem

This is a directory to place the output in. You have to create the directory before running db2caem – it will not create it for you. You don’t have to specify this, but I like to keep similar names on files related to a specific query so I can easily look back at it later. The output is actually put in a subdirectory of this directory.

-tbspname dba32k

Again, you don’t have to specify this. This utility temporarily creates an event monitor that writes to an unformatted event monitor table, and this tablespace will hold the table. I have a 32k tablespace that I keep around just for my own use that could never possibly impact my running app if it were to fill up. I wouldn’t expect this process to use a whole lot of space, but I’m a paranoid control freak, so prefer to specify it.

-sf query1.sql

This is the name of the file where I’ve stored the query I want to do the explain on. I’ve made sure the schema qualifiers are in place and that there’s a semi-colon at the end.

 

Friday, I’ll be presenting on the DB2Night Show’s DB2’s Got Talent competition about what db2caem is really doing behind the scenes. Sign up to attend here: https://www2.gotomeeting.com/register/663768226

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 544

5 Comments

  1. Thank you for all the info.
    I did try the CLP you wrote above and didn’t work

    The error message say.
    DBT7020E “db2caem”: The db2caem command failed because the specified output path does not e qualified. Specified output path: “/home/db2inst/f21monitor/query2.caem”.

    I did specify the path of output.
    Could you please tell me what I did wrong?

    Thank you in advance.

  2. Hi Ember,
    I did try db2caem. The statements must be declared with real values on the where clause. If we try using parameter markers, the analyser tool returns CLI0100E (Wrong number of parameters.).

    Regards.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.