I’m always keeping an eye on Twitter to catch interesting bits of news, and that includes watching cross platform information. I find it fascinating to compare DB2 to other RDMBSes, and saw a great article from Use the Index, Luke (fabulous name!) – Seven Surprising Findings About DB2. I’d recommend reading it.
One of the things that stood out most was a view he created/used to display explain information in a way he says is similar to what other RDBMSes do. I’ve developed my skills with reading a text explain graph over the years, but it does take some learning.
I thought I’d try it out and see how I like it. You can get the full syntax for the view from the Use the Index Luke site here: http://use-the-index-luke.com/sql/explain-plan/db2/getting-an-execution-plan, or on github here: https://github.com/fatalmind/DB2-last-explained
I tested it on a fairly simple query from a list aggregation blog article that I’m working on. The query explained is:
select substr(i.indschema,1,12) as indschema, substr(i.indname,1,25) as indname, uniquerule, (select cast(listagg(( case when ic.colorder = 'A' then '+' || ic.colname when ic.colorder = 'D' then '-' || ic.colname when ic.colorder = 'I' then '|' || ic.colname end ), '') within group (order by ic.colseq) as varchar(100)) from syscat.indexcoluse ic where ic.indschema = i.indschema and ic.indname = i.indname ) as colnames from syscat.indexes as i where tabschema = 'DB2INST1' and tabname = 'PROJACT' order by i.tabschema, i.tabname, i.indname;
And the explain graph I get out of that using db2exfmt is:
Total Cost: 27.512 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 2.64198 NLJOIN ( 2) 27.5117 4.03704 /-------+-------\ 2.64198 1 TBSCAN GRPBY ( 3) ( 7) 13.8736 13.624 2.03704 2 | | 2.64198 2.29439 SORT TBSCAN ( 4) ( 8) 13.8735 13.6219 2.03704 2 | | 2.64198 2.29439 FETCH SORT ( 5) ( 9) 13.873 13.6216 2.03704 2 /---+----\ | 2.64198 428 2.29439 IXSCAN TABLE: SYSIBM FETCH ( 6) SYSINDEXES ( 10) 6.81363 Q4 13.621 1 2 | /---+----\ 428 2.29439 982 INDEX: SYSIBM IXSCAN TABLE: SYSIBM INDINDEXES02 ( 11) SYSINDEXCOLUSE Q4 6.8137 Q1 1 | 982 INDEX: SYSIBM INDINDEXCOLUSE01 Q1
Using the view from Use the Index, Luke – I get:
$ db2 "select * from last_explained" Explain Plan ---------------------------------------------------------------------------------------------------- ID | Operation | Rows | Cost 1 | RETURN | | 27 2 | NLJOIN | 3 of 1 | 27 3 | TBSCAN | 3 of 3 (100.00%) | 13 4 | SORT | 3 of 3 (100.00%) | 13 5 | FETCH SYSINDEXES | 3 of 3 (100.00%) | 13 6 | IXSCAN INDINDEXES02 | 3 of 428 ( .70%) | 6 7 | GRPBY (COMPLETE) | 1 of 2 ( 50.00%) | 13 8 | TBSCAN | 2 of 2 (100.00%) | 13 9 | SORT | 2 of 2 (100.00%) | 13 10 | FETCH SYSINDEXCOLUSE | 2 of 2 (100.00%) | 13 11 | IXSCAN INDINDEXCOLUSE01 | 2 of 982 ( .20%) | 6 Predicate Information 6 - START (Q4.TBCREATOR = 'DB2INST1') START (Q4.TBNAME = 'PROJACT') STOP (Q4.TBCREATOR = 'DB2INST1') STOP (Q4.TBNAME = 'PROJACT') 11 - START (Q1.INDSCHEMA = Q4.CREATOR) START (Q1.INDNAME = Q4.NAME) STOP (Q1.INDSCHEMA = Q4.CREATOR) STOP (Q1.INDNAME = Q4.NAME) Explain plan by Markus Winand - NO WARRANTY http://use-the-index-luke.com/s/last_explained 25 record(s) selected.
Adding actuals
I wondered if I could figure out a way to pull in actuals using an activity event monitor. It requires a bit more work to get the data and an updated version of the view.
First, we have to create an activity event monitor. Remember that with activity event monitors, even creating them does not cause anything to be collected – you also have to enable collection one or more ways.
To create and activate the event monitor:
$ db2 create event monitor act_stmt for activities write to table manualstart DB20000I The SQL command completed successfully. $ db2 "set event monitor act_stmt state = 1" DB20000I The SQL command completed successfully.
Then to cause data to actually be sent to the event monitor, we’ll use the WLM_SET_CONN_ENV procedure. This will allow us to enable data collection only for the very short period of time while we run the statement we want values for. NULL as the first parameter tells DB2 that I only want to collect data on my current connection. See the IBM DB2 Knowledge Center for details on this procedure.
$ db2 "CALL WLM_SET_CONN_ENV(NULL, 'WITH DETAILS, SECTION ALL ')" Return Status = 0
In addition, we either have to set the SECTION_ACTUALS database configuration parameter to BASE or call WLM_SET_CONN_ENV slightly differently. I’ve chosen to set the db cfg parameter because this is something I’ll want to make use of again.
$ db2 update db cfg for SAMPLE using section_actuals BASE immediate DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
Now, we must execute the statement we would like to explain and get actuals for. This process does not work on statements with parameter markers, and the way I’m showing it here, I actually execute the statement. If you have a statement that cannot be executed, you can set your command options to disable autocommit, then execute the statement, then roll it back. We cannot do statements with parameter markers, because there is no such thing as actuals on parameter markers.
$ db2 -tvf listagg_query2.sql select substr(i.indschema,1,12) as indschema, substr(i.indname,1,25) as indname, uniquerule, (select cast(listagg(( case when ic.colorder = 'A' then '+' || ic.colname when ic.colorder = 'D' then '-' || ic.colname when ic.colorder = 'I' then '|' || ic.colname end ), '') within group (order by ic.colseq) as varchar(100)) from syscat.indexcoluse ic where ic.indschema = i.indschema and ic.indname = i.indname ) as colnames from syscat.indexes as i where tabschema = 'DB2INST1' and tabname = 'PROJACT' order by i.tabschema, i.tabname, i.indname INDSCHEMA INDNAME UNIQUERULE COLNAMES ------------ ------------------------- ---------- ---------------------------------------------------------------------------------------------------- DB2INST1 IX_PROJACT_INCLUDE U +PROJNO+ACTNO+ACSTDATE|ACSTAFF DB2INST1 PK_PROJACT P +PROJNO+ACTNO+ACSTDATE 2 record(s) selected.
After executing the statement, I want to disable collection of the activity data using this:
$ db2 "CALL WLM_SET_CONN_ENV(NULL, 'NONE ')"
Then I need to flush and deactivate my event monitor.
$ db2 flush event monitor act_stmt DB20000I The SQL command completed successfully. $ db2 "set event monitor act_stmt state = 0" DB20000I The SQL command completed successfully.
Now you’ll notice that at no point during this process have I actually collected explain information on my statement. I can do that now, using the EXPLAIN_FROM_ACTIVITY procedure – which is also what populates the EXPLAIN_ACTUALS table that I can use to get the actual row counts from. Some of the details must be fed into this procedure, and you can use this SQL to get those details from the activity data collected. First, query ACTIVITYSTMT_ACT_STMT to find the EXECUTABLE_ID for your statement. If you ran anything else from the same connection between your activate and deactivate, there may be more than one statement int ther.
db2 "select executable_id, stmt_text from ACTIVITYSTMT_ACT_STMT"
After you have identified the EXECUTABLE_ID, then get the other information using this SQL:
db2 "select application_id, uow_id, activity_id from ACTIVITYSTMT_ACT_STMT where executable_id=x'01000000000000005E0000000000000000000000020020141112095336537558'" 1 UOW_ID ACTIVITY_ID -------------------------------------------------------------------------------------------------------------------------------- ----------- -------------------- *LOCAL.db2inst1.141112163941 32 1 1 record(s) selected.
Then, fill in the information from the above into the call to EXPLAIN_FROM_ACTIVITY. You’ll note that I specify the name of the activity event monitor here. Scroll right to see the full call statement.
$ db2 "CALL EXPLAIN_FROM_ACTIVITY('*LOCAL.db2inst1.141112163941', 32,1, 'ACT_STMT', Null, ?, ?, ?, ?, ? )" Value of output parameters -------------------------- Parameter Name : EXPLAIN_SCHEMA Parameter Value : DB2INST1 Parameter Name : EXPLAIN_REQUESTER Parameter Value : DB2INST1 Parameter Name : EXPLAIN_TIME Parameter Value : 2014-11-12-10.11.33.687937 Parameter Name : SOURCE_NAME Parameter Value : SQLC2K26 Parameter Name : SOURCE_SCHEMA Parameter Value : NULLID Parameter Name : SOURCE_VERSION Parameter Value : Return Status = 0 $
This will populate all of the explain tables, including EXPLAIN_ACTUALS with the data on the access plan and on the actual row counts. Once these tables are populated you can run a select on the modified last_explained view against them. The output looks something like this:
$ db2 "select * from last_explained" Explain Plan ---------------------------------------------------------------------------------------------------- ID | Operation | Rows | ActualRows | Cost 1 | RETURN | | 2 of 3 ( 66.67%) | 27 2 | NLJOIN | 3 of 1 | 2 of 1 (200.00%) | 27 3 | TBSCAN | 3 of 3 (100.00%) | 2 of 3 ( 66.67%) | 13 4 | SORT | 3 of 3 (100.00%) | 2 of 3 ( 66.67%) | 13 5 | FETCH SYSINDEXES | 3 of 3 (100.00%) | 2 of 3 ( 66.67%) | 13 6 | IXSCAN INDINDEXES02 | 3 of 446 ( .67%) | 2 of 446 ( .45%) | 6 7 | GRPBY (COMPLETE) | 1 of 2 ( 50.00%) | 1 of 2 ( 50.00%) | 13 8 | TBSCAN | 2 of 2 (100.00%) | 4 of 2 (200.00%) | 13 9 | SORT | 2 of 2 (100.00%) | 4 of 2 (200.00%) | 13 10 | FETCH SYSINDEXCOLUSE | 2 of 2 (100.00%) | 4 of 2 (200.00%) | 13 11 | IXSCAN INDINDEXCOLUSE01 | 2 of 1060 ( .19%) | 4 of 1060 ( .38%) | 6 Predicate Information 6 - START (Q4.TBCREATOR = 'DB2INST1') START (Q4.TBNAME = 'PROJACT') STOP (Q4.TBCREATOR = 'DB2INST1') STOP (Q4.TBNAME = 'PROJACT') 11 - START (Q1.INDSCHEMA = Q4.CREATOR) START (Q1.INDNAME = Q4.NAME) STOP (Q1.INDSCHEMA = Q4.CREATOR) STOP (Q1.INDNAME = Q4.NAME) Explain plan by Markus Winand - NO WARRANTY Modifications by Ember Crooks - NO WARRANTY http://use-the-index-luke.com/s/last_explained 26 record(s) selected.
One of the nice things is that with the view defined this way, you simply get an empty column when running it for a query that did not have actuals captured:
$ db2 "select * from last_explained" Explain Plan ---------------------------------------------------------------------------------------------------- ID | Operation | Rows | ActualRows | Cost 1 | RETURN | | | 27 2 | NLJOIN | 3 of 1 | | 27 3 | TBSCAN | 3 of 3 (100.00%) | | 13 4 | SORT | 3 of 3 (100.00%) | | 13 5 | FETCH SYSINDEXES | 3 of 3 (100.00%) | | 13 6 | IXSCAN INDINDEXES02 | 3 of 446 ( .67%) | | 6 7 | GRPBY (COMPLETE) | 1 of 2 ( 50.00%) | | 13 8 | TBSCAN | 2 of 2 (100.00%) | | 13 9 | SORT | 2 of 2 (100.00%) | | 13 10 | FETCH SYSINDEXCOLUSE | 2 of 2 (100.00%) | | 13 11 | IXSCAN INDINDEXCOLUSE01 | 2 of 1060 ( .19%) | | 6 Predicate Information 6 - START (Q4.TBCREATOR = 'DB2INST1') START (Q4.TBNAME = 'PROJACT') STOP (Q4.TBCREATOR = 'DB2INST1') STOP (Q4.TBNAME = 'PROJACT') 11 - START (Q1.INDSCHEMA = Q4.CREATOR) START (Q1.INDNAME = Q4.NAME) STOP (Q1.INDSCHEMA = Q4.CREATOR) STOP (Q1.INDNAME = Q4.NAME) Explain plan by Markus Winand - NO WARRANTY Modifications by Ember Crooks - NO WARRANTY http://use-the-index-luke.com/s/last_explained 26 record(s) selected.
The modified syntax for this is available here: https://github.com/fatalmind/DB2-last-explained/
This was quite an exercise to get working. The first issue was if it was even possible – @ARSDB2 asked John Hornibrook at IDUG EMEA for me and got the information that it was possible and a few keywords that pushed me in the right direction – amazing how much you can learn from a 140-character tweet.
Someone out there is going “Hey, Ember, why didn’t you use db2caem?”. I tried. Geesh, how I tried. First, it wouldn’t handle my sample query for some annoying reason. Then when I went to a simpler query to explain, it doesn’t seem to populate or save the data in EXPLAIN_ACTUALS, or any exported file, or anywhere I can find outside of the explain plan graph it modifies – which is really quite annoying. A lot of the instructions above are things that db2caem does.
But honestly, most of my time was spent figuring out where the heck stuff is stored, how to get it stored there, and how to join it back to the rest of the data. I don’t find it difficult now that I know how to do it – the process above should work farily well.
Anyhow, all of my challenges were in adding in the actuals. The original view to get the explain plan output is interesting, and I’ll have to play with it more. The one situation I’ve since used it in, I needed to see how DB2 had re-written the query, so wasn’t able to use this approach. I will certainly have to read the Use the Index, Luke blog more often.
So , Its finally out , I have gone through the INDEX LUKE post when i found his tweet , but the way you handled section actuals stealed the show .
Great Post Ember
When a table contains more than 1000 rows and you have collected the actual rows, you get an overflow exception.
SOURCE_ID 2 3 ACTUALROWS
———– ——- ———————— —————-
SQL0413N Overflow occurred during numeric data type conversion.
SQLSTATE=22003
To resolve this I’ve changed the syntax from
“LPAD(CAST (ROUND(ROUND(act.actual_value)/total_rows*100,2) AS NUMERIC(5,2)), 6, ‘ ‘)”
to
LPAD(CAST (ROUND(ROUND(act.actual_value)/total_rows*100,2) AS NUMERIC(15,2)), 6, ‘ ‘)