Multiple Index Regression Analysis

I actually had a blog entry started on this topic before IDUG. I knew this was possible, but not exactly how to do it. Then I sat in Scott Hayes’ session – D04 – More Sage Advice: Invaluable DB2 LUW Performance Insights from Around the Globe, and he had all the details I needed to really get rolling on trying this out – saving me valuable time in research.

Indexing is relatively easy with a simple data model, or even a complicated one that doesn’t join a lot of tables, once you get used to it. But get to a point where you’ve got views and tables and dozens of objects involved, where the time is not primarily spent on reading data from the tables and indexes, but in the joining and filtering, and it starts to get difficult.

I’m calling this multiple index regression analysis because we’re looking for which recommended indexes most impact our query performance. The DB2 design advisor can recommend many indexes to improve performance, and some of them are more impactful than others. In some analysis I was doing last week, one of the indexes turned out to only reduce query execution time by 0.0001 timerons – not worth the overhead and space for that particular index. This does not use actual statistical formulas like those used in Multiple Linear Regression, but I could see a whole-workload approach where that would be interesting.

This is a highly advanced DB2 topic. It is assumed that the reader understands terms like Timeron and gets the concepts and can read an explain and db2advis output.

Scenario

In this example, I’m working with this query:

select * from order_trace  where  c_state = ? and dl_ap_date2 >= ? and  reference not in (select reference from edi_204  where edi_204.reference = reference and edi_204.order_id = order_id  and CUST_s_location_id = ? ) and (status in (? ,? ) OR ORDER_TYPE = ? )  order by dl_ap_date2, dl_ap_time2, act_state, act_city, actual, order_id;

Doesn’t look so bad, right? except order_trace is a view that is quite complicated. So my explain plan looks like this:

Access Plan:
-----------
    Total Cost:         35208.6
    Query Degree:       1

                                                                                                                                                                                       Rows 
                                                                                                                                                                                      RETURN
                                                                                                                                                                                      (   1)
                                                                                                                                                                                       Cost 
                                                                                                                                                                                        I/O 
                                                                                                                                                                                        |
                                                                                                                                                                                      2.92646 
                                                                                                                                                                                     >^NLJOIN
                                                                                                                                                                                     (   2)
                                                                                                                                                                                      35208.6 
                                                                                                                                                                                      11324.4 
                                                                                                                                                                          /-------------+--------------\
                                                                                                                                                                      2.92646                             1 
                                                                                                                                                                     >^NLJOIN                          IXSCAN
                                                                                                                                                                     (   3)                            (  66)
                                                                                                                                                                      35062.3                          50.0091 
                                                                                                                                                                      11318.5                             2 
                                                                                                                                                      /-----------------+-----------------\              |
                                                                                                                                                  2.92646                                    1         595580 
                                                                                                                                                 >^NLJOIN                                 IXSCAN   INDEX: DB2     
                                                                                                                                                 (   4)                                   (  65)      ORDER_ID
                                                                                                                                                  34989.1                                 25.0083        Q1
                                                                                                                                                  11315.6                                    1 
                                                                                                                              /---------------------+----------------------\                |
                                                                                                                          2.92646                                             1            13398 
                                                                                                                         >^NLJOIN                                          IXSCAN     INDEX: SYSIBM  
                                                                                                                         (   5)                                            (  64)   SQL991011111520810
                                                                                                                          34915.9                                          25.0074          Q2
                                                                                                                          11312.7                                             1 
                                                                                             /------------------------------+-------------------------------\                |
                                                                                         2.92646                                                               1            2976 
                                                                                         NLJOIN                                                             IXSCAN     INDEX: SYSIBM  
                                                                                         (   6)                                                             (  63)   SQL991011113441210
                                                                                         34842.7                                                            25.0112          Q3
                                                                                         11309.8                                                               1 
                                        /--------------------------------------------------+---------------------------------------------------\              |
                                    2.92646                                                                                                       1          13398 
                                    TBSCAN                                                                                                     NLJOIN   INDEX: DB2     
                                    (   7)                                                                                                     (  32)      DRIVER_ID
                                    33816.6                                                                                                    268.586        Q4
                                    11237.2                                                                                                      19 
                                      |                                                                                             /------------+-------------\
                                    2.92646                                                                                        1                              1 
                                    SORT                                                                                        NLJOIN                         TBSCAN
                                    (   8)                                                                                      (  33)                         (  59)
                                    33816.6                                                                                     243.01                         12.7956 
                                    11237.2                                                                                       17                              1 
                                      |                                                                                /----------+----------\                   |
                                    2.92646                                                                           1                         1                 1 
                                    HSJOIN                                                                         NLJOIN                    TBSCAN            TEMP  
                                    (   9)                                                                         (  34)                    (  56)            (  60)
                                    33816.6                                                                        191.871                   38.3582           12.7911 
                                    11237.2                                                                          13                         3                 1 
                  /-------------------+--------------------\                                                /--------+--------\                |                 |
              139659                                       2.92646                                         1                     1              1                 1 
              ^HSJOIN                                      HSJOIN                                       ^NLJOIN               TBSCAN         TEMP              GRPBY 
              (  10)                                       (  13)                                       (  35)                (  38)         (  57)            (  61)
              9380.69                                      24432.7                                      37.7969               140.072        38.3536           12.7892 
              3516.17                                       7721                                           2                    10              3                 1 
         /------+-------\                        /-----------+-----------\                         /------+------\              |              |                 |
     139659              97515               2685.75                     152.176                  1                 1            1              1             0.364844 
     IXSCAN             IXSCAN               NLJOIN                      ^NLJOIN               IXSCAN            IXSCAN       TEMP           IXSCAN            IXSCAN
     (  11)             (  12)               (  14)                      (  19)                (  36)            (  37)       (  39)         (  58)            (  62)
     5805.14            3567.73              6632.09                     17800.5               25.0075           25.5699      140.067        38.3518           12.7892 
     2657.04            859.125              2970.8                      4750.2                   1                 2           10              3                 1 
       |                  |               /----+-----\                 /---+----\                |                 |            |              |                 |
     139659              97515        1875.29        1.43218       152.176         1            2976             139659          1           440024             19946 
 INDEX: DB2         INDEX: DB2        TBSCAN         IXSCAN        TBSCAN       IXSCAN    INDEX: BFAIRCHILD  INDEX: DB2       NLJOIN     INDEX: DB2        INDEX: DB2     
      WIZ29       IDX201171957580000  (  15)         (  18)        (  20)       (  31)   IDX201162148270000      WIZ1171      (  40)   IDX201161942390000    IDX_STOP_1
       Q23                Q25         209.649        38.3519       15258.7      25.5699          Q10               Q11        140.066          Q8                Q5
                                      17.3389           3          3605.98         2                                            10 
                                        |              |             |            |                                /------------+-------------\
                                      1875.29        139659        152.176      139659                            1                              1 
                                      SORT       INDEX: DB2        SORT     INDEX: DB2                         TBSCAN                         TBSCAN
                                      (  16)   IDX201171959320000  (  21)       WIZ1171                        (  41)                         (  50)
                                      209.649          Q24         15258.7        Q22                          76.1435                        51.1416 
                                      17.3389                      3605.98                                     5.00001                           4 
                                        |                            |                                           |                              |
                                      1875.29                      152.176                                   1.3729e-005                         1 
                                      IXSCAN                       NLJOIN                                      SORT                           TEMP  
                                      (  17)                       (  22)                                      (  42)                         (  51)
                                      209.323                      15258.6                                     76.1433                        51.137 
                                      17.3389                      3605.98                                     5.00001                           4 
                                        |                     /------+-------\                                   |                              |
                                       97515                56               2.71743                         1.3729e-005                         1 
                                  INDEX: DB2              TBSCAN             IXSCAN                            NLJOIN                         TBSCAN
                                IDX201171957580000        (  23)             (  26)                            (  43)                         (  52)
                                        Q26               25.0247            332.82                            76.1429                        51.1352 
                                                             1               35.2989                           5.00001                           4 
                                                            |              /---+----\                    /-------+-------\                      |
                                                            56         803.566       54670              1              1.3729e-005               1 
                                                          SORT         TBSCAN   INDEX: DB2           TBSCAN              FETCH                SORT  
                                                          (  24)       (  27)    IDX_ORDER_01        (  44)              (  48)               (  53)
                                                          25.0246      78.3923        Q21            51.1352             25.0077              51.135 
                                                             1         6.08317                          4                1.00001                 4 
                                                            |            |                             |              /----+-----\              |
                                                            56         803.566                          1       1.3729e-005        519        1.00058 
                                                          IXSCAN       TEMP                          SORT         IXSCAN     TABLE: DB2       FETCH 
                                                          (  25)       (  28)                        (  45)       (  49)          CODE        (  54)
                                                          25.0201      78.1369                       51.135       25.0072          Q15        51.1347 
                                                             1         6.08317                          4            1                           4 
                                                            |            |                             |            |                    /------+------\
                                                            519        803.566                       1.00058        519              1.00058         1.9199e+007 
                                                      INDEX: DB2       UNIQUE                        FETCH    INDEX: DB2             IXSCAN        TABLE: EDI     
                                                        IDX_CODE_01    (  29)                        (  46)     IDX_CODE_01          (  55)   EDI_204_ADDITIONAL_FIELDS
                                                            Q27        78.0063                       51.1347        Q15              38.3527             Q18
                                                                       6.08317                          4                               3 
                                                                         |                      /------+------\                        |
                                                                       803.566              1.00058         1.9199e+007            1.9199e+007 
                                                                       IXSCAN               IXSCAN        TABLE: EDI             INDEX: EDI     
                                                                       (  30)               (  47)   EDI_204_ADDITIONAL_FIELDS   EDI_204_A_F_NDX
                                                                       77.9929              38.3527             Q13                    Q18
                                                                       6.08317                 3 
                                                                         |                    |
                                                                    2.23029e+007          1.9199e+007 
                                                                   INDEX: DB2           INDEX: EDI     
                                                                 IDX201171956050000     EDI_204_A_F_NDX
                                                                         Q28                  Q13

A bit more horrendous, that, and looking at it in detail, there’s no obvious gigantic table scan or even fetch from a table after reading an index that provides a significant amount of the time.

Running db2advis on that, we get:

  19  indexes in current solution
 [34706.0000] timerons  (without recommendations)
 [19035.0000] timerons  (with current solution)
 [45.15%] improvement


--
--
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1],    0.548MB
   CREATE UNIQUE INDEX "ECROOKS "."IDX1505210120080"
   ON "DB2     "."ORDER" ("TRAILER_SIZE" ASC, "DL_AP_DATE2"
   ASC, "CUSTOMER_RATE" ASC, "NOTIFIED_DATE" ASC, "DATE_ENTERED"
   ASC, "COMMODITY" ASC, "NEEDS_PAPERWORK" ASC, "LINE_OF_BUSINESS"
   ASC, "TOTAL_CHARGE" ASC, "MILES" ASC, "TERMINAL" ASC,
   "SCAC" ASC, "SEAL" ASC, "PO_NUM" ASC, "PU_NUM" ASC,
   "BL_NUM" ASC, "REFERENCE" ASC, "WHO_ENTERED" ASC,
   "ORDER_TYPE" ASC, "TRACTOR_ID" ASC, "DRIVER_ID" ASC,
   "DL_AT_TIME2" ASC, "DL_AT_DATE2" ASC, "DL_AT_TIME1"
   ASC, "DL_AT_DATE1" ASC, "DL_AP_TIME2" ASC, "DL_AP_TIME1"
   ASC, "DL_AP_DATE1" ASC, "PU_AT_TIME2" ASC, "PU_AT_DATE2"
   ASC, "PU_AT_TIME1" ASC, "PU_AT_DATE1" ASC, "PU_AP_TIME2"
   ASC, "PU_AP_DATE2" ASC, "PU_AP_TIME1" ASC, "PU_AP_DATE1"
   ASC, "S_LOCATION_ID" ASC, "CHASSIS" ASC, "CHECK_DIGIT"
   ASC, "TRAILER" ASC, "STATUS" ASC, "ORDER_ID" ASC)
   INCLUDE ("C_LOCATION_ID", "ACT_LOCATION_ID", "BTO_LOCATION_ID")
   ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[2],   20.087MB
   CREATE INDEX "ECROOKS "."IDX1505210116380" ON "DB2     "."EDI_204"
   ("CUST_S_LOCATION_ID" ASC, "ORDER_ID" ASC, "REFERENCE"
   ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[3],    0.048MB
   CREATE UNIQUE INDEX "ECROOKS "."IDX1505210115300"
   ON "DB2     "."VENDOR_UNIT" ("VENDOR_UNIT_ID" ASC)
   INCLUDE ("VENDOR_ID") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[4],    5.778MB
   CREATE UNIQUE INDEX "ECROOKS "."IDX1505210115350"
   ON "DB2     "."LOCATION" ("LOCATION_ID" ASC) INCLUDE
   ("NAME") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[5], 1673.419MB
   CREATE INDEX "ECROOKS "."IDX1505210115360" ON "EDI     "."EDI_204_ADDITIONAL_FIELDS"
   ("REFERENCE" ASC, "FIELD_NAME" ASC, "INSERT_DT" ASC,
   "FIELD_VALUE" ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[6],    0.013MB
   CREATE INDEX "ECROOKS "."IDX1505210116350" ON "DB2     "."CODE"
   ("CODE_TEXT" ASC, "TYPE" ASC, "CODE_ID" ASC, "CODE"
   ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[7],    9.739MB
   CREATE UNIQUE INDEX "ECROOKS "."IDX1505210121040"
   ON "DB2     "."RAIL_ETA" ("ORDER_ID" ASC) INCLUDE
   ("STATE") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;

Some of those indexes I’m not likely to try. index[1] for example is awfully wide. index[2] is nearly identical to an existing index, with the difference being in the order of the last two columns. And 19 total indexes that DB2 wants to use for this one query? Wow, that’s a lot.

Also, I don’t have a full-size test or QA environment – the data size in my lower environments is significantly smaller, so it is hard for me to scale down to see what happens with different indexes in those environments. One option here is to mimic statistics in my lower environments. I’ve done that before, but I want to try mimicking the new indexes instead.

Setup steps

First, my explain tables are all currently shared tables. With multiple DBAs potentially working in this environment, I have two real choices to isolate my data – #1 is to create my own set of explain tables and make use of them. #2 is to note the time before I collect data and then the time after I collect explain and db2advis data and look only within that period to verify that only the data I want is included. I’m chosing the second option. With only 3 potential people running explains and none of us dedicated to that full time, this seems like a good option – we’re not likely to conflict.

Running the Explain and db2advis

I need to capture the explain and db2advis data before I work on anything. Note that I happen to be working in a Windows environment from a powershell prompt for this work – everything here would be the same on Linux/UNIX.

To capture my initial data, I’ll use:

PS D:\xtivia\queries> db2 "select current timestamp from sysibm.sysdummy1"

1
--------------------------
2015-05-21-10.00.07.793000

  1 record(s) selected.

PS D:\xtivia\queries> db2 set current explain mode explain
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf query13.sql
select * from db2.order_trace  where  c_state = ? and dl_ap_date2 >= ? and  reference not in (select reference from db2.edi_204 edi_204 where edi_204.reference = reference and edi_204.order_id = order
_id  and CUST_s_location_id = ? ) and (status in (? ,? ) OR ORDER_TYPE = ? )  order by dl_ap_date2, dl_ap_time2, act_state, act_city, actual, order_id
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

PS D:\xtivia\queries> db2exfmt -d comtrak2 -e db2admin -1 -o query13_exfmt.txt
DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 2009
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Output is in query13_exfmt.txt.
Executing Connect Reset -- Connect Reset was Successful.
PS D:\xtivia\queries> db2advis -d comtrak2 -i query13.sql -p |tee query13_advis.txt

Using user id as default schema name. Use -n option to specify schema
execution started at timestamp 2015-05-21-10.01.12.315000
found [1] SQL statements from the input file
Recommending indexes...
total disk space needed for initial set [1715.827] MB
total disk space constrained to         [35280.399] MB
Trying variations of the solution set.
Optimization finished.
  19  indexes in current solution
 [38772.0000] timerons  (without recommendations)
 [18742.0000] timerons  (with current solution)
 [51.66%] improvement


--
--
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1],    0.532MB
   CREATE UNIQUE INDEX "DB2ADMIN"."IDX1505231651490"
   ON "DB2     "."ORDER" ("TRAILER_SIZE" ASC, "DL_AP_DATE2"
   ASC, "CUSTOMER_RATE" ASC, "NOTIFIED_DATE" ASC, "DATE_ENTERED"
   ASC, "COMMODITY" ASC, "NEEDS_PAPERWORK" ASC, "LINE_OF_BUSINESS"
   ASC, "TOTAL_CHARGE" ASC, "MILES" ASC, "TERMINAL" ASC,
   "SCAC" ASC, "SEAL" ASC, "PO_NUM" ASC, "PU_NUM" ASC,
   "BL_NUM" ASC, "REFERENCE" ASC, "WHO_ENTERED" ASC,
   "ORDER_TYPE" ASC, "TRACTOR_ID" ASC, "DRIVER_ID" ASC,
   "DL_AT_TIME2" ASC, "DL_AT_DATE2" ASC, "DL_AT_TIME1"
   ASC, "DL_AT_DATE1" ASC, "DL_AP_TIME2" ASC, "DL_AP_TIME1"
   ASC, "DL_AP_DATE1" ASC, "PU_AT_TIME2" ASC, "PU_AT_DATE2"
   ASC, "PU_AT_TIME1" ASC, "PU_AT_DATE1" ASC, "PU_AP_TIME2"
   ASC, "PU_AP_DATE2" ASC, "PU_AP_TIME1" ASC, "PU_AP_DATE1"
   ASC, "S_LOCATION_ID" ASC, "CHASSIS" ASC, "CHECK_DIGIT"
   ASC, "TRAILER" ASC, "STATUS" ASC, "ORDER_ID" ASC)
   INCLUDE ("C_LOCATION_ID", "ACT_LOCATION_ID", "BTO_LOCATION_ID")
   ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[2],   21.126MB
   CREATE INDEX "DB2ADMIN"."IDX1505231648190" ON "DB2     "."EDI_204"
   ("CUST_S_LOCATION_ID" ASC, "ORDER_ID" ASC, "REFERENCE"
   ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[3],    0.048MB
   CREATE UNIQUE INDEX "DB2ADMIN"."IDX1505231647110"
   ON "DB2     "."VENDOR_UNIT" ("VENDOR_UNIT_ID" ASC)
   INCLUDE ("VENDOR_ID") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[4],    5.782MB
   CREATE UNIQUE INDEX "DB2ADMIN"."IDX1505231647160"
   ON "DB2     "."LOCATION" ("LOCATION_ID" ASC) INCLUDE
   ("NAME") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[5], 1678.521MB
   CREATE INDEX "DB2ADMIN"."IDX1505231647170" ON "EDI     "."EDI_204_ADDITIONAL_FIELDS"
   ("REFERENCE" ASC, "FIELD_NAME" ASC, "INSERT_DT" ASC,
   "FIELD_VALUE" ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[6],    0.013MB
   CREATE INDEX "DB2ADMIN"."IDX1505231648160" ON "DB2     "."CODE"
   ("CODE_TEXT" ASC, "TYPE" ASC, "CODE_ID" ASC, "CODE"
   ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[7],    9.806MB
   CREATE UNIQUE INDEX "DB2ADMIN"."IDX1505231652450"
   ON "DB2     "."RAIL_ETA" ("ORDER_ID" ASC) INCLUDE
   ("STATE") ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;


snip

1449 solutions were evaluated by the advisor
DB2 Workload Performance Advisor tool is finished.
PS D:\xtivia\queries> db2 set current explain mode no
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 "select current timestamp from sysibm.sysdummy1"

1
--------------------------
2015-05-21-10.03.06.369000

  1 record(s) selected.

What this gets me is all of the details about this query without any changes, and the recommended indexes – in this case 7 new indexes. I don’t really like all of those indexes, but I need to figure out which ones will give me the most impact and analyze other characteristics about them. Note the use of the -p option on the db2advis. This causes DB2 to retain information about the explain plan and recommended indexes.

We also need to note from this process the starting time and the ending time for the analysis – in this case:
Start: 2015-05-21-10.00.07
End: 2015-05-21-10.03.06

Note Costs

The next step is to gather the costs of this query before and after the indexes recommended by the db2 design advisor. Note that we have to fill in the timestamps from above to limit the data returned.

select 
    dec(total_cost,20,4) as before_total_cost
    , dec(io_cost,20,4) as before_io_cost
    , dec(CPU_cost,20,4) as before_cpu_cost
    , dec(Comm_cost,20,4) as before_comm_cost
from Explain_Operator
    ,(select min(explain_time) as mintime
        from Explain_Operator 
        where operator_type = 'RETURN' 
          and explain_time between timestamp('2015-05-21-10.00.07') and timestamp('2015-05-21-10.03.06')) as b
where 
    explain_time = b.mintime
    and operator_type = 'RETURN' 
with UR ;
BEFORE_TOTAL_COST      BEFORE_IO_COST         BEFORE_CPU_COST        BEFORE_COMM_COST
---------------------- ---------------------- ---------------------- ----------------------
            38772.4492             11552.6152         705355200.0000                 0.0000

  1 record(s) selected.

And:

select 
    dec(total_cost,20,4) as after_total_cost
    , dec(io_cost,20,4) as after_io_cost
    , dec(CPU_cost,20,4) as after_cpu_cost
    , dec(Comm_cost,20,4) as after_comm_cost
from Explain_Operator
    ,(select max(explain_time) as maxtime
        from Explain_Operator 
        where operator_type = 'RETURN' 
          and explain_time between timestamp('2015-05-21-10.00.07') and timestamp('2015-05-21-10.03.06')) as b
where 
    explain_time = b.maxtime
    and operator_type = 'RETURN' 
with UR ;
AFTER_TOTAL_COST       AFTER_IO_COST          AFTER_CPU_COST         AFTER_COMM_COST
---------------------- ---------------------- ---------------------- ----------------------
            18741.7597              5339.1113         131031752.0000                 0.0000

  1 record(s) selected.

What we have in the explain tables at this point is really two sets of data – the data from the original explain, and the data from an explain as if all of the recommended indexes existed.

Now here’s a query I prefer to work with of the existing and recommended indexes for this query. It uses both data on existing indexes from syscat.indexes and syscat.tables, along with data on the proposed new indexes from advise_index. It’s also one I run over and over again to help me keep track of what indexes I’m trying things with:

with ts as ( select max(explain_time) as maxtime
        from Explain_Operator 
        where operator_type = 'RETURN' 
          and explain_time between timestamp('2015-05-21-10.00.07') and timestamp('2015-05-21-10.03.06'))
select substr(name,1,25) as indname
    , substr(tbcreator,1,13) as tabschema
    , substr(tbname,1,18) as tabname
    , coalesce(si.fullkeycard, ai.fullkeycard) as fullkeycard
    , st.card
    , coalesce(si.uniquerule, ai.uniquerule) as uniquerule
    , use_index
    , exists
    , substr(coalesce(si.colnames, ai.colnames),1,60) as colnames 
from advise_index ai 
    left outer join syscat.indexes si on ai.tbcreator=si.tabschema and ai.tbname=si.tabname and ai.name=si.indname 
    left outer join syscat.tables st on st.tabschema=ai.tbcreator and st.tabname=ai.tbname 
    join ts on explain_time = ts.maxtime
order by exists, use_index, uniquerule desc, indname with ur;

I’m actually masaging that a bit with clpplus to get the output I want because substr isn’t doing so well at the powershell prompt for me at the moment:

                                                                              Index                Table                                                                                         
Index Name           Table Schema    Table Name                         Cardinality          Cardinality UNIQUERULE USE_INDEX EXISTS Column Names                                                
-------------------- --------------- ------------------------- -------------------- -------------------- ---------- --------- ------ ------------------------------------------------------------
IDX1505231647110     DB2             VENDOR_UNIT                               2976                 2976 U          Y         N      +VENDOR_UNIT_ID-VENDOR_ID                                   
IDX1505231647160     DB2             LOCATION                                140445               140445 U          Y         N      +LOCATION_ID-NAME                                           
IDX1505231651490     DB2             ORDER                                      642                52497 U          Y         N      +TRAILER_SIZE+DL_AP_DATE2+CUSTOMER_RATE+NOTIFIED_DATE+DATE_E
                                                                                                                                     NTERED+COMMODITY+NEEDS_PAPERWORK+LINE_OF_BUSINESS+TOTAL_CHAR
                                                                                                                                     GE+MILES+TERMINAL+SCAC+SEAL+PO_NUM+PU_NUM+BL_NUM+REFERE...  

IDX1505231652450     DB2             RAIL_ETA                                649793               649793 U          Y         N      +ORDER_ID-STATE                                             
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          Y         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          Y         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231648160     DB2             CODE                                         5                  520 D          Y         N      +CODE_TEXT+TYPE+CODE_ID+CODE                                
IDX1505231648190     DB2             EDI_204                                 295885              1014841 D          Y         N      +CUST_S_LOCATION_ID+ORDER_ID+REFERENCE                      
DRIVER_ID            DB2             DRIVER                                   13519                13519 U          Y         Y      +DRIVER_ID+FIRST_NAME+LAST_NAME+PREVIOUS_NUM+DISPATCHER     
IDX201162151160000   DB2             CITY                                     97516                97516 U          Y         Y      +CITY_ID+STATE+CITY+TIME_ZONE                               
WIZ1166              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+ZIP+LONGITUDE+LATITUDE+NAME+CITY_ID            
WIZ1171              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+NAME+NUMBER                                    
SQL991011111520810   DB2             DRIVER                                   13519                13519 P          Y         Y      +DRIVER_ID                                                  
SQL991011113441210   DB2             VENDOR_UNIT                               2976                 2976 P          Y         Y      +VENDOR_UNIT_ID                                             
IDX_CODE_01          DB2             CODE                                       518                  520 D          Y         Y      +TYPE+CODE_TEXT+CODE                                        
IDX_STOP_1           DB2             STOP                                     15883                21568 D          Y         Y      +ORDER_ID                                                   
IDX201161942390000   EDI             EDI_204_HUB_CUSTOMER                    451173               451173 D          Y         Y      +REFERENCE+DESCRIPTION                                      
IDX201171957580000   DB2             CITY                                     97516                97516 D          Y         Y      +STATE+CITY+CITY_ID                                         
IDX201171959320000   DB2             LOCATION                                140445               140445 D          Y         Y      +CITY_ID+DROP+ZIP+LONGITUDE+LATITUDE+NAME+LOCATION_ID       

I pull that output into Excel so I can note details as I work through each index and easily play with the numbers. It looks something like this:
Screenshot_052115_094911_AM

I hide and add some columns, but I’m running this same query between every change I make so I keep track of what I’ve done and not done.

Analysis Techniques

At this point, there are two different methods that Scott laid out in his presentation on this topic – Index Addition and Index Subtraction.

Index Addition

The point of index addition is to run an explain on the query with each proposed index individually, to see the impact of that index alone. This works particuarly well when a great impact is seen from a few indexes and the combination of indexes does not make much difference. To use this method, we update ADVISE_INDEX to have USE_INDEX set to N for all indexes that do not already exist:

PS > db2 "update advise_index set use_index = 'N' where exists= 'N'"
DB20000I  The SQL command completed successfully.
                                                                              Index                Table                                                                                         
Index Name           Table Schema    Table Name                         Cardinality          Cardinality UNIQUERULE USE_INDEX EXISTS Column Names                                                
-------------------- --------------- ------------------------- -------------------- -------------------- ---------- --------- ------ ------------------------------------------------------------
IDX1505231647110     DB2             VENDOR_UNIT                               2976                 2976 U          N         N      +VENDOR_UNIT_ID-VENDOR_ID                                   
IDX1505231647160     DB2             LOCATION                                140445               140445 U          N         N      +LOCATION_ID-NAME                                           
IDX1505231651490     DB2             ORDER                                      642                52497 U          N         N      +TRAILER_SIZE+DL_AP_DATE2+CUSTOMER_RATE+NOTIFIED_DATE+DATE_E
                                                                                                                                     NTERED+COMMODITY+NEEDS_PAPERWORK+LINE_OF_BUSINESS+TOTAL_CHAR
                                                                                                                                     GE+MILES+TERMINAL+SCAC+SEAL+PO_NUM+PU_NUM+BL_NUM+REFERE...  

IDX1505231652450     DB2             RAIL_ETA                                649793               649793 U          N         N      +ORDER_ID-STATE                                             
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231648160     DB2             CODE                                         5                  520 D          N         N      +CODE_TEXT+TYPE+CODE_ID+CODE                                
IDX1505231648190     DB2             EDI_204                                 295885              1014841 D          N         N      +CUST_S_LOCATION_ID+ORDER_ID+REFERENCE                      
DRIVER_ID            DB2             DRIVER                                   13519                13519 U          Y         Y      +DRIVER_ID+FIRST_NAME+LAST_NAME+PREVIOUS_NUM+DISPATCHER     
IDX201162151160000   DB2             CITY                                     97516                97516 U          Y         Y      +CITY_ID+STATE+CITY+TIME_ZONE                               
WIZ1166              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+ZIP+LONGITUDE+LATITUDE+NAME+CITY_ID            
WIZ1171              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+NAME+NUMBER                                    
SQL991011111520810   DB2             DRIVER                                   13519                13519 P          Y         Y      +DRIVER_ID                                                  
SQL991011113441210   DB2             VENDOR_UNIT                               2976                 2976 P          Y         Y      +VENDOR_UNIT_ID                                             
IDX_CODE_01          DB2             CODE                                       518                  520 D          Y         Y      +TYPE+CODE_TEXT+CODE                                        
IDX_STOP_1           DB2             STOP                                     15883                21568 D          Y         Y      +ORDER_ID                                                   
IDX201161942390000   EDI             EDI_204_HUB_CUSTOMER                    451173               451173 D          Y         Y      +REFERENCE+DESCRIPTION                                      
IDX201171957580000   DB2             CITY                                     97516                97516 D          Y         Y      +STATE+CITY+CITY_ID                                         
IDX201171959320000   DB2             LOCATION                                140445               140445 D          Y         Y      +CITY_ID+DROP+ZIP+LONGITUDE+LATITUDE+NAME+LOCATION_ID       

Now one by one we set use_index equal to ‘Y’ for each index and run an explain using the ‘evaluate indexes’ option.

PS D:\xtivia\queries> db2 "update advise_index set use_index = 'Y' where name = 'IDX1505231647110'"
DB20000I  The SQL command completed successfully.

                                                                              Index                Table                                                                                         
Index Name           Table Schema    Table Name                         Cardinality          Cardinality UNIQUERULE USE_INDEX EXISTS Column Names                                                
-------------------- --------------- ------------------------- -------------------- -------------------- ---------- --------- ------ ------------------------------------------------------------
IDX1505231647160     DB2             LOCATION                                140445               140445 U          N         N      +LOCATION_ID-NAME                                           
IDX1505231651490     DB2             ORDER                                      642                52497 U          N         N      +TRAILER_SIZE+DL_AP_DATE2+CUSTOMER_RATE+NOTIFIED_DATE+DATE_E
                                                                                                                                     NTERED+COMMODITY+NEEDS_PAPERWORK+LINE_OF_BUSINESS+TOTAL_CHAR
                                                                                                                                     GE+MILES+TERMINAL+SCAC+SEAL+PO_NUM+PU_NUM+BL_NUM+REFERE...  

IDX1505231652450     DB2             RAIL_ETA                                649793               649793 U          N         N      +ORDER_ID-STATE                                             
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231648160     DB2             CODE                                         5                  520 D          N         N      +CODE_TEXT+TYPE+CODE_ID+CODE                                
IDX1505231648190     DB2             EDI_204                                 295885              1014841 D          N         N      +CUST_S_LOCATION_ID+ORDER_ID+REFERENCE                      
IDX1505231647110     DB2             VENDOR_UNIT                               2976                 2976 U          Y         N      +VENDOR_UNIT_ID-VENDOR_ID                                   
DRIVER_ID            DB2             DRIVER                                   13519                13519 U          Y         Y      +DRIVER_ID+FIRST_NAME+LAST_NAME+PREVIOUS_NUM+DISPATCHER     
IDX201162151160000   DB2             CITY                                     97516                97516 U          Y         Y      +CITY_ID+STATE+CITY+TIME_ZONE                               
WIZ1166              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+ZIP+LONGITUDE+LATITUDE+NAME+CITY_ID            
WIZ1171              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+NAME+NUMBER                                    
SQL991011111520810   DB2             DRIVER                                   13519                13519 P          Y         Y      +DRIVER_ID                                                  
SQL991011113441210   DB2             VENDOR_UNIT                               2976                 2976 P          Y         Y      +VENDOR_UNIT_ID                                             
IDX_CODE_01          DB2             CODE                                       518                  520 D          Y         Y      +TYPE+CODE_TEXT+CODE                                        
IDX_STOP_1           DB2             STOP                                     15883                21568 D          Y         Y      +ORDER_ID                                                   
IDX201161942390000   EDI             EDI_204_HUB_CUSTOMER                    451173               451173 D          Y         Y      +REFERENCE+DESCRIPTION                                      
IDX201171957580000   DB2             CITY                                     97516                97516 D          Y         Y      +STATE+CITY+CITY_ID                                         
IDX201171959320000   DB2             LOCATION                                140445               140445 D          Y         Y      +CITY_ID+DROP+ZIP+LONGITUDE+LATITUDE+NAME+LOCATION_ID

With that set, we set the current explain mode, and explain the query:

PS D:\xtivia\queries> db2 set current explain mode evaluate indexes
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf query13.sql
select * from db2.order_trace  where  c_state = ? and dl_ap_date2 >= ? and  reference not in (select reference from db2.edi_204 edi_204 where edi_204.reference = reference and edi_204.order_id = order
_id  and CUST_s_location_id = ? ) and (status in (? ,? ) OR ORDER_TYPE = ? )  order by dl_ap_date2, dl_ap_time2, act_state, act_city, actual, order_id
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

PS D:\xtivia\queries> db2 set current explain mode no
DB20000I  The SQL command completed successfully.

Note that the explain mode used is not yes or explain, but evaluate indexes.

Next, we must query the cost to record in our spreadsheet:

select 
    dec(total_cost,20,4) as this_index_total_cost
    , dec(io_cost,20,4) as this_index_io_cost
    , dec(CPU_cost,20,4) as this_index_cpu_cost
    , dec(Comm_cost,20,4) as this_index_comm_cost
from Explain_Operator
    ,(select max(explain_time) as maxtime
        from Explain_Operator 
        where operator_type = 'RETURN') as b
where 
    explain_time = b.maxtime
    and operator_type = 'RETURN' 
with UR ;

select 
THIS_INDEX_TOTAL_COST  THIS_INDEX_IO_COST     THIS_INDEX_CPU_COST    THIS_INDEX_COMM_COST
---------------------- ---------------------- ---------------------- ----------------------
            38772.4492             11552.6152         705353856.0000                 0.0000

  1 record(s) selected.

If you’re keeping track, that’s ZERO impact on timerons for that particular index. Now we have to move through each index in turn. I’m only showing the second one so you understand that each time we mark the other indexes as not being used, and mark only one index as being used:

PS D:\xtivia\queries> db2 "update advise_index set use_index = 'N' where exists= 'N'"
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 "update advise_index set use_index = 'Y' where name = 'IDX1505231647160'"
DB20000I  The SQL command completed successfully.

                                                                              Index                Table                                                                                         
Index Name           Table Schema    Table Name                         Cardinality          Cardinality UNIQUERULE USE_INDEX EXISTS Column Names                                                
-------------------- --------------- ------------------------- -------------------- -------------------- ---------- --------- ------ ------------------------------------------------------------
IDX1505231647110     DB2             VENDOR_UNIT                               2976                 2976 U          N         N      +VENDOR_UNIT_ID-VENDOR_ID                                   
IDX1505231651490     DB2             ORDER                                      642                52497 U          N         N      +TRAILER_SIZE+DL_AP_DATE2+CUSTOMER_RATE+NOTIFIED_DATE+DATE_E
                                                                                                                                     NTERED+COMMODITY+NEEDS_PAPERWORK+LINE_OF_BUSINESS+TOTAL_CHAR
                                                                                                                                     GE+MILES+TERMINAL+SCAC+SEAL+PO_NUM+PU_NUM+BL_NUM+REFERE...  

IDX1505231652450     DB2             RAIL_ETA                                649793               649793 U          N         N      +ORDER_ID-STATE                                             
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          N         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231648160     DB2             CODE                                         5                  520 D          N         N      +CODE_TEXT+TYPE+CODE_ID+CODE                                
IDX1505231648190     DB2             EDI_204                                 295885              1014841 D          N         N      +CUST_S_LOCATION_ID+ORDER_ID+REFERENCE                      
IDX1505231647160     DB2             LOCATION                                140445               140445 U          Y         N      +LOCATION_ID-NAME                                           
DRIVER_ID            DB2             DRIVER                                   13519                13519 U          Y         Y      +DRIVER_ID+FIRST_NAME+LAST_NAME+PREVIOUS_NUM+DISPATCHER     
IDX201162151160000   DB2             CITY                                     97516                97516 U          Y         Y      +CITY_ID+STATE+CITY+TIME_ZONE                               
WIZ1166              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+ZIP+LONGITUDE+LATITUDE+NAME+CITY_ID            
WIZ1171              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+NAME+NUMBER                                    
SQL991011111520810   DB2             DRIVER                                   13519                13519 P          Y         Y      +DRIVER_ID                                                  
SQL991011113441210   DB2             VENDOR_UNIT                               2976                 2976 P          Y         Y      +VENDOR_UNIT_ID                                             
IDX_CODE_01          DB2             CODE                                       518                  520 D          Y         Y      +TYPE+CODE_TEXT+CODE                                        
IDX_STOP_1           DB2             STOP                                     15883                21568 D          Y         Y      +ORDER_ID                                                   
IDX201161942390000   EDI             EDI_204_HUB_CUSTOMER                    451173               451173 D          Y         Y      +REFERENCE+DESCRIPTION                                      
IDX201171957580000   DB2             CITY                                     97516                97516 D          Y         Y      +STATE+CITY+CITY_ID                                         
IDX201171959320000   DB2             LOCATION                                140445               140445 D          Y         Y      +CITY_ID+DROP+ZIP+LONGITUDE+LATITUDE+NAME+LOCATION_ID       
PS D:\xtivia\queries> db2 set current explain mode evaluate indexes
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf query13.sql
select * from db2.order_trace  where  c_state = ? and dl_ap_date2 >= ? and  reference not in (select reference from db2.edi_204 edi_204 where edi_204.reference = reference and edi_204.order_id = order
_id  and CUST_s_location_id = ? ) and (status in (? ,? ) OR ORDER_TYPE = ? )  order by dl_ap_date2, dl_ap_time2, act_state, act_city, actual, order_id
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

PS D:\xtivia\queries> db2exfmt -d comtrak2 -1 -o query13_ind2_exfmt.txt
DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 2009
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Base table information incomplete
Output is in query13_ind2_exfmt.txt.
Executing Connect Reset -- Connect Reset was Successful.
PS D:\xtivia\queries> db2 set current explain mode no
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf this_index_cost_query.sql
select dec(total_cost,20,4) as this_index_total_cost , dec(io_cost,20,4) as this_index_io_cost , dec(CPU_cost,20,4) as this_index_cpu_cost , dec(Comm_cost,20,4) as this_index_comm_cost from Explain_Op
erator ,(select max(explain_time) as maxtime from Explain_Operator where operator_type = 'RETURN') as b where explain_time = b.maxtime and operator_type = 'RETURN' with UR

THIS_INDEX_TOTAL_COST  THIS_INDEX_IO_COST     THIS_INDEX_CPU_COST    THIS_INDEX_COMM_COST
---------------------- ---------------------- ---------------------- ----------------------
            38772.4492             11552.6152         705354496.0000                 0.0000

  1 record(s) selected.

Note that I do tend to run a db2exfmt each time. I’m not technically required to, but I find them useful to have to go back and look at. Again, this index appears to have ZERO impact by itself.

When I’ve gone through each one, this is what my spreadsheet looks like:
Screenshot_052115_105225_AM

Note that by far, the biggest single impact is that wide index that I really don’t like the idea of. Only one other of the 7 recommended new indexes shows any impact by itself. This leads me to wonder why DB2 is even recommending some of the others. I’m hoping that it is because a combination of them makes a real difference. Let’s try Index Subtraction to see if that is true.

Index Subtraction

For Index Subtraction, we tell DB2 to give us the numbers as if all of the proposed indexes exist, and we take away only one at a time to see the impact of not having that particular index. The steps are very similar, just with slightly different updates to the USE_INDEX column.

To start with, we want all of the recommended indexes for this query to be marked with USE_INDEX of Y.

PS D:\xtivia\queries> db2 "update advise_index set use_index='Y' where name in ('IDX1505231647110','IDX1505231647160','IDX1505231652450','IDX1505231647170','IDX1505231648160','IDX1505231648190','IDX15
05231651490')"
DB20000I  The SQL command completed successfully.
                                                                              Index                Table                                                                                         
Index Name           Table Schema    Table Name                         Cardinality          Cardinality UNIQUERULE USE_INDEX EXISTS Column Names                                                
-------------------- --------------- ------------------------- -------------------- -------------------- ---------- --------- ------ ------------------------------------------------------------
IDX1505231647110     DB2             VENDOR_UNIT                               2976                 2976 U          Y         N      +VENDOR_UNIT_ID-VENDOR_ID                                   
IDX1505231647160     DB2             LOCATION                                140445               140445 U          Y         N      +LOCATION_ID-NAME                                           
IDX1505231651490     DB2             ORDER                                      642                52497 U          Y         N      +TRAILER_SIZE+DL_AP_DATE2+CUSTOMER_RATE+NOTIFIED_DATE+DATE_E
                                                                                                                                     NTERED+COMMODITY+NEEDS_PAPERWORK+LINE_OF_BUSINESS+TOTAL_CHAR
                                                                                                                                     GE+MILES+TERMINAL+SCAC+SEAL+PO_NUM+PU_NUM+BL_NUM+REFERE...  

IDX1505231652450     DB2             RAIL_ETA                                649793               649793 U          Y         N      +ORDER_ID-STATE                                             
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          Y         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231647170     EDI             EDI_204_ADDITIONAL_FIELDS             19897638             19897638 D          Y         N      +REFERENCE+FIELD_NAME+INSERT_DT+FIELD_VALUE                 
IDX1505231648160     DB2             CODE                                         5                  520 D          Y         N      +CODE_TEXT+TYPE+CODE_ID+CODE                                
IDX1505231648190     DB2             EDI_204                                 295885              1014841 D          Y         N      +CUST_S_LOCATION_ID+ORDER_ID+REFERENCE                      
DRIVER_ID            DB2             DRIVER                                   13519                13519 U          Y         Y      +DRIVER_ID+FIRST_NAME+LAST_NAME+PREVIOUS_NUM+DISPATCHER     
IDX201162151160000   DB2             CITY                                     97516                97516 U          Y         Y      +CITY_ID+STATE+CITY+TIME_ZONE                               
WIZ1166              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+ZIP+LONGITUDE+LATITUDE+NAME+CITY_ID            
WIZ1171              DB2             LOCATION                                140445               140445 U          Y         Y      +LOCATION_ID+NAME+NUMBER                                    
SQL991011111520810   DB2             DRIVER                                   13519                13519 P          Y         Y      +DRIVER_ID                                                  
SQL991011113441210   DB2             VENDOR_UNIT                               2976                 2976 P          Y         Y      +VENDOR_UNIT_ID                                             
IDX_CODE_01          DB2             CODE                                       518                  520 D          Y         Y      +TYPE+CODE_TEXT+CODE                                        
IDX_STOP_1           DB2             STOP                                     15883                21568 D          Y         Y      +ORDER_ID                                                   
IDX201161942390000   EDI             EDI_204_HUB_CUSTOMER                    451173               451173 D          Y         Y      +REFERENCE+DESCRIPTION                                      
IDX201171957580000   DB2             CITY                                     97516                97516 D          Y         Y      +STATE+CITY+CITY_ID                                         
IDX201171959320000   DB2             LOCATION                                140445               140445 D          Y         Y      +CITY_ID+DROP+ZIP+LONGITUDE+LATITUDE+NAME+LOCATION_ID       

Now one at a time, we will mark the indexes to not be used, and run the explain using evaluate indexes.

PS D:\xtivia\queries> db2 "update advise_index set use_index='N' where name = 'IDX1505231647110'"
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 set current explain mode evaluate indexes
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf query13.sql
select * from db2.order_trace  where  c_state = ? and dl_ap_date2 >= ? and  reference not in (select reference from db2.edi_204 edi_204 where edi_204.reference = reference and edi_204.order_id = order
_id  and CUST_s_location_id = ? ) and (status in (? ,? ) OR ORDER_TYPE = ? )  order by dl_ap_date2, dl_ap_time2, act_state, act_city, actual, order_id
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

PS D:\xtivia\queries> db2exfmt -d comtrak2 -1 -o query13_excl_ind1_exfmt.txt
DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 2009
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Base table information incomplete
Base table information incomplete
Base table information incomplete
Base table information incomplete
Base table information incomplete
Base table information incomplete
Output is in query13_excl_ind1_exfmt.txt.
Executing Connect Reset -- Connect Reset was Successful.
PS D:\xtivia\queries> db2 set current explain mode no
DB20000I  The SQL command completed successfully.
PS D:\xtivia\queries> db2 -tvf this_index_cost_query.sql
select dec(total_cost,20,4) as this_index_total_cost , dec(io_cost,20,4) as this_index_io_cost , dec(CPU_cost,20,4) as this_index_cpu_cost , dec(Comm_cost,20,4) as this_index_comm_cost from Explain_Op
erator ,(select max(explain_time) as maxtime from Explain_Operator where operator_type = 'RETURN') as b where explain_time = b.maxtime and operator_type = 'RETURN' with UR

THIS_INDEX_TOTAL_COST  THIS_INDEX_IO_COST     THIS_INDEX_CPU_COST    THIS_INDEX_COMM_COST
---------------------- ---------------------- ---------------------- ----------------------
            18741.7597              5339.1113         131033120.0000                 0.0000

  1 record(s) selected.

When I cycle through each index that way, ensuring that in each round I eliminate only one, this is what my spreadsheet looks like:
Screenshot_052115_111216_AM

Results

With this information, I can immediately eliminate indexes: IDX1505231647110, IDX1505231647160, IDX1505231652450, IDX1505231648160, and IDX1505231647170. This leaves me with two indexes to consider. Given the fact that EDI_204 is one of the largest and most active tables in my database, I won’t be adding that index for the small additional impact that it gives for this query. I’m left with just one major index:

   CREATE UNIQUE INDEX "DB2ADMIN"."IDX1505231651490"
   ON "DB2     "."ORDER" ("TRAILER_SIZE" ASC, "DL_AP_DATE2"
   ASC, "CUSTOMER_RATE" ASC, "NOTIFIED_DATE" ASC, "DATE_ENTERED"
   ASC, "COMMODITY" ASC, "NEEDS_PAPERWORK" ASC, "LINE_OF_BUSINESS"
   ASC, "TOTAL_CHARGE" ASC, "MILES" ASC, "TERMINAL" ASC,
   "SCAC" ASC, "SEAL" ASC, "PO_NUM" ASC, "PU_NUM" ASC,
   "BL_NUM" ASC, "REFERENCE" ASC, "WHO_ENTERED" ASC,
   "ORDER_TYPE" ASC, "TRACTOR_ID" ASC, "DRIVER_ID" ASC,
   "DL_AT_TIME2" ASC, "DL_AT_DATE2" ASC, "DL_AT_TIME1"
   ASC, "DL_AT_DATE1" ASC, "DL_AP_TIME2" ASC, "DL_AP_TIME1"
   ASC, "DL_AP_DATE1" ASC, "PU_AT_TIME2" ASC, "PU_AT_DATE2"
   ASC, "PU_AT_TIME1" ASC, "PU_AT_DATE1" ASC, "PU_AP_TIME2"
   ASC, "PU_AP_DATE2" ASC, "PU_AP_TIME1" ASC, "PU_AP_DATE1"
   ASC, "S_LOCATION_ID" ASC, "CHASSIS" ASC, "CHECK_DIGIT"
   ASC, "TRAILER" ASC, "STATUS" ASC, "ORDER_ID" ASC)
   INCLUDE ("C_LOCATION_ID", "ACT_LOCATION_ID", "BTO_LOCATION_ID")
   ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;

Given how wide this index is, I’m really concerned about adding it to a table that is already a bit over-indexed, so I haven’t decided yet if I’m going to add it. But at least I now understand that it’s the only index really worth considering for this query.

I find that for now, I like to use both the index addition and subtraction methods to really see what difference there might be in the results.

A big thanks to Scott Hayes for covering this topic in depth at his IDUG North America presentation this year. This process has become a regular tool in my arsenal.

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 544

12 Comments

  1. Hi Ember this is a very nice article, thanks for sharing it !!.

    After you run db2advis you can also get the explain plan of the query with the recommended/new indexes. In this case you didn’t copy/paste it but I wonder; most likely the big new index (IDX1505210120080) is used in a join with another table. Don’t you think maybe by creating an index using only the column(s) joined with that other table could provide a good result (timerons) without including the rest of the columns?. Cause normally DB2 will only suggest such huge indexes to do an index-scan…. but I guess the most important thing is to optimize the join.

    Regards

    • Absolutely, there is additional analysis I could do as to which columns in that index are truly necessary. Would probably make a good separate blog entry.

  2. You can do this from Data Studio and the Query Tuner. After the Index Advisor is run, you Test Candidate indexes by running explain with different combinations of indexes added or removed. You can then do a graph compares to see which indexes helped the most.

      • Even Im not but i recently tried my hands on the Query Tuner in IBM DS, believe me its generating an awesome report. Yeah i agree its basically drawing the data from explain information but html report its generating on a consolidated basis can be used for further discussion with the developers and RCA.

  3. Bit of a late response:-) I’ve been thinking about investigating this since I saw Scotts presentation at IDUG a couple of years ago, but I somehow forgot about it. The other day I stumbled on your article whilst looking for something else, so I thought I give it a shot. The article is great and easy to follow, so it was easy to get things going.
    I soon discovered that even with a moderate number of indexes, the process of evaluating the cost for different index combinations was a bit time-consuming. I also managed to screw up which indexes to enable/disable, so I had to start over to get things right. Here’s a sketch on how to automate this process, the idea is to generate a SQL script that saves the cost for each combination to a table. In the example, I only use total cost, but that easy is easy to extend. I also generate code for all possible index combinations. The SQL-generator is written in python2, so I guess it should work in a Jupyter notebook (haven’t tried though):

    {code}
    #!/usr/bin/python

    import sys
    from itertools import chain, combinations

    # table to keep track of cost for combination of indexes
    costtbl = “create table costtbl (nr int not null primary key, total_cost decimal(20,4) not null);”

    # indexes proposed by db2advis
    indexes = [ “IDX1904090821410”, “IDX1904090821340” ]

    # the query under investigation
    query = “select count(1) from nya.application a join nya.admission_round ar on a.admissionround_id = ar.admissionround_id where ar.process_state_id ./myscript.py | tee myscript.sql
    #> db2 connect to mydb
    #> db2 +c -s -tf myscript.sql
    #> db2 commit
    #> db2 “select * from costtbl”

    NR TOTAL_COST
    ———– ———————-
    0 40635.9140
    1 248.7855
    2 40635.9140
    3 248.7855

    NR is a bit-mask for the array of indexes (3 means first and second index). It’s just a sketch, but once the data is in a table we can do all sort of analyses on it. It should also be easy to add more dimensions to the cost-table.

    Thanks for a great article.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.