Example of Indexing for a Problem Delete

Posted by

Go-lives bring all kinds of opportunities to find problem SQL. A new site going live or a new site design going live can put a completely different load of SQL on a database. Depending on the level of load testing done, there can be some problems still to find. This post is an example of just one problem SQL statement (in this case, a delete) found and mitigated through database level analysis.

Finding the Problem

After a recent go-live for a client that did not do much load testing, I found the following delete statement showing up as a potential problem:

STATEMENT                        ROWS_READ            PCT_TOT_RR ROWS_RETURNED        READ_EFFICIENCY NUM_EXECUTIONS
-------------------------------- -------------------- ---------- -------------------- --------------- --------------------
delete from EDPPAYINST where ...             47771786      32.00                    0           -1.00                   11

That’s in an hour. And a good example why I don’t look only for statements with poor read efficiency – notice that since this query doesn’t return any rows, read efficiency is not applicable here. And sure, it’s only executed 11 times an hour, but it is doing that every hour, and managing to be one of the top two queries in terms of rows read in my database. There is no need to read those 40 million plus rows if we don’t have to.

Note that there were other problem queries I found here – I’m just pulling this one out as an example.

Analyzing the Statement

First, I put the statement in a file – go-live_query1.sql – and made sure the tables were fully qualified with the schema name – something WebSphere Commerce does not do.

Then I captured explain information and generated the explain plan:

db2 connect to SAMPLE

   Database Connection Information

 Database server        = DB2/LINUXX8664 9.7.6
 SQL authorization ID   = DB2INST1
 Local database alias   = SAMPLE

$ db2 set current explain mode explain
DB20000I  The SQL command completed successfully.
$ db2 -tvf go-live_query1.sql
delete from wscomusr.EDPPAYINST where EDPORDER_ID in (select EDPORDER_ID from wscomusr.EDPORDER where ORDER_ID in (select ORDERS_ID from wscomusr.ORDERS where  STATUS = 'P' and MEMBER_ID = ? ))
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

$ db2exfmt -d SAMPLE -1 -o go-live_query1.sql.exfmt
DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 2009
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Output is in go-live_query1.sql.exfmt.
Executing Connect Reset -- Connect Reset was Successful.

Now, I have the explain information in the file that looks like this (the top portion anyway):

DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 2009
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool



******************** EXPLAIN INSTANCE ********************

DB2_VERSION:       09.07.6
SOURCE_NAME:       SQLC2H23
SOURCE_SCHEMA:     NULLID
SOURCE_VERSION:
EXPLAIN_TIME:      2013-04-05-08.29.39.064276
EXPLAIN_REQUESTER: DB2INST1

Database Context:
----------------
        Parallelism:          None
        CPU Speed:            2.401083e-07
        Comm Speed:           100
        Buffer Pool size:     5985820
        Sort Heap size:       2500
        Database Heap size:   2467
        Lock List size:       778829
        Maximum Lock List:    98
        Average Applications: 1
        Locks Available:      24424078

Package Context:
---------------
        SQL Type:           Dynamic
        Optimization Level: 5
        Blocking:           Block All Cursors
        Isolation Level:    Cursor Stability



---------------- STATEMENT 1  SECTION 203 ----------------
        QUERYNO:       2
        QUERYTAG:      CLP
        Statement Type:   Searched Delete
        Updatable:        Not Applicable
        Deletable:        Not Applicable
        Query Degree:     1

Original Statement:
------------------
delete
from wscomusr.EDPPAYINST
where EDPORDER_ID in
   (select EDPORDER_ID
   from wscomusr.EDPORDER
   where ORDER_ID in
      (select ORDERS_ID
      from wscomusr.ORDERS
      where STATUS = 'P' and MEMBER_ID = ?))


Optimized Statement:
-------------------
$WITH CONTEXT$($CONSTRAINT$(WSCOMUSR.EDPATMPAY.F_1097),
        $CONSTRAINT$(WSCOMUSR.EDPRELHIST.F_1101),
        $CONSTRAINT$(WSCOMUSR.EDPPAYHIST.F_1159))
DELETE
FROM WSCOMUSR.EDPPAYINST AS Q5
WHERE $RID$ IN
   (SELECT Q8.EDPPAYINST_ID
   FROM WSCOMUSR.ORDERS AS Q6, WSCOMUSR.EDPORDER AS Q7, WSCOMUSR.EDPPAYINST
           AS Q8
   WHERE (Q6.STATUS = 'P') AND (Q6.MEMBER_ID = ?) AND (Q7.ORDER_ID =
           Q6.ORDERS_ID) AND (Q8.EDPORDER_ID = Q7.EDPORDER_ID))

Access Plan:
-----------
        Total Cost:             424147
        Query Degree:           1

                                                            Rows
                                                           RETURN
                                                           (   1)
                                                            Cost
                                                             I/O
                                                             |
                                                          0.333333
                                                           FILTER
                                                           (   2)
                                                           424147
                                                           415578
        +----------------------------------------------------+----------------+-----------------------------------+
         1                                                                     0                               0.0717332
      TBSCAN                                                                DELETE                              DELETE
      (   3)                                                                (   4)                              (  21)
    2.8813e-05                                                              212078                              212070
         0                                                                  207789                              207788
        |                                                                 /---+---\                          /----+----\
         1                                                               0           0                  0.0717332      684953
 TABFNC: SYSIBM                                                       NLJOIN  TABLE: WSCOMUSR            NLJOIN    TABLE: WSCOMUSR
      GENROW                                                          (   5)    EDPRELHIST               (  22)       EDPATMPAY
                                                                      212078        Q3                   212069          Q17
                                                                      207789                             207788
                                                             /----------+----------\                  /----+----\
                                                        0.145504                      0          0.0724482     0.990131
                                                         DELETE                    IXSCAN         TBSCAN        IXSCAN
                                                         (   6)                    (  20)         (  23)        (  24)
                                                         212078                  0.00967348       212061        15.1398
                                                         207789                       0           207787           2
                                                       /---+----\                    |              |             |
                                                  0.145504    1.38936e+06             0          0.0724482      684953
                                                   NLJOIN   TABLE: WSCOMUSR    INDEX: SYSIBM      TEMP      INDEX: WSCOMUSR
                                                   (   7)     EDPPAYHIST     SQL121004092950890   (   9)       I0000846
                                                   212077         Q4                 Q14          212061          Q19
                                                   207789                                         207787
                                             /-------+-------\
                                        0.0724482            2.00838
                                         TBSCAN              FETCH
                                         (   8)              (  18)
                                         212061              22.7027
                                         207787                 3
                                           |               /---+----\
                                        0.0724482      2.00838    1.38936e+06
                                         TEMP          IXSCAN   TABLE: WSCOMUSR
                                         (   9)        (  19)     EDPPAYHIST
                                         212061        15.139         Q11
                                         207787           2
                                           |             |
                                        0.0724482    1.38936e+06
                                         DELETE    INDEX: WSCOMUSR
                                         (  10)       I0000910
                                         212061          Q11
                                         207787
                                      /----+----\
                                 0.0724482      691780
                                  NLJOIN    TABLE: WSCOMUSR
                                  (  11)      EDPPAYINST
                                  212061          Q5
                                  207787
                  /-----------------+------------------\
              1.00049                                 0.0724127
              ^HSJOIN                                  FETCH
              (  12)                                   (  16)
              212045                                   15.6861
              207785                                   2.07241
         /------+-------\                           /----+----\
   9.55329e+06          1.00049                0.0724127      691780
     TBSCAN             FETCH                   IXSCAN    TABLE: WSCOMUSR
     (  13)             (  14)                  (  17)      EDPPAYINST
     211522             22.7026                 15.1381         Q8
     207782             3.00017                    2
       |              /---+----\                  |
   9.55329e+06    1.00049       83849           691780
 TABLE: WSCOMUSR  IXSCAN   TABLE: WSCOMUSR  INDEX: WSCOMUSR
    EDPORDER      (  15)       ORDERS          I0000844
       Q7         15.1377        Q6               Q8
                     2
                    |
                   83849
              INDEX: WSCOMUSR
                 I0000176
                    Q6

For a split second, my first thought was “why is it involving EDPPAYINST, EDPPAYHIST, and other tables not directly referenced in the statement?”, but then it hit me – cascading deletes. WebSphere Commerce databases depend on cascading deletes, so most of the time when you explain a delete, you’ll see the tables it cascades to.

If you have a WebSphere Commerce database, try explaining a delete on MEMBER or USER – they hit the most tables of any of them. The explain will take a while and will be hard to read, but it is interesting.

Now I have that basic information, I also like to have the DB2 design advisor recommendations. That looks like:

$ db2advis -d SAMPLE -i go-live_query1.sql |tee go-live_query1.sql.advis
execution started at timestamp 2013-04-05-08.32.03.476088
found [1] SQL statements from the input file
Recommending indexes...
total disk space needed for initial set [ 337.919] MB
total disk space constrained to         [6260.996] MB
Trying variations of the solution set.
  6  indexes in current solution
 [424147.2812] timerons  (without recommendations)
 [123.9237] timerons  (with current solution)
 [99.97%] improvement


--
--
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1],    2.993MB
   CREATE INDEX "DB2INST1"."IDX1304051332060" ON "WSCOMUSR"."ORDERS"
   ("MEMBER_ID" ASC, "STATUS" ASC, "ORDERS_ID" ASC) ALLOW
   REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[2],  275.005MB
   CREATE INDEX "DB2INST1"."IDX1304051332230" ON "WSCOMUSR"."EDPORDER"
   ("ORDER_ID" ASC, "EDPORDER_ID" ASC) ALLOW REVERSE
   SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[3],   19.915MB
   CREATE INDEX "DB2INST1"."IDX1304051332270" ON "WSCOMUSR"."EDPPAYINST"
   ("EDPORDER_ID" ASC, "EDPPAYINST_ID" ASC) ALLOW REVERSE
   SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[4],   39.997MB
   CREATE INDEX "DB2INST1"."IDX1304051332320" ON "WSCOMUSR"."EDPPAYHIST"
   ("EDPPAYINST_ID" ASC, "EDPPAYHIST_ID" ASC) ALLOW REVERSE
   SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;
-- index[5],    0.009MB
   CREATE INDEX "DB2INST1"."IDX1304051332340" ON "WSCOMUSR"."EDPRELHIST"
   ("EDPPAYHIST_ID" ASC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;


--
--
-- RECOMMENDED EXISTING INDEXES
-- ============================
-- RUNSTATS ON TABLE "WSCOMUSR"."ORDERS" FOR SAMPLED DETAILED INDEX "WSCOMUSR"."I0000176" ;
-- COMMIT WORK ;
-- RUNSTATS ON TABLE "WSCOMUSR"."EDPPAYINST" FOR SAMPLED DETAILED INDEX "WSCOMUSR"."I0000844" ;
-- COMMIT WORK ;
-- RUNSTATS ON TABLE "WSCOMUSR"."EDPATMPAY" FOR SAMPLED DETAILED INDEX "WSCOMUSR"."I0000846" ;
-- COMMIT WORK ;
-- RUNSTATS ON TABLE "WSCOMUSR"."EDPPAYHIST" FOR SAMPLED DETAILED INDEX "WSCOMUSR"."I0000910" ;
-- COMMIT WORK ;
-- RUNSTATS ON TABLE "WSCOMUSR"."EDPRELHIST" FOR SAMPLED DETAILED INDEX "SYSIBM  "."SQL121004092950890" ;
-- COMMIT WORK ;


--
--
-- UNUSED EXISTING INDEXES
-- ============================
-- DROP INDEX "WSCOMUSR"."I0000843";
-- DROP INDEX "WSCOMUSR"."I0000850";
-- DROP INDEX "WSCOMUSR"."I0000845";
-- DROP INDEX "WSCOMUSR"."I0000852";
-- DROP INDEX "WSCOMUSR"."I0000652";
-- DROP INDEX "WSCOMUSR"."I0000653";
-- DROP INDEX "WSCOMUSR"."I0000654";
-- DROP INDEX "WSCOMUSR"."I0000892";
-- DROP INDEX "WSCOMUSR"."I0000933";
-- DROP INDEX "WSCOMUSR"."I173124";
-- DROP INDEX "WSCOMUSR"."I0001267";
-- ===========================
--

-- ====ADVISOR DETAILED XML OUTPUT=============
-- ==(Benefits do not include clustering recommendations)==

...

Making Decisions

Never blindly take the recommendations of the design advisor – especially the deletes. The design advisor’s recommendations are as if this workload you’ve fed it (in this case, a single delete statement) are the only workloads on the database. It also aims for index-only access wherever possible, which can be appropriate for a query that is a large problem, with a measurable impact on an end-user process like checkout. But in this case, we have no evidence of end-user impact.

The tables we’re looking at here are very directly involved in user operations such as checkout where insert performance is critical. I hate to add an index to ORDERS or any order or payment related tables like these, because it can slow down those processes. So from the moment I saw the recommendations, I knew that I wasn’t about to do all of these indexes. I needed to figure out if there was just one that wouldn’t involve too much duplication that could help this delete, or if I was going to have to go back to the developers first and ask how critical this activity was.

If you look at the explain plan, note these two sections:

   9.55329e+06  
     TBSCAN     
     (  13)     
     211522     
     207782     
       |        
   9.55329e+06  
 TABLE: WSCOMUSR
    EDPORDER    
       Q7       
 

AND

     /----+----\
0.0724482     0.990131
 TBSCAN        IXSCAN
 (  23)        (  24)
 212061        15.1398
 207787           2
   |             |
0.0724482      684953
 TEMP      INDEX: WSCOMUSR
 (   9)       I0000846
 212061          Q19
 207787

If you dig through the detailed explain plan, you would see that the first is table scanning EDPORDER, and then feeding the output through several other transformations and filters, and feeding most of that tablescan into a temp table scan in operator #23. So if I can eliminate that table scan, I can eliminate the two most expensive parts of my explain plan and get the vast majority of the reduction in timerons and in rows read as well. Looking at the design advisor output, I see this recommended index for that table:

-- index[2],  275.005MB
   CREATE INDEX "DB2INST1"."IDX1304051332230" ON "WSCOMUSR"."EDPORDER"
   ("ORDER_ID" ASC, "EDPORDER_ID" ASC) ALLOW REVERSE
   SCANS COLLECT SAMPLED DETAILED STATISTICS;
   COMMIT WORK ;

To fully analyze that, I need to look at the other indexes on the table:

INDNAME              UNIQUERULE COLNAMES                  LASTUSED
-------------------- ---------- ------------------------- ----------
SQL121004092949970   P          +EDPORDER_ID              04/13/2013
I0000843             D          +STORE_ID+ORDER_ID        04/13/2013

There, we see that the new index would duplicate columns that are already in other indexes. However, I don’t yet have an index with ORDER_ID as the first column, and for the way this table is used, that seems like a decent idea anyway. The index to be created is also a bit on the large side – 275 MB, but I still think it is worth it for the kind of improvement we’re talking about. I also know that the cardinality of the new index will be very high since it happens to include the primary key. Maybe in version 10, a combination of jump scans and index anding would make this index obsolete, but I’m on 9.7 and since WebSphere Commerce does not yet support DB2 10, I think it unlikely that I’ll get there any time this year.

Adding my own naming standards and clearing up the syntax, this is what I end up adding:

db2 "CREATE INDEX WSCOMUSR.I_EDPORDER01 ON WSCOMUSR.EDPORDER (ORDER_ID ASC, EDPORDER_ID ASC) ALLOW REVERSE SCANS COLLECT DETAILED STATISTICS"

When creating a new object, you should create it in your test, QA or any other lower environments first, before moving it to stage. After creating a new index, it is always important to do another explain to ensure that your new index is used. Here’s what that explain looked like for me:

Access Plan:
-----------
        Total Cost:             147.709
        Query Degree:           1

                                                               Rows
                                                              RETURN
                                                              (   1)
                                                               Cost
                                                                I/O
                                                                |
                                                             0.333333
                                                              FILTER
                                                              (   2)
                                                              147.709
                                                              19.5123
        +-------------------------------------------------------+--------------------+-----------------------------------+
         1                                                                            0                               0.0717332
      TBSCAN                                                                       DELETE                              DELETE
      (   3)                                                                       (   4)                              (  21)
    2.8813e-05                                                                      77.92                              69.7886
         0                                                                         10.293                              9.21924
        |                                                                        /---+---\                          /----+----\
         1                                                                      0           0                  0.0717332      684953
 TABFNC: SYSIBM                                                              NLJOIN  TABLE: WSCOMUSR            NLJOIN    TABLE: WSCOMUSR
      GENROW                                                                 (   5)    EDPRELHIST               (  22)       EDPATMPAY
                                                                              77.92        Q3                   69.2461         Q17
                                                                             10.293                             9.14751
                                                                    /----------+----------\                  /----+----\
                                                               0.145504                      0          0.0724482     0.990131
                                                                DELETE                    IXSCAN         TBSCAN        IXSCAN
                                                                (   6)                    (  20)         (  23)        (  24)
                                                                77.9095                 0.00967348       61.6671       15.1398
                                                                10.293                       0           8.14751          2
                                                              /---+----\                    |              |             |
                                                         0.145504    1.38936e+06             0          0.0724482      684953
                                                          NLJOIN   TABLE: WSCOMUSR    INDEX: SYSIBM      TEMP      INDEX: WSCOMUSR
                                                          (   7)     EDPPAYHIST     SQL121004092950890   (   9)       I0000846
                                                          76.8091        Q4                 Q14          61.6595         Q19
                                                          10.1475                                        8.14751
                                                    /-------+-------\
                                               0.0724482            2.00838
                                                TBSCAN              FETCH
                                                (   8)              (  18)
                                                61.6671             22.7027
                                                8.14751                3
                                                  |               /---+----\
                                               0.0724482      2.00838    1.38936e+06
                                                TEMP          IXSCAN   TABLE: WSCOMUSR
                                                (   9)        (  19)     EDPPAYHIST
                                                61.6595       15.139         Q11
                                                8.14751          2
                                                  |             |
                                               0.0724482    1.38936e+06
                                                DELETE    INDEX: WSCOMUSR
                                                (  10)       I0000910
                                                61.6567         Q11
                                                8.14751
                                             /----+----\
                                        0.0724482      691780
                                         NLJOIN    TABLE: WSCOMUSR
                                         (  11)      EDPPAYINST
                                         61.1088         Q5
                                         8.07506
                           /---------------+---------------\
                       1.00049                            0.0724127
                       NLJOIN                              FETCH
                       (  12)                              (  16)
                       45.4151                             15.6861
                       6.00163                             2.07241
               /---------+---------\                    /----+----\
           1.00049                    1            0.0724127      691780
           FETCH                   IXSCAN           IXSCAN    TABLE: WSCOMUSR
           (  13)                  (  15)           (  17)      EDPPAYINST
           22.7026                 22.7014          15.1381         Q8
           3.00017                    3                2
         /---+----\                  |                |
     1.00049       83849         9.55329e+06        691780
     IXSCAN   TABLE: WSCOMUSR  INDEX: WSCOMUSR  INDEX: WSCOMUSR
     (  14)       ORDERS        I_EDPORDER01       I0000844
     15.1377        Q6               Q7               Q8
        2
       |
      83849
 INDEX: WSCOMUSR
    I0000176
       Q6

Whee, we went from over 420,000 timerons in the first explain to just 148 timerons with just one index addition. Not only is it using the index, but where DB2 used to have to scan over 200,000 rows to execute this delete, it is now handling just three.

I also like to check several days later to make sure the index is still used. In this case, the LASTUSED column in syscat.indexes shows that it is, indeed, being used.

The design advisor said that we could get this query down to about 124 timerons by adding all of the suggested indexes. But given the importance of inserts for these tables, I’m not willing to try the other indexes unless someone else notices that this query is a problem or that a process that this query is involved in is notably slow for end-users. They would be high-risk as far as indexes go, even before I analyze them further.

A possible follow up here is to also run the query and index by the development team to see if they have any thoughts or objections. In some organizations, you’ll have to do that first anyway to get approval to add it in production.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

9 comments

  1. > cascading deletes
    Thanks for the detailed blog entry. Note that simple foreign keys can cause exactly the same problems. Problems that you’ll never notice when just inserting.

  2. Can you post the the query you used to find the delete statement in the first entry of this post ……………

    1. It is actually querying my own set of custom history tables based on MON_GET_PKG_CACHE_STMT, so it wouldn’t nesecarily work against other tables.

  3. Its from the MON_GET_PKG_CACHE_STMT 9.7 , is there a way we can get this from 9.1 , i mean throught Admin VIews or some how……………….

      1. the interesting column i found is ROWS RETURNED which i cannot find in any of these view 9.1 , so i thought have i missed anything and thought of confirming with you …………Anyways how’s your trip to IDUG Orlando???????? Hope you enjoyed it a lot ……can sense it in your tweets…..

  4. Hi Ember,

    Thanks a lot for all your posts which is very helpful in learning’s and resolving problems. In the example, we created here an index suggested by db2 adviser only. Sometime, we face issue like where indexes recommended by db2 also do not help in complex queries. As you said in post, don’t relay on db2 adviser output. Is it possible to explain with an example what to do and how to choose and decide which index we should create and what are other points we should consider while looking at explain plan.

    The other question – is it advisable always to include primary key columns combination while creating index to increase cardinality.

    1. Only include primary key columns if they will also be queried along with the columns you are indexing on. Artificially inflating cardinality does not improve things – in fact, it may make performance worse.

      If you have a low-cardinality index that you wish to create, and there are no high-cardinality columns that also make sense to include based on the queries run against the table, then you should consider making it a clustering index or a dimension for MDC. Or even if higher cardinality is introduced by join with another table, consider a statistical view.

      Beware of any low-cardinality indexes of 3-5% or less of the table cardinality, especially if the data is unevenly distributed over the values. I do my best not to add such indexes as they may be detrimental to performance.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.