Fun with IN-lists and GENROW

IN-lists are poor performers. Here's what they look like in explain plans and why they perform poorly.

I’ve been having some fun with explains lately. I’ve been preparing a presentation for some developers on how to analyze queries – I’ll be presenting next week when I’m on-site. So I started digging through some explain plans I had sitting around from various times when I’ve need to do analyses, and I started going through them. One reccomendation on my list of “DB2 SQL Best Practices” is to avoid in-lists whenever possible. Just one of those tips that I’ve always known and never really thought about. So I start digging into this excerpt from explain plan:

                                                     (   6)                                         
                                                  1          55804                            
                                               RIDSCN   TABLE: WSCOMUSR                       
                                               (   7)      CATENTRY                                
        1                1                1                1                1                1   
     SORT             SORT             SORT             SORT             SORT             SORT   
     (   8)           (  10)           (  12)           (  14)           (  16)           (  18)
     12.8719          12.8719          12.8719          12.8719          12.8719          12.8719
        1                1                1                1                1                1
       |                |                |                |                |                |
        1                1                1                1                1                1
     IXSCAN           IXSCAN           IXSCAN           IXSCAN           IXSCAN           IXSCAN
     (   9)           (  11)           (  13)           (  15)           (  17)           (  19)
     12.8711          12.8711          12.8711          12.8711          12.8711          12.8711
        1                1                1                1                1                1
       |                |                |                |                |                |
      55804            55804            55804            55804            55804            55804
    I0000518         I0000518         I0000518         I0000518         I0000518         I0000518

Having not done much in-depth analysis recently and most of my recent analysis being of the popcorn and bubble-gum variety using the index advisor, I wonder why it is scanning the same index 6 times (and that is taking up just less than 1/2 of the total timerons). Looking into the query itself, I find an IN list with exactly 6 items – whaddya know. So in this case, DB2 had to scan the SAME index 6 times -once for each item in the list, and then it did a RID scan of that list and fetched the rest of the data from the base table. I now have proof of how IN lists can work and a nifty example for the developers.

So digging into more queries, I find this access plan:

Access Plan:
  Total Cost:   102.842
  Query Degree:  1

             (   1)
             (   2)
         4             1
      TBSCAN        IXSCAN
      (   3)        (   6)
    0.00286127      25.7188
         0             2
        |             |
         4           55804
      SORT      INDEX: DB2INST1
      (   4)        IBM_007
      (   5)

Nothing too expensive or complicated here – single table access. But why the heck is it doing a JOIN when the query only accesses a single table with no subqueries? And furthermore, what’s this GENROW thing that it’s doing a table scan on and getting 4 rows. The documentation is real helpful on GENROW (from the Info Center):

Generates a table of rows.

Digging a bit further, I find:

This operator is used by the optimizer to generate rows of data.

Operator name: GENROW

Represents: A built-in function that generates a table of rows, using no input from tables, indexes, or operators.

GENROW can be used by the optimizer to generate rows of data (for example, for an INSERT statement or for some IN-lists that are transformed into joins).

Ok, so burried in there is a reference to IN-lists and commerce tends to be a bit IN-list happy. Looking at the query, I find an IN-list with 4 values. So DB2 is essentially treating that 4 rows as a (non-indexed) table, and then doing an NLJOIN between it and the table itself. So on a single-table query, I get a sort, a table scan, and a join, even though I’m getting index-only access to the table directly.

Much of this I’ve learned before when digging into the details of explains, but I haven’t focused so much on how IN-lists are handled, and thought it was interesting.

=) yes, this is my idea of fun on a Thursday night.

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 555


  1. Hello, Great Post !. I’m facing the same situation. I have 2 equal databases in 2 different servers, but when I execute the explain plan.. one is using the same index several times and the other is using genrow. What’s the difference? Could be statistics or anything else? Thanks. Regards.

    • Do they have the same statistics and exactly the same data? My guess would be that they have different data or different statistics. I would look also at if one is much faster than the other.

  2. Thanks, for the information on genrow. I tried using “or” clause instead of in-list for example “where (pred1=”value1″ or pred1=”value2″)” but it still produces genrow section.

  3. may be by now, you would have found the way to avoid multiple index scan with in-list.

    I have tried something like this and was able to avoid multiple index scan.

    Original Query :
    select pm.id_value,pm.token
    where pm.ENTITY_STATUS_CREATING_DATE between ‘01.03.2017 00:00:00.000000’ and ‘01.03.2017 23:59:59.999999’
    and pm.type in (‘virtualCreditCardToken’, ‘creditCardToken’)
    for read only with ur

    Modified query :

    select pm.id_value,pm.token
    (values(‘virtualCreditCardToken’) union values(‘creditCardToken’) ) as v1(c1)
    on v1.c1=pm.type
    where pm.ENTITY_STATUS_CREATING_DATE between ‘2017-03-01-’ and ‘2017-03-01-’
    for read only with ur

    create temp table in the query. So far modified query didn’t give incorrect results :).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.