SQL Tip: COALESCE

Posted by

I haven’t generally been known for my SQL tips. I can find my way around SQL decently enough, but for years, I didn’t do much of it. I’ve used it more and more as the years have gone by as I have moved out of a stictly physical/system DBA role and into a more mixed role where I interact and collaborate with our development team. I’ve also developed some of my own tools, and that has taught me worlds about more complicated SQL.

Some of my SQL tips are going to seem pretty darn basic, but the truth is that the most basic topics are my most popular articles, and I might have a tidbit here and there that some of my readers aren’t familiar with.

This first tip is centered on the COALESCE. I actually first became aware of COALESCE when I took an Oracle administration class. We had to use a COALESCE function in class, and I thought there must be something like it in DB2.

COALESCE vs. CASE

The whole point of COALESCE is to help you deal with a situation where you may want to return something different if a value is NULL. This is pretty common, and while you can do the same functionality with a CASE statement, COALESCE is shorter and simpler.

Basically, the whole point of COALESCE is that it is a function that returns the first value that is not NULL. You can pass in as many (compatible) arguments as you like – it can handle any data types. Many times, the last argument passed in is a string literal or constant (number).

How can this help you? Let’s take a look at an example. In my case, I needed to return a value from a table, but if that value happened to be NULL, I needed instead to return a default value of -20. In my case, the CASE statement looked like this:

case 
    when stgfilter is null 
        then -20 
    else stgfilter 
end

Not complicated, but still several lines to get what I need. When using COALESCE to do the same logic, I can use:

coalesce(stgfilter,-20)

Performance

Logically, it feels like COALESCE has less work to do, so should perform faster. But I don’t know if the DB2 optimizer really knows the difference. When I did a little experiment with my two methods on tiny little <200 row tables, I got the following two explain plans:

Case statement

Access Plan:
-----------
        Total Cost:             15.1685
        Query Degree:           1
 
                             Rows
                            RETURN
                            (   1)
                             Cost
                              I/O
                              |
                               1
                           >^NLJOIN
                           (   2)
                            15.1683
                               2
                     /--------+---------\
                    1                      1
                 >NLJOIN                IXSCAN
                 (   3)                 (   7)
                 15.1579               0.0103311
                    2                      0
           /-------+-------\              |
          1                   1           18
       IXSCAN              FETCH    INDEX: WSCOMUSR
       (   4)              (   5)   X_STR_MAP_IX01
      0.0145029            15.1434        Q5
          0                   2
         |               /---+----\
         110            1          26359
   INDEX: SYSIBM     IXSCAN   TABLE: WSCOMUSR
SQL120207213008270  (   6)      STORECENT
         Q1          7.57994        Q3
                        1
                       |
                      26359
                 INDEX: WSCOMUSR
                    I0000345
                       Q3

COALESCE

Access Plan:
-----------
        Total Cost:             15.1685
        Query Degree:           1
 
                             Rows
                            RETURN
                            (   1)
                             Cost
                              I/O
                              |
                               1
                           >^NLJOIN
                           (   2)
                            15.1683
                               2
                     /--------+---------\
                    1                      1
                 >NLJOIN                IXSCAN
                 (   3)                 (   7)
                 15.1579               0.0103311
                    2                      0
           /-------+-------\              |
          1                   1           18
       IXSCAN              FETCH    INDEX: WSCOMUSR
       (   4)              (   5)   X_STR_MAP_IX01
      0.0145029            15.1434        Q5
          0                   2
         |               /---+----\
         110            1          26359
   INDEX: SYSIBM     IXSCAN   TABLE: WSCOMUSR
SQL120207213008270  (   6)      STORECENT
         Q1          7.57994        Q3
                        1
                       |
                      26359
                 INDEX: WSCOMUSR
                    I0000345

No, I did not copy and paste wrong – those are two exactly identical explain plans. And when I checked, the re-written SQL did not show one method had been converted to the other.

I can’t exactly exrapolate these results to more complex situations, of course, but I still thought it was interesting.

I would love it if any readers have any knowledge about the performance differences of CASE vs. COALESCE that they’d be willing to share in the comments.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

2 comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.