Db2 SQL Basics: The Column List

Posted by

Learning SQL can feel intimidating, even to some experienced DB2 DBAs. For years, you could administer Db2 without using SQL. There are a lot of developers, too who want to learn SQL. The set-thinking of SQL can be confusing. The complete detachment from the physical layer can confound others.

SQL, like many other things is something you can learn the basics of in a few hours, but to be a master, you have to spend hundreds or thousands of hours in real-life scenarios, and include an understanding of performance to really master. I spent a couple of years focusing my learning time on SQL several years ago, when I realized it was a weakness. I’ve come a long way, but there are still masters who make me look like a grasshopper.

The last entry in this series, Db2 SQL Basics: The Most Basic Select Statement, started with an SQL statement that is as simple as it gets – selecting all rows and all columns from a table. In this post we take just one step forward to select only specific columns. The methodology shown in this week’s article covers selecting specific columns.

What is a Column, and Why Limit the Columns Returned?

In a relational table, each row of the table represents an entity or occurrence, while each column represents one and only one attribute of the entities or occurrences. When writing SQL, the vast majority of the time, we need only a small subset of the columns or attributes. It should be rare that we’re returning everything. It is actually a bad coding practice to use “select *” for two reasons:

  1. Even if we need every column, SQL written for applications should generally specify the column names to future-proof the statement. Specifying the column names means that the result set cannot suddenly change shape when a column is added later. While some programming languages can handle unexpected columns, others cannot.
  2. The less data returned, the better performance. While this statement is not always true, it is true in the vast majority of situations. In some applications, particularly when using BLU (column-organized tables), the performance can be exponentially better.

Even if the tables are not fully normalized as we would expect in a relational database, the above still holds true. Many relational databases are not fully relational in their implementation.

Returning only Certain Columns/Attributes

The syntax used for returning only certain columns is quite simple. Insted of *, we simply specify the names of the columns that we want to return. In the last blog entry in this series we used this example:

$ db2 "SELECT * FROM SYSCAT.BUFFERPOOLS"

BPNAME                                                                                                                           BUFFERPOOLID DBPGNAME                                                                                                                         NPAGES      PAGESIZE    ESTORE NUMBLOCKPAGES BLOCKSIZE   NGNAME                                                                                                                          
-------------------------------------------------------------------------------------------------------------------------------- ------------ -------------------------------------------------------------------------------------------------------------------------------- ----------- ----------- ------ ------------- ----------- --------------------------------------------------------------------------------------------------------------------------------
IBMDEFAULTBP                                                                                                                                1 -                                                                                                                                         -2        8192 N                  0           0 -                                                                                                                               

  1 record(s) selected.

More useful than that is to return only select column names. Try this instead:

$ db2 "select bufferpoolid, bpname, pagesize from syscat.bufferpools"

BUFFERPOOLID BPNAME                                                                                                                           PAGESIZE   
------------ -------------------------------------------------------------------------------------------------------------------------------- -----------
           1 IBMDEFAULTBP                                                                                                                            8192

  1 record(s) selected.

The above information may be far more useful if all I need to know is the name of each buffer pool and its page size.

What are the Column Names for a Table?

Sometimes you don’t know the data structure for a table you’re querying. In this case, you can find that information by using the DESCRIBE command. It is used at the command line like this:

$ db2 describe table syscat.bufferpools

                                Data type                     Column
Column name                     schema    Data type name      Length     Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
BPNAME                          SYSIBM    VARCHAR                    128     0 No    
BUFFERPOOLID                    SYSIBM    INTEGER                      4     0 No    
DBPGNAME                        SYSIBM    VARCHAR                    128     0 Yes   
NPAGES                          SYSIBM    INTEGER                      4     0 No    
PAGESIZE                        SYSIBM    INTEGER                      4     0 No    
ESTORE                          SYSIBM    CHARACTER                    1     0 No    
NUMBLOCKPAGES                   SYSIBM    INTEGER                      4     0 No    
BLOCKSIZE                       SYSIBM    INTEGER                      4     0 No    
NGNAME                          SYSIBM    VARCHAR                    128     0 Yes   

  9 record(s) selected.

This gives me an easy formatted list of every column in the table or view, and what data type each column uses. Many GUIs also provide easy access to this information.

More complex entries in this series are coming, so stay tuned!

Lead Db2 Database Engineer and Service Delivery Manager , XTIVIA
Ember is always curious and thrives on change. Working in IT provides a lot of that change, but after 17 years developing a top-level expertise on Db2 for mid-range servers and more than 7 years blogging about it, Ember is hungry for new challenges and looks to expand her skill set to the Data Engineering role for Data Science. With in-depth SQL and RDBMS knowledge, Ember shares both posts about her core skill set and her journey into Data Science. Ember lives in Denver and work from home for XTIVIA, leading a team of Db2 DBAs.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.