Db2 SQL Basics: The Most Basic Select Statement

Posted by

Learning SQL can feel intimidating, even to some experienced DB2 DBAs. For years, you could administer Db2 without using SQL. There are a lot of developers, too who want to learn SQL. The set-thinking of SQL can be confusing. The complete detachment from the physical layer can confound others.

SQL, like many other things is something you can learn the basics of in a few hours, but to be a master, you have to spend hundreds or thousands of hours in real-life scenarios, and include an understanding of performance to really master. I spent a couple of years focusing my learning time on SQL several years ago, when I realized it was a weakness. I’ve come a long way, but there are still masters who make me look like a grasshopper.

The Simplest Statement

This post is a starting place. The first statement you’ll likely ever run is as simple as it gets. Your goal here is to just return all the data from a single table. As an example, we’ll pick a likely small system table that will exist in any Db2 (LUW) database, including the SAMPLE Database.

The statement to start with is:

SELECT *
FROM SYSCAT.BUFFERPOOLS;

Let’s go over that in extreme detail. The first word, SELECT is the normal starting word of a query. It identifies for Db2 that you wish to return something. The next word is just the asterisk character(*). In SQL, this reads as “everything”.

The FROM keyword is the only other required word in every Db2 query. It tells Db2 that you have stopped specifying what to return, and are now specifying where to get it from. The SYSCAT above is a schema name. Every table and nearly every object exists in a logical grouping called a schema. Your default schema is your user name, but for Db2 the tie to the user name is not strong. There can easily be a table in your schema that you don’t have access to, and removing a user does not remove any of her objects. The SYSCAT schema is one that stores views on the Db2 system catalog that describes objects and activities relating to the database. You can tell this is a schema by the single-dot notation.

After the dot is the table name. In this case, BUFFERPOOLS. The bufferpools table lists out each buffer pool (a type of memory area) along with some basic details about each buffer pool.

Generally, SQL gets much more complicated than this and does much more. Returning the entire contents of a table is rarely what you want to do, and is generally inefficient. It is possible you might do this to get the data into a data frame in Python, but it is generally much more efficient to select only specific columns or to filter the rows using a where clause, or perform a million different types of transformations and joins on the data being returned in SQL before it has to be sent over the network or stored in memory by a program.

How to Execute SQL

There are a vast array of different ways to execute SQL against a Db2 database. My favorite lately is using the SQL Magic in Jupyter Notebooks. If you’re on Windows, you might want to use a PowerShell command line or a PowerShell script. If you have a Linux/UNIX command line with the db2profile sourced, all you have to do to execute this is:

$ db2 connect to sample

   Database Connection Information

 Database server        = DB2/LINUXX8664 11.1.3.3
 SQL authorization ID   = DB2INST1
 Local database alias   = SAMPLE

$ db2 "SELECT * FROM SYSCAT.BUFFERPOOLS"

BPNAME                                                                                                                           BUFFERPOOLID DBPGNAME                                                                                                                         NPAGES      PAGESIZE    ESTORE NUMBLOCKPAGES BLOCKSIZE   NGNAME                                                                                                                          
-------------------------------------------------------------------------------------------------------------------------------- ------------ -------------------------------------------------------------------------------------------------------------------------------- ----------- ----------- ------ ------------- ----------- --------------------------------------------------------------------------------------------------------------------------------
IBMDEFAULTBP                                                                                                                                1 -                                                                                                                                         -2        8192 N                  0           0 -                                                                                                                               

  1 record(s) selected.

If you don’t know what that connect statement is all about, check out my blog entry onhow to connect to a Db2 LUW database

The syntax used above can generally be used in KSH, Bash, or other shell scripts.

Other options include using a cross-platform GUI, IBM’s Data Studio (free, geared towards developers), or IBM Data Server Manager(DSM) (free, geared towards DBAs).

While this post focuses on SQL for Db2 on Linux, UNIX, and Windows, many of the concepts apply to other RDBMSes as well. The table used as an example here will not exist on other platforms. Some platforms diverge from the SQL standard to allow SELECT without FROM, but Db2 is not one of them.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.