A Few Scripting Basics with DB2 and PowerShell

Not long ago, I posted some of the details of using PowerShell as your command line for DB2 on Windows. I am definitely addicted to PowerShell as my command line when I have to work on Windows servers.

I also have to write some scripts for some Windows systems. My first is a dynamic reorg/runstats/rbind script. I’m not done writing it yet, but I’m learning a lot and want to share what I’m learning with my readers.

Goals for this script

This is my fourth time writing this script. For one employer, I wrote this script first in KSH and then in Perl. Then I wrote it again for another employer in Perl. I seem to have always been writing scripts in situations where my employers end up owning them, and I’m a bit of a square about respecting that. Now I need a version that works on Windows, and I’m not allowed to use Perl. My choices are Batch, PowerShell, or VB. VB doesn’t appeal to me because it is so different from the scripting languages I am used to. Using batch and having to call db2cmd.exe is just annoying. And my husband happens to know PowerShell and has helped me with some of the basics, so that has been a plus.

I was a bit disappointed to learn that O’Reilly doesn’t have a “Learning PowerShell” book – their “Learning Perl” book is the single best technical book I have ever read. So I decided to just dig in and see what I could work out.

This script minimally needs to:

  1. Connect to a local database without specifying a password
  2. Run runstats on all tables, using a dynamically generated list of tables
  3. Generate/read reorgchk information in some format
  4. Determine what kind of table reorgs to do on which tables based on the above output
  5. Determine what kind of index reorgs to do on which tables base on the above output
  6. Perform the reorgs
  7. Not reorg the same table more than once for the table and once for the indexes
  8. Not die because one table action failed
  9. Perform runstats on reorged tables
  10. Perform reorg
  11. Write a log file of everything done

The following features might not be in the initial version but are high on my priority list to add

  • Take a list of tables to handle or except from the actions of the script
  • Manage concurrency – limiting the number of online table reorgs
  • Manage concurrency – Run multiple index reorgs at one time
  • Handle some error situations – including retrying reorgs in some situations
  • Potentially and optionally force off blocking connections?

I’m testing my script and playing with it on a local Windows VM with the SAMPLE database. I’ll have to add/remove data to get some more interesting results for the Reorg formulas in the near future.

I am a novice with PowerShell, so please use any code snippets here with extreme caution and only after thorough testing.

Connecting Locally Without Specifying a Password

This was an early challenge. I have decided to attempt to use a database interface rather than simply executing command line commands. This is different than the approach that I’ve taken in the past. It was relatively easy to find help on how to connect and to execute a query, but to connect to a local database without specifying a password actually took looking up blog and forum entries for the OLEDB provider that were not specific to PowerShell.

I detailed my efforts in this area in this quick blog entry: https://datageek.blog/2015/03/19/how-to-connect-to-a-local-db2-database-without-specifying-a-password-in-powershell/

Calling Stored Procedures

The next thing I needed to be able to do is to run a stored procedure. Specifically the reorgchk_tb_stats and reorgchk_ix_stats stored procedures. This actually turned out to be fairly easy, especially since the output they send to screen is just not something I’m going to use. I just treat the call statement like it’s a query. This still involves:

  1. Defining a connection object, if that has not already been done
  2. Defining a data set for the data returned to go into. Not that I’m actually going to do anything with that data in this case.
  3. Defining the query string for the call statement
  4. Defining a data object by passing it the connection object and the query string
  5. Executing the statement by using the Fill property of the data object to populate the data set

Note that not all that terminology may be 100% correct. I definitely feel like this is a blog entry I will look back on in the future and wonder that I managed to do anything at all with so little understanding. But it’s working so far.

Here’s the code I used to accomplish those steps.

#Define connection string for the database
$cn = new-object system.data.OleDb.OleDbConnection("Provider=IBMDADB2;DSN=$db_name;User Id=;Password=;");
#Define data set for first query
$ds = new-object "System.Data.DataSet" "dsReorgCHK"
#Define query to populate reorgchk information
$q = "call reorgchk_tb_stats('T', 'ALL')"
# Define data object given the specific query and connection string
$da = new-object "System.Data.OleDb.OleDbDataAdapter" ($q, $cn)
# Fill the data set - essentially run the query. 

Error Checking

Being thourough, I also wanted to make sure I was catching any SQL error messages with two goals in mind. First, for many of the statements I run in this script, I know I won’t want the whole script to die just because a statement fails – which SQL errors seem to cause in some situations. Second, I know that particularly once I get to the reorg statements (and maybe the runstats), there are some errors that I want to look for and either take specific action based on (maybe re-issue the reorg if it fails due to deadlock?), or write notes out to the output based on the error.

I had initially hoped I could grab the SQLCA somehow and pull from it. I’m still not convinced that’s not possible – I just have had a bear of a time figuring out how to do it. If anyone knows, please comment and help me out.

Barring that, I know I can parse the output for an SQL code like I always have in Perl. My first challenge in this area was to capture the error at all. My initial attempts at an error just led to the script stopping executing at that point. After some time away and some research, I found try/catch/finally. This is a way of executing a line of code, and if it produces an error to do something with the error information before proceeding on.

The syntax when I wrap that around my Fill statement above looks like this:

try {
catch {
  $err >> $output_file
  $q >> $output_file
  $SQL_err = ParseSQLError $err

Note that ParseSQLError – that’s my own function to parse through the output. It is pretty rudimentary at this point, and will likely evolve as I test various failure scenarios. Right now it looks like this:

# Function to parse out an SQL error code
function ParseSQLError ($err_text) {
  $err_code = [regex]::matches($err_text, "SQL\d{4}\D{1}") | %{$_.value}
  if (!$err_code) { $err_code = [regex]::matches($err_text, "SQL\d{5}\D{1}") | %{$_.value} }
  if ($debug -eq 1) {Write-Host "Error code: $err_code"}

I fall back to beloved regexes that Perl made so easy. The ones above capture the matching text and return it. I used different ones for 4 and 5 digit SQL codes, it would be easy to merge those together. I don’t currently capture return codes or SQLSTATEs or anything.

Basics of a Function

This parsing function was my first function in PowerShell. It’s a piece of code I expect to use over and over again within a script, and I don’t want to have to maintain it in more than one location. The function format seems easy and non difficult. I find here and a lot of places where $_ is a similar concept as it was in Perl. I’m not yet to the point of worrying about the scope of a lot of my variables – it doesn’t seem as strict as it was in Perl. I’ve seen a few references to a strict mode that I will probably enable at some point and tighten up my code. I’ve been so focused so far on just getting through the basics. Things like, when you call a function, the parameters are not in parentheses and they are separated by spaces and not commas.

Working wiht Query Output

Queries are run the same way I executed my call statement above, but after you’ve done the Fill, you continue to work with the result sets much as if they were tables. This is actually fantastic and easy. After the Fill for my query of session.tb_stats, I used this code (with $q set to the SQL):

  # Parse the output into a table format where we can refer to each column name and iterate through
  $reorgchk_table = new-object "System.Data.DataTable" "reorgchk_table"
  $reorgchk_table = $dq.Tables[0]
  # for each row of our data table, perform these actions
  $reorgchk_table | FOREACH-OBJECT { 
  if ( $_.CLUSTERING_INDEX -ne "-" ) { 
  } else {
    ReorgTable $_.TABSCHEMA $_.TABNAME

Now this is a preliminary version of the code where I’m generating reorg statements for every table reguardless of whether they are flagged for reorg or my own thresholds exceeded for reorg. I code one baby step at a time. The ReorgTable is a function that currently just writes out the reorg syntax so I can look at it. later it will actually handle the reorging of things. Notice that within the FOREACH-OBJECT on $reorgck_table, I refer to each value with $_.COLUMN_NAME. These are mapped for me by the Tables action on $dq (my data set). The COLUMN_NAMEs are read in from the query output, and I believe are case sensitive (not much is with PowerShell). Ask me how I discovered they’re case sensitive. Go on, I dare you. Let’s just say I was banging my head against that brick wall for at least 20 minutes.

What I really like here is that I don’t have to parse the output into variables and do splits to define what is in which column. I would guess that I would have also gotten something like this with the Perl DBI that I never used. These references to the values can be used in comparisons, can be written out to file or whatever you might want to do with them.

Just for a bit of balance, my currently very simple ReorgTable function:

function ReorgTable ($tabschema,$tabname,$qual_index) {
  $reorg_stmt = "reorg table " + $tabschema + "." + $tabname
  if ( $qual_index ) { 
    $reorg_stmt += " index " + $qual_index  
  $reorg_stmt += " inplace allow write access"
  $reorg_stmt >> $output_file

I’ll add to this using ADMIN_CMD to execute the reorgs. I also haven’t yet coded in runstats functionality either on the front end or the back. rbind, and basic concurrency – there’s a lot left to do here and a lot left to share in a future blog entry. I also need to make sure I’m closing connections and clearing variables once I’m done with them. If I end up needing to run any queries multiple times, I’d like to learn how to do a separate prepare/execute. I’m remembering how fun scripting is!

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 556


  1. Ember –
    It might be “better” to use the ADO.NET IBM.Data.DB2 provider rather than the OleDb provider in PowerShell. It seems to me that it may provide a better integration with DB2 and would be a more direct way of getting to DB2 API as opposed to the OleDb provider. It is what IBM suggests (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.swg.im.dbclient.adonet.doc/doc/c0024474.html) when coding in .NET, and since PowerShell’s nickname could just as well be “.NET shell”…

    The example included in the KnowledgeCenter article is specific to C#; the syntax is a little different for PowerShell to get an instance of the DB provider factory.
    Instead of:
    DbProviderFactory factory = DbProviderFactories.GetFactory(“IBM.Data.DB2”) /* C# syntax */
    $factory = [System.Data.Common.DbProviderFactories]::GetFactory(“IBM.Data.DB2”) /* Powershell Syntax */

  2. I have spent some more time with the ADO.NET IBM.Data.DB2 provider, and there are some things I’ve discovered with respect to error handling that might be share-worthy. All code below is using the ADO.NET IBM.Data.DB2 provider; however, I believe that the approach may be extendable to the “OleDb provider way” and may give you similar results.

    For example, given the statement “SELECT BLA, TABSCHEMA, TABNAME FROM SYSCAT.TABLES WHERE COMPRESSION IN (‘R’, ‘B’)”, and the code to execute it:


    What is assigned to $err after the error is thrown is an instance of a System.Management.Automation.ErrorRecord.

    This class has a property Exception which returns the actual Exception object that was thrown when the error was detected. When using the ADO.NET provider, the exception is of type IBM.Data.DB2.DB2Exception (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.swg.im.dbclient.adonet.ref.doc/doc/DB2ExceptionClass.html), and has some interesting properties:

    $_.Exception.Message – “ERROR [42703] [IBM][DB2/NT64] SQL0206N “BLA” is not valid in the context where it is used.”
    $_.Exception.Errors – this is a collection of objects of type IBM.Data.DB2.DB2Error, with only one element in this example. This is what can be seen in $_.Exception.Errors[0]:

    Message : [IBM][DB2/NT64] SQL0206N “BLA” is not valid in the context where it is used.
    SQLState : 42703
    NativeError : -206
    Source : IBM.Data.DB2
    RowNumber : -1

    So the native DB2 error code can be captured either from the error description via regex like you have shown, or it can be obtained from the NativeError property of the IBM.Data.DB2.DB2Exception object. Unfortunately, it *still* does not provide the SQLCA… The documentation suggests that SQLCommunicationsArea should really be a property of a statement (DB2Command) or the connection itself maybe, not necessarily of an error or an exception; however, it is missing there as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.