I thought I’d share an overview of the real-time monitoring options for Db2 (LUW). This article focuses on comparing the options, with links to deeper information on each tool.
Types of Monitoring
First a quick overview of the types of monitoring that every DBA should be doing. The way I look at it, there are three different, though sometimes overlapping, types of monitoring for Db2:
- Historical Performance Monitoring
- Real-Time Monitoring
- Monitoring to Generate Alerts
Historical Performance Monitoring
Historical performance monitoring gathers data on metrics over time. These metrics are stored in a format that can be easily queried or explored. The main questions we’re trying to answer with them are things like:
- Why was the database slow at 2PM on Thursday?
- What trends do we see in database performance?
- What red flags do we see indicating potential performance problem areas?
- How do things now look different from our normal baseline?
These are important questions for any DBA to be able to answer. The tools in this space are generally not included with Db2. IBM provides a free option in the Database Management Console. Vendor solutions include ITGain’s SpeedGain, DBI’s Brother-Panther, and Quest’s Foglight.
Problem-Detection Monitoring and Alerting
Alerting is critical to keeping a database up and running and resolving problems before they become outages. There is just one main question we’re trying to answer here:
- What condition exists that risks imminent outage?
This monitoring area is key to proactively catching problems before they become outages, and to responding quickly when a serious problem does occur.
Real-Time monitoring is focused on helping us understand what is happening right now within the database. The questions we’re trying to answer here are things like:
- What is happening right now?
- Why is the database slow right now?
- What is causing this effect that someone is seeing?
Being able to answer these questions is critical to troubleshooting and being able to immediately resolve issues. Most of the tools in this space are part of Db2. This is the area I’ll focus on in this article.
Real-Time Monitoring Options
There are a wide array of options in the real-time monitoring space, and different options are better for different purposeses. I’ve created this summary chart of the most popular options and where they shine:
For example, if I’m looking for a monitoring method where I can use SQL to get data to pass on to another tool, I would look at the column titled “Use SQL”, and it would tell me that the MON_GET* Functions or the SYSIBMADM.SNAP* views are my best choices. If I were instead looking for a method where I’m staring at the screen watching a database, I’d look at the “Update Info” column, which would tell me that db2top, dmctop, and dsmtop will all give me on-screen updates, while I could use the -repeat option with db2pd to get regular updates.
For each column in the table above, I’m addressing a particular feature that may cause you to choose one tool over another.
Some tools print out a one-time report, while others offer options for the data on the screen to update in real-time. db2pd offers a
-repeat X option that can be used to re-run the same command every X seconds, and display the new run on the screen, though the update is displayed sequentially rather than dynamically.
Depending on the use case, it can be very useful to access data using SQL. If one is writing a script, or a dashboard, or a Jupyter Notebook, the predictable output and easy manipulation of SQL is a significant advantage. I learned early in my career not to base scripts off of db2pd because the output is likely to change with only a fixpack update, breaking any scripting based on it.
Lightweight In-Memory Metrics
There are two major underlying interfaces that all of these methods depend on. Either the snapshot interface or the lightweight in-memory metrics. The snapshot interface has a greater impact on the database being monitored, while the in-memory metrics are the strategic direction for IBM moving forward, and are less impactful.
Some tools like db2pd can only be used on the server where the database is running. Frequently db2pd may only work as the instance owner, even. Other tools can be run on a central server or on your laptop and connect to the database remotely. This choice matters if you’re running scripts, depending on where you’re running them from and also depending on your access choices.
Both monitoring interfaces report numbers since the last time the instance or database was restarted by default. The snapshot interface has a reset option to allow you to see counters only over a specific time period. Some tools on top of the lightweight in-memory monitoring emulate this functionality.
If a tool runs locally on a server, it may or may not require a database connection. For example, db2pd does not require a database connection, which could be an advantage if you are unable to connect due to some problem or due to workload. With the exception of db2pd, generally the snapshot interfaces do not require a connection while the lightweight in-memory monitoring interface does. db2top does not require a connection, while dmctop does.
In the Db2 world, “Deprecated” means that a feature is no longer improved in any way, and may be completely removed (“Discontinued”) in a future release. The snapshot interface has been Deprecated. New solutions should generally not be built on deprecated functionality, as you will quickly regret not having new functionality in the area of monitoring.
Below are links on each of the tools mentioned for more information.
IBM Db2 KC page on Monitoring Procedures and Functions
Using DB2 LUW’s Monitoring Table Functions
My New Best Friend – mon_ Part 1: Table Functions
My New Best Friend – mon_ Part 2: Views
DB2 Administrative SQL Cookbook: Generating a Comma Separated List of Columns in a MON_GET Table Function