db2diag.log
) for log archive messages, but I thought I could throw together a quick example of how to do this using SQL.
As I mentioned above, log archiving activity is written to db2diag.log
, but it is also written to the database history. Looking at the database history used to be moderately difficult (using db2 list history ...
), because you had to parse the text output. However, in DB2 9.7, IBM added the view SYSIBMADM.DB_HISTORY
that provides a much nicer SQL interface to the database history.
Looking at the documentation for SYSIBMADM.DB_HISTORY
, you can see that we can find log archive events by filtering on the OPERATION
column.
Here is a simple query you can run to see all log files that were archived in the last 3 hours:
select dbpartitionnum ,start_time ,firstlog from sysibmadm.db_history where operation = 'X' and start_time > current timestamp - 3 hours
Experienced DBAs might note that the start_time
column is defined as a varchar(14)
, but conveniently we can take advantage of the implicit casting feature (also introduced with DB2 9.7) to use the column as though it is a timestamp
.
The start_time
column has data in the format YYYYMMDDhhmmss
, so to get a histogram of how many log files are archived every hour, we need to see the column as YYYYMMDDhh
. It is easy to do this by using the expression SUBSTR(start_time, 1, 10)
. Adding a GROUP BY
to the previous statement gives us the histogram with the number of log files archived every hour (per database partition):
SELECT dbpartitionnum ,substr(start_time, 1, 10) as YYYYMMDDhh ,count(*) as archive_count FROM sysibmadm.db_history WHERE operation = 'X' AND start_time > current timestamp - 3 hours GROUP BY dbpartitionnum ,substr(start_time, 1, 10) DBPARTITIONNUM YYYYMMDDHH ARCHIVE_COUNT -------------- ---------- ------------- 0 2015050819 2 0 2015050821 185 0 2015050822 328 0 2015050823 239 0 2015050900 67 0 2015050902 6 0 2015050905 34 7 record(s) selected.
This query may already satisfy your requirements; however notice that there are gaps in the YYYYMMDDhh
column: no transaction log files were archived during the 2015050820
, 2015050901
, 2015050903
, and 2015050904
hours.
To resolve this, the most common solution is to use a calendar table. For those of you familiar with data warehouses, this kind of table appears in the time dimension. This table will have entries for every hour of every day for a given time period.
It is easy to imagine creating a table and pre-populating it with all of the values you might need for a given month, year or even decade. However, since we will be using this data for (presumably) infrequent administation purposes, we can generate the table on the fly, using a recursive query and a common table expression.
WITH gen_ts (ts) AS ( VALUES current timestamp - 2 days UNION ALL SELECT ts + 1 hour FROM gen_ts WHERE ts <= current timestamp ) SELECT ts FROM gen_ts; TS -------------------------- SQL0347W The recursive common table expression "IBJORHOV.GEN_TS" may contain an infinite loop. SQLSTATE=01605 2015-05-07-07.18.27.247374 2015-05-07-08.18.27.247374 2015-05-07-09.18.27.247374 2015-05-07-10.18.27.247374 2015-05-07-11.18.27.247374 2015-05-07-12.18.27.247374 2015-05-07-13.18.27.247374 2015-05-07-14.18.27.247374 ... 2015-05-09-00.18.27.247374 2015-05-09-01.18.27.247374 2015-05-09-02.18.27.247374 2015-05-09-03.18.27.247374 2015-05-09-04.18.27.247374 2015-05-09-05.18.27.247374 2015-05-09-06.18.27.247374 2015-05-09-07.18.27.247374 2015-05-09-08.18.27.247374 50 record(s) selected with 1 warning messages printed.
As with nearly all recursive queries, you may get a warning (shown above) when you execute this query, but it’s safe to ignore it: We are limiting the results to only the last 2 days, but by modifying the expression current timestamp - 2 days
you can change much much data you will see[1].
We can now perform an outer join with the SYSIBMADM.DB_HISTORY
table, to ensure that we get entries for all hours, even when there were no logs archived.
For simplicity and readability, we will build a second common table expression, format_ts
, which will truncate the timestamp values to show only the desired YYYYMMDDhh
format[2], and a third common table expression, log_archives
, that queries sysibmadm.db_history
:
WITH gen_ts (ts) AS ( VALUES current timestamp - 2 days UNION ALL SELECT ts + 1 hour FROM gen_ts WHERE ts <= current timestamp ), format_ts (yyyymmddhh) AS ( SELECT bigint(ts)/10000 FROM gen_ts ), log_archives (yyyymmddhh, archive_count) AS ( SELECT substr(start_time, 1, 10) as YYYYMMDDhh, count(*) FROM sysibmadm.db_history WHERE operation = 'X' GROUP BY substr(start_time, 1, 10) ) SELECT f.yyyymmddhh AS hour ,coalesce(a.archive_count,0) AS logs_archived FROM format_ts f LEFT OUTER JOIN log_archives a ON f.yyyymmddhh = a.yyyymmddhh; HOUR LOGS_ARCHIVED -------------------- ------------- 2015050707 34 2015050708 51 2015050709 22 2015050710 23 2015050711 0 2015050712 1 2015050713 194 2015050714 551 2015050715 33 2015050716 237 2015050717 2 2015050718 410 2015050719 36 2015050720 59 2015050721 246 2015050722 226 2015050723 342 2015050800 4 2015050801 0 2015050802 5 2015050803 0 2015050804 41 2015050805 34 2015050806 403 2015050807 34 2015050808 52 2015050809 19 2015050810 23 2015050811 0 2015050812 52 2015050813 531 2015050814 181 2015050815 296 2015050816 316 2015050817 160 2015050818 76 2015050819 3 2015050820 0 2015050821 185 2015050822 328 2015050823 239 2015050900 67 2015050901 0 2015050902 6 2015050903 0 2015050904 0 2015050905 34 2015050906 0 2015050907 0 2015050908 0 50 record(s) selected with 1 warning messages printed.
This looks very good, but the YYYYMMDDhh
format for the truncated timestamp could be a little easier to read.
We can use the TRANSLATE()
function in a non-obvious fashion[3] to modify the format of the YYYYMMDDhh
column:
translate('ABCD-EF-GH IJh', cast(f.yyyymmddhh as char(12)), 'ABCDEFGHIJ') as hour
This will translate the column from a value like 2015050410
to a much more understandable 2015-05-04 10h
.
Putting it all together, and we get the final query:
WITH gen_ts (ts) AS ( VALUES current timestamp - 2 days UNION ALL SELECT ts + 1 hour FROM gen_ts WHERE ts <= current timestamp ), format_ts (yyyymmddhh) AS ( SELECT bigint(ts)/10000 FROM gen_ts ), log_archives (yyyymmddhh, archive_count) AS ( SELECT substr(start_time, 1, 10) as YYYYMMDDhh, count(*) FROM sysibmadm.db_history WHERE operation = 'X' GROUP BY substr(start_time, 1, 10) ) SELECT translate('ABCD-EF-GH IJh', cast(f.yyyymmddhh as char(12)), 'ABCDEFGHIJ') as hour ,coalesce(a.archive_count,0) AS logs_archived FROM format_ts f LEFT OUTER JOIN log_archives a ON f.yyyymmddhh = a.yyyymmddhh; HOUR LOGS_ARCHIVED -------------- ------------- 2015-05-07 07h 34 2015-05-07 08h 51 2015-05-07 09h 22 2015-05-07 10h 23 2015-05-07 11h 0 2015-05-07 12h 1 2015-05-07 13h 194 2015-05-07 14h 551 2015-05-07 15h 33 2015-05-07 16h 237 2015-05-07 17h 2 2015-05-07 18h 410 2015-05-07 19h 36 2015-05-07 20h 59 2015-05-07 21h 246 2015-05-07 22h 226 2015-05-07 23h 342 2015-05-08 00h 4 2015-05-08 01h 0 2015-05-08 02h 5 2015-05-08 03h 0 2015-05-08 04h 41 2015-05-08 05h 34 2015-05-08 06h 403 2015-05-08 07h 34 2015-05-08 08h 52 2015-05-08 09h 19 2015-05-08 10h 23 2015-05-08 11h 0 2015-05-08 12h 52 2015-05-08 13h 531 2015-05-08 14h 181 2015-05-08 15h 296 2015-05-08 16h 316 2015-05-08 17h 160 2015-05-08 18h 76 2015-05-08 19h 3 2015-05-08 20h 0 2015-05-08 21h 185 2015-05-08 22h 328 2015-05-08 23h 239 2015-05-09 00h 67 2015-05-09 01h 0 2015-05-09 02h 6 2015-05-09 03h 0 2015-05-09 04h 0 2015-05-09 05h 34 2015-05-09 06h 0 2015-05-09 07h 0 2015-05-09 08h 0 50 record(s) selected with 1 warning messages printed.
- Be careful about making the
gen_ts
common table expression return too much data; this can make the final query run with extremely poor performance. We are using 2 “virtual” tables (the common table expression and theSYSIBMADM.DB_HISTORY
view, which returns data from a table function). If you want to report on longer periods of time, I recommend you consider building a physical table to hold the database history. I did a 4-minute presentation about this for the 2014 DB2’s Got Talent competition, which you can watch here (skip to 27:17 in the video to see my presentation). ↩ - The obvious way to do this is to use the
SUBSTR()
function, but to avoid having to remove the-
and.
characters in the timestamp, you can use the expressionBIGINT(ts)/10000
.BIGINT(ts)
returns the timestamp asYYYYMMDDhhmmss
, and dividing the resulting number by 10000 strips the minutes and seconds, resulting inYYYYMMDDhh
. ↩ - The documentation for
TRANSLATE
would never hint at using the function in this manner. I learned this awesome trick from either Serge Rielau or Knut Stolze on thecomp.databases.ibm-db2
usenet newsgroup a number of years ago. ↩
Ian Bjorhovde is the principal consultant for DataProxy System Solutions in Phoenix, Arizona. He has worked in the IT industry for over 18 years and has spent the last 15 years as a consultant developing and supporting large DB2 LUW databases with clients in many different industries. He is an IBM Champion for Information Management, produces a DB2-focused podcast called The Whole Package Cache and has presented at RUG, IDUG and IBM conferences.
[…] Generating Log Archive Activity Histograms […]
Useful script, I’ve done this the hard way when I’ve previously needed it!
The TRANSLATE is clever, but you don’t need it – you can use TO_CHAR to format the output and also get rid of the CTE for ‘format_ts’ :
WITH gen_ts (ts, ymdh ) AS (
VALUES (CURRENT TIMESTAMP – 7 DAYS, TO_CHAR(CURRENT TIMESTAMP – 7 DAYS, ‘YYYYMMDD HH24′)||’h’ )
UNION ALL
SELECT ts + 1 hour, TO_CHAR(ts + 1 hour, ‘YYYYMMDD HH24′)||’h’
FROM gen_ts
WHERE ts <= CURRENT TIMESTAMP
),
log_archives (yyyymmddhh, archive_count) AS (
SELECT TO_CHAR(TIMESTAMP(start_time), 'YYYYMMDD HH24')||'h' as YYYYMMDDhh, count(*)
FROM sysibmadm.db_history
WHERE operation = 'X'
GROUP BY TO_CHAR(TIMESTAMP(start_time), 'YYYYMMDD HH24')||'h'
)
SELECT
g.ymdh as hour
,coalesce(a.archive_count,0) AS logs_archived
FROM
gen_ts g
LEFT OUTER JOIN log_archives a
ON g.ymdh = a.yyyymmddhh;
I prefer the `format_ts` CTE if only because it separates the logic of formatting the timestamp from the recursive query – I think it’s perhaps a little easier to understand – but using `TO_CHAR` in the final query would certainly be a better solution than `TRANSLATE`. Thanks!
Hi Ian,
Looks like my query, but much smarter with your calendar and translate. But sometimes you have to know the amount of logs and the log size…
SELECT H.DATE AS HOUR, H.DBPARTITIONNUM, H.NUMBER_OF_LOGS_PER_HOUR, H.NUMBER_OF_LOGS_PER_HOUR * L.LOGSIZE_MB AS AMOUNT_LOGS_HOUR_MB, LOGSIZE_MB
FROM (SELECT (VALUE * 4096) / 1024 / 1024 AS LOGSIZE_MB, DBPARTITIONNUM
FROM SYSIBMADM.DBCFG AS DBCFG
WHERE NAME IN (‘logfilsiz’)) AS L, (
SELECT SUBSTR(END_TIME, 1, 10) AS DATE, DBPARTITIONNUM,
COUNT AS NUMBER_OF_LOGS_PER_HOUR
FROM SYSIBMADM.DB_HISTORY
WHERE OPERATION = ‘X’ AND OPERATIONTYPE = ‘1’
AND TIMESTAMP(END_TIME) > CURRENT_DATE – 14 DAYS
GROUP BY SUBSTR(END_TIME, 1, 10), DBPARTITIONNUM) AS H
WHERE L.DBPARTITIONNUM = H.DBPARTITIONNUM
ORDER BY 1, 2;
Best regards
Joachim
Adding log space used as you suggest would be a very useful addition to the query. Thanks!
Hi.
I have been using the amount of logs archived as a proxy for activity too. However due to some constrains of our archiving solution we have to increase the log file sizes to keep things running, This has thrown my statistics out the window.
Is there a table that contains the actual size of every log file.
Regards
Philip.
Hello,
Great select but I’m wondering how to advance the script to see logs generated per day, not per hour.
Can anybody advice???
Thanks a lot!
Marian
Marian,
You can use
substr(start_time, 1, 8) as YYYYMMDD
in the query (i.e., shortening the substring by 2 characters) against sysibmadm.db_history to give you a summary by date.If you have at least 1 log file archived per day, you will probably not even need to use a calendar table.