Recently, I needed to do some SQL profiling – discovering exactly what SQL a web application was executing against the database. I learned a lot, and when I learn a lot, I try to share with my readers.
I went into this exercise with two major misconceptions:
- A statement event monitor would collect parameter marker values
- Event monitor table output would be well documented and easy to figure out
I had the first misconception proven wrong pretty immediately. After running an “old fashioned” statement event monitor during some well-defined activities, I discovered that nowhere were the parameter marker values documented. That misconception was really the impetus for this blog entry – I wanted to blog about the process that I finally found that worked.
I also had the second misconception shoved in my face as I tried to get data that was collected out of the tables. I found the tables were there and populated, but finding the columns that fully uniquely identified each execution of a statement was difficult. The elements themselves were well documented, but not the combination and the table they’re in. If you’re working on a similar project, or just want to understand the process better, then this article is for you.
“Old” vs. “New” Event monitors
Despite having worked some with the newer event monitoring interfaces (see db2caem: The Untold Story and Analyzing Deadlocks – the new way), I found that even once I knew that this is what I needed to get detailed statement information including values for parameter markers, I had some trouble getting this up and running. What you need in order to deal with the new activity event monitors is two things – both enablement of the collection of activity data and the actual event monitor. When using db2caem, it does this for you. When talking locking, you have to manage some database configuration parameters to get what you want. In my opinion, this is more confusing than the old way where you simply created the event monitor and activated it and got data. So be sure to dig around in the info center to find what you might need in this area for any specific kind of event monitor.
Enabling Activity Data Collection
There are several levels at which you can enable data collection. You can do it at the session level. That didn’t work for me, because I’m collecting data from a web-based application (WebSphere Commerce), and I need to just capture whatever it does in a controlled scenario. That means my option was to capture this at the Workload Manager Level. Now I don’t and probably never will have a license for WLM (while in my current role). But even if you’re not licensed, all of your database activity goes into one of two default classes (more if you’re using BLU, I think). The classes in my 9.7 implementation are:
- SYSDEFAULTADMWORKLOAD
- SYSDEFAULTUSERWORKLOAD
All user queries are automatically in that latter “SYSDEFAULTUSERWORKLOAD” if you have no other properties defined.
As always, verify any licensing statements with IBM before counting on them, but my understanding is that you can enable activity data collection on these default classes with out having licensed the WLM feature. To enable the activity data collection I want for my activity statement event monitor, I had to alter that service class like this:
db2 "alter service class sysdefaultsubclass under sysdefaultuserclass collect activity data on all database partitions with details and values"
If you’re actually licensed for and using WLM, you would have to alter the appropriate service class to collect the data you need.
Presumably, there is overhead involved here, so you would not want to leave this enabled beyond the short period that you need to use it.
Event Monitor
Once you have enabled the collection of activity data, you need to create an event monitor to actually receive that data. Here, many of the options that you may be familiar with on event monitors are available. My preference for later data analysis is an event monitor that writes to a table, but if you were dealing with a fragile performance situation, that does add additional overhead.
Let’s stop for a minute and talk about when you might want to capture statements with an event monitor and this methodology. In my particular situation, we’re concerned about the frequency of execution of some statements. We suspect that the application might be running a lot more statements than needed or even the exact same statement with identical values more than once – it should be caching values or at least not querying the tables in question 1500 times for a single add-to-cart or checkout. I ran this event monitor during specifically defined activity in our dev and stage environments – with other people asked to stay out of the environment. In a production environment, you are likely not just to impact performance of the main workload with an event monitor with this level of detail, but you’re also likely to generate an incredible amount of data – which may either fill up your available disk space or just be that much harder to get a reasonable analysis out of because there is so much data.
Here’s the syntax I used for this event monitor:
db2 create event monitor act_stmt for activities write to table manualstart
That part actually seemed pretty simple to me.
Not being familiar with this event monitor, I wanted to see what tables it actually created. In my case, I didn’t really care which tablespace the tables ended up in for these test/staging environments, but you could specify that as well. So looking at the tables created, I see:
db2 list tables for all |grep DB2INST1 ACTIVITYMETRICS_ACT_STMT DB2INST1 T 2013-11-15-13.05.39.368590 ACTIVITYSTMT_ACT_STMT DB2INST1 T 2013-11-15-13.05.38.306542 ACTIVITYVALS_ACT_STMT DB2INST1 T 2013-11-15-13.05.38.655985 ACTIVITY_ACT_STMT DB2INST1 T 2013-11-15-13.05.37.208514
Of course to actually activate it when ready to capture the data, I used:
db2 "set event monitor act_stmt state = 1"
Running the End-user Actions
Being new to this, I did check to make sure the tables were being populated during the process. While testing was being done, I noted the start and end times of each test so I could use those in queries to limit the data I was working with and properly identify what belonged in what test. To do that, I simply ran this before and after each test:
db2 select current timestamp from sysibm.sysdummy1
When you’re done generating the needed activity, you must deactivate the event monitor using this syntax:
db2 flush event monitor act_stmt db2 "set event monitor act_stmt state = 0"
I’m not sure how necessary the flush is with the newer event monitoring, but I was taught to always flush event monitors before deactivating them. Seems to me that IBM should have included the flush in the deactivate by now.
If you don’t deactivate it, it will continue generating much data and impacting your performance.
Data Parsing
My next major challenge came in interpreting the data collected. Look in detail at these tables, and you’ll find there are no indexes, no primary or foreign keys – what DBA created this structure?!? After a bunch of experimentation, I was able to write the following query that uniquely lists each SQL statement. Unlike the old statement event monitors, I see only one entry for each execution of each statement – I don’t see separate entries for OPEN, EXECUTE, etc like the old statement event monitor had.
db2 "select substr(as.APPL_ID,1,35) appl_id, as.uow_id, as.activity_id, EXECUTABLE_ID, STMT_FIRST_USE_TIME, STMT_LAST_USE_TIME from db2inst1.ACTIVITYSTMT_ACT_STMT as as order by appl_id, as.uow_id, as.ACTIVITY_ID, as.executable_id" APPL_ID UOW_ID ACTIVITY_ID EXECUTABLE_ID STMT_FIRST_USE_TIME STMT_LAST_USE_TIME ----------------------------------- ----------- -------------------- ------------------------------------------------------------------- -------------------------- -------------------------- *LOCAL.db2inst1.140109212654 10 1 x'0100000000000000BE5309000000000000000000020020131115110406070749' 2013-11-15-11.04.06.070917 2013-11-15-11.04.06.070917 *LOCAL.db2inst1.140109212654 11 1 x'0100000000000000BF5309000000000000000000020020131115110432726132' 2013-11-15-11.04.32.726160 2013-11-15-11.04.32.726160 *LOCAL.db2inst1.140109212654 12 1 x'0100000000000000C75309000000000000000000020020131115110450468150' 2013-11-15-11.04.50.468177 2013-11-15-11.04.50.468177 *LOCAL.db2inst1.140109212654 13 1 x'0100000000000000EF5309000000000000000000020020131115110527915776' 2013-11-15-11.05.27.915808 2013-11-15-11.05.27.915808 *LOCAL.db2inst1.140109212654 14 1 x'0100000000000000935409000000000000000000020020131115110628044007' 2013-11-15-11.06.28.044263 2013-11-15-11.06.28.044899 *LOCAL.db2inst1.140109212654 15 1 x'0100000000000000935409000000000000000000020020131115110628044007' 2013-11-15-11.20.08.879255 2013-11-15-11.20.08.879902 *LOCAL.db2inst1.140109212654 16 1 x'0100000000000000935809000000000000000000020020131115112307073717' 2013-11-15-11.23.07.073751 2013-11-15-11.23.07.073751 *LOCAL.db2inst1.140109212654 17 1 x'01000000000000000C5A09000000000000000000020020131115112406683520' 2013-11-15-11.24.06.683684 2013-11-15-11.24.06.837215 *LOCAL.db2inst1.140109212654 18 1 x'0100000000000000865B09000000000000000000020020131115112716783591' 2013-11-15-11.27.16.783789 2013-11-15-11.27.16.783789 *LOCAL.db2inst1.140109212654 19 1 x'0100000000000000F65B09000000000000000000020020131115112820694139' 2013-11-15-11.28.20.694275 2013-11-15-11.28.20.694886 ...
Each statement execution is uniquely identified by the combination of APPL_ID, UOW_ID, and ACTIVITY_ID. In this case, I’ve only returned the EXECUTABLE_ID for each statement and not the statement text. That was actually because I knew I was going to have to group on something later in my SQL journey. If you are working with trying to group these and summarize, you cannot group on a large field like the statement text, so the EXECUTABLE_ID is a unique representation of each statement where the text is the same. You can also feed this EXECUTABLE_ID into things like the EXPLAIN_FROM_SECTION to directly explain the statement. It’s a quite useful identifier.
After figuring out the columns that uniquely identify a statement, I then moved on to finding the values used in the parameter markers for those statements. To make things difficult, they are stored in a different table. In my examples, I found statements that had anywhere between 0 and 1,001(!) parameter markers. So here’s the SQL that I use to get the full statement and the parameter marker values:
with vars as ( select substr(av.APPL_ID,1,35) appl_id , av.uow_id , av.activity_id , as.EXECUTABLE_ID , listagg(substr(av.stmt_value_data,1,15),', ') vals from db2inst1.ACTIVITYSTMT_ACT_STMT as as left outer join db2inst1.ACTIVITYVALS_ACT_STMT av on as.appl_id=av.appl_id and as.uow_id=av.uow_id and as.activity_id=av.activity_id where as.UOW_ID=27 group by av.appl_id , av.uow_id , av.ACTIVITY_ID , as.executable_id order by 5, av.appl_id, av.uow_id, av.ACTIVITY_ID, as.executable_id ) select as1.STMT_FIRST_USE_TIME, substr(as1.stmt_text,1,2000), vars.vals from db2inst1.ACTIVITYSTMT_ACT_STMT as as1 join vars on as1.uow_id=vars.uow_id and as1.appl_id=vars.appl_id and as1.activity_id=vars.activity_id fetch first 5 rows only with ur;
and the output looks like this:
STMT_FIRST_USE_TIME 2 VALS -------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2013-11-15-11.22.15.862560 select q1."ACTIVITY_ID", q1."CALLER_ID", q1."STARTTIME", q1."ENDTIME", q1."STATUS", q1."STORE_ID", q1."RUNAS_ID", q1."LASTACCESSTIME", q1."OPTCOUNTER" from CTXMGMT q1 where ( q1."CALLER_ID" = ?) and ( q1."RUNAS_ID" = ?) and ( q1."STORE_ID" = ?) and ( q1."STATUS" = ?) 159042 , 159042 , 10051 , A 2013-11-15-11.22.15.869792 SELECT CATENTRY_ID_1 FROM ( SELECT CATENTRY.CATENTRY_ID CATENTRY_ID_1 FROM CATENTRY WHERE CATENTRY.CATENTRY_ID IN (?) AND CATENTRY.MARKFORDELETE = 0 ) T1 WHERE EXISTS ( SELECT 1 FROM STORECENT WHERE CATENTRY_ID_1 = STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN ( ? ) ) 121519 , 10051 2013-11-15-11.22.15.867868 SELECT CATENTRY_ID_1 FROM ( SELECT CATENTRY.CATENTRY_ID CATENTRY_ID_1 FROM CATENTRY WHERE CATENTRY.CATENTRY_ID IN (?) AND CATENTRY.MARKFORDELETE = 0 ) T1 WHERE EXISTS ( SELECT 1 FROM STORECENT WHERE CATENTRY_ID_1 = STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN ( ? ) ) 121519 , 10051 2013-11-15-11.22.15.871828 SELECT CATENTRY.CATENTRY_ID, CATENTRY.MEMBER_ID, CATENTRY.ITEMSPC_ID, CATENTRY.CATENTTYPE_ID, CATENTRY.PARTNUMBER, CATENTRY.MFPARTNUMBER, CATENTRY.MFNAME, CATENTRY.MARKFORDELETE, CATENTRY.URL, CATENTRY.FIELD1, CATENTRY.FIELD2, CATENTRY.LASTUPDATE, CATENTRY.FIELD3, CATENTRY.ONSPECIAL, CATENTRY.ONAUCTION, CATENTRY.FIELD4, CATENTRY.FIELD5, CATENTRY.BUYABLE, CATENTRY.OID, CATENTRY.BASEITEM_ID, CATENTRY.STATE, CATENTRY.STARTDATE, CATENTRY.ENDDATE, CATENTRY.RANK, CATENTRY.OPTCOUNTER, CATENTRY.AVAILABILITYDATE, CATENTRY.LASTORDERDATE, CATENTRY.ENDOFSERVICEDATE, CATENTRY.DISCONTINUEDATE, CATENTSUBS.CATENTRY_ID, CATENTSUBS.SUBSCPTYPE_ID, CATENTSUBS.DISALLOW_REC_ORDER, CATENTSUBS.OPTCOUNTER, CATCONFINF.CATENTRY_ID, CATCONFINF.URL, CATCONFINF.REFERENCE, CATCONFINF.FIELD1, CATCONFINF.FIELD2, CATCONFINF.FIELD3, CATCONFINF.FIELD4, CATCONFINF.FIELD5, CATCONFINF.OPTCOUNTER, CATCONFINF.CONFIGURATION FROM CATENTRY LEFT OUTER JOIN CATENTSUBS ON (CATENTRY.CATENTRY_ID = CATENTSUBS.CATENTRY_ID) LEFT OUTER JOIN CATCONFINF ON (CATENTRY.CATENTRY_ID = CATCONFINF.CATENTRY_ID) WHERE CATENTRY.CATENTRY_ID IN (?) 121519 2013-11-15-11.22.15.883294 SELECT CATENTRY.CATENTRY_ID, CATENTRY.MEMBER_ID, CATENTRY.ITEMSPC_ID, CATENTRY.CATENTTYPE_ID, CATENTRY.PARTNUMBER, CATENTRY.MFPARTNUMBER, CATENTRY.MFNAME, CATENTRY.MARKFORDELETE, CATENTRY.URL, CATENTRY.FIELD1, CATENTRY.FIELD2, CATENTRY.LASTUPDATE, CATENTRY.FIELD3, CATENTRY.ONSPECIAL, CATENTRY.ONAUCTION, CATENTRY.FIELD4, CATENTRY.FIELD5, CATENTRY.BUYABLE, CATENTRY.OID, CATENTRY.BASEITEM_ID, CATENTRY.STATE, CATENTRY.STARTDATE, CATENTRY.ENDDATE, CATENTRY.RANK, CATENTRY.OPTCOUNTER, CATENTRY.AVAILABILITYDATE, CATENTRY.LASTORDERDATE, CATENTRY.ENDOFSERVICEDATE, CATENTRY.DISCONTINUEDATE, CATENCALCD.STORE_ID, CATENCALCD.CATENCALCD_ID, CATENCALCD.TRADING_ID, CATENCALCD.CATENTRY_ID, CATENCALCD.CALCODE_ID, CATENCALCD.OPTCOUNTER, CATENCALCD.CALFLAGS, CALCODE.CALCODE_ID, CALCODE.CODE, CALCODE.CALUSAGE_ID, CALCODE.STOREENT_ID, CALCODE.GROUPBY, CALCODE.PUBLISHED, CALCODE.SEQUENCE, CALCODE.COMBINATION, CALCODE.LASTUPDATE, CALCODE.FIELD1, CALCODE.DESCRIPTION, CALCODE.DISPLAYLEVEL, CALCODE.STARTDATE, CALCODE.ENDDATE, CALCODE.FLAGS, CALCODE.PRECEDENCE, CALCODE.OPTCOUNTER FROM CATENTRY, CATENCALCD, CALCODE WHERE CATENTRY.CATENTRY_ID = CATENCALCD.CATENTRY_ID AND CALCODE.CALCODE_ID = CATENCALCD.CALCODE_ID AND CATENCALCD.STORE_ID = ? AND CATENCALCD.CATENTRY_ID IN (?) 10051 , 121519
Now, I’ve put some limits in there that won’t work for all of my data. For example, I know that my input data can be up to 4,000 bytes in length – but I limited it here to just 15 bytes to make it look pretty. I also know my statements can be longer than 2,000 bytes. But I wanted it to be at least vaguely readable for this blog entry.
I can use this query with time limitations to provide all of the data on statements within the short time periods of our specific tests to provide SQL with parameter markers to my developers so they know all of the sql used.
Note that there is some potential for error in some of my conclusions in this post (as with any post) – I have only this one experiment to pull from. I welcome reader comments and corrections.
Performance Information
In my sample situation, we’re really not looking at the performance of these statements. I’ve always thought of using an event monitor to look at performance as difficult because of having to aggregate data over many executions. But I see a table called ACTIVITYMETRICS_ACT_STMT – I think it would be relatively easy to aggregate these values grouping by that EXECUTION_ID. I haven’t worked on the SQL for it, but I think it would work more easily than I had it in my head.
READER SQL CHALLENGE!!
Ok, so I’m issuing a challenge to my readers. Given the output from activity event monitors and statements with parameter markers, can you write SQL that instead of the results from above:
SELECT CATENTRY_ID_1 FROM ( SELECT CATENTRY.CATENTRY_ID CATENTRY_ID_1 FROM CATENTRY WHERE CATENTRY.CATENTRY_ID IN (?) AND CATENTRY.MARKFORDELETE = 0 ) T1 WHERE EXISTS ( SELECT 1 FROM STORECENT WHERE CATENTRY_ID_1 = STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN ( ? ) ) 121519 , 10051
would combine those into one result that would integrate the values into their proper places in the query? So with results something like this?
SELECT CATENTRY_ID_1 FROM ( SELECT CATENTRY.CATENTRY_ID CATENTRY_ID_1 FROM CATENTRY WHERE CATENTRY.CATENTRY_ID IN (?121519?) AND CATENTRY.MARKFORDELETE = 0 ) T1 WHERE EXISTS ( SELECT 1 FROM STORECENT WHERE CATENTRY_ID_1 = STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN ( ?10051? ) )
I wouldn’t care if the value delimiters are something other than ?’s but I need some indicator to tell between parameter marker values and hard-coded values. It would have to handle a variable number of parameter markers from 0 to 1001(or more).
And a note here – I do not have the answer to this challenge, but I bet some of my readers could do it. I’ll write a blog entry on any working solutions, and tweet your name and link of your choice (related to db2/business/work – no porn or political links) to all of my followers.
Enter your solution in the comments below, and make sure I have a way to contact you for further information or followup – I’ll strip out any personal contact information you don’t want shared before approving/posting any comments.
This article was exactly what I was looking for to capture statements for replaying back during regression testing. I do have one issue though… if the sql statement was an insert or an update to a CLOB or BLOB field I don’t see anything in the ACTIVITYVALS_ACT_STMT table. Do you have any idea of where, or even if, that data is captured?
Sorry, I don’t know where. I had to do quite a bit of investigative work just to figure this much out.
Overall, this post was extremely helpful. I was able to get activity monitoring set up for all activity using this post. So thank you for that!
In answer to this question, the reason LOB fields don’t show up is because they aren’t collected. See the following:
https://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.mon.doc/doc/c0006121.html
https://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.mon.doc/doc/r0012165.html
That’s extremely frustrating… I wish I had found those links 4 days ago, but alas I did not.
I’m leaving this comment so that anyone else that comes along and finds this wonderful post will also hopefully see Steve’s question and my answer so they know why all the LOB values in ACTIVITYVALS_ACT_STMT are empty strings.
Hi,
Sorry to see it was so painful for you, Ember… our docs don’t do a good enough job tying all the pieces together unfortunately. If you get stuck in the future, feel free to ask and I will try to help out. Getting this information out there is critical and your blog really helps. Thanks!
As for Steve’s question, LOB data values are not captured as per the documentation on the stmt_value_date element:
stmt_value_data – Value data monitor element
This element contains a string representation of a data value to an SQL statement. LOB, LONG, and structured type parameters appear as empty strings. Date, time, and timestamp fields are recorded in ISO format.
How to i get the actual application id ?
how can i revert
db2 “alter service class sysdefaultsubclass under sysdefaultuserclass collect activity data on all database partitions with details and values”
to default?
db2 “alter service class sysdefaultsubclass under sysdefaultuserclass collect activity data none”
Thanks 😉