This post is specific to DB2 9.7. In DB2 10.1 and above, Usage Lists would be a better way to handle this.
In the course of writing my recent developerWorks article on low-cardinality indexes and an associated presentation, I found that I have dozes of indexes with a cardinality of 1 that are also recently used. I cannot come up with a way that using these indexes could help performance, so I want to understand the SQL that is using them. To do that, I need to find the SQL that uses an index. I also would like to do it without event monitors on a production system with minimal impact. Here’s my journey to do that.
I suspect the reason that the low-cardinality indexes are being used is because the SQL using them is using parameter markers, and therefore cannot make use of distribution statistics to know that they’re harmful rather than helpful to performance. I also want to know if the SQL using these indexes is frequently executed or some of the reporting workload that this e-commerce database is cursed with.
I also have absolutely no vendor or IBM tools other than what comes with DB2 ESE. This in itself is both a blessing and a curse. It takes me longer to do a lot of things than it would with tools, but without tools I learn so much more about how things work.
Where to Find Index Usage Information
Let’s start with where I got the list of indexes that have a cardinality of 1 and a lastused within the last 30 days. (this was run on 9/30/2013)
select i.lastused, substr(t.tabname,1,15) as tabname, substr(indname,1,10) as indname, substr(colnames,1,16) as colnames, smallint(fullkeycard)as fullkeycard, integer(card) as table_card, volatile from syscat.indexes i, syscat.tables t where i.tabname=t.tabname and i.tabschema=t.tabschema and t.tabschema not like 'SYS%' and card > 10 and fullkeycard=1 and uniquerule='D' and i.lastused > current date - 30 days with ur; LASTUSED TABNAME INDNAME COLNAMES FULLKEYCARD TABLE_CARD VOLATILE ---------- --------------- ---------- ---------------- ----------- ----------- -------- 09/30/2013 ACACTGRP I0000427 +MEMBER_ID 1 412 09/29/2013 ACRELGRP I0000442 +MEMBER_ID 1 19 09/30/2013 ACRESGRP I0000446 +MEMBER_ID 1 216 09/30/2013 ATCHTGT I0000864 +ATTACHUSG_ID 1 3961 09/30/2013 CALCODE I0001247 +TXCDCLASS_ID 1 1019 09/30/2013 CATENCALCD I0000516 +TRADING_ID 1 208 09/30/2013 CATENTRY I0000375 +RANK 1 279635 09/30/2013 CATGPCALCD I0000522 +TRADING_ID 1 32 09/30/2013 CATGRPTPC I0000531 +CATGROUP_ID 1 22 09/30/2013 CATTOGRP I0000373 +CATALOG_ID_LINK 1 627 09/30/2013 CPENDORDER I0000542 +STOREENT_ID 1 23 09/30/2013 DISPCGPREL I0000556 +MBRGRP_ID 1 13 09/30/2013 FFMCENTDS I0000570 +STADDRESS_ID 1 785 09/30/2013 LANGPAIR I0000603 +STOREENT_ID 1 13 09/30/2013 NOTIFY I0000625 +MSGTYPE_ID 1 11818 09/29/2013 OICOMPLIST I0001263 +OICOMPREL_ID 1 14951 09/30/2013 ORDERS I0001267 +BUYERPO_ID 1 1184754 09/30/2013 PARTICIPNT I0000680 +TERMCOND_ID 1 73 09/30/2013 PATTRVALUE I0001270 +ATTACHMENT_ID 1 44644 09/30/2013 PPCEXTDATA I0001106 +PPCBATCH_ID 1 3633179 09/30/2013 PPCPAYINST I0000919 +RMA_ID+MARKFORD 1 241572 09/30/2013 PPCPAYTRAN I0001103 +PPCBATCH_ID 1 209810 09/30/2013 SCHCMD I0000758 +STOREENT_ID 1 21 09/30/2013 TRADEPOSCN I0000814 +PRODUCTSET_ID 1 43 09/30/2013 PRELETEMPLATE I0001344 +STOREENT_ID 1 13 09/30/2013 PRICERULE I0001342 +TYPE 1 15 09/30/2013 CATENTSUBS I0001358 +SUBSCPTYPE_ID 1 162 09/30/2013 SEOPAGEDEFSUBPA I0001415 +STOREENT_ID 1 14 09/20/2013 CATGROUP I0000376 +RANK 1 1400 09/30/2013 BUSEVENT I0001104 +CHECKED 1 6805318 09/30/2013 STLOC I0001211 +GEONODE_ID 1 4913 09/29/2013 PPCEXTDATA I0000927 +PPCPAYTRAN_ID 1 3633179 32 record(s) selected.
After digging around a bit, I determined there’s no good way to say match statements in mon_get_pkg_cache_stmt to indexes using system tables alone. So I determined I’m going to have to explain basically everything in my package cache to accomplish this. Explain must track the objects used, and indeed, I found the EXPLAIN_OBJECT table, which should list the indexes used and tie them back to which statement they were used for.
EXPLAINing Everything
My first thought on explaining everything was to use db2advis – since I knew that you can apply db2advis to the entire contents of your package cache, and can specify the option to capture explain information as well. Luckily, I thought that this might be resource-intensive and tried it on a non-production system first. I DO NOT RECOMMEND USING THIS METHOD on a production database. I used this syntax:
db2advis -d wc005s01 -q WSCOMUSR -g -p |tee db2advis.out
This ran for over 30 minutes. It did end up recommending some interesting indexes, but sifting through them would be a bit time consuming, and if I added all 192 it recommended, it would take over 5 GB of space – quite a bit for this ~100GB database. It wasn’t about the indexes anyway – I ran this to get the explain information. The big problem was that it spiked my CPU. CPU on a box that mostly runs below 20% was at 70-80% for that half hour, with spikes all the way to 100%. That would never work for my production server. And anyway, it’s doing more work than I need – I don’t need all that db2advis info, just the explain tables popluated.
Theoretically, I could just query mon_get_pkg_cache_stmt to get each of the statements, and run explains on them. But I wanted to use this to see if there was a more direct way of getting what I wanted. Digging around the IBM DB2 Info Center some more (some days I just love the Info Center), I found EXPLAIN_FROM_SECTION. It allows me to pass in an executable_id (which is available in mon_get_pkg_cache_stmt), and not much else. Armed with that, I came up with this, instead:
db2 -x "select executable_id from table(mon_get_pkg_cache_stmt(NULL, NULL, NULL, -2)) as t" >exe_id.list date; cat exe_id.list |while read e; do db2 connect to wc005q01; db2 "CALL EXPLAIN_FROM_SECTION ($e, 'M', NULL, 0, NULL, ?, ?, ?, ?, ?)"; date; db2 connect reset; done >>explain_all.out
When I ran this on a nearly un-used test box, it used between 5% and 15% of CPU, when the CPU was normally 0% used. There were about 13,000 statements involved, and it’s a 12-cpu box, but obviously results vary based on hardware, package cache size, database purpose and design, etc. This is the results of my case study only.
It also ran for about half an hour. Part of that is my methodology. Running it in a stored procedure would certainly perform better – especially since my bash loop requires me to connect in every loop – significant overhead. But creating a stored procedure on stage and prod databases would require me to go through a change management process. I would be willing to do that once I understand if this process is useful, but not when I’m just trying it out.
After testing to make sure I wasn’t going to tank the system, I moved forward to an actually busy test system (ongoing CPU utilization still under 10%), it spiked CPU utilization up to 35%. Granted, this is a smaller box, not sized for load tests, with only 3 cpus and there were about 18,000 statements to explain. Running explains this way really seemed to peg one cpu, and leave the others alone.
Production has a smaller package cache (I have since altered the test environments to match prod), and has 12 cpus. It normally runs around 20% cpu utilization off hours, so I feel comfortable running the same process there. With only about 3,000 statements, it finished on prod in less than 10 minutes, and while running had CPU running 10% higher or less than what it looked like without it running.
Summary of the last 3 paragraphs: don’t run this on prod without testing the impact on your system first – it could be the straw that breaks the camel’s back.
Analyzing Explain Info for Indexes
After I had collected data as outlined above, I used this query to look at indexes on the same criteria I used to look at indexes above:
select o.explain_time, substr(o.object_schema,1,16) as schema, substr(o.object_name,1,32) as name, i.uniquerule, i.fullkeycard, t.card, substr(s.statement_text,1,50) as stmt from db2inst1.explain_object o, db2inst1.explain_statement s, syscat.tables t, syscat.indexes i where s.explain_time = o.explain_time and s.explain_requester=o.explain_requester and s.source_name=o.source_name and s.source_schema=o.source_schema and s.explain_time > timestamp('2013-09-30-20.00.00.996909') and o.object_type='IX' and o.object_schema = i.indschema and o.object_name = i.indname and i.tabname=t.tabname and i.tabschema=t.tabschema and card > 10 and i.fullkeycard=1 and uniquerule='D' with ur;
Note the conditions here – I’m looking for SQL that used indexes with a cardinality of 1 where the table has a cardinality greater than 10. In one execution of this process, I got output like this:
EXPLAIN_TIME SCHEMA NAME UNIQUERULE FULLKEYCARD CARD STMT -------------------------- ---------------- -------------------------------- ---------- -------------------- -------------------- -------------------------------------------------- 2013-10-01-09.41.44.460410 WSCOMUSR I0001263 D 1 17742 DELETE FROM ORDERITEMS WHERE ORDERITEMS_ID = ? AN 2013-10-01-09.41.44.460410 WSCOMUSR I0001263 D 1 17742 $WITH CONTEXT$($CONSTRAINT$(WSCOMUSR.BKORDALLOC.F_ 2013-10-01-09.41.51.030702 WSCOMUSR I0000927 D 1 3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD 2013-10-01-09.41.51.030702 WSCOMUSR I0000927 D 1 3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC 2013-10-01-10.01.28.520154 WSCOMUSR I0000927 D 1 3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD 2013-10-01-10.01.28.520154 WSCOMUSR I0000927 D 1 3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC 2013-10-01-10.04.31.072162 WSCOMUSR I0000927 D 1 3769661 select ppcpaytran_id, count(*) from wscomusr.PPCEX 2013-10-01-10.04.31.072162 WSCOMUSR I0000927 D 1 3769661 SELECT Q3.PPCPAYTRAN_ID AS "PPCPAYTRAN_ID", Q3.$C1 2013-10-01-15.21.23.483638 WSCOMUSR I0001263 D 1 17742 DELETE FROM ORDERITEMS WHERE ORDERITEMS_ID = ? AN 2013-10-01-15.21.23.483638 WSCOMUSR I0001263 D 1 17742 $WITH CONTEXT$($CONSTRAINT$(WSCOMUSR.BKORDALLOC.F_ 2013-10-01-15.21.31.235352 WSCOMUSR I0000927 D 1 3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD 2013-10-01-15.21.31.235352 WSCOMUSR I0000927 D 1 3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC 12 record(s) selected.
One thing I noticed here – many times there are two entries for the same query – one for the original syntax and one for the optimized syntax that you see listed in the explain plan when you do an explain (which is not directly executable).
While this got some interesting results, it still did not quite get me as far as I wanted to go. I find that to understand HOW the sql is using these indexes, I need the values for the parameter markers in the queries. While this was an interesting exercise, in the end, I think I still need an event monitor to fully understand how the SQL is using the indexes, because only an event monitor will get me those values.. I’m hoping DB2 is using the indexes when the value of the column for the low-cardinality indexes is not represented in the index at all – for elimination. Because otherwise I don’t understand how an index with a cardinality of 1 could ever be useful.
But I can run explains and run the design advisor against this SQL to see what I can do as far as these indexes go.
Hi Amber, I liked your blog as I tried the same thing before and feels nice somebody else is trying to get the same info DB2 does not make easy to get.
My experiences are:
A) I’ve found SQLs where DB2 uses indexes which have nothing to do at all with the WHERE predicate. Like this particular query below uses the PK index (which is SINGLETRANSID column) but you don’t see it in the where predicate, even though there’s an index on ARCHFLAG+ARCHDATE+TMSTAMP columns.
“SELECT Min(singletransID) FROM isaac.table WHERE ARCHFLAG = ? and Tmstamp < timestamp(current_date) – ? DAY"
Only if you add 'and ARCHDATE=? and' in the predicate, DB2 uses the right index. So I think if we know indexes with fullkeycard=1 are not useful we should drop them anyway.
Hi ,
Thanks for sharing the information . It was great to see the way we can pull the metadata statistics and use the same to improve our application and database performance . I was also wondering if there is a way to do the below :
I need to know which columns of a certain index does a SELECT query use if at all the query uses that index . As for eg let’s say I have an index ix1 based on columns c1 , c2 , c3 and c4 . Now my select query is in the form of select * from table where c1 = …. and c3 = …. . If I see the explain plan I will be able to see that DB2 has indeed used index ix1 but how can I know which columns out of that index has DB2 used viz. c1 only or c1 and c3 both . Do we have any third party tool that pretty much serves the above purpose ?
DB2 has to read all of the columns from the index, so it doesn’t really matter which columns it “uses” – unless we’re talking index-only access, or unless the table has to then be scanned/accessed to get more information that’s not in the index. If the data is in the index and there isn’t a function in the query that prevents its use, then the data in the index will generally be preferred over the table. By reading the explain, you can generally tell which predicates are being applied at which step, but you have to dig down into the details to do so.
The bigger question to ask with indexes is usually how they are being accessed. There are two ways to access an index – following the b-tree structure to retrieve certain leaf pages, or scanning through all the leaf pages. The latter is obviously more resource-intensive and worse for performance.