Please start with the first two articles in this series:
Explain Part 1 – Explain and the DB2 Optimizer
Explain Part 2 – Command Line Explain Plans Using db2exfmt
Explain is one of the few areas where I can see the argument for using a GUI. There are some neat features in Visual Explain and I haven’t seen many bugs with it. In this post, I’m going to cover using Data Studio(3.2) for generating visual explain plans. Why not Control Center? Well, the Control Center doesn’t exist any more in DB2 10.1. It has been not just deprecated, but fully discontinued. If you like GUIs and have not yet gotten to 10.1, it’s a good idea to start getting used to Data Studio. Data Studio has always felt more like a developer’s tool than a DBA tool to me.
Generating the Explain Plan
Much like the command line, the first step is to get our SQL statement into a file. In this case, we’ll right-click on the datatabase and select “New SQL Script”:
Then paste your SQL into the box and hit the “Open Visual Explain” button, which looks like this:
That should pop up a pane (it could be off to the far right) that looks like this:
If you want to read an Explain Plan in the order that DB2 does things, you’ll want to read it from right to left and from bottom to top.
Notice in this case that the “return” operator is highlighted, and we can see some cumulative numbers about it displayed at the lower left. Return is the final operator in every Explain Plan, and since Explain numbers are cumulative, it gives us the total cost of the query in terms of timerons here. For more about timerons, see Explain Part 1 – Explain and the DB2 Optimizer.
Timerons are the one number also reported directly on the graph. This helps us find the most expensive step by looking for the biggest jump or jumps in timeron values from one step to another.
We can also get even more information by selecting “all” in the middle left:
In the upper left pane there are other things we can click on to get the predicates applied at each step or the inputs/outputs for the operator. This gives us details like the ones in the bottom sections of an Explain Plan generated with db2exfmt.
All of this is just exactly the same information we got through db2exfmt (though I ran them against different databases, so the numbers may not be identical in the two posts). So a lot of the same information that I covered in Explain Part 2 – Command Line Explain Plans Using db2exfmt still applies.
In Visual Explain, the operator identifiers are still there – they’re called “Operator Identifiers” and show up as attributes in the lower left panel. The highest Operator Identifier is done first, working back to the lowest. Here’s a neat trick – there’s a button in the upper middle with arrows pointing up and down called “Reverse Diagram Vertically” – it might help you see it more in the order DB2 does things:
Kind of interesting, but I still tend to look at things in the traditional way.
Other Info
I’m not going to talk in depth in this post about what to Explain. Possible sources for SQL that you may be examining include:
- Your package cache – see my developerWorks article: Mining your package cache for problem SQL in DB2 for Linux, UNIX, and Windows
- Code reviews or SQL that developers ask you to review
- The deadlock event monitor or locking event monitor
- Stored procedures or triggers
- Statement event monitors
Dealing with Parameter Markers
Many of the queries I explain have parameter markers. This is common in an e-commerce database with well defined SQL. Some queries use hard values, or rather do not use parameter markers. For those unfamiliar with parameter markers, they are represented in SQL (including what you get from the package cache) as question marks – ‘?’. Parameter markers represent values that are supplied at execution time – the DB2 Optimizer does not know about the actual values when generating the access plan. Most methods of generating Explain Plans can handle parameter markers just fine.
One thing to understand, though, is that it is not the same to fill in values for parameter markers or to artificially place parameter marker question marks in the place of values. The Explain Plan for:
select co1, col2 from table_a where col3=100
may be very significantly different than the Explain Plan for:
select co1, col2 from table_a where col3=?
This is especially true if you use distribution statistics and have an uneven distribution of data. Just be aware of this and make sure the SQL you are using to generate Explain Plans does the same things with parameter markers as the actual SQL running against your database.
Operators Which Appear in Access Plans
The list of operators which can appear in access plans is actually really long. Below are the ones DBAs supporting single partition datbases without much need to deal with XML should be at least generally familiar with. The complete list is available in the DB2 Information Center.
- DELETE – represents the deletion of rows from a table.
- FETCH – represents the fetching of columns/rows from a table.
- FILTER – represents how data is filtered.
- GENROW – used by the optimizer to generate rows of data. See my post on Fun with IN-lists and GENROW
- HSJOIN represents hash joins for which the qualified rows from tables are hashed.
- INSERT – represents the insertion of rows into a table.
- IXAND represents the ANDing of the results of multiple index scans.
- IXSCAN – represents just about any kind of access to an index.
- MSJOIN – represents a merge join.
- NLJOIN – represents a nested loop join.
- RETURN – represents the return of data from a query.
- RIDSCN – represents the scan of a list of row identifiers (RIDs).
- SORT – represents the sorting of rows in a table.
- TBSCAN – represents table scans.
- TEMP – represents the storage of data in a temporary table.
- UPDATE – represents the updating of data in the rows of a table.
One quick note on operators – every access to an Index is reported as an IXSCAN, whether it is truly a scan or not. This can be a bit frustrating when trying to optimize index access, but is good to know when you’re digging down to that level.
Summary
Visual explain is just another way of getting the same data you get with db2exfmt. If you like GUIs and clicking around, it probably makes more sense for you.
Keep your eyes open for two additional entries in this series – one describing join types and one describing doing explains from identifiers that you get from querying the package cache (thanks to reader suggestion!).
Aren’t access plans read left-to-right, top-to-bottom? Your article says right-to-left. Can you please verify?
Well, the most important thing is bottom to top. The first thing DB2 does is the lowest thing. Then if there is more than one action on that level, we must go right to left if we want to understand the order DB2 is doing things in. The true way to tell what DB2 is doing in what order is always to look at the operator identifiers. In the example in this post, DB2 is doing the ixscan/fetch on SCHSTATUS on the right first, before it does the ixscan/fetch on SCHCONFIG. It may not hurt to read it left to right, as nothing that is dependent on something else will be on on the same level. But to read it in the order that db2 is taking actions, right to left is the way to go.
Sorry I meant bottom to top which is what you have written too. Question is about left to right.
How can I clean up explain tables?
https://datageek.blog/2015/03/03/keeping-explain-under-control/
SYSINSTALLOBJECTS with the ‘D’ action is the easiest way to actually delete the tables. Truncate is nice if you just want to remove all data in them.