The goal of this article is to to take one query and use the explain facilities of Db2 to walk through it, and then do the same with MySQL, and then show some of the optimizations I might make for the query and why.
A few disclaimers. First, I’m using MySQL 5.7. I understand there may be some more interesting options in MySQL 8, but I continue to use 5.7 because that’s what is in production at work. Second, I have 20 years of experience with Db2 Explains, and only 6 months with MySQL. I welcome comments and suggestions below.
Optimizer
Moving from Db2 to MySQL, I expected the optimizer to be less sophisticated. And that is, indeed the case. For me, that’s one of the known differences between an open-source and a proprietary system. IBM has put decades of paid development into the Db2 optimizer. The way Db2 collects and uses statistics is also much more in-depth than MySQL 5.7.
Query and Data Model
I’ve chosen a data model and a query that is pretty MySQL-centric. I’m using the Sakila database and a the “find overdue DVDs” query from this page in the MySQL documentation.
Db2
For Db2, I pulled the Sakila database from this repo. I do have to change the date syntax, as every database platform somehow has to come up with its own date syntax. <sigh> I’m translating it as literally as possible. Note that the data wasn’t all there from the source, I list, either, and I had to tweak it. This included some kind of confusion between NULL and ”, and missing data in the phone field of the address table. I need to see if they’ll accept PRs to make it better.
Query and Result Set
Here’s the query and result set to level set before digging in to the explain:
SELECT customer.last_name || ', ' || customer.first_name AS customer, address.phone, film.title
FROM rental
INNER JOIN customer ON rental.customer_id = customer.customer_id
INNER JOIN address ON customer.address_id = address.address_id
INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
INNER JOIN film ON inventory.film_id = film.film_id
WHERE rental.return_date IS NULL
AND rental_date + film.rental_duration DAYS < CURRENT DATE
ORDER BY title
LIMIT 5
CUSTOMER PHONE TITLE
-------------------------------------------------------------------------------------------- -------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OLVERA, DWAYNE 62127829280 ACADEMY DINOSAUR
HUEY, BRANDON 99883471275 ACE GOLDFINGER
OWENS, CARMEN 272234298332 AFFAIR PREJUDICE
HANNON, SETH 864392582257 AFRICAN EGG
COLE, TRACY 371490777743 ALI FOREVER
5 record(s) selected.
The output is messy here because that's how Db2 formats it by default using my favorite CLI, and I'm not changing the query to make it prettier. Note that the data here is the same as using MySQL below.
Basic Explain
This is my favorite format for doing explain using Db2. I'll share each step here, and then attach the full output, as it's often several pages.
$ db2 set current explain mode explain
DB20000I The SQL command completed successfully.
$ db2 -tvf query4.sql
SELECT customer.last_name || ', ' || customer.first_name AS customer, address.phone, film.title FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id INNER JOIN address ON customer.address_id = address.address_id INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id INNER JOIN film ON inventory.film_id = film.film_id WHERE rental.return_date IS NULL AND rental_date + film.rental_duration DAYS < CURRENT DATE ORDER BY title LIMIT 5
SQL0217W The statement was not executed as only Explain information requests
are being processed. SQLSTATE=01604
$ db2exfmt -d sakila -1 -o query4.exfmt
DB2 Universal Database Version 11.5, 5622-044 (c) Copyright IBM Corp. 1991, 2019
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool
Connecting to the Database.
Connect to Database Successful.
Using SYSTOOLS schema for Explain tables.
Output is in query4.exfmt.
Executing Connect Reset -- Connect Reset was Successful.
Note there is no information given there - it's all in the output file query4.exfmt.
Explain Highlights
I could probably teach an hour-long course on everything in this explain output, so let's stick to a few highlights and what they tell us. First of all, I nearly always start with the query graph:
Access Plan:
-----------
Total Cost: 450.146
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
5
^NLJOIN
( 2)
450.146
262.976
/-----------------+-----------------\
5 1
^NLJOIN FETCH
( 3) ( 14)
389.413 13.5336
254 2
/------+-------\ /---+----\
5 1 1 603
TBSCAN FETCH IXSCAN TABLE: DB2INST1
( 4) ( 12) ( 15) ADDRESS
328.517 13.5336 6.76789 Q3
245 2 1
| /---+----\ |
5 1 599 603
SORT IXSCAN TABLE: DB2INST1 INDEX: DB2INST1
( 5) ( 13) CUSTOMER PK_ADDRESS
328.517 6.76789 Q1 Q3
245 1
| |
61 599
HSJOIN^ INDEX: DB2INST1
( 6) PK_CUSTOMER
328.515 Q1
245
/---------+----------\
1000 183
TBSCAN HSJOIN^
( 7) ( 8)
49.3022 279.192
53 192
| /---------+---------\
1000 4581 183
TABLE: DB2INST1 FETCH TBSCAN
FILM ( 9) ( 11)
Q5 143.584 135.564
43 149
/---+----\ |
4581 4581 16044
IXSCAN TABLE: DB2INST1 TABLE: DB2INST1
( 10) INVENTORY RENTAL
81.5935 Q4 Q2
12
|
4581
INDEX: DB2INST1
IDX_FK_FILM_ID
Q4
This graph is read bottom to top (the things at the bottom happen first, and the things at the top happen last). It's also read right to left (things at the right generally happen before things on the left).
From this graph, I can see that this query takes 450 timerons. Timerons are an abstract measurement of time that cannot be converted to actual units of time, but can be used to compare access plans to one another. It also expects to return 5 rows, which is absolutely great, because it does return 5 rows. This graph shows me every object (table or index) that is accessed and what join methodology is used to join it to the rest of the query. Looking at this, I can immediately see that there are two table scans, and two index accesses with fetching data from the tables themselves. There are several levels of opportunities to improve performance there.
Each node of the graph is called an operator and has an identifier in parentheses. In this graph, the operators are numbered 1 through 15. Each operator also tells us the cost in timerons, which is cumulative of all operators below that one in the graph. This is the number directly below the operator identifier. The top number in each operator tells us the number of rows Db2 expects to be dealing with at this point. These numbers can be off, particularly if statistics are not correct or are not detailed. There are tools for determining that like db2caem.
To get more information on any of the operators, I can scroll down to their section. For example, the details on the table scan of RENTAL (11) look like this:
11) TBSCAN: (Table Scan)
Cumulative Total Cost: 135.564
Cumulative CPU Cost: 3.57339e+07
Cumulative I/O Cost: 149
Cumulative Re-Total Cost: 1.77916
Cumulative Re-CPU Cost: 3.47691e+07
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 7.46347
Estimated Bufferpool Buffers: 149
Arguments:
---------
CUR_COMM: (Currently Committed)
TRUE
JN INPUT: (Join input leg)
INNER
LCKAVOID: (Lock Avoidance)
TRUE
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
SHARE (CS/RS)
SCANDIR : (Scan Direction)
FORWARD
SKIP_INS: (Skip Inserted Rows)
TRUE
SPEED : (Assumed speed of scan, in sharing structures)
FAST
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY
THROTTLE: (Scan may be throttled, for scan sharing)
TRUE
VISIBLE : (May be included in scan sharing structures)
TRUE
WRAPPING: (Scan may start anywhere and wrap)
TRUE
Predicates:
----------
6) Sargable Predicate,
Comparison Operator: Is Null
Subquery Input Required: No
Filter Factor: 0.0114061
Predicate Text:
--------------
Q2.RETURN_DATE IS NULL
Input Streams:
-------------
7) From Object DB2INST1.RENTAL
Estimated number of rows: 16044
Number of columns: 5
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q2.$RID$+Q2.RENTAL_DATE+Q2.RETURN_DATE
+Q2.INVENTORY_ID+Q2.CUSTOMER_ID
Output Streams:
--------------
8) To Operator #8
Estimated number of rows: 183
Number of columns: 3
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q2.RENTAL_DATE+Q2.INVENTORY_ID+Q2.CUSTOMER_ID
Each operator has similar sections, indicating where it is getting input from and where it is sending output to, along with any predicates being applied, and other details about the cost at this node.
Using Explain
The first reason to use explain is always to understand what choices the optimizer is making. Often we will use those choices and the costs to determine where to focus to improve query performance. While there are some red flags, like table scans, to look out for that we may want to address, we also use those cost estimates to figure out where to spend time optimizing. In the case of this query, the costs tell us to focus on the operators below the Hash Scan Join - operator 6. Spending time on the Customer or address branches would bring us limited improvements as the costs there are already quite low. If we could transform the table scan on RENTAL (operator 11) to something else, we might make a significant improvement.
Explain Summary for Db2
I've shared only a very small amount of information here, but the information is there to really dig into different parts of the query. In addition, I have advanced techniques. I can use db2caem to get not just an explain plan, but the actual counts of things like rows if I suspect that Db2 is estimating incorrectly. This could allow me to improve my statistics gathering in a way that gives Db2 the information needed to select a better plan.
I could also do an explain from section of an actual executed instance of this query from the package cache to validate that what explain expects to happen is actually what is happening.
MySQL
I keep feeling like I'm missing something with explain in MySQL, and maybe I am, but the information seems far less dense and useful.
Query and Result Set
This part is nice. Using the MySQL command line, I get nicer output.
mysql> SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer,
-> address.phone, film.title
-> FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id
-> INNER JOIN address ON customer.address_id = address.address_id
-> INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
-> INNER JOIN film ON inventory.film_id = film.film_id
-> WHERE rental.return_date IS NULL
-> AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE()
-> ORDER BY title
-> LIMIT 5;
+----------------+--------------+------------------+
| customer | phone | title |
+----------------+--------------+------------------+
| OLVERA, DWAYNE | 62127829280 | ACADEMY DINOSAUR |
| HUEY, BRANDON | 99883471275 | ACE GOLDFINGER |
| OWENS, CARMEN | 272234298332 | AFFAIR PREJUDICE |
| HANNON, SETH | 864392582257 | AFRICAN EGG |
| COLE, TRACY | 371490777743 | ALI FOREVER |
+----------------+--------------+------------------+
5 rows in set (0.11 sec)
This is the same output we got from Db2. I do like how it tells me how long the query took to execute as well. In Db2, I'd have to wrap it in another command or use db2batch to get that. db2batch isn't a bad idea anyway, as one execution of a query isn't the definitive answer on how long it takes.
Explain
There are actually two forms of explain I would run here to look at things. First the basic explain, then a json format that gives us a few more details.
Basic Explain
The basic output of explain looks like this:
mysql> EXPLAIN SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer,
-> address.phone, film.title
-> FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id
-> INNER JOIN address ON customer.address_id = address.address_id
-> INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
-> INNER JOIN film ON inventory.film_id = film.film_id
-> WHERE rental.return_date IS NULL
-> AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE()
-> ORDER BY title
-> LIMIT 5;
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | rental | NULL | ALL | idx_fk_inventory_id,idx_fk_customer_id | NULL | NULL | NULL | 1 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | customer | NULL | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | address | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | inventory | NULL | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | film | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.inventory.film_id | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------+------+----------+----------------------------------------------+
5 rows in set, 1 warning (0.02 sec)
Just to confuse me, this output is read top to bottom, and not bottom to top like the Db2 explain output.
The nice part about this output is that it gives me one and only one line for each table being accessed, and describes which index is being used to access it. It tells me for each table how many rows it expects to be using, but these numbers are based on much more simple statistics than I'm used to in Db2, so may be less accurate. "filtered" does not mean what we might think it to mean in Db2. It indicates what percentage of the rows it thinks it is working with will meet the other conditions for that table. Notice here that it tells us here that it expects to only be working with one row all the way along, when we end up with 5 rows.
The "type" column is sometimes called join type, but is really access type. The access types are well-documented in the MySQL documentation - go to the section titled "EXPLAIN Join Types". One of the big things we're looking for in the explain output here is things we know are not great like table scans (join type: ALL). Here the table scan on the rental table once again stands out.
While the MySQL optimizer is at least partially cost-based, this format of explain has no mention of costs.
JSON Explain
mysql> EXPLAIN SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer, address.phone, film.title FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id INNER JOIN address ON customer.address_id = address.address_id INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id INNER JOIN film ON inventory.film_id = film.film_id WHERE rental.return_date IS NULL AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE() ORDER BY title LIMIT 5;^C
mysql> EXPLAIN FORMAT=JSON SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer,
-> address.phone, film.title
-> FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id
-> INNER JOIN address ON customer.address_id = address.address_id
-> INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
-> INNER JOIN film ON inventory.film_id = film.film_id
-> WHERE rental.return_date IS NULL
-> AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE()
-> ORDER BY title
-> LIMIT 5;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "2.75"
},
"ordering_operation": {
"using_temporary_table": true,
"using_filesort": true,
"cost_info": {
"sort_cost": "1.00"
},
"nested_loop": [
{
"table": {
"table_name": "rental",
"access_type": "ALL",
"possible_keys": [
"idx_fk_inventory_id",
"idx_fk_customer_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "0.35",
"data_read_per_join": "32"
},
"used_columns": [
"rental_id",
"rental_date",
"inventory_id",
"customer_id",
"return_date"
],
"attached_condition": "(`sakila`.`rental`.`return_date` is null)"
}
},
{
"table": {
"table_name": "customer",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"idx_fk_address_id"
],
"key": "PRIMARY",
"used_key_parts": [
"customer_id"
],
"key_length": "2",
"ref": [
"sakila.rental.customer_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "0.70",
"data_read_per_join": "584"
},
"used_columns": [
"customer_id",
"first_name",
"last_name",
"address_id"
]
}
},
{
"table": {
"table_name": "address",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"address_id"
],
"key_length": "2",
"ref": [
"sakila.customer.address_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "1.05",
"data_read_per_join": "632"
},
"used_columns": [
"address_id",
"phone"
]
}
},
{
"table": {
"table_name": "inventory",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"idx_fk_film_id"
],
"key": "PRIMARY",
"used_key_parts": [
"inventory_id"
],
"key_length": "3",
"ref": [
"sakila.rental.inventory_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "1.40",
"data_read_per_join": "16"
},
"used_columns": [
"inventory_id",
"film_id"
]
}
},
{
"table": {
"table_name": "film",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"film_id"
],
"key_length": "2",
"ref": [
"sakila.inventory.film_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "1.75",
"data_read_per_join": "552"
},
"used_columns": [
"film_id",
"title",
"rental_duration"
],
"attached_condition": "((`sakila`.`rental`.`rental_date` + interval `sakila`.`film`.`rental_duration` day) < (curdate()))"
}
}
]
}
}
} |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.02 sec)
This has a bit more information, including some cost metrics. I'm still not sure how useful they are. I've seen the 'data_read_per_join' be wildly inaccurate - to the tune of TB, when there's no way it was dealing with that much data. But it can still be used to find parts of the access plan that are more expensive than others, probably.
I haven't yet seen a solid number like Db2's timerons to use to compare two access plans. I've found that particularly frustrating, as with Db2, I can say that an access plan with some tweaks will be more or less expensive and by approximately how much.
Tweaking the query and adding an index
Ok, so this query happens to display one of my favorite performance problems. This query is performing date math on columnar data, pretty much guaranteeing a table scan (or maybe an index leaf page scan). I'm curious to see what happens to the access plan on both platforms if I correct that math to apply to the comparison value and add an index on the columns in question.
Db2
For Db2, I add the index, and change the syntax, and redo the explain:
db2 "create index idx_rental_query4 on rental (return_date, rental_date, inventory_id, customer_id) allow reverse scans collect detailed statistics"
DB20000I The SQL command completed successfully.
[db2inst1@d8ab9ae65554 ~]$ db2 set current explain mode explain
DB20000I The SQL command completed successfully.
[db2inst1@d8ab9ae65554 ~]$ db2 -tvf query4a.sql
SELECT customer.last_name || ', ' || customer.first_name AS customer, address.phone, film.title
FROM rental
INNER JOIN customer ON rental.customer_id = customer.customer_id
INNER JOIN address ON customer.address_id = address.address_id
INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id I
NNER JOIN film ON inventory.film_id = film.film_id
WHERE rental.return_date IS NULL
AND rental_date < CURRENT DATE - film.rental_duration DAYS
ORDER BY title
LIMIT 5
SQL0217W The statement was not executed as only Explain information requests
are being processed. SQLSTATE=01604
The access plan I see is clearly about 25% less in cost than before:
Access Plan:
-----------
Total Cost: 331.09
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
5
^NLJOIN
( 2)
331.09
116.413
/-----------------+-----------------\
5 1
^NLJOIN FETCH
( 3) ( 14)
270.357 13.5336
107.437 2
/------+-------\ /---+----\
5 1 1 603
TBSCAN FETCH IXSCAN TABLE: DB2INST1
( 4) ( 12) ( 15) ADDRESS
209.461 13.5336 6.76789 Q3
98.4372 2 1
| /---+----\ |
5 1 599 603
SORT IXSCAN TABLE: DB2INST1 INDEX: DB2INST1
( 5) ( 13) CUSTOMER PK_ADDRESS
209.461 6.76789 Q1 Q3
98.4372 1
| |
50.8046 599
HSJOIN^ INDEX: DB2INST1
( 6) PK_CUSTOMER
209.459 Q1
98.4372
/---------+----------\
1000 183
TBSCAN HSJOIN^
( 7) ( 8)
49.3022 160.136
53 45.4372
| /---------+----------\
1000 4581 183
TABLE: DB2INST1 FETCH IXSCAN
FILM ( 9) ( 11)
Q5 143.584 16.5075
43 2.43717
/---+----\ |
4581 4581 16044
IXSCAN TABLE: DB2INST1 INDEX: DB2INST1
( 10) INVENTORY IDX_RENTAL_QUERY4
81.5935 Q4 Q2
12
|
4581
INDEX: DB2INST1
IDX_FK_FILM_ID
Q4
Looking at this explain, I can see I might get further optimizations by getting a better index on inventory, if index-only access is appropriate for this query, and I might also spend some further time seeing if I can optimize access to film.
Note also, that using db2advis would have given me some of this, but might have also sent me on a goose chase adding indexes on customer and address that would only be appropriate if performance of this particular query really was critical enough to aim for index-only access across the board.
MySQL
Performing the same actions in MySQL - changing the syntax around the date, creating the index, and rerunning the explain looks like this:
mysql> EXPLAIN SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer,
-> address.phone, film.title
-> FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id
-> INNER JOIN address ON customer.address_id = address.address_id
-> INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
-> INNER JOIN film ON inventory.film_id = film.film_id
-> WHERE rental.return_date IS NULL
-> AND rental_date < CURRENT_DATE() - INTERVAL film.rental_duration DAY
-> ORDER BY title
-> LIMIT 5;
+----+-------------+-----------+------------+--------+----------------------------------------------------------------------+-------------------+---------+----------------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+----------------------------------------------------------------------+-------------------+---------+----------------------------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | rental | NULL | ref | rental_date,idx_fk_inventory_id,idx_fk_customer_id,idx_rental_query4 | idx_rental_query4 | 6 | const | 183 | 100.00 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | customer | NULL | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | address | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | inventory | NULL | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | film | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.inventory.film_id | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+--------+----------------------------------------------------------------------+-------------------+---------+----------------------------+------+----------+-----------------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)
We can still see that things are clearly better, but other than eliminating a table scan, it is hard to say by how much. Additionally, there's not much to point me into which direction further optimization makes sense. Perhaps further optimization isn't really needed, since the PRIMARY index includes all table data to begin with.
Summary
Writing this blog article is actually a bit cathartic for me, and helps me name why I'm finding using explain in MySQL frustrating. It just doesn't have some of the details I'm used to. I'm sure in time, I'll see the patterns in MySQL explain more quickly, without the greater detail that I'm used to.
MySQL has a really simple executor, at least for most cases — it always does nested loop joins. That also makes it pretty bad at DWH queries.
With that and no subqueries in the statement, the cost of the query in the tabular simple explain is the product of the values shown in the rows column, the “badness” of the query. Badness is not comparable between different results, but with the same result set produced, the query with the lower badness is usually better. That seems to be matching the concept of timerons for DB2?
I hadn’t heard the product of rows thing before. I’m still a bit mystified how the first explain in MySQL only shows one row for the table scan, while the second with the added index and altered date syntax shows 183. 183 is the correct number of rows where return_date is NULL.
Definitively weird, or even wrong. The explains says “ALL” in type, so it is a full table scan. It also says “NULL” in ref, so no index can be used. This is confirmed with key and key_len also being NULL. The rows should show the estimated table size for “rental” from “INFORMATION_SCHEMA.TABLES.TABLE_ROWS”.
It further says “where used”, so additional conditions are being applied to the table scan while emitting found rows into the inner loop of the loop join. The result is materialied in a temporary table, and is sorted (that is applied at the end).
All other operations are “eq_ref” in type, and to the Unique Not-Null Index named PRIMARY (the primary key), so they are bound to be 1 in cost.
All of that makes sense for the (limited) way MySQL works, except for the wrong number in rows. If you can reproduce that, it would be interesting to see if that goes away after an ANALYZE TABLE on the table “rental”.