It Matters Where Your Join Clauses Go

Posted by

Explicit Join vs. Implicit Join

When writing SQL there are two places you can define how tables are joined. The first is explicit, and is specified after the the keyword ON in the join clause like this:


SELECT * 
FROM table_a as a
    JOIN table_b as b ON a.id=b.id 

The same join can also be written this way with implicit join syntax:


SELECT * 
FROM table_a as a, table_b as b 
WHERE a.id=b.id 

Both of these work perfectly fine. Generally the implicit syntax is considered older while the explicit is considered newer. Personally, I vastly prefer the explicit syntax. The reason is simply code maintainablity. SQL often survives many application updates and is reused in many ways in many places. The where clause will often change over time, and be changed by different people at different times. More than once in my career, I've seen one of the changes accidentally remove the where clause, leading to significant performance problems and incorrect results.

What Happened

Last week, someone was troubleshooting a very real performance problem and came to me with information from a monitoring tool, Dynatrace. The information showed that within a narrow time window being examined, a query expected to run in 3 seconds or less instead ran in 50 seconds. Considering the issue being worked on had to do with a web request timing out, this could be relevant. The monitoring tool presented the query like this:

select distinct USERS.USERS_ID,USERREG.LOGONID 
from USERS,USERREG 
where USERS.REGISTERTYPE in (?, ?, ..., ?) 
and USERS.PROFILETYPE = ? 
and (LCASE(USERREG.LOGONID) like ? escape '\'

I ran an explain on this query and found it was indeed exceptionally expensive, and in fact 50 seconds seemed rather speedy given that it was clearly creating a Cartesian Product:

Upon further inspection, we found the following representation of execution and response times for this query:

The green arrow points to the problem time identified - when this query ran for 50 seconds. The purple line indicates median execution duration. All other execution times for this query were much more reasonable. This graph alone ruled out our issue, as we were trying to help a vendor identify why a patch had started causing a particular function to time out on the website, and this query was occurring both before and after the patch, and generally was returning in a reasonable amount of time. However, I was still worried as not only would an abundance of these at once cause a potential performance problem, but it also wouldn't return meaningful data since it would pair each logonid with each user_id, which is not meaningful.

After I reached out to our developers to discuss, someone realized that the analysis tool was TRUNCATING the query for some reason. After some research, it looks like the real query was:

SELECT DISTINCT USERS.USERS_ID, USERREG.LOGONID
FROM USERS, USERREG
WHERE USERS.REGISTERTYPE
IN (?, ?, ?) AND USERS.PROFILETYPE = ? AND (LCASE(USERREG.LOGONID) LIKE ? escape '\' ) AND (USERREG.USERS_ID = USERS.USERS_ID)
ORDER BY USERREG.LOGONID

Because the join condition was placed at the END of the WHERE clause, the part that got truncated was the where clause, leading the query to look much worse than it was. The actual explain plan looked like this:

While probably not the ideal, this is at least acceptable for a query that runs sporadically and less than once per minute.

Lessons Learned

First, there's a reason why explicit join syntax is preferred. I would not have missed the tool truncating the query if the join condition were explicit.
Second, always check the data from a reporting tool to verify it means what you think it means. I would have spent less time on this if I had understood we were talking about a single execution and if I can correctly realized that we were not seeing a real-world cart

Lead Database Administrator
Ember is always curious and thrives on change. Working in IT provides a lot of that change, but after 18 years developing a top-level expertise on Db2 for mid-range servers and more than 7 years blogging about it, Ember is hungry for new challenges and looks to expand her skill set to the Data Engineering role for Data Science. With in-depth SQL and RDBMS knowledge, Ember shares both posts about her core skill set and her journey into Data Science. Ember lives in Denver and work from home

One comment

  1. Good points, Ember.

    Personally, I would be worried about any tool which truncates queries anywhere.

    I have always maintained that there are 2 types of predicates – structure and business. When joining employee table to employee history, explicit/implicit referential integrity should dictate that a.empid = b.empid is needed for equijoin. This is a structure predicate and should be in the ON clause. If looking for a business condition of salary > 100,000 this should be in the WHERE clause. Structure predicates should rarely need to be changed. Since earlier versions of SQL did not have the ON clause we are left with a potential mess.

    A related topic to this is Oracle’s Natural join (it will join on all common columns by name!!), which is an even bigger ticking time-bomb!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.