A Join A Day – The Sort Merge Join

2012-12-22 - A Join A Day, General, Performance, Series, SQL Server Internals

Introduction

This is the twenty-second post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

The Sort Merge Join algorithm is the fastest of them all. However, there is a caveat. The algorithm is actually a two-step process. The first step is to sort both inputs in the same order. The second step is the Merge Join step. Here the rows from both inputs get matched together. The Merge part is the blazingly fast part. So if both inputs are sorted already because of an index or because of another sort requirement in the same query, the Sort Merge Join algorithm is the first choice. But if the inputs are not sorted, it rarely makes sense for SQL Server to first sort them.

The Sort Merge Join Algorithm

merge join icon

SQL Server actually implements only half of this algorithm directly. For the actual join the Merge Join operator is used. However, that operator cannot sort the inputs itself. It instead requires the inputs to be presorted. The sort can happen either by using dedicated sort operators in the execution plan or by utilizing existing indexes. The optimizer makes sure to only use the Merge Join operator if the inputs are indeed sorted.

The Merge Join algorithm works like this: The first row of each input is read. This primes the algorithm. After that a loop is executed:

  1. Check if the rows match. If they do produce output row
  2. If the value in input A is smaller than the value in input B, attempt to read next row from input A and skip next step
  3. Attempt to read next row from Input B
  4. If read attempt was successful, loop

Again, this is a little simplified. This version requires the values in the join column(s) to be unique. If that is not the case, special handling of multi-row matches has to be introduced, but even then the process is still very similar to the above.

The following GIF animation shows the process step by step:

merge join algorithm

This algorithm is so powerful, because SQL Server has to read and step through both inputs only once – in lockstep.

Let's look at the numbers by reusing the tables we created in yesterday's post about the Nested Loops Join algorithm. Here are the page and row counts for both tables again:

create example tables

We are going to use the same query again, just replacing the LOOP hint with a MERGE hint:

[sql] SET STATISTICS IO ON;
GO
SELECT *
FROM dbo.Tbl100 A
INNER MERGE JOIN dbo.Tbl10 B
ON A.Val = B.Val;
GO
SET STATISTICS IO OFF;
[/sql]

The execution plan for this query shows the required two Sort operators, one for each input to the Merge Join operator:

merge join execution plan with two heaps

The tool tip for the two tables shows that each one got scanned once as we expected. Looking at the SET STATISTICS IO ON; output however unveils a surprise:

[sourcecode] Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl10'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

The two table's read counts show that they each got scanned once only as we expected. But what is the story with that Worktable?

If SQL Server does not know for sure that at least one of the two tables is unique in the join column(s) it has to prepare for the possibility of a many to many relationship. Because the merge algorithm is designed to touch every record in each table only once, it cannot handle a many to many relationship. To get around that SQL Server stores all rows of the second input that had a match in a worktable in tempdb until they are not needed anymore. A record from the second input does not need to be stored any longer, when a different value on the first input has been read. Every time the same value on the first input is repeated, SQL Server "rewinds" the stream by using this work table.

Because the data in our two example tables is actually unique, there was no data read from the work table, hence the 0 reads in the statistics. But keep this in mind when you have a query that is using a merge join. For each section of identical values, SQL Server actually executes a loop join, first writing and then multiple times reading from the additional storage in tempdb. If you have large islands of identical values in your data this can potentially be very expensive.

The Merge Join operator has a many-to-many property that you can see in the execution plan tooltip window. If this property is set to "true" it means that SQL Server is expecting duplicate rows on both inputs.

Sort Merge Join Strengths and Weaknesses

As said before, the Sort Merge Join has two parts. The first part is the Sort which can be quite expensive. However, if the data is sorted already, the second part which is handled by the Merge Join operator has the following properties:

  • + There is no setup cost.
  • + It does not require memory.
  • + The first rows are returned immediately, as soon as a match is found.
  • - At least one column in the join condition needs to use an equality comparison.
  • - If there is a many to many relationship in the data, a tempdb work table is created and used to provide a rewind-ability of the data potentially hurting performance severely.

The equality comparison is required to be able to sort both inputs the same way. However, there is a special case. SQL Server can use the Merge Join operator when a nonequi full outer join is required. In that case the entire first input is copied into the work table. Then a nested loops algorithm is executed between the second input and the work table. Each row in the worktable that had a match is marked. In the end an additional scan of the worktable returns all non-matched rows. This is a very expensive process hidden under a seemingly fast operator. Be aware of this.

Summary

The Merge Join algorithm is certainly the most efficient algorithm available to SQL Server. However, it requires both inputs to be sorted the same way. The cost of sorting is usually too high to make the use of this algorithm worthwhile, unless the data is sorted already or has to be sorted anyway for example because of an ORDER BY clause in the query.

You also need to be aware of the fact that even the potential of a many to many relationship in the data causes a worktable to be created and filled. If there are actually any duplicate values, the data in the worktable is reread as often as necessary which can be detrimental to the performance of the query.

A Join A Day

This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.

Categories: A Join A Day, General, Performance, Series, SQL Server Internals

Leave a Reply