Home » Blog » A Join A Day – The Nested Loops Join

A Join A Day – The Nested Loops Join

2012-12-21 - A Join A Day, General, Performance, Series, SQL Server Internals

Introduction

This is the twenty-first post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

Over the next few days we are going to look at the join algorithms in a little more detail. There are three join algorithms that are commonly implemented by modern database management systems. They are: Nested Loops Join, Hash Join and Sort Merge Join. Unsurprisingly, SQL Server also implements those three algorithms.

Today's topic is the Loop Join – probably the simplest of them all.

The Nested Loops Join

The Nested Loop Algorithm gets its name from the fact that it literally executes two nested loops. The outer loop steps through the rows of the left side. For each row the inner loop then steps through all rows of the right side to find all matches. That means the entire right side input is accessed as many times as there are rows on the left. To demonstrate that, let's create these two tables:

[sql] IF OBJECT_ID('dbo.Tbl10') IS NOT NULL DROP TABLE dbo.Tbl10;
CREATE TABLE dbo.Tbl10(
Id INT IDENTITY(1,1),
Val INT,
Fill CHAR(7000) NOT NULL DEFAULT REPLICATE('Fill',1750)
);

IF OBJECT_ID('dbo.Tbl100') IS NOT NULL DROP TABLE dbo.Tbl100;
CREATE TABLE dbo.Tbl100(
Id INT IDENTITY(1,1),
Val INT,
Fill CHAR(7000) NOT NULL DEFAULT REPLICATE('Fill',1750)
);

INSERT INTO dbo.Tbl10(Val)
SELECT TOP(10) 1+ROW_NUMBER()OVER(ORDER BY (SELECT 1))%100
FROM sys.all_columns A, sys.all_columns B, sys.all_columns C;

SELECT index_type_desc, alloc_unit_type_desc, index_depth, page_count, record_count
FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('dbo.Tbl10'),NULL,NULL,'SAMPLED');

INSERT INTO dbo.Tbl100(Val)
SELECT TOP(100) ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM sys.all_columns A, sys.all_columns B, sys.all_columns C;

SELECT index_type_desc, alloc_unit_type_desc, index_depth, page_count, record_count
FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('dbo.Tbl100'),NULL,NULL,'SAMPLED');
[/sql]

After executing this, the dbo.Tbl100 table contains 100 records. The dbo.Tbl10 contains 10 records. Both tables are designed so that each record takes up an entire storage page of 8192 bytes. The two selects against the sys.dm_db_index_physical_stats show the amount of pages used by and the amount of records stored in each table:

The query also shows that both tables do not have a clustered index. Now let's run the following query:

[sql] SET STATISTICS IO ON;
GO
SELECT *
FROM dbo.Tbl100 A
INNER LOOP JOIN dbo.Tbl10 B
ON A.Val = B.Val;
GO
SET STATISTICS IO OFF;
[/sql]

The query returns ten records. Because we used the LOOP hint, the Nested Loops Join algorithm was used:

You can see in the tooltip that the table scan of dbo.Tbl10 was executed 100 times, once for each row in dbo.Tbl100. Looking at the output of SET STATISTICS IO ON; confirms that:

[sourcecode] Table 'Tbl10'. Scan count 1, logical reads 1000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

The query executed 1,000 reads against the 10 page table, so the entire table was read 100 times.

This sounds pretty bad. So why would you ever want to use the Nested Loops Join algorithm? There are a quite a few advantages. See the Strengths and Weaknesses section at the end of this article for a detailed list.

The biggest advantage is probably that there is no setup work required. If the tables are small enough, the nested join can be done processing, before the other algorithms would have even started returning results.

You also need to keep in mind that the above example was the worst case scenario of joining two heaps with no usable indexes. So, let's see what happens if we add a clustered index:

[sql] CREATE UNIQUE CLUSTERED INDEX [dbo.Tbl10(Val, Id) CL] ON dbo.Tbl10(Val, Id);
[/sql]

Now sys.dm_db_index_physical_stats shows an index depth of two:

The index depth is what determines how many reads are required to do a seek in that index. If we rerun the above join query again the execution plan will, as expected, contain a seek of dbo.Tbl10:

This execution plan now is not quite a nested loop anymore. SQL Server still loops through the left input one row at a time. But it does not loop through the second input anymore. Instead it executes a direct seek to read just the row requested. However, the seek operation is still executed 100 times. So we can expect 200 reads to happen against that table. Let's check the SET STATISTICS IO ON; output:

[sourcecode] Table 'Tbl10'. Scan count 100, logical reads 227, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

It shows 227 reads. There are just a few reads more than we expected. Those additional reads happen because of additional meta-data pages like the IAM page that need to be accessed.

With an appropriate index the Nested Loops Join algorithm is not that bad anymore. While the savings in this case weren't great, you need to keep in mind, that the index depth growth very slowly as new rows are added to the table. Even in this badly designed example table you can store several million rows before a seek takes more than four reads.

This example showed another advantage of the Nested Loops Join operator: It can make use of an index on the right side input. With the exception of a few edge cases, the other two algorithms cannot utilize such an index.

If you are following along with the examples, use this statement to remove the index when you are done:

[sql] DROP INDEX dbo.Tbl10.[dbo.Tbl10(Val, Id) CL];
[/sql]

Loop Join Strengths and Weaknesses

The Loop Join operator has the following strength and weaknesses:

+ There is no setup cost.
+ It does not require memory.
+ The first rows are returned immediately, as soon as a match is found.
+ It is the only algorithm that can handle a plain nonequi-join situation.
+ A matching index on the right input can be utilized, potentially saving a lot of reads.
- If there is no usable index available the right input has to be read completely for each row of the left input – making this algorithm very expensive, at least on large data sets.

Summary

The Nested Loops Join algorithm is SQL Servers default join operator. This is due to its simplicity. There are no setup costs and its execution does not require memory. Because of that, it produces the first rows fairly quickly which can be an advantage in certain cases. This algorithm is also the only one that can truly handle a plain nonequi-join. Finally, it can make use of an index on the right input side, which sets it apart from the other algorithms.

A Join A Day

This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.