This is the fifteenth post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.
As you might have expected, the Right Anti Semi Join is logically equivalent to the Left Anti Semi Join, just the roles of the two input row sources have switched. So, a logical right anti semi join query is one that returns all the rows from the right side for which no match on the left side exists.
As with the Right Semi Join, there is no direct way to write a logical right anti semi join query. However, we again can take the above description literally:
SELECT th.* FROM Production.Product p RIGHT OUTER JOIN Production.TransactionHistory th ON p.ProductID = th.ProductID WHERE p.ProductID IS NULL;
The query returns all TransactionHistory records for which the Product table entry does not exist (anymore). Executing it produces an empty result:
This however is expected: There is a foreign key relationship defined between these two tables so the above query will always return zero rows. But it still gives a good logical right anti semi join example.
Let's also look at the TblA – TblB example:
SELECT b.* FROM dbo.TblA AS a RIGHT OUTER JOIN dbo.TblB AS b ON a.TblA_Val = b.TblB_Val WHERE a.TblA_Val IS NULL;
This query follows the same pattern as the previous query. It first finds for all right side rows all potential left side matches and then returns only the right side rows that did not have a left side match. Before we look at the result here is the content of the two tables again:
Running this query returns only a single row from TblB:
The row with TblB_Val = 6 is the only TblB row, that does not have a match in TblA.
As with the Right Semi Join, there is no single statement that is equivalent to a logical right anti semi join. Any combination out of JOIN and WHERE that leads to only rows (and columns) from the right side being returned that do not have a left side match is a logical right anti semi join query and might get optimized using an Anti Semi Join operator.
Today's first example query, while being a logical anti semi join query does not result in an Anti Semi Join operator:
It instead goes through a two-step process again, first using a Right Outer Join operator to find matches and then a Filter operator to remove rows with matches. My experience has been that queries written this way end up with an execution plan like this in most cases. Instead of trying to force the optimizer in one direction, just write the query the most logical way using a NOT EXISTS() clause:
SELECT * FROM Production.TransactionHistory th WHERE NOT EXISTS ( SELECT 1 FROM Production.Product p WHERE p.ProductID = th.ProductID );
This query is switching the order of the two tables to be able to use the NOT EXISTS() clause. However, the SQL Server optimizer recognizes based on the row count and row size estimates that for this query it is actually better to us a Right Anti Semi Join operator. So it switches the order of the tables back during compilation:
There is no single statement that represents a logical right anti semi join. There are ways to write a query that is equivalent to a right anti semi join however, by using the other join types we have discussed before. If a query meets the anti semi join pattern, SQL Server might decide to use a Right Anti Semi Join operator. For that it might even change the order of the two row sources if it estimates the result to be more efficient.
This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.