A Join A Day – The Right Semi Join

2012-12-13 - A Join A Day, Fundamentals, General, Series

Introduction

This is the thirteenth post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

The Right Semi Join is the exact mirror image of the Left Semi Join: For every row from the left side, matching rows from the right side are returned. No columns from the left side are returned and rows from the right are returned each at most once.

RIGHT SEMI JOIN Example

To write a query with a logical left semi join we could use the EXISTS() clause. However, there is not really a concept or statement that would result in a logical right semi join. But we can take the description above literally. That will lead us to a query like this:

[sql] SELECT DISTINCT be.*
FROM Person.BusinessEntityAddress AS bea
JOIN Person.BusinessEntity AS be
ON bea.BusinessEntityID = be.BusinessEntityID;
[/sql]

The above query produces the following result. It is the same result that we got from yesterday's EXISTS() query.

Logical Right Semi Join Example

For this query pattern to be a true right semi join the right side columns must include a primary or unique key. Otherwise the DISTINCT might combine multiple separate right side rows into one.

The tables in the TblA – TblB example I have been using do not have a primary key. To make this example work we have to select only distinct values from the left join column before we join to the right side:

[sql] SELECT b.*
FROM (SELECT DISTINCT TblA_Val FROM dbo.TblA) AS a
JOIN dbo.TblB AS b
ON a.TblA_Val = b.TblB_Val;
[/sql]

Remember, the two tables in this example contain the following rows:

setup for all-cases join example

Running the above query leads to this result:

Example for All-Cases for logical Right Semi JOin

Only rows from TblB for which a match in TblA exists are included. Each TblB row is included at most once, even if it has multiple matches in TblA.

Like the Left Semi Join, the Right Semi Join does not eliminate duplicates. So if a row on the right exists more than once, and if a match on the left exists, all of those rows will be included in the final result.

RIGHT SEMI JOIN Syntax

As the previous section suggested there is no statement that is logically equivalent to a right semi join, so there is no syntax to discuss here. There are many ways to write a logical right semi join using the different join commands we have discussed before. However, SQL Servers optimizer might decide to not use a Right Semi Join operator for any of those queries.

RIGHT SEMI JOIN Operator

Any join query that fits the logical requirements by returning only rows from one of the two sides, each at most once might end up with a Semi Join operator. This operator can be either a Left Semi Join or a Right Semi Join operator. Even in an EXISTS() query SQL Server might decide that it is better to flip the order of the tables and use a Right Semi Join operator. However, as with the Left Semi Join, there are many possible execution plans for a logical right semi join query that do not use a Right Semi Join operator and SQL Server might pick any of those for a given logical right semi join query.

The execution plan of the address example above actually does not contain a Right Semi Join operator:

Logical Right Semi Join with Left Semi Join Execution Plan

Instead our left dominant SQL Server optimizer again decided to flip the order of the tables on us and use a Left Semi Join operator instead. To get SQL Server to use a Right Semi Join operator we have to "pull a fast one":

[sql] SELECT DISTINCT be.*
FROM Person.BusinessEntityAddress AS bea
JOIN Person.BusinessEntity AS be
ON bea.BusinessEntityID = be.BusinessEntityID
OPTION ( HASH JOIN );
[/sql]

The OPTION(HASH JOIN) is a "hint" that instructs SQL Server to use the hash join algorithm. This hint clearly has nothing to do with semi joins, it just restricts the optimizer's options enough for it to pick the Right Semi Join operator as new winner for this query:

Forced Right Semi Join Execution Plan

I am not going to explain this hint further here. Join hints are topic of day 18 of this series.

Summary

The Right Semi Join and the Left Semi Join are the same logical operation. They just switch the roles of the left side row source and the right side row source. The Right Semi Join returns all rows from the right side that have a match on the left side, but each one of them is returned at most once.

A Join A Day

This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.

Categories: A Join A Day, Fundamentals, General, Series

One Response to A Join A Day – The Right Semi Join

  1. Pingback: A Join A Day – The Left Anti Semi Join - sqlity.net | sqlity.net

Leave a Reply