This is the seventeenth post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.
Like the INTERSECT key word, the EXCEPT key word was added to the T-SQL language with the SQL Server 2005 release. It does not look like a join at first, but gets handled by SQL Server like one. Also – like INTERSECT – EXCEPT comes directly from the mathematical set theory (pdf).
Let's look at the same example sets again that we used during the INTERSEC post yesterday: : {A, B, M, T, Y, Z} and {G, K, L, M, O, Y}
The EXCEPT key word represents the difference in set theory. The difference of two sets is defined as the set of the elements that are in the first set but not in the second set. It is denoted with the \ sign:
{A, B, M, T, Y, Z} \ {G, K, L, M, O, Y} = {A, B, T, Z}
The EXCEPT keyword therefore returns all rows from the first query expression that are not part of the second query expression. Let's look at the TblA – TblB example again. The tables look like this:
Let's run the following query:
This query results in only a single row:
The value 1 is the only one that is contained in TblA but not contained in TblB.
This looks exactly like the result we got back from the NOT EXISTS() query in the Left Anti Semi Join post. But again, there is one important difference. To show that difference we need to add two more rows to TblA:
Now the result of above query looks like this:
Now let's recall the NOT EXISTS() query, modified to also only return the TblA_Val column:
The result of this query looks like this:
The difference between EXCEPT and NOT EXISTS() is the same as the difference between INTERSECT and EXISTS(): The EXCEPT keyword causes duplicate rows to be eliminated from the final result. This is intentional as the action is borrowed from set theory and a set cannot contain the same element more than once.
If the column list contains the primary key (or another unique column) from TblA, the result of an EXCEPT query and a NOT EXISTS() query are the same.
If you are following along with the examples use the following statement to clean up the additional rows.
Like INTERSECT, the use of the EXCEPT key word tends to lead to cleaner and easier to maintain queries.
The syntax is similar to the UNION and the INTERSECT statements. You can connect any two queries with the EXCEPT statement. The only restriction is here too that the column count and data types must match. Again, implicit conversions are allowed.
The full EXCEPT syntax you can find here: http://msdn.microsoft.com/en-us/library/ms188055(v=sql.105).aspx
As you might have expected by now, the operator of choice for EXCEPT queries is the Left Anti Semi Join:
A Stream Aggregate operator is added here too, to remove the duplicate rows.
EXCEPT is a T-SQL statement that was introduced in SQL Server 2005. It returns the set-theoretical difference of two query results. Rows from the left are returned as long as they are not part of the right also. Because the result is meant to be a set of rows, duplicate rows are automatically removed.
This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.