Performance Archives - sqlity.net

The Deceiving Seek Operator [TSQL Tuesday #043 – Hello, Operator?]

Sebastian Meine — Tue, 11 Jun 2013 14:00:45 +0000

T-SQL Tuesday #43 is hosted by Robert Farley (blog|g+|twitter).
This month's topic is "Hello, Operator?".

The Deceiving Seek Operator

Introduction

This T-SQL Tuesday is too great a topic to pass by. There are a lot of operators SQL Server uses to compile an execution plan that will return the data your query asked for. Some of them you keep running into regularly, like the Table-Scan Operator. Other ones are rarely seen, like the Assert Operator. (Both of them potentially point to a problem in your query.)

An Execution Plan Operator is actually a C++ object that implements three methods: Open, GetRow and Close. These objects get linked together in a tree structure to build the execution plan. Each operator gets rows from its direct child operator (or operators) by first calling the Open() method. Then the GetRow() method is called until no more rows are available or needed. Finally, the Close() method cleans up by for example releasing memory that was required by the child operator. This simple interface that is implemented by all operators allows SQL Server to place operators anywhere in any order in the execution plan. Because they all have the same interface, every operator can talk to every other operator as its child. That makes the whole system very flexible.

Today, rather than talking about how execution plans are constructed in general I am going to take a look at one operator that is often thought to be the Holy Grail in query optimization: The Seek Operator. If your query gets the data with a Scan Operator you are in a bad place, if it gets the data using a Seek Operator it is going to be blazing fast.

As most things in SQL Server this one is not actually that straight forward. For example, if the index seek has to be followed by a key lookup, things quickly look very different, at least if you are returning more than just a few rows. (Do a search on SQL Server Tipping Point for more info on this.)

What I am going to show you is that the Seek can actually be quite deceiving by executing a full Index Scan under the covers. But before we go there, let's do a quick refresher on Seeks and Scans

Seek and Scan Refresher

For this exercise I am going to create a large table first:

[sql] CREATE TABLE dbo.tst(
id INT IDENTITY(1,1),
key_fill CHAR(792) DEFAULT REPLICATE('key_fill',99),
d1 INT DEFAULT CHECKSUM(NEWID()),
page_fill CHAR(7200) DEFAULT REPLICATE('PageFill',900),
CONSTRAINT [PK:dbo.tst] PRIMARY KEY CLUSTERED (id, key_fill)
);
GO
MERGE dbo.tst t
USING(
SELECT TOP(100000)1 X
FROM sys.system_internals_partition_columns A,sys.system_internals_partition_columns B,sys.system_internals_partition_columns C,sys.system_internals_partition_columns D
)X
ON 1=0
WHEN NOT MATCHED THEN
INSERT DEFAULT VALUES;
[/sql]

The code above creates an admittedly slightly crazy table dbo.tst and inserts 100,000 rows into it. The table is designed to use as many data pages as possible (actually one page per row) and as many supporting index pages as possible. If you execute this code you will end up with a table that is about one GB in size, has 100,000 data pages and about another 25,000 supporting clustered index pages making the clustered index B+Tree eight levels deep. You can run this query to confirm that:

[sql] SELECT QUOTENAME(OBJECT_SCHEMA_NAME(i.object_id)) + '.'
+ QUOTENAME(OBJECT_NAME(i.object_id)) table_name,
i.name index_name,
ips.index_type_desc,
p.partition_number,
au.type_desc,
au.total_pages,
au.used_pages,
au.data_pages,
p.rows,
ips.index_depth
FROM sys.allocation_units au
JOIN sys.partitions p
ON au.container_id = p.partition_id
JOIN sys.indexes i
ON p.object_id = i.object_id
AND p.index_id = i.index_id
JOIN sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.tst'), NULL,
NULL, NULL) ips
ON p.object_id = ips.object_id
AND p.index_id = ips.index_id
AND p.partition_number = ips.partition_number;
[/sql]

The result should look like this:

The Scan

Now let's run a very simple query against that table:

[sql] SELECT * FROM dbo.tst;
[/sql]

This query is going to use a Scan Operator:

It makes sense for this query to use a clustered index scan, as we are asking it to return the entire table with all rows and all columns. If you look at the above screenshot you can tell that this query ran for about 87 seconds and read 120,006 pages. That is all 100,000 data pages and 20,006 of the about 25,000 supporting index pages. That is a little bit surprising as SQL Server knows what the first data page is. Also, all data pages are connected with each other in a double-linked-list. So you would expect SQL Server to only read the 100,000 data pages during a clustered index scan. The reason for that over-read is however beyond the scope of this post. Important is that a scan reads almost all of the pages in an index.

The Seek

To get a seek query we are just going to add a simple WHERE clause to the query:

[sql] SELECT * FROM dbo.tst WHERE id = 42;
[/sql]

As expected, this query is using a Seek Operator:

This query finishes in about 2 milliseconds and executes 11 reads. That is one read for each of the 8 index levels and then 3 more, because SQL Server likes to do additional work when dealing with clustered indexes as we have seen before. As you can tell, an index seek is significantly faster because it has to do a lot less work to get to the data.

Now, the above two queries cannot really be compared with each other as the one is returning the entire table whereas the other one returns only a single row. But if you were executing a query that is also returning a single row but can't make use of the index, SQL Server would have to execute the same expensive full clustered index scan to find that row.

The Scanning Seek

Now so far everything was pretty straight forward and probably no surprise to you. Now let's look at this query:

[sql] SELECT * FROM dbo.tst WHERE id > 0;
[/sql]

This query is going to, like the first query we looked at, return all 100,000 rows in the table as the id column is defined as an IDENTITY(1,1) column, so all id values are going to be larger than 0. Therefore we would expect to see a Clustered Index Scan Operator in the execution plan.
However, when you look at the execution plan, there is a Custered Index Seek Operator grinning back at you:

If you look at the execution time and the number of reads SQL Server executed you will find that it took about as long as the clustered index scan we looked at before. SQL Server also used the exact same number of logical reads: 120,006. So, everything is screaming that this in fact was a full clustered index scan operation. Just the plan seems to tell us otherwise.

Deception

The reason for the behavior you saw above is that the operator is only telling us how SQL Server is going to find the first row of the result. After that first row is located, SQL Server might scan in either direction of the index to retrieve further rows. This scan can happen, as we have seen, under either data access operator.

So the take away for you is, that a Seek Operator in an execution plan does not necessary mean that everything is in great shape. You need to look deeper to see what SQL Server is actually doing with that operator.

There are a few more examples where the Seek Operator is doing more work that the execution plan is telling you on first glance and I might write a follow-up article later to show more of them. Until then, be alert and don't trust that Deceiving Seek Operator.

The post The Deceiving Seek Operator [TSQL Tuesday #043 – Hello, Operator?] appeared first on sqlity.net.

Merge Wonders – Insert or Use

Sebastian Meine — Tue, 05 Feb 2013 15:00:57 +0000

Introduction

When working with clients I often see code that follows the insert-or-use pattern. Let's assume we have a dbo.Product table:

[sql] CREATE TABLE dbo.Product
(
Id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
Name NVARCHAR(50) NOT NULL,
ProductNumber NVARCHAR(25)
);
[/sql]

Often the pattern is implemented like this:

[sql] CREATE PROCEDURE dbo.GetOrCreateProduct
@ProductName NVARCHAR(50),
@ProductNumber NVARCHAR(25),
@Id INT OUTPUT
AS
BEGIN
IF EXISTS(SELECT 1 FROM dbo.Product WHERE ProductNumber = @ProductNumber)
BEGIN
SELECT @Id = Id FROM dbo.Product WHERE ProductNumber = @ProductNumber;
END
ELSE
BEGIN
INSERT INTO dbo.Product(Name, ProductNumber)VALUES(@ProductName, @ProductNumber);
SELECT @id = @@IDENTITY;
END;
END;
[/sql]

There are several issues with this implementation. To start the table is accessed twice in all cases. If there is no index on ProductNumber, that will cause disastrous performance, but even with an index, we should avoid double work whenever possible. Even more important, in the time between the first and the second read, someone else could have inserted a row already. Depending on if there is a unique constraint on ProductNumber this will then either cause a duplicate row to be created or it will cause this code to fail. Also, there is a whole slew of issues with @@IDENTITYitself that I am not going to cover today. Refer back to my Identity Crisis post for more information.

The Merge Statement to the rescue

There are several options to deal with the problems described above. One of the more elegant ones is the use of the MERGE statement.

The MERGE statement that was introduced in SQL Server 2008 is intended to be used in an INSERT or UPDATE situation. So how does it help in this situation, where no UPDATE is involved?

Let's address the issues of the solution above one at a time to see how the MERGE statement can help. Because of the issues described in the Identity Crisis post, it is recommended to use the output clause to get the value of the identity column of a newly inserted row like this:

[sql] CREATE PROCEDURE dbo.GetOrCreateProduct
@ProductName NVARCHAR(50),
@ProductNumber NVARCHAR(25),
@Id INT OUTPUT
AS
BEGIN
DECLARE @Ids TABLE(Id INT);

INSERT INTO dbo.Product(Name, ProductNumber)
OUTPUT(INSERTED.IDENTITYCOL)INTO @Ids
VALUES(@ProductName, @ProductNumber);

SELECT @Id = Id FROM @Ids;
END;
[/sql]

The double step of storing the identity value in a table variable and then reading it back out to use it is necessary, because SQL Servers OUTPUT clause cannot directly be used to set the value of a variable.

This takes care of the insert portion, but we don't want to insert a new row if one exists already. That is where MERGE gets introduced:

[sql] CREATE PROCEDURE dbo.GetOrCreateProduct
@ProductName NVARCHAR(50),
@ProductNumber NVARCHAR(25),
@Id INT OUTPUT
AS
BEGIN
DECLARE @Ids TABLE(Id INT);

MERGE dbo.Product AS p
USING (VALUES(@ProductName, @ProductNumber))n(Name,ProductNumber)
ON p.ProductNumber = n.ProductNumber
WHEN NOT MATCHED THEN
INSERT(Name, ProductNumber)
VALUES(n.Name, n.ProductNumber)
OUTPUT(INSERTED.IDENTITYCOL)INTO @Ids
;

SELECT @Id = Id FROM @Ids;
END;
[/sql]

Now a new row is only inserted no matching row exists. If a matching row exists the MERGE statement does nothing. That however means that in that case the OUTPUT clause does not return any rows, so we won't be able to retrieve the correct Id. To get the OUTPUT clause to return the required information, we just need to add an UPDATE into the mix. If you have a column that you would like to be updated on ever access (like a column with the time of the last access) this is easily done. But what can we do if we do not actually want to change a column value? We can just assign a column to itself as in SET p.ProductNumber = p.ProductNumber. SQL Server is clever enough to not actually execute the write if no changes happened. But this is a little dangerous, because the next person looking at this code might think that this was an oversight and just take out the UPDATE that clearly isn't doing anything. So I like to be a little more obvious about this and instead use the ability of the UPDATE statement to set variables like this:

[sql] CREATE PROCEDURE dbo.GetOrCreateProduct
@ProductName NVARCHAR(50),
@ProductNumber NVARCHAR(25),
@Id INT OUTPUT
AS
BEGIN
DECLARE @Ids TABLE(Id INT);
DECLARE @dummy INT;

MERGE dbo.Product AS p
USING (VALUES(@ProductName, @ProductNumber))n(Name,ProductNumber)
ON p.ProductNumber = n.ProductNumber
WHEN MATCHED THEN
UPDATE SET @dummy = @dummy
WHEN NOT MATCHED THEN
INSERT(Name, ProductNumber)
VALUES(n.Name, n.ProductNumber)
OUTPUT(INSERTED.IDENTITYCOL)INTO @Ids
;

SELECT @Id = Id FROM @Ids;
END;
[/sql]

While it is still a little obfuscated what is happening here, at least it should be clear that the line of code with the UPDATE was not an accident. You could even be more verbose and write something like this: UPDATE SET @this_is_a_required_update = 1

Danger, Concurrency!

The biggest problem with the original code is that it did not prevent concurrent inserts of the same value. If two connections at the same time execute the procedure for the same product, they both will not find the product at about the same time and then insert the product also at the same time. If there is a unique constraint on the column this will cause unexpected errors. If there is no unique constraint this will cause row duplicates, which usually is even worse.

The MERGE statement was written for cases where a row might be inserted if it does not yet exists. So you would expect that it could handle this case automatically. However, under the covers MERGE is just a lookup potentially followed by an insert. This is probably due to the MERGE statement being targeted at whole table synchronizations. The name itself hints at that. In that case high frequency calls with the same values are highly unlikely and any measures to prevent the side-effects of such will unnecessarily take additional resources. That means that in our case we need to take one additional step:

[sql] CREATE PROCEDURE dbo.GetOrCreateProduct
@ProductName NVARCHAR(50),
@ProductNumber NVARCHAR(25),
@Id INT OUTPUT
AS
BEGIN
DECLARE @Ids TABLE(Id INT);
DECLARE @dummy INT;

MERGE dbo.Product WITH(HOLDLOCK) AS p
USING (VALUES(@ProductName, @ProductNumber))n(Name,ProductNumber)
ON p.ProductNumber = n.ProductNumber
WHEN MATCHED THEN
UPDATE SET @dummy = @dummy
WHEN NOT MATCHED THEN
INSERT(Name, ProductNumber)
VALUES(n.Name, n.ProductNumber)
OUTPUT(INSERTED.IDENTITYCOL)INTO @Ids
;

SELECT @Id = Id FROM @Ids;
END;
[/sql]

The only difference to the previous version is the WITH(HOLDLOCK) table hint right after the table name. The HODLOCK table hint causes SQL Server to access that one table as if the current isolation level was set to serializable. That does prevent any insert collisions effectively. However, HOLDLOCK makes use of range locks. Range locks require the search argument to be indexed, otherwise SQL Server will effectively have to take a table lock, as there is nothing of which a range could be taken. If you set the transaction isolation level to serializable and run a query that requires a table scan, SQL Server will actually take a table lock instead of many range locks. However, this is not the case when using the MERGE statement together with the HOLDLOCK table hint. SQL Server will instead take a range lock on every row. That has the same effect as a table lock but at a much higher cost. So it is imperative that you have appropriate indexes in place when using this solution. But that index is one that you should have anyway if you are executing this query often.

Summary

While not directly designed for this use case, the SQL Server MERGE statement gives us an easy and – if used correctly – safe way to deal with an Insert or Use scenario. In this scenario a row's primary key needs to be retrieved if the row exists, otherwise the row needs to be created and the newly generated identity value needs to be returned.

While there are many ways to solve this problem and to circumvent all the pitfalls associated with it, MERGE offers one of the most elegant solutions.

The post Merge Wonders – Insert or Use appeared first on sqlity.net.

A Join A Day – The Hash Join

Sebastian Meine — Sun, 23 Dec 2012 15:00:08 +0000

Introduction

This is the twenty-third post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

The Hash Join algorithm is a good choice, if the tables are large and there is no usable index. Like the Sort Merge Join algorithm, it is a two-step process. The first step is to create an in-memory hash index on the left side input. This step is called the build phase. The second step is to go through the right side input one row at a time and find the matches using the index created in step one. This step is called the probe phase.

The Hash Join Algorithm

Usually a hash join execution plan contains only one icon for the entire process of building the hash table and using it to do the actual join. However, in SQL Server 2012 you might in some special cases see a separate icon for the build phase:

The first step in the Hash Join algorithm is always to create a hash index (or hash table) for the left side input. A hash index gets created by distributing the rows into several buckets. Each bucket is a memory area with a distinct address that can hold a group of rows. All the columns from the left side that are needed later in the query are included and stored in those buckets. Therefore the memory requirement is proportional to the size of the left side input.

The address of the bucket to store a given row is calculated using a hash function over the columns that are part of the join condition. A hash function is a deterministic function that transforms an input of one or more columns into a single number. Every time the same input is used, the same number will result. However, it is possible and even likely that two different inputs will yield the same number.

In an equi-join all columns of the join condition are included in the calculation of the hash value. In a nonequi-join condition only the columns that use an equality comparison are included in that calculation. That means that the Hash Join algorithm does not work if there is not at least one equality column included in the join condition.

Once all rows of the first input are distributed amongst the buckets, SQL Server loops through the rows of the second input one at a time. For each row it takes the equality columns of the join condition and calculates the hash value using the same hash function that was used in the creation of the hash table. Any matching row must be in the bucket with the same address. However, because the hash function might result in the same value for differing inputs, we can't tell at this point which of the rows in that bucket – if any – join to the current row. The only way to find out is to loop through all of them basically executing a nested loop. So the fewer rows there are in each bucket, the more efficient this algorithm works. A good hash function causes the input data to be evenly distributed amongst all available buckets.

The Hash Join algorithm is able to handle any of the logical join types. If the join is a Left Outer Join, a Full Outer Join or a Left Anti Semi Join, a marker is added to each row in the hash index to keep track of rows that had a match. At the end of the join process a single scan of the hash index will return all unmatched rows.

In the beginning of this article I stated that the hash index is an in-memory construct. So what happens if there is not enough memory? In that case SQL Server uses a complicated algorithm paging sections of the hash table in and out of memory. See http://msdn.microsoft.com/en-us/library/aa178403(v=sql.80).aspx for a little more detail on this. This paging of data between memory and disk can be quite a drain on the available resources. The hash table components that need to be moved out of memory are stored in tempdb. This process is called "spilling".

Let's look at an example. We are going to reuse the same two tables we have used for the Nested Loop Join and the Merge Join. Here is what they look like again:

Let's run the following query, this time with a hash join hint:

[sql] SET STATISTICS IO ON;
GO
SELECT *
FROM dbo.Tbl100 A
INNER HASH JOIN dbo.Tbl10 B
ON A.Val = B.Val;
GO
SET STATISTICS IO OFF;
[/sql]

It will produce these statistics:

[sourcecode] Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl10'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 94, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

Again, like with the Merge Join algorithm, both tables get scanned completely once each. There is also again a Worktable mentioned that was not read from. The worktable has been created to provide a place to store the pieces of the hash index that had to be paged out due to memory constraints. It has 0 logical reads in this example which means its use was not necessary.

Sort Hash Join Strengths and Weaknesses

Because the Hash Join like the Sort Merge Join has a setup phase, there is a setup cost involved. This setup cost is due to the creation of the in-memory hash table. While that process is not complicated, it will take time and a lot of memory. Below are the characteristics of the Hash Join algorithm.

+ All logical join types can be handled.
- There is a substantial setup cost.
- It does require a sizable amount of memory, big enough to fit the entire left side input. (Spilling to tempdb is possible)
- During the entire time it takes to create the hash index, no row will be returned.
- At least one column in the join condition needs to use an equality comparison.

All over all this is the most resource intensive join algorithm because of the expensive build phase. However, once the hash table is build, this join algorithm can be quite fast. For large tables with no usable index, the time savings during the probe phase will more than outweigh the additional cost of the build phase. Keep in mind however that because of the large memory requirements, only very few of these can run at the same time.

Summary

A Join A Day

This post is part of my December 2012 "A Join A Day" blog post series. You can find the table of contents with all posts published so far in the introductory post: A Join A Day – Introduction. Check back there frequently throughout the month.

The post A Join A Day – The Hash Join appeared first on sqlity.net.

A Join A Day – The Sort Merge Join

Sebastian Meine — Sat, 22 Dec 2012 15:00:29 +0000

Introduction

This is the twenty-second post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

The Sort Merge Join algorithm is the fastest of them all. However, there is a caveat. The algorithm is actually a two-step process. The first step is to sort both inputs in the same order. The second step is the Merge Join step. Here the rows from both inputs get matched together. The Merge part is the blazingly fast part. So if both inputs are sorted already because of an index or because of another sort requirement in the same query, the Sort Merge Join algorithm is the first choice. But if the inputs are not sorted, it rarely makes sense for SQL Server to first sort them.

The Sort Merge Join Algorithm

SQL Server actually implements only half of this algorithm directly. For the actual join the Merge Join operator is used. However, that operator cannot sort the inputs itself. It instead requires the inputs to be presorted. The sort can happen either by using dedicated sort operators in the execution plan or by utilizing existing indexes. The optimizer makes sure to only use the Merge Join operator if the inputs are indeed sorted.

The Merge Join algorithm works like this: The first row of each input is read. This primes the algorithm. After that a loop is executed:

Check if the rows match. If they do produce output row
If the value in input A is smaller than the value in input B, attempt to read next row from input A and skip next step
Attempt to read next row from Input B
If read attempt was successful, loop

Again, this is a little simplified. This version requires the values in the join column(s) to be unique. If that is not the case, special handling of multi-row matches has to be introduced, but even then the process is still very similar to the above.

The following GIF animation shows the process step by step:

This algorithm is so powerful, because SQL Server has to read and step through both inputs only once – in lockstep.

Let's look at the numbers by reusing the tables we created in yesterday's post about the Nested Loops Join algorithm. Here are the page and row counts for both tables again:

We are going to use the same query again, just replacing the LOOP hint with a MERGE hint:

[sql] SET STATISTICS IO ON;
GO
SELECT *
FROM dbo.Tbl100 A
INNER MERGE JOIN dbo.Tbl10 B
ON A.Val = B.Val;
GO
SET STATISTICS IO OFF;
[/sql]

The execution plan for this query shows the required two Sort operators, one for each input to the Merge Join operator:

The tool tip for the two tables shows that each one got scanned once as we expected. Looking at the SET STATISTICS IO ON; output however unveils a surprise:

[sourcecode] Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl10'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

The two table's read counts show that they each got scanned once only as we expected. But what is the story with that Worktable?

If SQL Server does not know for sure that at least one of the two tables is unique in the join column(s) it has to prepare for the possibility of a many to many relationship. Because the merge algorithm is designed to touch every record in each table only once, it cannot handle a many to many relationship. To get around that SQL Server stores all rows of the second input that had a match in a worktable in tempdb until they are not needed anymore. A record from the second input does not need to be stored any longer, when a different value on the first input has been read. Every time the same value on the first input is repeated, SQL Server "rewinds" the stream by using this work table.

Because the data in our two example tables is actually unique, there was no data read from the work table, hence the 0 reads in the statistics. But keep this in mind when you have a query that is using a merge join. For each section of identical values, SQL Server actually executes a loop join, first writing and then multiple times reading from the additional storage in tempdb. If you have large islands of identical values in your data this can potentially be very expensive.

The Merge Join operator has a many-to-many property that you can see in the execution plan tooltip window. If this property is set to "true" it means that SQL Server is expecting duplicate rows on both inputs.

Sort Merge Join Strengths and Weaknesses

As said before, the Sort Merge Join has two parts. The first part is the Sort which can be quite expensive. However, if the data is sorted already, the second part which is handled by the Merge Join operator has the following properties:

+ There is no setup cost.
+ It does not require memory.
+ The first rows are returned immediately, as soon as a match is found.
- At least one column in the join condition needs to use an equality comparison.
- If there is a many to many relationship in the data, a tempdb work table is created and used to provide a rewind-ability of the data potentially hurting performance severely.

The equality comparison is required to be able to sort both inputs the same way. However, there is a special case. SQL Server can use the Merge Join operator when a nonequi full outer join is required. In that case the entire first input is copied into the work table. Then a nested loops algorithm is executed between the second input and the work table. Each row in the worktable that had a match is marked. In the end an additional scan of the worktable returns all non-matched rows. This is a very expensive process hidden under a seemingly fast operator. Be aware of this.

Summary

The Merge Join algorithm is certainly the most efficient algorithm available to SQL Server. However, it requires both inputs to be sorted the same way. The cost of sorting is usually too high to make the use of this algorithm worthwhile, unless the data is sorted already or has to be sorted anyway for example because of an ORDER BY clause in the query.

You also need to be aware of the fact that even the potential of a many to many relationship in the data causes a worktable to be created and filled. If there are actually any duplicate values, the data in the worktable is reread as often as necessary which can be detrimental to the performance of the query.

A Join A Day

The post A Join A Day – The Sort Merge Join appeared first on sqlity.net.

A Join A Day – The Nested Loops Join

Sebastian Meine — Fri, 21 Dec 2012 15:00:02 +0000

Introduction

This is the twenty-first post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

Over the next few days we are going to look at the join algorithms in a little more detail. There are three join algorithms that are commonly implemented by modern database management systems. They are: Nested Loops Join, Hash Join and Sort Merge Join. Unsurprisingly, SQL Server also implements those three algorithms.

Today's topic is the Loop Join – probably the simplest of them all.

The Nested Loops Join

The Nested Loop Algorithm gets its name from the fact that it literally executes two nested loops. The outer loop steps through the rows of the left side. For each row the inner loop then steps through all rows of the right side to find all matches. That means the entire right side input is accessed as many times as there are rows on the left. To demonstrate that, let's create these two tables:

[sql] IF OBJECT_ID('dbo.Tbl10') IS NOT NULL DROP TABLE dbo.Tbl10;
CREATE TABLE dbo.Tbl10(
Id INT IDENTITY(1,1),
Val INT,
Fill CHAR(7000) NOT NULL DEFAULT REPLICATE('Fill',1750)
);

IF OBJECT_ID('dbo.Tbl100') IS NOT NULL DROP TABLE dbo.Tbl100;
CREATE TABLE dbo.Tbl100(
Id INT IDENTITY(1,1),
Val INT,
Fill CHAR(7000) NOT NULL DEFAULT REPLICATE('Fill',1750)
);

INSERT INTO dbo.Tbl10(Val)
SELECT TOP(10) 1+ROW_NUMBER()OVER(ORDER BY (SELECT 1))%100
FROM sys.all_columns A, sys.all_columns B, sys.all_columns C;

SELECT index_type_desc, alloc_unit_type_desc, index_depth, page_count, record_count
FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('dbo.Tbl10'),NULL,NULL,'SAMPLED');

INSERT INTO dbo.Tbl100(Val)
SELECT TOP(100) ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM sys.all_columns A, sys.all_columns B, sys.all_columns C;

SELECT index_type_desc, alloc_unit_type_desc, index_depth, page_count, record_count
FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('dbo.Tbl100'),NULL,NULL,'SAMPLED');
[/sql]

After executing this, the dbo.Tbl100 table contains 100 records. The dbo.Tbl10 contains 10 records. Both tables are designed so that each record takes up an entire storage page of 8192 bytes. The two selects against the sys.dm_db_index_physical_stats show the amount of pages used by and the amount of records stored in each table:

The query also shows that both tables do not have a clustered index. Now let's run the following query:

[sql] SET STATISTICS IO ON;
GO
SELECT *
FROM dbo.Tbl100 A
INNER LOOP JOIN dbo.Tbl10 B
ON A.Val = B.Val;
GO
SET STATISTICS IO OFF;
[/sql]

The query returns ten records. Because we used the LOOP hint, the Nested Loops Join algorithm was used:

You can see in the tooltip that the table scan of dbo.Tbl10 was executed 100 times, once for each row in dbo.Tbl100. Looking at the output of SET STATISTICS IO ON; confirms that:

[sourcecode] Table 'Tbl10'. Scan count 1, logical reads 1000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

The query executed 1,000 reads against the 10 page table, so the entire table was read 100 times.

This sounds pretty bad. So why would you ever want to use the Nested Loops Join algorithm? There are a quite a few advantages. See the Strengths and Weaknesses section at the end of this article for a detailed list.

The biggest advantage is probably that there is no setup work required. If the tables are small enough, the nested join can be done processing, before the other algorithms would have even started returning results.

You also need to keep in mind that the above example was the worst case scenario of joining two heaps with no usable indexes. So, let's see what happens if we add a clustered index:

[sql] CREATE UNIQUE CLUSTERED INDEX [dbo.Tbl10(Val, Id) CL] ON dbo.Tbl10(Val, Id);
[/sql]

Now sys.dm_db_index_physical_stats shows an index depth of two:

The index depth is what determines how many reads are required to do a seek in that index. If we rerun the above join query again the execution plan will, as expected, contain a seek of dbo.Tbl10:

This execution plan now is not quite a nested loop anymore. SQL Server still loops through the left input one row at a time. But it does not loop through the second input anymore. Instead it executes a direct seek to read just the row requested. However, the seek operation is still executed 100 times. So we can expect 200 reads to happen against that table. Let's check the SET STATISTICS IO ON; output:

[sourcecode] Table 'Tbl10'. Scan count 100, logical reads 227, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tbl100'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[/sourcecode]

It shows 227 reads. There are just a few reads more than we expected. Those additional reads happen because of additional meta-data pages like the IAM page that need to be accessed.

With an appropriate index the Nested Loops Join algorithm is not that bad anymore. While the savings in this case weren't great, you need to keep in mind, that the index depth growth very slowly as new rows are added to the table. Even in this badly designed example table you can store several million rows before a seek takes more than four reads.

This example showed another advantage of the Nested Loops Join operator: It can make use of an index on the right side input. With the exception of a few edge cases, the other two algorithms cannot utilize such an index.

If you are following along with the examples, use this statement to remove the index when you are done:

[sql] DROP INDEX dbo.Tbl10.[dbo.Tbl10(Val, Id) CL];
[/sql]

Loop Join Strengths and Weaknesses

The Loop Join operator has the following strength and weaknesses:

+ There is no setup cost.
+ It does not require memory.
+ The first rows are returned immediately, as soon as a match is found.
+ It is the only algorithm that can handle a plain nonequi-join situation.
+ A matching index on the right input can be utilized, potentially saving a lot of reads.
- If there is no usable index available the right input has to be read completely for each row of the left input – making this algorithm very expensive, at least on large data sets.

Summary

The Nested Loops Join algorithm is SQL Servers default join operator. This is due to its simplicity. There are no setup costs and its execution does not require memory. Because of that, it produces the first rows fairly quickly which can be an advantage in certain cases. This algorithm is also the only one that can truly handle a plain nonequi-join. Finally, it can make use of an index on the right input side, which sets it apart from the other algorithms.

A Join A Day

The post A Join A Day – The Nested Loops Join appeared first on sqlity.net.

A Join A Day – Join Hints

Sebastian Meine — Thu, 20 Dec 2012 15:00:43 +0000

Introduction

This is the twentieth post in my A Join A Day series about SQL Server Joins. Make sure to let me know how I am doing or ask your burning join related questions by leaving a comment below.

Before SQL Server can execute a query it needs to compile it into an execution plan. The compilation is done by the optimizer. The optimizer is looking at possible execution plans for the query and is trying to identify the best one. Yesterday we learned that the order of tables in a join does not change the result. Reordering the tables in a join construct can already lead to very many plans. There are six possible orders when three tables are involved, 24 with four tables and 120 with five tables. There are also many other options that the optimizer has to choose from, for example the three join algorithms. There are 81 possibilities to select one of the three algorithms for four different join operators. So, in the case of a five table join there are 120 * 81 = 9720 different plans to consider. This does not even yet include things like index selection. You can see, the number of possible plans quickly gets too big for the optimizer to look at all of them.

Because of the high number of possible plans, the optimizer does not actually try to find the best plan. Instead it tries to find a good enough plan. To determine if it found a good enough plan, the optimizer looks at the cost estimate of the plans found so far and based on those estimates determines how long it is going to spend to find a better plan. After that time it takes the best plan it found up to then.

This is a very simplistic view of the things that happen during optimization. The point I am trying to make is that is it indeed very common for the optimizer to not find the best plan. However, most of the time the plan it comes up with is pretty close to the best. But every once in a while the execution plan that the optimizer came up with is really bad. In a case like that we can use query hints to help the optimizer find a better plan.

Join Hints

Let's look at this simple query:

[sql] SELECT soh.AccountNumber, soh.OrderDate, sod.OrderQty, sod.UnitPrice
FROM Sales.SalesOrderDetail AS sod
INNER JOIN Sales.SalesOrderHeader AS soh
ON soh.SalesOrderID = sod.SalesOrderID;
[/sql]

For this query the optimizer decides to use a Merge Join operator:

The Merge Join algorithm is actually the best choice in this context. But let's assume for this article that we know that another algorithm is better and we want to influence the optimizer's decision.

There are two ways to sway the optimizer to use a different join algorithm. The first one is a direct join hint and the second on is a query join hint. Both have in common, that they are not really hints, but they rather force the optimizer to use the specified algorithm.

To specify a direct join hint, we just need to mention the desired algorithm in between the word INNER and the word JOIN like this:

The key word LOOP causes the optimizer to us the Loop Join algorithm. If you want to specify the algorithm you have to type the INNER key word as well. Just writing TblA LOOP JOIN TblB will result in an error. However, when using an outer join, the hint can be placed right before the key word JOIN and the key word OUTER stays optional:

This forces the Hash Join algorithm to be used in this Left Outer Join context.

The third algorithm is the Merge Join that the query used without a hint. You can force this algorithm with the key word MERGE:

So we would expect this to be the same execution plan as the one we got without the hint. But if you look closely, that is not the case. The order of the two inputs has been switched. The hint-free execution plan uses the Sales.SalesOrderHeader table as first input and the Sales.SalesOrderDetail table as second input. Because Sales.SalesOrderHeader has less rows, that is the preferred order for almost all join situations. However, when hinting to the optimizer, which algorithm we want to use, we also force the order of the tables to stay the same as specified in the query.

A hint specified like this affects only the join operator that it was specified at. You can specify a different algorithm for each join in the query. However, even a single join hint forces the order of the entire query.

Query Hints

The second way to specify the desired join algorithm is a query hint. A query join hint is specified at the end of the query like this:

This syntax also allows you to specify any of the three algorithms LOOP, HASH and MERGE. There are two important differences between a query hint and a join hint. The first on is that a query hint does not force the table order as you can see in above example. The second difference is that the query hint forces all joins to use the same algorithm. Let's look at this query for an example:

[sql] SELECT soh.AccountNumber, soh.OrderDate, sod.OrderQty, sod.UnitPrice, prod.Name
FROM Sales.SalesOrderDetail AS sod
INNER JOIN Sales.SalesOrderHeader AS soh
ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN Production.Product AS prod
ON sod.ProductID = prod.ProductID;
[/sql]

If un-hinted like above, it results in this execution plan:

Now let's specify just a single join hint:

[sql] SELECT soh.AccountNumber, soh.OrderDate, sod.OrderQty, sod.UnitPrice, prod.Name
FROM Sales.SalesOrderDetail AS sod
INNER JOIN Sales.SalesOrderHeader AS soh
ON soh.SalesOrderID = sod.SalesOrderID
INNER LOOP JOIN Production.Product AS prod
ON sod.ProductID = prod.ProductID;
[/sql]

This forces only the second join to be a loop join. The first join is now replaced by a hash join.

You can also see that the order in which the tables are accessed now matches the order in which they are mentioned in the query.

Now let's look at this query with a query hint:

Now both join operators are using the Loop Join algorithm:

However, the table order does not match the one specified in the query.

The two types of hints cannot be mixed. If you attempt to do so error 1042 will be raised:

FORCE ORDER

If you just want to force the order of the tables without specifying the algorithm(s) to be used, there is a~~n app~~ hint for that too:

The FORCE ORDER query hint gives us just that functionality. It forces the order but still allows the optimizer to choose the algorithms that it finds most appropriate.

FORCE ORDER actually does not only consider the order of tables in the query, but also the placement of the ON clauses. For an example let's look at this slightly modified query:

[sql] SELECT soh.AccountNumber, soh.OrderDate, sod.OrderQty, sod.UnitPrice,pers.FirstName, pers.LastName
FROM (
Sales.SalesOrderHeader AS soh
INNER JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID
)
INNER JOIN (
Sales.Customer AS cust
INNER JOIN Person.Person AS pers
ON cust.PersonID = pers.BusinessEntityID
)
ON soh.CustomerID = cust.CustomerID;
[/sql]

Without a join or table hint we get this right deep execution plan:

However, if we add the FORCE ORDER hint the optimizer builds a bushy execution plan in which first the Sales.SalesOrderHeader and Sales.SalesOrderDetail tables are joined, then the Sales.Customer and Person.Person tables and finally the two results with each outher:

There are a few additional options available for join and query hints that we can't discuss today. Check out these two Books Online articles for more information:

http://msdn.microsoft.com/en-us/library/ms173815.aspx

http://msdn.microsoft.com/en-us/library/ms181714.aspx

Cautionary Hint

Working with hints in T-SQL is always a double edged sword. While you might find a better table order or set of algorithms for a particular query base on the current data, you are taking away SQL Servers ability to adapt to changes in the data. Every time enough data in one of the tables has changed SQL Server (with default options enabled) will revisit every query accessing that table to see if it can come up with a better plan. It cannot do that with queries on which you have forced its way. For that reason it is an accepted best practice to use any type of query hint only after all other options failed.

If you have a join query that needs some performance improvements, the first thing to check is if appropriate indexes exist and if the query is written in a way that those indexes can actually be used. If that does not help, make sure that all statistics are up to date. Finally you could check if adding additional statistics or filtered statistics can improve the query.

This list is not meant to be exhaustive. It just gives you a starting point for things to look at when trying to improve the performance of a join query.

Summary

Join hints are a very powerful way to steer the decisions of the optimizer that affect join algorithm selection and table access order. We have seen several options to influence those decisions. However, every time we hint to the optimizer, we take some of its flexibility to adapt to changes in the data away. While a hinted query might be smooth sailing today, be aware of what comes after the next wave of updates to your data.

A Join A Day

The post A Join A Day – Join Hints appeared first on sqlity.net.

The Mysterious “sp_” System Procedure Prefix

Sebastian Meine — Sun, 13 May 2012 21:04:46 +0000

Introduction

If you have been working with SQL Server for a while you probably know that you should not select names for your stored procedures that start with the three characters "sp_".

Today I would like to take a closer look at all the myths and facts around this prefix. This will include a series of examples, all of which have been tested on SQL 2008R2 and on SQL 2012.

Special Objects

The first myth I would like to bust is that the "sp_" prefix stands for "system procedure".

While the BOL entry for Creating Stored Procedures uses language that could easily be interpreted like that, it actually never says anything about the etymology of this prefix. In fact, the "sp_" prefix causes a special path to be taken in the resolution of the object name. This works for stored procedures as well as for other object types. Later on this article contains an example with a table.

As "sp_" designates more objects than just stored procedures as special, it clearly does not mean "System Procedure", even though it currently is only used for system procedures. Instead it just stands for "special".

Resource Database

The idea for this article came from a tweet in a discussion about this prefix started by @ericstephani. In the tweet in question @SirSQL suggested to check if master was indeed the database checked first, as everyone in the discussion had assumed before, or if it rather is the resource database. This makes a lot of sense, as most of the system objects live in the resource database. So I decided to check this out by running a series of tests.

If you want to follow along with these tests you will need two databases: One with the name test and one that is called OtherDb.

For the first test I created a procedure with the name sp_executesql in master as well as in test and then executed it from both places:

[sql] USE master;
GO
CREATE PROCEDURE dbo.sp_executesql
@statement NVARCHAR(MAX)
AS
BEGIN
SELECT DB_NAME() [called from], 'master' [sp_executesql];
END
GO
EXEC dbo.sp_executesql @statement = N'SELECT DB_NAME() [called from], ''system'' [sp_executesql];';
GO
DROP PROCEDURE dbo.sp_executesql;
GO
------------------------------------------------
GO
USE test;
GO
CREATE PROCEDURE dbo.sp_executesql
@statement NVARCHAR(MAX)
AS
BEGIN
SELECT DB_NAME() [called from], 'test' [sp_executesql];
END
GO
EXEC dbo.sp_executesql @statement = N'SELECT DB_NAME() [called from], ''system'' [sp_executesql];';
GO
DROP PROCEDURE dbo.sp_executesql;
GO
[/sql]

The sp_executesql version that gets shipped with SQL Server lives in the resource database and so we would expect this version to get executed in all cases. This is indeed correct, as you can see in the result:

The above test shows that the resource database is checked first when an object name with the "sp_" prefix is encountered. That means that if you create any object with that name prefix you are running the risk that it will not be accessible anymore after the next service pack install if Microsoft decided to create an object with the same name in the resource database. This is even true when the resource database object is of a different type than your object. This you can quickly confirm by creating a table with the name sp_executesql in any database and trying to insert a row into it.

Master Piece

The fact that the resource database is the one checked first to resolve "sp_" object names raises the question where the master database fits in. Read on, as the results might surprise you.

The second test involves a procedure that is not (yet) part of the resource database: sp_MyOwnProc

I again created this procedure in master and in test:

[sql] GO
USE master;
GO
CREATE PROCEDURE sp_MyOwnProc
AS
BEGIN
SELECT DB_NAME() [called from], 'master' [sp_MyOwnProc];
END
GO
USE test;
GO
CREATE PROCEDURE sp_MyOwnProc
AS
BEGIN
SELECT DB_NAME() [called from], 'test' [sp_MyOwnProc];
END
GO
EXEC sp_MyOwnProc;
GO
USE OtherDb;
GO
EXEC sp_MyOwnProc;
GO
USE test;
GO
DROP PROCEDURE sp_MyOwnProc;
GO
USE master;
GO
DROP PROCEDURE sp_MyOwnProc;
GO
[/sql]

The above code creates the procedure in both places. It then executes the EXEC sp_MyOwnProc; statement, first from the test database and afterwards from the OtherDb database. The result is shown below:

As you can see, master is not checked first for an "sp_" object. Instead the object in the current database is used. Only if the current database does not contain the object in question master is looked at to see if the object exists in there.

The same behavior can be observed when all references to the sp_MyOwnProc in the above script are schema-qualified with dbo.

Other Object Types

You also get the same behavior if you go through this exercise with a table:

[sql] USE master;
GO
CREATE TABLE sp_MyOwnTable(
sp_MyOwnTable NVARCHAR(MAX)
);
INSERT INTO sp_MyOwnTable SELECT DB_NAME();
GO
USE test;
GO
CREATE TABLE sp_MyOwnTable(
sp_MyOwnTable NVARCHAR(MAX)
);
INSERT INTO sp_MyOwnTable SELECT DB_NAME();
GO
SELECT DB_NAME() [called from], * FROM sp_MyOwnTable;
GO
USE OtherDb;
GO
SELECT DB_NAME() [called from], * FROM sp_MyOwnTable;
GO
USE test;
GO
DROP TABLE sp_MyOwnTable;
GO
USE master;
GO
DROP TABLE sp_MyOwnTable;
GO
[/sql]

This script creates a table with the name sp_MyOwnTable in the master as well as in the test database and then executes a select against this name executing in test as well as in OtherDb. The result is shown here:

This shows that the name resolution works the same for tables as it does for stored procedures. It also works views, but not for functions or user defined types.

Performance Impact

Now we know that SQL Server tries to find an "sp_" object in the resource database first and only if it does not exist there the search is continued in the current database. So there should be a measurable performance impact showing this extra work.

To measure the impact I used this script:

[sql] USE test;
GO
IF OBJECT_ID('dbo.sp_MyOwnProc2') IS NOT NULL DROP PROCEDURE dbo.sp_MyOwnProc2;
GO
CREATE PROCEDURE dbo.sp_MyOwnProc2
AS
RETURN 0;
GO
IF OBJECT_ID('dbo.MyOwnProc2') IS NOT NULL DROP PROCEDURE dbo.MyOwnProc2;
GO
CREATE PROCEDURE dbo.MyOwnProc2
AS
RETURN 0;
GO
------------------------------------
GO
DECLARE @StartTime DATETIME2 = SYSDATETIME();
DECLARE @EndTime DATETIME2 = SYSDATETIME();
DECLARE @Counter INT = 0;
DECLARE @CmdA NVARCHAR(100) = 'EXEC dbo.MyOwnProc2;--';
DECLARE @CmdB NVARCHAR(100) = 'EXEC dbo.sp_MyOwnProc2;--';
DECLARE @TimeA BIGINT=0;
DECLARE @TimeB BIGINT=0;
DECLARE @Cmd2 NVARCHAR(100);
WHILE(@Counter<10000000)
BEGIN
SET @Cmd2 = @CmdA + CAST(@Counter AS NVARCHAR(20));
SET @StartTime = SYSDATETIME();
EXEC(@Cmd2);
SET @EndTime = SYSDATETIME();
SET @TimeA += DATEDIFF(microsecond,@StartTime,@EndTime)
SET @Cmd2 = @CmdB + CAST(@Counter AS NVARCHAR(20));
SET @StartTime = SYSDATETIME();
EXEC(@Cmd2);
SET @EndTime = SYSDATETIME();
SET @TimeB += DATEDIFF(microsecond,@StartTime,@EndTime)
SET @Counter += 1;
END
SELECT @Counter Counter,@TimeA [MyOwnProc2], @TimeB [sp_MyOwnProc2];
GO
[/sql]

It first creates two identical stored procedures that do nothing but return a 0. The first one is called sp_MyOwnProc2, the second one carries the name MyOwnProc2 without the "sp_" prefix. To measure the performance impact, the script calls both procedures alternating in a loop and records their execution times. Each call gets executed as dynamic sql with the current loop count being part of the sql string. This prevents any possible attempts to cache the plan for the batch. It does however not prevent caching the plan for the procedure itself. That is okay, as we are after the name resolution piece of the execution. Because both procedures get called alternatingly, any background noise caused by other processes on the test system should affect both evenly.

On my system I got these results:

After 10 million executions of each of the two procedures you can see, that there is a performance impact. With less than 2 percent, however, it is very small compared to the time it takes to just call the procedure. Remember, that the procedures in this test did not actually do any work, so all the time recorded by this test was spent on identifying the procedure and the overhead of calling it.

Conclusion

The "sp_" object name prefix causes SQL Server to take a special route when resolving the name of this object: First SQL Server checks if the object exists in the hidden resource database. Second it tries to find the object in the current database and if it does not exist there SQL Server goes on to check if the object exists in the master database.

If you name your own objects using this prefix, you have to be aware of two possible consequences:
First there is a small but measurable impact on performance. Second it can cause your application to suddenly break, if Microsoft decides to add an object with the same name to the resource database.

Because of that it is a best practice to not use the "sp_" name prefix anywhere in your code.

There is one exception however: If you are creating an object that you want to be accessible from all databases, you can use this prefix and place the object in the master database. However, because of the database precedence there are now two possibilities for those objects to get eclipsed by another object. So, if you go this route, make sure to regularly check that the object you are trying to execute is actually the one executing.

The post The Mysterious “sp_” System Procedure Prefix appeared first on sqlity.net.

Tracing the Actual Execution Plan for a single Query

Sebastian Meine — Sat, 05 May 2012 21:18:30 +0000

Introduction

SQL Server offers several ways to get to the execution plan for a particular query. Most of them however only provide the estimated execution plan without actual counts and statistics. In this article we are going to look at a way to get to the actual execution plan of a particular query.

Tracing

The certainly easiest way to get the actual plan of a query is to actually execute it in SSMS while the "collect actual execution plan" option is turned on. However, sometimes you want to collect the actual execution plan for a query every time it gets executed in your production environment. The only way to do this in SQL Server 2008R2 and earlier was to run a trace and collect the "Showplan XML Statistics Profile" event.

The problem with this approach is that it is not trivial to restrict the collected information to execution plans of one query. That means that on a busy system thousands of these events will fire in a very short time. With the included XML execution plan the amount of data to be collected will be quickly overloading most production systems.

Filtering Options

If the statement you are interested in is inside of a stored procedure you are somewhat in luck, as it is simple to filter by the name of the procedure which is returned by the event in the ObjectName column.

However, if the statement is an ad hoc or a prepared statement, this simple option does not exist anymore. There are a few other columns you can filter by like the login name, but most of the time you won't have the option to single out a specific query to use different connection settings.

The next option would be to filter on the text of the query. However, the query text is unfortunately not included in this event – neither as separate column nor within the execution plan XML.
That leaves only one column that we can try to filter on, the execution plan itself.

While the execution plan is stripped of most of the query text, there are two types of names that are included in the execution plan: Parameter names and table alias names. Parameter names are only included if the statement is a prepared statement. Table alias names will be included in all types of statements that access a table. Be aware, that column alias names are not included in the execution plan.

Random Alias

While most often the table alias names are not unique between all the queries in a system, it is usually not too difficult to change a particular query to use some distinct character string as an alias name for one table. A change like that will not modify the query behavior nor will it influence the plan choice of the optimizer but it will us something to filter by.

The easiest way to come up with such an identifying name is to just use a random character string like "ir83n476s9d". Make sure however that the name of your choice begins with a letter. It also should not contain any special characters.

Once your query is prepared like this, you can easily restrict a "Showplan XML Statistics Profile" trace to only include this query by using a "Like" filter on the TextData column and setting it to "%ir83n476s9d%".

Conclusion

This simple trick allows you to filter the "Showplan XML Statistics Profile" event in a trace to fire only for a particular query. While it is not always possible to change a query to accommodate this, in many cases it is easy to do.

On final word of caution: When running traces in a production environment you should never use the SQL Profiler, especially when dealing with high-volume events like the "Showplan XML Statistics Profile" event. Instead setup your trace as a server-side trace to write the collected data to a fast, preferably dedicated drive. That way the performance impact on the system will be kept as small as possible.

The post Tracing the Actual Execution Plan for a single Query appeared first on sqlity.net.

The Unloved Backward Scan

Sebastian Meine — Sun, 29 Apr 2012 20:31:29 +0000

Introduction

When a query requires rows to be sorted, either directly requested with an ORDER BY clause or because one of the iterators requires it, SQL Server has two options to guarantee that order. The obvious one is to utilize a sort iterator. If the data comes from an index (clustered or covering), SQL Server can also use an "Ordered Scan" of the data. Depending on the requested sort direction, such an ordered scan can be either forward or backward.

This would hardly be worth an article, if there wasn't the peculiarity that SQL Server obviously does not like the idea of having to execute an ordered scan that is directed backwards.

Backward Scan Dislike

Let's look at an example. First we need some tables with data:

[sql] IF OBJECT_ID('dbo.T1') IS NOT NULL DROP TABLE dbo.T1;
IF OBJECT_ID('dbo.T2') IS NOT NULL DROP TABLE dbo.T2;

CREATE TABLE dbo.T1(
Id INT CONSTRAINT T1_PK PRIMARY KEY CLUSTERED,
v1 INT,
c1 VARCHAR(8000)
);

CREATE TABLE dbo.T2(
Id INT CONSTRAINT T2_PK PRIMARY KEY CLUSTERED,
v1 INT,
c1 VARCHAR(8000)
);

INSERT INTO dbo.T1(Id,v1,c1)
SELECT n,CHECKSUM(NEWID()),'*'
FROM dbo.GetNums(1000000);

INSERT INTO dbo.T2(Id,v1,c1)
SELECT n,CHECKSUM(NEWID()),'*'
FROM dbo.GetNums(1000000);
--Get dbo.GetNums here: http://www.sqlmag.com/article/sql-server/virtual-auxiliary-table-of-numbers
[/sql]

This script is creating two tables containing one million rows each. The v1 column is filled with random values; the c1 column just contains a single constant character. The Id column is the clustered primary key and it is providing the ordering on disk that we are going to use.

The query is a little made up. It joins T2 to itself bringing back only one set of columns. It also restricts the result of this join to the first one million rows, sorted by T2.Id. The one million values in T2.v1 are random from a set of 4 billion making them "mostly" unique. That means the self-join is expected to return about one million rows anyway. Those rows are then joined to T1. In the end the ORDER BY requests that all rows be returned sorted by the Id column of the T1 table:

[sql] SELECT *
FROM dbo.T1 A
INNER JOIN ( SELECT TOP ( 1000000 )
B.*
FROM dbo.T2 B
INNER JOIN dbo.T2 C ON B.v1 = C.v1
ORDER BY B.Id
) BC ON A.Id = BC.Id
ORDER BY A.Id;
[/sql]

The execution plan of this query looks like this:

There is nothing unexpected in this plan. The T2 self-join is done with a hash join operator, as there is no index on the v1 column. The data is then sorted by a sort operator and passed through a serial zone for the top operator. After that there is no additional sort operator for the data to pass through. As the merge join requires both streams to be sorted the same way on the join column, this means that the data must be produced sorted by the scan of the T1 table. A quick look at its properties confirms that:

This scan is a forward scan, as both ORDER BY statements are asking to sort by Id ascending. To make this scan go backwards we should just have to change the query to sort by Id descending in both places:

[sql] SELECT *
FROM dbo.T1 A
INNER JOIN ( SELECT TOP ( 1000000 )
B.*
FROM dbo.T2 B
INNER JOIN dbo.T2 C ON B.v1 = C.v1
ORDER BY B.Id DESC
) BC ON A.Id = BC.Id
ORDER BY A.Id DESC;
[/sql]

However, this query produces this rather unexpected plan:

A loop join with an index seek against the T1 table to retrieve 1,000,000 single rows – that cannot be good.

Before we look at execution statistics let's force the merge join back by specifying a join hint:

[sql] SELECT *
FROM dbo.T1 A
INNER MERGE JOIN ( SELECT TOP ( 1000000 )
B.*
FROM dbo.T2 B
INNER JOIN dbo.T2 C ON B.v1 = C.v1
ORDER BY B.Id DESC
) BC ON A.Id = BC.Id
ORDER BY A.Id DESC;
[/sql]

Now we get the expected plan again:

If you compare the properties for the iterators in the forward scan and the backward scan plan you will notice, that the cost of the two scan directions is not that much different:

The estimated cost of the scan iterator is 0% in both cases. The bulk of the work is done in the hash self-join and the following sort of the T2 table.

So, let us take a look at the execution statistics of all three queries:

The execution statistics clearly show that the decision to go with a loop join was not necessarily the best. While the estimated cost for the backward scan is about 150% of the estimated cost for the loop join, the actual reads for the loop join version are about 3 million, which is more than 250 times higher than the 8000 reads for either scan direction.

Disk Access

To understand were this dislike is coming from we need to look at the access pattern necessary to retrieve the data from disk. A forward scan of a table (or index), that is not heavily fragmented, looks pretty much like doing a single contiguous read of a big blob of data. SQL Server has a lot of performance optimizations build in that make this type of access as fast as possible. The Read-Ahead mechanism is a good example for that.

A backward scan on the other hand looks quite the opposite: After reading a page, the disk has to do almost a complete turn to get to the previous page, which is the next one in line to be read. This is about the worst kind of random access you can come by.

As SQL Server always assumes that none of the requested pages are in cache, the decision to not get into that backward spinning game seems quite understandable.

Also, if you closely compare the two "scanning" execution plans you will notice that while the forward scan iterator is parallelized, the backward scan operator is executed single threaded. This is a general rule: SQL Server cannot execute a backward scan in parallel. That makes sense as with all that waiting for the disk to do another spin to get to the "next" page, it is unlikely for the data to come in quickly enough to keep multiple threads busy.

Estimates?

But wait, there is one more thing: If you look at the image above and compare the properties of all iterators in the two scanning plans, you will notice something odd. The estimated CPU cost of the "hash match inner join" iterator goes from 11 in the "forward" case to 1357 in the "backward" case; the estimated IO cost makes a similar jump from 0 to 1654. That seems to not make a lot of sense, as the next iterator, the sort, is a blocking iterator. "Blocking" means that the iterator reads in all input rows into a holding area before producing any output rows. This implies that anything on the left side of a sort should not be able to influence the cost of operators on the right side of it. That is, unless it affects the number of rows significantly, as this is a TOP N Sort. See TOP N Sort – A Little Bit of Sorting for an explanation. But as in this example all columns are "mostly" unique, the optimizer guesses correctly that about one million rows get passed into the hash join on each input and also about one million rows come out the other end. So row count estimates should not matter here.

If you look at the estimates when executing just that inner query sorting backwards, you will see that they exactly match the ones in the imbedded forward sorting case:

That means that the scan direction change of the T1 table causes the estimates on the other side of the merge join to get completely thrown off.

Conclusion

SQL Servers dislike of backward scan operations is understandable when looking at the work necessary to retrieve the records from disk in opposite index order. Additionally, a backward scan cannot be executed in parallel. That adds to the list of reasons, why the backward scan seems unloved.

However, the main reason why SQL Server avoids backward scans seems to be that they throw its cost estimations off, making following iterators appear a lot more expensive than they really are.

The post The Unloved Backward Scan appeared first on sqlity.net.

TOP N Sort – A Little Bit of Sorting

Sebastian Meine — Mon, 23 Apr 2012 00:39:06 +0000

Introduction

Sorting is one of the most often used functions in SQL Server. It is used when creating indexes. It allows us to retrieve data in a particular order. It also is used to filter out repeated rows in the context of a DISTINCT request. The SQL Server Team spent a lot of effort optimizing the sort algorithms to give the best possible performance in each context. They even implemented sort avoidance where the expensive sort gets removed from the query plan if it is known that the input is sorted already or will contain only a single distinct value.

One of those optimizations is the TOP N Sort that I would like to look at in a little more detail today.

Memory Requirements for Sorting

Sorting requires memory – potentially a lot of memory. SQL Server estimates the size of the data using statistics and data size estimates and then requests about 150 percent of that in memory for the sort operation.

To see how much memory a query is using, you can use the sys.dm_exec_query_memory_grants DMV. It shows the current memory usage for each SPID that requested additional memory for one (or more) of the memory consuming operators.

The problem with this DMV is that its values change quickly, so it is difficult to get meaningful information for a particular query. There is however a trick you can use: Every memory consuming iterator has a build phase where most of the memory is used, and a delivery phase in which the rows get send to the parent operator. The memory consumption during the build phase is usually a lot higher. SQL Server recognizes this and reuses the additional memory during the delivery phase for other memory consuming iterators.

The sys.dm_exec_query_memory_grants DMV contains the columns used_memory_kb and columns max_used_memory_kb that report current and maximum memory usage. If we can ensure that we get the data from this DMV during the delivery phase of the operator in question, the max_used_memory_kb column will tell us exactly how much memory was used during the build phase. This however only works for queries with a single memory consuming operator that also needs to be blocking.

Sort is a blocking operator. That means that during the build phase no rows get transmitted. If we include the request for the memory data in the query and place it before ("left of" in the graphical plan) the sort, we know that the sort will have finished its build phase when the data gets retrieved from the DMV.

Let us look at an example:

[sql] IF OBJECT_ID('dbo.FixedWidth') IS NOT NULL DROP TABLE dbo.FixedWidth;

CREATE TABLE dbo.FixedWidth(
Id INT PRIMARY KEY CLUSTERED,
c1 INT NOT NULL,
fill VARCHAR(1000) NOT NULL
);

INSERT INTO dbo.FixedWidth(Id,c1,fill)
SELECT n,CHECKSUM(NEWID()),REPLICATE('X',192)
FROM dbo.GetNums(10000);

SELECT SUM(DATALENGTH(Id)+DATALENGTH(c1)+DATALENGTH(fill))
FROM dbo.FixedWidth;
[/sql]

This snippet creates the table dbo.FixedWidth and inserts 10000 rows into it. It then sums up the DATALENGTH of all columns over all rows. The size of all the data totals 2,000,000 bytes. It is using Itzik Ben-Gan's GetNums function.

The following query will sort the data in this table and report its memory usage:

[sql] SELECT ( SELECT *
FROM sys.dm_exec_query_memory_grants
WHERE session_id = @@SPID
FOR
XML PATH('') ,
TYPE
) Mem ,
*
FROM dbo.FixedWidth
ORDER BY c1;
[/sql]

This query has below execution plan:

The access to the sys.dm_exec_query_memory_grants DMV happens after the sort has finished. Therefore the max_used_memory_kb will tell us the amount actually used for the sort of the entire 2MB table. The information from the memory DMV is returned as a single XML value. As the DMV result is spooled in this query, all rows will show the same XML value:

[xml] 66
0
1
1
2012-04-22T19:09:40.303
2012-04-22T19:09:40.307
7352
7352
512
2456
2456
3.460044653270174e+000
86
0
BgA6AB2ceA5AIZDfAAAAAAAAAAAAAAAA
AgAAAB2ceA7MMroMHW//TQYN1cSbf5KW
2
2
0
7352
[/xml]

It shows that 7,352KB is the ideal memory grant size the optimizer came up with. This was also the amount that was granted. However, only 2,456KB where actually used during the sort.

TOP N Sort

If you are interested only in the first few rows based on a sort order, you do not need to complete sort the rest of the rows.
To find the top most row it is enough to take the first row of the data set and compare it to each remaining row, keeping each time the one row that comes first according to the sort order. If you have more than one row you can follow the same principle, you just have to keep the first n rows after each step.

SQL Server implements this with the TOP N Sort operator:

[sql] SELECT TOP ( 10 )
*
FROM dbo.FixedWidth
ORDER BY c1;
[/sql]

This simple query has the following execution plan:

As you can see, the sort operator is of type "TOP N Sort".

You can use the same trick as shown above to get its memory consumption, but you have to add an outer layer, otherwise SQL Server removes the TOP N Sort optimization:

[sql] SELECT ( SELECT *
FROM sys.dm_exec_query_memory_grants
WHERE session_id = @@SPID
FOR
XML PATH('') ,
TYPE
) Mem ,
*
FROM ( SELECT TOP ( 10 )
*
FROM dbo.FixedWidth
ORDER BY c1
) X;
[/sql]

[xml] 66
0
1
1
2012-04-22T19:25:18.023
2012-04-22T19:25:18.023
1024
1024
24
24
24
8.388694532701750e-001
25
1
BgA6AM04yitAYc7jAAAAAAAAAAAAAAAA
AgAAAM04yivnP5lrZVMtEy5tLK/6dHjX
2
2
1
32
[/xml]

This query used only 24KB, clearly not enough to hold the entire table, so the TOP N Sort optimization was used.

The algorithms SQL Server uses for sorting are not documented. However, it is clear that an algorithm to find the TOP N rows cannot be as efficient as a normal sort when sorting the same number of rows. For that reason there is a maximum number of rows for which SQL Server uses the TOP N Sort optimization. That number was set to 100.

If you run above query with TOP(100) you will get a memory usage of 128KB:

[xml] 66
0
1
1
2012-04-22T19:32:38.127
2012-04-22T19:32:38.127
1024
1024
128
128
128
8.624926532701749e-001
25
1
BgA6AKCsNDBAIVXkAAAAAAAAAAAAAAAA
AgAAAKCsNDB6vVhQtTSg6/3xIflWWU1M
2
2
1
136
[/xml]

If you use TOP(101) the memory usage goes up to 2,456KB:

[xml] 66
0
1
1
2012-04-22T19:32:11.993
2012-04-22T19:32:11.993
7352
7352
512
2456
2456
8.627551332701751e-001
25
0
BgA6AMCfYwlA4WnkAAAAAAAAAAAAAAAA
AgAAAMCfYwlb3oB4gW5K1oLplwLKxsA+
2
2
0
7352
[/xml]

All the memory numbers match the ones from the whole table sort, so SQL Server decided to execute a standard sort followed by a standard TOP operation instead of using the TOP N Sort optimization.

This boundary is fixed and cannot be configured. There is also no hint to force this optimization. That is rather unfortunate as 100 seems to be a quite arbitrary boundary value and a TOP N Sort would often perform a lot better that a full table sort followed by a TOP.

Conclusion

We all know that while the optimizer in general is very good a finding a (close to) optimal plan, sometimes it does not.

The TOP N Sort optimization is a great enhancement, if you are dealing with 100 or less rows. However at 101 rows you will see a sudden drop in performance. To deal with this, you can either restrict the value of rows to return to 100, if the business allows. Or you can help SQL Server to deal with the "all rows" sort. One of the most obvious options here is to add an index to the table.

The post TOP N Sort – A Little Bit of Sorting appeared first on sqlity.net.