Partition boundaries can be set to be a left range or a right range. While this is a seemingly simple decision, it starts getting complex if you are really trying to understand what those two options mean. That is not necessarily helped by the fact that different terminologies seem to contradict each other. For example, a left range builds intervals that are right side inclusive.
SQL Server partitioning uses two constructs, the partition scheme and the partition function. An un-partitioned table is placed in a filegroup during creation: CREATE TABLE … ON PRIMARY. If you are creating a table that is partitioned, instead of placing it into a filegroup, you place it on a partition scheme. The partition scheme is basically a collection of storage units called partitions, each being assigned to a filegroup. (However, multiple partitions can live in the same filegroup.)
The partition function decides which row in the table ends up in which partition based on a single column. Let us assume that that column is an INT column, say the id column. (I am using INT as an example. However, the concept is the same for other data types.) The partition function does not go and assigns blocks of values to partitions directly; instead, it splits the entirety of possible values into sections. An INT can store values from -2147483648 to 2147483647.
To split that into two sections, we have to define just one boundary. If we chose the value 3 for that first boundary, we would create two partitions: One that accepts values from -2147483648 to 3 and one for values from 3 to 2147483647. If we now add a second boundary at 6, we end up with three partitions: The first one still accepting values from -2147483648 to 3, the second for values from 3 to 6, and the last one finally for values from 6 to 2147483647.
You might have noticed an ambiguity in the preceding paragraph. The English language is particularly fuzzy on boundary conditions. The text above does not specify, which partitions the boundary values themselves belong to. Do the go to the left or the right?
Let me use a diagram to clarify:
For each boundary, you have to decide if the boundary value is part of the preceding partition or part of the following. As we commonly write from left to right, you can ask that same question this way: Is the boundary value part of the partition to its left or part of the partition to its right?
SQL Server does not allow us to make that decision individually for each boundary. Instead, you decide for all boundaries in a partition function at once.
To create a partition function you have to use the CREATE PARTITION FUNCTION statement. In this statement you decide if the boundary values are in their left or right partition by specifying either RANGE LEFT or RANGE RIGHT, followed by a comma separated list of the actual boundary values:
CREATE PARTITION FUNCTION ThreeStepRight(INT)
AS RANGE RIGHT FOR VALUES(3,6);
[/sql]
Now that we have those two partition functions (using the same two boundary values) in place, we can actually compare their behavior.
First, let us finish the setup by creating two tables that are partitioned based on these functions:
CREATE TABLE dbo.LeftTable
(
Id INT PRIMARY KEY CLUSTERED,
OtherValue INT DEFAULT CHECKSUM(NEWID())
) ON ThreeStepLeftScheme(id);
INSERT INTO dbo.LeftTable(id) VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9);
--
CREATE PARTITION SCHEME ThreeStepRightScheme AS PARTITION ThreeStepRight ALL TO([PRIMARY]);
CREATE TABLE dbo.RightTable
(
Id INT PRIMARY KEY CLUSTERED,
OtherValue INT DEFAULT CHECKSUM(NEWID())
) ON ThreeStepRightScheme(id);
INSERT INTO dbo.RightTable(id) VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9);
[/sql]
The above SQL snippet defines a partition scheme for each of the two partition functions. The schemes are of the most simple type, assigning all partitions to the primary filegroup. The snippet also creates a table for each partition scheme and inserts rows with id values ranging from 1 to 9.
Two years ago, I published a query that lists all partitions in a database together with their boundaries. Below you can find an improved version of that query:
Filtered to include only those two tables we created before, the query returns this result:
There are a few more columns in there that did not fit on the screen, but the more important ones are included. I would like you to look at partition_number, range_desc and boundary_value_on_right for a second. The latter is set to 1 is the partition function was created with the RANGE RIGHT clause, which means that each boundary belongs to the partition on its right. The range_desc column explains what values will be stored in each partition.
I would ask you to trust me on that query, but I know you rather see proof that the range description is actually, what happens under the covers.
There is no documented way to directly determine the partition a particular table row is stored in. There is a way to pass any value to a partition function and receive a (partition) number in return. However, that does not really tell us that the row actually was stored in that partition.
Luckily, if we combine two pieces of undocumented functionality, we can still see which partition a particular row is stored in. Those two functions are the %%physloc%% virtual column that returns the file_id and page_id of the page the row is stored in and the sys.dm_db_database_page_allocations DMF that lists all the database pages that belong to a partition.
Together these two provide enough information to determine the actual partition of a row in a partitioned table. However, before we can use them we need to execute two more steps. First we need to translate the output of the %%physloc%% column into separate values for the file and page ids. For that we can use the sys.fn_PhysLocCracker function. Second, we do not know the partition of our row upfront, so we need to get the full list of pages to compare with. That can be achieved by passing in NULL to all parameters but the first of the sys.dm_db_database_page_allocations DMF. With that, it returns all pages in the specified database.
Putting this all together, we can write a query like the following:
This query, in addition to determining the actual partition of a row, also uses part of the previous query to create the range_desc column of the identified partition for easier comparison. The important column however is DDDPA.partition_id, aliased as partition_number. That is the column that returns the partition number for each row.
If you want to use this query on a different table, all you have to do is replace the single occurrence of dbo.LeftTable with the new table name. The two screenshots below show the output of this query for both the dbo.LeftTable and the dbo.RightTable.
As you can now clearly see, when looking at the id and the partition_number columns, the output in the range_desc column is accurate.
Now that you, I hope, understand partition boundaries a little better, let use recap what we found out.
3 Responses to Partition Boundaries – Left or Right: That is the Question