It is T-SQL Tuesday again, the 22nd incarnation and this time the topic is the presentation of data, hosted by Robert Pearl (Blog|Twitter).
Every so often you run into the problem of having to display a list of items to the user. This could be a list of item numbers or a list of names.
Assume for example if you are trying to write a view that displays information about all the indexes in the database.
You want this view to return one row per index. You also want to include the columns of the index. To make this happen you need to concatenate the information for the columns into one string.
This post is going to look at how to implement string concatenation in T-SQL.
In many other database management systems you have access to a function GROUP_CONCAT()
that provides just this functionality. In T-SQL it is not that easy yet. While Denali offers a new CONCAT()
function, a GROUP_CONCAT()
is not available in SQL Server up to version Denali CTP3.
There are several ways to go about this problem. For example you could write a UDF or you could rely on undocumented behavior1.
Most of those solutions seem complicated and also do not really address a real concatenation in a GROUP BY
context.
There is one solution that I have found to be the easiest to use: FOR XML.
Using the FOR XML
clause, SQL Server allows the output of any select to be a single XML Document:
returns something like
[xml] <row> Every row gets wrapped into a <row>
tag and every column in a tag that matches the columns name.
To eliminate the row tag we can pass in an empty string to the PATH()
function. The column tags are eliminated by naming the column [text()]
.
So our example would look like this:
[sql] SELECT name AS [[text()]] FROM master.sys.tablesIt's output looks like this:
[xml] spt_fallback_dbspt_fallback_devspt_fallback_usgspt_monitorspt_valuesMSreplication_options
That looks already very promising. You probably would like to add separators in between the strings. How you can do that we will get to a little later.
There are a few things that I would like to address first.
Let's look at characters that have a special meaning in XML like ‘<’:
[sql] SELECT '<&>' AS [[text()]] FOR XML PATH('');Output:
[xml] <&>That is not what we wanted. To get those characters unescaped, we need to use one of the XML datatype functions:
[sql] SELECT (SELECT '<&>'
This has the expected output: ‘<&>’. You might have noticed the missing column name. If the column name is not specified at all (explicitly or otherwise), it has the same effect as if the [text()]
name is given. The additional TYPE
keyword causes the XML to be returned as an XML
datatype result instead of as an NVARCHAR(MAX)
datatype result. The .value()
function is defined on the XML
datatype and takes two parameters. The first one is using the XPath syntax and describes what we want to get back. The '.' here means everything. The second one specifies the datatype to which we want the result to be converted to.
The next problem is not that easy to solve. The SQL Server XML
datatype cannot handle specific characters at all:

however, it falls apart once we add the necessary TYPE
keyword back in:
[sql]
SELECT CHAR(1)Msg 6841, Level 16, State 1, Line 1 FOR XML could not serialize the data for node 'NoName' because it contains a character (0x0001) which is not allowed in XML. To retrieve this data using FOR XML, convert it to binary, varbinary or image data type and use the BINARY BASE64 directive.
There is a total of 2079 single character values that will cause this error. How to figure out which ones exactly will have to wait for another blog post. For now, if you think you might run ito any of them, you can use the following code to replace the most common of them with a question mark:
[sql] SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(This replacement has to happen before the text is converted to XML. The following examples are going to skip this step for readability.
Now we know how to concatenate string values together. To make our initial example of a select statement (or view) that returns one row per index and includes a comma separated column list work, two more steps are missing. First, the separator needs to be inserted between the values but not in front of the first one. This is not possible. But it is fairly simple to remove the additional separator after adding one in front of every item:
[sql] SELECT STUFF((SELECT ', ' + name
The STUFF
function helps us with this problem. The first three parameters work just like the ones in the SUBSTRING
function. But instead of returning that substring, the STUFF
function replaces it with the value passed in as fourth parameter and then returns the complete string.
The last step is to do this all inside a GROUP BY
query. For that we have to use the CROSS APPLY
functionality to create the column list for each group value in a correlated sub-query. Putting this all together looks like this:
This query returns the table name, the index name and a comma separated list of all columns for each index in the database. It also marks descending columns with the DESC
keyword. What it does not do is separate out included columns. I'll leave that as an exercise for the reader.
1) One method that relies on undocumented behavior would be to continually add to a single variable within a single multirow select.
You must be logged in to post a comment.
Pingback: T-SQL Tuesday #22 Round-Up - Data Presentation - Robert Pearl's Blog