Friday, May 18, 2007

T-SQL New features in SQL Server 2005 - Part 3

Often times, during my career, I faced a situation where I had to recursively get data out of a table starting from a root. The classic example of such a situation is Employee-Manager relationship table, which I am sure everyone, having even a basic idea of SQL, is familiar with. In this case, we have a table containing the details like Name, Department ID, Salary etc of Employees. There is also another column called Manager ID, which is nothing but the Employee ID from same table ie. case of a self-join. In this table, the employee having highest designation(lets assume CEO) has it Manager ID as null making him the root. Another such example is a table having the details/labels for a TreeView kind of structure (in other words, source for hierarchical data display). Now, prior to SQL Server 2005, the only way to get the data from such a table was by writing a stored procedure. However, introduction of CTEs has made it possible to get data recursively out of a table.

As mentioned in the previous article, CTEs are nothing but an inline representation of a table whose source is a query written in the CTE. By having the join with itself , the CTE can fetch the data recursively. To take it more clearer, lets look at an example.

Lets assume we have a table named Employees containing following data:


In the above table, Jim is the CEO since his manager ID is null. Now, we want to get all the people in the hierarchy starting from Jim. We can do that using the following query:

WITH AllEmployees AS -- (1)

(

SELECT Employees.EmployeeID, Employees.Name, Employees.ManagerID

FROM Employees WHERE ManagerID IS NULL -- (2)

UNION ALL -- (3)

SELECT Employees.EmployeeID, Employees.Name, Employees.ManagerID

FROM AllEmployees INNER JOIN Employees ON -- (4)

AllEmployees.EmployeeID = Employees.ManagerID -- (5)

)

SELECT * FROM AllEmployees -- (6)


This query will result in the following data:


Following is the explanation of the steps:

1) WITH defines a name for the CTE.

2) This query defines the anchor point. Notice we have the condition "ManagerID IS NULL" in WHERE clause. This defines/fetches the starting point. All the data will be fetched from this point onwards.

3) It is mandatory to use UNION ALL in case you are writing a recursive query. Failing this will give you an error mentioning that "it does not contain a top-level UNION ALL operator".

4) Makes a join with the data returned from the anchor query and in turn the CTE as a whole.

5) We make the join on condition where the manager ID from Employees table is in EmployeeID of AllEmployees CTE.

6) Selects all the records from the CTE as source. It is important that the select is the immediate next statement after the CTE closing parentheses.

We can also restrict the recursion level to a desired value. In the above example, we have nesting upto 3 levels starting from Jim. Lets assume that we wish to get the data only till the second level i.e. Jim->Thomas, Helena->Robert, Julian. We can do so by adding the following option to the above mentioned query:
OPTION (MaxRecursion 2)

This will result in the termination of the query as soon as the supplied nesting level is reached. If you look at the messages after executing the query, you will get:

Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 2 has been exhausted before statement completion.

But in the results tab, you will still get the expected data.

Also, if you wish to get the nesting or recursion level along with the data, you can modify the query in following manner:

WITH AllEmployees AS

(

SELECT Employees.EmployeeID, Employees.Name, Employees.ManagerID, 0 as Level

FROM Employees WHERE ManagerID IS NULL

UNION ALL

SELECT Employees.EmployeeID, Employees.Name, Employees.ManagerID, Level + 1 as Level

FROM AllEmployees INNER JOIN Employees ON

AllEmployees.EmployeeID = Employees.ManagerID

)

SELECT * FROM AllEmployees


Notice that we have added another column named Level to display the nesting level of the given record. The dissection is as follows: the anchor query assigns the value of "Level" as 0(zero). In the second query (after Union All), we set the "Level" as "Level + 1". What it does is that takes the value of Level from previous recursion and adds 1 to it. The value of "Level" is not incremented on each record, but on each recursion. Now the data is displayed as shown in the following figure:



I hope that this tutorial will come in handy when you are in a situation where you need to fetch the data out of a table recursively.

kick it on DotNetKicks.com