SQL: How to parse Microsoft Transact-SQL Statements in C# and to match the column aliases of a view with its related (dependent) tables.

Published by Marian Galie-Andriescu on

This is a example on how to use the Microsoft T-SQL parser in C# which is part of the Feature Pack.

Microsoft® SQL Server® 2014 Feature Pack:
https://www.microsoft.com/en-us/download/details.aspx?id=42295

Microsoft® SQL Server® 2014 Transact-SQL Language Service:
The SQL Server Transact-SQL Language Service is a component based on the .NET
Framework which provides parsing validation and IntelliSense services for Transact-SQL for SQL Server 2014, SQL Server 2012, SQL Server 2008 R2, SQL Server 2008, and SQL Server 2005. Filename: X86 and x64 Package(tsqllanguageservice.msi)

Add the following reference for 32 bit in your C# project:
C:\Program Files (x86)\Microsoft SQL Server\120\SDK\Assemblies\Microsoft.SqlServer.TransactSql.ScriptDom.dll

If you have a previous version of the language service, then the directory name will be slightly different; replace 120 with the installed version as below:

80 = SQL Server 2000
90 = SQL Server 2005
100 = SQL Server 2008
110 = SQL Server 2012
120 = SQL Server 2014

Documentation on MSDN:
https://msdn.microsoft.com/en-us/library/microsoft.sqlserver.transactsql.scriptdom.tsql120parser(v=sql.120).aspx

For each SQL server version there is a corresponding parser class following the same mapping as above: TSql80Parser, TSql90Parser,TSql100Parser, TSql110Parser,  TSql120Parser.

In our example we will use the TSQL parser to parse a View statement in order to match its column aliases with the columns in the tables used by the view. This information is not available in the meta data of the sql server. A column from a view can be dependent on multiple columns and tables, but we only focus on the case where we have a one to one match.

Using the metadata available in sql database we can get information about the columns of a view and information about on what tables and columns the view depends on, but not a mapping between the two pieces of information.

This is our view:

  
CREATE VIEW [dbo].[View_1]
AS

SELECT  AreaId       = A.mcw_areaId,
        SurrogateKey = A.AreaKey,
        Code         = S.statecode,
        Name         = S.statename 

FROM    CRM.dim_Area as A INNER JOIN
        CRM.dim_AreaState as S ON A.statecode = S.statecode

Using the following query we can get the dependencies for our view:

SELECT
t.object_id as Table_ObjectID,
c.column_id as Column_ObjectID,
c.name as Column_Name,
t.Name as Table_Name
FROM sys.views v
JOIN sys.sql_dependencies d ON d.object_id = v.object_id
JOIN sys.objects t ON t.object_id = d.referenced_major_id
JOIN sys.columns c ON c.object_id = d.referenced_major_id AND
     c.column_id = d.referenced_minor_id
WHERE
d.class < 2 AND
v.name = 'View_1' AND
v.schema_id = SCHEMA_ID('Dbo')
ORDER BY COLUMN_ObjectID;

Andriescu Sql Parse View Alias Dependencies

 As alternative to the query above we can use Information schema as follows:
select * from INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
where VIEW_NAME='View_1' and
      VIEW_SCHEMA='Dbo'

Andriescu Sql Parse View Alias Dependencies Information Schema View Column Usage

However, a word of warning on using the Information_Schema from Microsoft:
“Do not use INFORMATION_SCHEMA views to determine the schema of an object. The only reliable way to find the schema of a object is to query the sys.objects catalog view.”

More information on this topic can be found here:
http://sqlblog.com/blogs/aaron_bertrand/archive/2011/11/03/the-case-against-information-schema-views.aspx

Using the following query we can get the list of column aliases of the view:

select object_id, column_id, name
from sys.columns
where object_id = object_id('Dbo.View_1');

Andriescu Sql Parse View Alias Column List

 

As mentioned before, there is no way to link and match these two results using the metadata found in the sql database.

Therefore the only solution is to parse the sql statements of the view and construct ourselves the mapping:

Andriescu Sql Parse View Alias Column Bind

 

The microsoft sql parser can be used as follows:

 

using Microsoft.SqlServer.TransactSql.ScriptDom;
....

TSql120Parser SqlParser = new TSql120Parser(false);

IList<ParseError> parseErrors;
TSqlFragment result = SqlParser.Parse(new StringReader(SqlTextBox.Text),
                                      out parseErrors);

TSqlScript SqlScript = result as TSqlScript;

foreach (TSqlBatch sqlBatch in SqlScript.Batches)
{
   foreach (TSqlStatement sqlStatement in sqlBatch.Statements)
   {
      ProcessViewStatementBody(sqlStatement);
   }
}

When using the parser be prepared to use recursion a lot. Also when testing for a type, use an exact match as follows:

(sqlStatement.GetType() == typeof (AlterViewStatement)

Using * as placeholder for columns in the view is not supported in this parsing example, because this will require querying the sql database for metadata and this is outside the scope for this project. Another limitation of this example is that if a view is using another view, we do not recursively process that view to get to the base table and column.

The project source code can be downloaded here:
Andriescu SQL Wpf Parse Sql View