Part2: Creating a Custom SSMA Report
This is part 2 of the article for creating custom SSMA report using SSIS and SSRS. In the previous post, we discussed how to use SSIS package to extract XML files containing SSMA assessment report information into SQL Server table. This week, we will discuss how to parse the XML data.
First, let's create tables to hold the parsed data:
CREATE TABLE tblSSMAReport_Object (
Project VARCHAR(255),
Session VARCHAR(26),
ObjectID CHAR(36),
Path VARCHAR(255),
Category Varchar(32),
Name Varchar(32),
ConvertedWithError TINYINT,
SourceStatement Varchar(max),
PRIMARY KEY (Project, Session, ObjectID)
)
CREATE TABLE tblSSMAReport_MessageDetails (
Project VARCHAR(255),
Session VARCHAR(26),
ObjectID CHAR(36),
Code CHAR(8),
SourceCodeLink VARCHAR(800),
Description VARCHAR(MAX),
PRIMARY KEY (Project, Session, ObjectID, Code)
)
tblSSMAReport_Object table contains the information about the database object and its overall status. tblSSMAReport_MessageDetails table stores conversion message .
We will need to parse text in order to extract information based on relative location. I created a function which we will use several times in this example:
CREATE FUNCTION [dbo].[UDF_ParsePath]
(
@string VARCHAR(800),
@delimiter VARCHAR(16),
@mode VARCHAR(6) = 'last',
@keyword VARCHAR(16) = null
)
RETURNS VARCHAR(100)
AS
BEGIN
IF CHARINDEX(@delimiter, @string) > 0
BEGIN
DECLARE
@first_position INT,
@last_position INT
SET @first_position =
CASE
WHEN LOWER(@mode) = 'last' THEN
LEN(REPLACE(@string,' ','|')) -
(CHARINDEX(REVERSE(ISNULL(@keyword,@delimiter)), REVERSE(@string)) +
LEN(REPLACE(ISNULL(@keyword,@delimiter),' ','|')) - 1) +
LEN(REPLACE(@delimiter,' ','|'))
WHEN LOWER(@mode) = 'after' AND CHARINDEX(@keyword, @string) > 0 THEN
CASE
WHEN CHARINDEX(@delimiter, @string, CHARINDEX(@keyword, @string) + LEN(REPLACE(@keyword,' ','|'))) = 0
THEN 0
ELSE
CHARINDEX(@delimiter, @string, CHARINDEX(@keyword, @string) +
LEN(REPLACE(@keyword,' ','|')))+ LEN(REPLACE(@delimiter,' ','|'))
END
WHEN LOWER(@mode) = 'before' AND CHARINDEX(@keyword, @string) > 0 THEN
LEN(REPLACE(@string,' ','|')) -
(CHARINDEX(REVERSE(@delimiter), REVERSE(@string),
CHARINDEX(REVERSE(@keyword), REVERSE(@string)) +
LEN(REPLACE(@keyword,' ','|')) +
LEN(REPLACE(@delimiter,' ','|')))
) + 2
ELSE 0
END
SET @last_position = CASE
WHEN CHARINDEX(@delimiter, @string, @first_position) = 0
THEN LEN(REPLACE(@string,' ','|')) + 1
ELSE CHARINDEX(@delimiter, @string, @first_position)
END
IF @first_position > 0 AND @last_position - @first_position > 0
RETURN SUBSTRING(@string, @first_position, @last_position - @first_position)
END
RETURN NULL
END
The UDF returns segment of the string based on the following parameter value:
- Last: return an entire value of last segment (as identified by delimiter character) containing the keyword
- After: return the segment after the segment containing the keyword
- Before: return the segment before the segment containing the keyword
For example:
DECLARE @string VARCHAR(255) = 'C:\SSMA\Projects\ProjectName\report\report_2010_09_22T09_34_43\GUID\path.xml'
SELECT
dbo.UDF_ParsePath(@string, '\','before','report\') as Project,
dbo.UDF_ParsePath(@string, '\','last','report') as Session,
dbo.UDF_ParsePath(@string, '\','before','path.xml') as ObjectID
results in the following:
Project | Session | ObjectID |
ProjectName | report_2010_09_22T09_34_43 |
0027b853-988c-4be4-b937-9859503d014d |
We can then use the user defined function to parse the XML and populate tblSSMAReport_Object:
INSERT INTO tblSSMAReport_Object (ObjectID, Project, Session, Path)
SELECT ObjectID, Project, SESSION,
MAX(CASE WHEN rn=6 THEN path + ' > ' ELSE '' END)
+ MAX(CASE WHEN rn=5 THEN path + ' > ' ELSE '' END)
+ MAX(CASE WHEN rn=4 THEN path + ' > ' ELSE '' END)
+ MAX(CASE WHEN rn=3 THEN path + ' > ' ELSE '' END)
+ MAX(CASE WHEN rn=2 THEN path + ' > ' ELSE '' END)
+ MAX(CASE WHEN rn=1 THEN path ELSE '' END)
As PATH
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY FileName ORDER BY FileName ) as rn,
dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
dbo.UDF_ParsePath(rpt.FileName, '\','before','path.xml') as ObjectID,
col.value('./@name','VARCHAR(100)') As Path
FROM tblRawData rpt
CROSS APPLY rpt.ObjectPath.nodes('//path/node') Tab(col) ) AS Data
GROUP BY ObjectID, Project, Session
The inner statement parse object path and XML values while the outer statement perform string aggregation to form the object path across multiple rows. The tblSSMAReport_Object table needs to be updated with additional attribute such as object category, name and conversion status. The object category can be parsed from the object path. I use the following UDF to help with the parsing:
CREATE FUNCTION [dbo].[UDF_IdentifyObjectType]
(
@string VARCHAR(800)
)
RETURNS VARCHAR(32)
AS
BEGIN
DECLARE @ObjectType VARCHAR(32)
DECLARE @ObjectTypes TABLE(Seq INT IDENTITY(1,1), Name VARCHAR(32), Match VARCHAR(32))
INSERT INTO @ObjectTypes (Name, Match)
VALUES ('packaged function', 'Packaged Functions'),('packaged type', 'Packaged Types'),('private packaged function', 'Private Packaged Functions'),('private packaged procedure', 'Private Packaged Procedures'),('private packaged type','Private Packaged Types'),('index','Indexes'),('trigger','Triggers'),('function','Functions'),('package','Packages'),('procedure','Procedures'),('sequence','Sequences'),('statement','Statements'),('synonym','Synonyms'),('table','Tables'),('user defined type','User Defined Types'),('view','Views')
SELECT TOP 1 @ObjectType= Name from @ObjectTypes
WHERE CHARINDEX(Match,@string) > 0
ORDER BY Seq
RETURN @ObjectType
END
GO
The function locates the specified keyword from the path and returns the first value found based on the sequence specified in the function.
For example:
SELECT [dbo].[UDF_IdentifyObjectType]('Schemas > HR > Tables > LOCATIONS > Indexes > LOC_COUNTRY_IX')
returns 'Index' even though both 'Tables' and 'Indexes' keyword exists, but since index appears first in the sequence, the index keyword is returned. With UDF_IdentifyObjectType created, I can now update the object table and cross apply with the value from cat.xml values in the tblRawData to get the conversion status.
UPDATE obj
SET
obj.Category = dbo.UDF_IdentifyObjectType(Path),
obj.Name = dbo.UDF_ParsePath(Path, ' > ','after',dbo.UDF_IdentifyObjectType(Path)),
obj.ConvertedWithError = CASE WHEN NotCvt.NotConvertedCount > 0 THEN 1 ELSE 0 END
FROM tblSSMAReport_Object obj
LEFT JOIN tblRawData rawdata ON dbo.UDF_ParsePath(rawdata.FileName, '\','before','src.sql.txt') = obj.ObjectID
AND dbo.UDF_ParsePath(rawdata.FileName, '\','before','report\') = obj.Project
AND dbo.UDF_ParsePath(rawdata.FileName, '\','last','report') = obj.Session
LEFT JOIN (
SELECT dbo.UDF_ParsePath(rpt.FileName, '\','before','cat.xml') as ObjectID,
dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
col.value('./@category','VARCHAR(32)') As Category,
col.value('./@not-converted-count','INT') As NotConvertedCount
FROM tblRawData rpt
CROSS APPLY rpt.ObjectCategory.nodes('//statistics/statistic') Tab(col)) NotCvt ON NotCvt.ObjectID = obj.ObjectID
AND NotCvt.Session = obj.Session and NotCvt.Project = obj.Project
and dbo.UDF_IdentifyObjectType(Path) = NotCvt.Category
where obj.Category is null
INSERT INTO tblSSMAReport_MessageDetails (Project, Session, ObjectID, Code, Description)
SELECT
dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
dbo.UDF_ParsePath(rpt.FileName, '\','before','messages.xml') as ObjectID,
LEFT(col.value('./@name','VARCHAR(max)'),8) as Code,
col.value('./@name','VARCHAR(max)') As Description
FROM tblRawData rpt
CROSS APPLY rpt.ObjectMessage.nodes('//categories/category/node/record') Tab(col)
In the part 3 of this article, I will show an example of SSRS report which consume from the tables we just created.
Comments
- Anonymous
December 02, 2012
The comment has been removed - Anonymous
October 31, 2013
The comment has been removed - Anonymous
November 04, 2013
So I was able to workaround the truncation error by altering the table tblSSMAReport_Object 'Name' field from Varchar(32) to Varchar(40). However the Primary Key Constraint violation is still an issue. Anyone have any ideas?