Share via


T-SQL Tips : Search and Replace String From Multiple Nodes Within a XML Document

 

Impetus

The impetus for this article comes from a recent question in the forums asking for find and replace on a string from multiple nodes within a xml document. This article suggests a solution which you can use to fulfil the requirement.

Business Case

One of the business scenario where I have come across this requirement is when I was passing metadata information through XML. The XML will have details of the tables with column information. This was supposed to be used for filling the details of a form.


Solution

SQL Server introduced xpath support for XML from 2005 onwards through set of functions like nodes,query,modify etc.
The function modify() can be used for performing data manipulation over XML node elements. This can be utilized in scenarios like above to replace contents within XML.
Now lets see an illustration using modify function which replaces node value within a XML document

Illustration

Consider the below simple table for illustration

declare @t table
(
x xml
)
 
 
INSERT @t
(
x 
)
VALUES('<Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.InvoiceID">1</Column>
          </Columns>
        </GridLayout>
      </Grid>
   </Form>')

The above table stores a single xml value. Now requirement is to replace instances of Schema1 by Schema2 in the above xml. In the above case there is only a single instance of Schema1 existing within the XML. Hence this can be achieved by applying the xpath function modify and the solution would be as below

declare @t table
(
x xml
)
 
 
INSERT @t
(
x 
)
VALUES('<Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.InvoiceID">1</Column>
          </Columns>
        </GridLayout>
      </Grid>
   </Form>')
   
   
    declare @SearchString varchar(100),@ReplaceString varchar(100)
 
    SELECT @SearchString = 'Schema1',@ReplaceString = 'Schema2'
    
   UPDATE r
   SET x.modify('replace value of (/Form/Grid/GridLayout/Columns/Column[contains(@key,sql:variable("@SearchString"))]/@key)[1] with sql:column("y")')
   FROM (SELECT x,REPLACE(t.u.value('@key','varchar(100)'),@SearchString,@ReplaceString) as  y
   FROM @t
   CROSS APPLY x.nodes('/Form/Grid/GridLayout/Columns/Column[contains(./@key,sql:variable("@SearchString"))]')t(u)
   )r
   

Now check the output and you can see the below

<Form name="InvoiceList">
  <Grid name="dbgInvoices">
    <GridLayout>
      <Columns>
        <Column key="field1" />
        <Column key="Schema2.TableName.InvoiceID">1</Column>
      </Columns>
    </GridLayout>
  </Grid>
</Form>

The above solution shreds the searched node from the XML, applies the replace logic over it and then replaces it back into the main XML value using the modify function. As you see the Schema1 instance got successfully replaced as Schema2 in the above illustration.
Though it works well for the scenario above you can see that this can't be applied to the cases where you have multiple nodes within the XML, which requires the value to be replaced. Things become a little more tricky when you have multiple instances of the node repeating within the same XML.
To illustrate that consider this modified XML structure

DECLARE @UserTable TABLE
    (
      userId INT  IDENTITY(1, 1) ,
      sUserName NVARCHAR(150) ,
      xmlSettings XML
    )
 
INSERT  INTO  @usertable
        SELECT  'user1' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="Schema1.TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
   <Form name="OrderList"><Grid name="dbgOrders">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.OrderID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.OrderDescription" />
            <Column key="field3" />
            <Column key="Schema1.TableName.OrderDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
    </Form>
</Forms>'
        UNION ALL
        
  SELECT  'user2' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="TableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
   <Form name="SomeOtherForm">...</Form>
</Forms>'

Now lets try applying the earlier solution for the above scenario and see the result

DECLARE @UserTable TABLE
    (
      userId INT  IDENTITY(1, 1) ,
      sUserName NVARCHAR(150) ,
      xmlSettings XML
    )
 
INSERT  INTO  @usertable
        SELECT  'user1' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="Schema1.TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
   <Form name="OrderList"><Grid name="dbgOrders">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.OrderID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.OrderDescription" />
            <Column key="field3" />
            <Column key="Schema1.TableName.OrderDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
    </Form>
</Forms>'
        UNION ALL
        
  SELECT  'user2' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="TableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
   <Form name="SomeOtherForm">...</Form>
</Forms>'
 
UPDATE m
SET xmlSettings.modify('replace value of (//Grid[@name="dbgInvoices"]/GridLayout/Columns/Column[ contains(@key , "TableName")]/@key) with sql:column("replaceval") ')
FROM
(
   SELECT  xmlSettings,REPLACE(t.u.value('./@key[1]','varchar(100)'),'TableName.','SomeOthertable.') AS  replaceval
   FROM    @UserTable p
   CROSS APPLY p.xmlSettings.nodes('/Forms/Form/Grid[@name="dbgInvoices"]/GridLayout/Columns/Column[ contains(@key , "TableName")]')t(u)
)m

 
Executing the above code will give you an error message as below

Msg 2337, Level 16, State 1, Line 64
XQuery [m.xmlSettings.modify()]: The target of 'replace' must be at most one node, found 'attribute(key,xdt:untypedAtomic) *'

The error message is straightforward. Modify can be applied on at most a single node at a time inside XML. 

Since we require values from multiple nodes to be replaced in the above case we need to implement some kind of an iterating logic using the modify function.

For this purpose we can add a WHILE loop to the code above. The iterating condition should be up and until the last occurrence of the search string is replaced within the XML.
The existence of a node can be checked by using the xpath function exist(). exist() will return a boolean result based on whether the searched node exists within XML document. This can be utilized in our WHILE loop to make sure the find and replace logic gets repeated until we replace all the occurrences.

Keeping this in mind we can rewrite the query as below

DECLARE @UserTable TABLE
    (
      userId INT  IDENTITY(1, 1) ,
      sUserName NVARCHAR(150) ,
      xmlSettings XML
    )
 
INSERT  INTO  @usertable
        SELECT  'user1' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="Schema1.TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
   <Form name="OrderList"><Grid name="dbgOrders">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.TableName.OrderID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.OrderDescription" />
            <Column key="field3" />
            <Column key="Schema1.TableName.OrderDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
    </Form>
</Forms>'
        UNION ALL
        
  SELECT  'user2' ,
                '<Forms>
   <Form name="InvoiceList">
      <Grid name="dbgInvoices">
        <GridLayout>
          <Columns>
            <Column key="field1" />
            <Column key="Schema1.ableName.InvoiceID"/>
            <Column key="field2" />
            <Column key="Schema1.TableName.InvoiceAmount" />
            <Column key="field3" />
            <Column key="Schema1.TableName.InvoiceDate" />
            <Column key="field3" />
          </Columns>
        </GridLayout>
      </Grid>
   </Form>
</Forms>'

The iterating logic looks like this

WHILE EXISTS ( SELECT   1
                        FROM    @UserTable
                        WHERE   xmlSettings.exist('//Grid[@name="dbgInvoices"]/GridLayout/Columns/Column[ contains(@key , "TableName")]') = 1 )
BEGIN           
                         
     UPDATE m
     SET xmlSettings.modify('replace value of (//Grid[@name="dbgInvoices"]/GridLayout/Columns/Column[ contains(@key , "TableName")]/@key)[1] with sql:column("replaceval") ')
      FROM
      (
          SELECT  xmlSettings,REPLACE(t.u.value('./@key[1]','varchar(100)'),'TableName.','SomeOthertable.') AS  replaceval
          FROM    @UserTable p
          CROSS APPLY p.xmlSettings.nodes('/Forms/Form/Grid[@name="dbgInvoices"]/GridLayout/Columns/Column[ contains(@key , "TableName")]')t(u)
      )m
                         
END

As you see from the above we check for the presence of nodes with search string value coming as attribute value using a WHILE loop which utilizes the exist function. Inside the loop we apply modify function to replace the first occurrence of the Node element containing the pattern with the replaced string. This process will continue until there are no more nodes with the pattern being present within the XML. At this point loop exits.

Now if you query the XML it would look like below

<Forms>
  <Form name="InvoiceList">
    <Grid name="dbgInvoices">
      <GridLayout>
        <Columns>
          <Column key="field1" />
          <Column key="Schema2.TableName.InvoiceID" />
          <Column key="field2" />
          <Column key="Schema2.TableName.InvoiceAmount" />
          <Column key="field3" />
          <Column key="Schema2.TableName.InvoiceDate" />
          <Column key="field3" />
        </Columns>
      </GridLayout>
    </Grid>
  </Form>
  <Form name="OrderList">
    <Grid name="dbgOrders">
      <GridLayout>
        <Columns>
          <Column key="field1" />
          <Column key="Schema1.TableName.OrderID" />
          <Column key="field2" />
          <Column key="Schema1.TableName.OrderDescription" />
          <Column key="field3" />
          <Column key="Schema1.TableName.OrderDate" />
          <Column key="field3" />
        </Columns>
      </GridLayout>
    </Grid>
  </Form>
</Forms>

This indicates that all instances of the nodes containing pattern got replaced as per our search criteria ie for the grid dbgInvoices within the Form.
Now if we want to replace all instances of nodes with searched pattern remove the filter criteria on Grid name attribute and it will be as below

WHILE EXISTS
 ( 
        SELECT  1
        FROM    @UserTable
        WHERE   xmlSettings.exist('//Grid/GridLayout/Columns/Column[ contains(@key , "Schema1.")]') = 1
 )
BEGIN           
                         
         UPDATE m
         SET xmlSettings.modify('replace value of (//Grid/GridLayout/Columns/Column[ contains(@key , "Schema1.")]/@key)[1] with sql:column("replaceval") ')
         FROM
         (
              SELECT  xmlSettings,
                               REPLACE(t.u.value('./@key[1]','varchar(100)'),'Schema1.','Schema2.') AS  replaceval
              FROM    @UserTable p
              CROSS APPLY p.xmlSettings.nodes('/Forms/Form/Grid/GridLayout/Columns/Column[ contains(@key , "Schema1.")]')t(u)
         )m
                       
END

On checking the output you will see the below

<Forms>
  <Form name="InvoiceList">
    <Grid name="dbgInvoices">
      <GridLayout>
        <Columns>
          <Column key="field1" />
          <Column key="Schema2.TableName.InvoiceID" />
          <Column key="field2" />
          <Column key="Schema2.TableName.InvoiceAmount" />
          <Column key="field3" />
          <Column key="Schema2.TableName.InvoiceDate" />
          <Column key="field3" />
        </Columns>
      </GridLayout>
    </Grid>
  </Form>
  <Form name="OrderList">
    <Grid name="dbgOrders">
      <GridLayout>
        <Columns>
          <Column key="field1" />
          <Column key="Schema2.TableName.OrderID" />
          <Column key="field2" />
          <Column key="Schema2.TableName.OrderDescription" />
          <Column key="field3" />
          <Column key="Schema2.TableName.OrderDate" />
          <Column key="field3" />
        </Columns>
      </GridLayout>
    </Grid>
  </Form>
</Forms>

Summary

As you see from the above illustration we can search and replace a string from multiple nodes within a XML document by using a WHILE loop implemented using xpath function exist() to check for the existence of patterns and using modify() function along with replace value of to replace those instances.

See Also

Exist function

Modify function

replace value