Jaa


Merging Comments from Multiple Open XML Documents into a Single Document

Microsoft Word 2007 allows you to lock a document, prohibiting users from making changes to content, while allowing them to add comments.  If we have multiple documents that have the same content yet different comments, we can merge those comments into a single document.  One possible use would be a specification review system.  After the specification writer finishes a specification, she could send it to other members of her team for review.  As each reviewer returns the specification, she could merge all comments into a single document, making it simpler to integrate those comments.  (This example was inspired by an email thread with Sergey Solyanik, and his need for a comment merger for his very cool code (and soon-to-be spec) review system.  Also, need to say thanks, Sergey did a code review, and the code is better for it.)  The code to do comment merging is available in a zip file named CommentMerger.zip in the downloads tab at www.codeplex.com/powertools.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCNote: The CommentMerger class is part of the Power Tools for Open XML project.  In the future, I’m going to build a new PowerShell cmdlet to do comment merging.  PowerTools for Open XML is an open source project on CodePlex that makes it easy to create and modify Open XML documents using PowerShell scripts.  It’s important to note that Power Tools for Open XML is not a supported Microsoft product and doesn’t necessarily represent future product direction.  We think it will serve as inspiration for customers who need to create and modify Open XML documents programmatically.  Power Tools for Open XML is published under the Microsoft public license (Ms-PL), which gives you wide latitude in how you use the code.

The CommentMerger.MergeComments method uses the Open XML SDK.  The use of the comment merger class is pretty simple: You call the method passing two open WordprocessingDocument objects:

using (WordprocessingDocument destinationDocument =
WordprocessingDocument.Open("Test1a.docx", true))
using (WordprocessingDocument sourceDocument =
WordprocessingDocument.Open("Test1b.docx", false))
{
CommentMerger.MergeComments(destinationDocument, sourceDocument);
}

Upon return, the comments in the source document are merged into the destination document.  To merge comments from multiple documents, you can call the function multiple times.

I wrote the CommentMerger.MergeComments method in the pure functional style.  All methods are written without side-effects.  After initializing, no variables are mutated.  For a detailed explanation of this approach, see Recursive Approach to Pure Functional Transformations of XML.

The CommentMerger.MergeComments method is an example of a ‘Common-vocabulary document-centric transform’.  For an overview of these types of transforms, see Document-Centric Transforms using LINQ to XML.

Before merging comments, the CommentMerger.MergeComments method validates that the two documents contain the same content.  For more info, see Comparing Two Open XML Documents using the Zip Extension Method.

This code is based on the code I presented in Splitting Runs in Open XML Word Processing Document Paragraphs.

This code pre-atomizes XName objects.  See A More Robust Approach for Handling XName Objects in LINQ to XML.

The code uses the Open XML SDK.

Comments

  • Anonymous
    August 21, 2009
    Премного благодарен. Прочитал с огромным интересом, и вообще полезный у Вас блог

  • Anonymous
    July 23, 2012
    Hi Eric, This is post is excellent. This works fine when the toDocument already has a comment but if it does not have a comment the document generated shows an error and removes comments. I am working on merging two documents from sharepoint document library. Can you please hlep me with this.. Thanks in advance

  • Anonymous
    July 23, 2012
    Hi Venkata, I'll take a look at this. -Eric

  • Anonymous
    July 23, 2012
    Thanks Eric, I have tried fixing this by adding a dummy comment to the toDocument but for some reason the relationship is missing between the maindocumentpart and comments part when a comments part is added in code. when i see the xml structure and open the rels file the relationship for comments.xml is missing. Something is not working fine. The same works when i try to do for two documents which are on hard disk and not in memory.It would be really helpful if you can give some suggestion or help on this. Thanks