Delete comments by all or a specific author in a word processing document
This topic shows how to use the classes in the Open XML SDK for
Office to programmatically delete comments by all or a specific author
in a word processing document, without having to load the document into
Microsoft Word. It contains an example DeleteComments
method to illustrate this task.
DeleteComments Method
You can use the DeleteComments
method to
delete all of the comments from a word processing document, or only
those written by a specific author. As shown in the following code, the
method accepts two parameters that indicate the name of the document to
modify (string) and, optionally, the name of the author whose comments
you want to delete (string). If you supply an author name, the code
deletes comments written by the specified author. If you do not supply
an author name, the code deletes all comments.
// Delete comments by a specific author. Pass an empty string for the
// author to delete all comments, by all authors.
static void DeleteComments(string fileName, string author = "")
Calling the DeleteComments Method
To call the DeleteComments
method, provide
the required parameters as shown in the following code.
if (args is [{ } fileName, { } author])
{
DeleteComments(fileName, author);
}
else if (args is [{ } fileName2])
{
DeleteComments(fileName2);
}
How the Code Works
The following code starts by opening the document, using the
WordprocessingDocument.Open
method and indicating that the document should be open for read/write access (the
final true
parameter value). Next, the code retrieves a reference to the comments
part, using the WordprocessingCommentsPart
property of the main document part, after having retrieved a reference to the main
document part from the MainDocumentPart
property of the word processing document. If the comments part is missing, there is no point
in proceeding, as there cannot be any comments to delete.
// Get an existing Wordprocessing document.
using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, true))
{
if (document.MainDocumentPart is null)
{
throw new ArgumentNullException("MainDocumentPart is null.");
}
// Set commentPart to the document WordprocessingCommentsPart,
// if it exists.
WordprocessingCommentsPart? commentPart = document.MainDocumentPart.WordprocessingCommentsPart;
// If no WordprocessingCommentsPart exists, there can be no
// comments. Stop execution and return from the method.
if (commentPart is null)
{
return;
}
Creating the List of Comments
The code next performs two tasks: creating a list of all the comments to
delete, and creating a list of comment IDs that correspond to the
comments to delete. Given these lists, the code can both delete the
comments from the comments part that contains the comments, and delete
the references to the comments from the document part.The following code
starts by retrieving a list of Comment
elements. To retrieve the list, it converts the Elements()
collection exposed by the commentPart
variable into a list of Comment
objects.
List<Comment> commentsToDelete = commentPart.Comments.Elements<Comment>().ToList();
So far, the list of comments contains all of the comments. If the author parameter is not an empty string, the following code limits the list to only those comments where the Author property matches the parameter you supplied.
if (!String.IsNullOrEmpty(author))
{
commentsToDelete = commentsToDelete.Where(c => c.Author == author).ToList();
}
Before deleting any comments, the code retrieves a list of comments ID values, so that it can later delete matching elements from the document part. The call to the Select method effectively projects the list of comments, retrieving an IEnumerable<T> of strings that contain all the comment ID values.
IEnumerable<string?> commentIds = commentsToDelete.Where(r => r.Id is not null && r.Id.HasValue).Select(r => r.Id?.Value);
Deleting Comments and Saving the Part
Given the commentsToDelete
collection, to
the following code loops through all the comments that require deleting
and performs the deletion.
// Delete each comment in commentToDelete from the
// Comments collection.
foreach (Comment c in commentsToDelete)
{
if (c is not null)
{
c.Remove();
}
}
Deleting Comment References in the Document
Although the code has successfully removed all the comments by this point, that is not enough. The code must also remove references to the comments from the document part. This action requires three steps because the comment reference includes the CommentRangeStart, CommentRangeEnd, and CommentReference elements, and the code must remove all three for each comment. Before performing any deletions, the code first retrieves a reference to the root element of the main document part, as shown in the following code.
Document doc = document.MainDocumentPart.Document;
Given a reference to the document element, the following code performs
its deletion loop three times, once for each of the different elements
it must delete. In each case, the code looks for all descendants of the
correct type (CommentRangeStart
, CommentRangeEnd
, or CommentReference
)
and limits the list to those whose Id
property value is contained in the list of comment IDs to be deleted.
Given the list of elements to be deleted, the code removes each element in turn.
Finally, the code completes by saving the document.
// Delete CommentRangeStart for each
// deleted comment in the main document.
List<CommentRangeStart> commentRangeStartToDelete = doc.Descendants<CommentRangeStart>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentRangeStart c in commentRangeStartToDelete)
{
c.Remove();
}
// Delete CommentRangeEnd for each deleted comment in the main document.
List<CommentRangeEnd> commentRangeEndToDelete = doc.Descendants<CommentRangeEnd>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentRangeEnd c in commentRangeEndToDelete)
{
c.Remove();
}
// Delete CommentReference for each deleted comment in the main document.
List<CommentReference> commentRangeReferenceToDelete = doc.Descendants<CommentReference>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentReference c in commentRangeReferenceToDelete)
{
c.Remove();
}
Sample Code
The following is the complete code sample in both C# and Visual Basic.
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using System;
using System.Collections.Generic;
using System.Linq;
// Delete comments by a specific author. Pass an empty string for the
// author to delete all comments, by all authors.
static void DeleteComments(string fileName, string author = "")
{
// Get an existing Wordprocessing document.
using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, true))
{
if (document.MainDocumentPart is null)
{
throw new ArgumentNullException("MainDocumentPart is null.");
}
// Set commentPart to the document WordprocessingCommentsPart,
// if it exists.
WordprocessingCommentsPart? commentPart = document.MainDocumentPart.WordprocessingCommentsPart;
// If no WordprocessingCommentsPart exists, there can be no
// comments. Stop execution and return from the method.
if (commentPart is null)
{
return;
}
// Create a list of comments by the specified author, or
// if the author name is empty, all authors.
List<Comment> commentsToDelete = commentPart.Comments.Elements<Comment>().ToList();
if (!String.IsNullOrEmpty(author))
{
commentsToDelete = commentsToDelete.Where(c => c.Author == author).ToList();
}
IEnumerable<string?> commentIds = commentsToDelete.Where(r => r.Id is not null && r.Id.HasValue).Select(r => r.Id?.Value);
// Delete each comment in commentToDelete from the
// Comments collection.
foreach (Comment c in commentsToDelete)
{
if (c is not null)
{
c.Remove();
}
}
Document doc = document.MainDocumentPart.Document;
// Delete CommentRangeStart for each
// deleted comment in the main document.
List<CommentRangeStart> commentRangeStartToDelete = doc.Descendants<CommentRangeStart>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentRangeStart c in commentRangeStartToDelete)
{
c.Remove();
}
// Delete CommentRangeEnd for each deleted comment in the main document.
List<CommentRangeEnd> commentRangeEndToDelete = doc.Descendants<CommentRangeEnd>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentRangeEnd c in commentRangeEndToDelete)
{
c.Remove();
}
// Delete CommentReference for each deleted comment in the main document.
List<CommentReference> commentRangeReferenceToDelete = doc.Descendants<CommentReference>()
.Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value))
.ToList();
foreach (CommentReference c in commentRangeReferenceToDelete)
{
c.Remove();
}
}
}