How to fix an OpenXml issue in Word interop (C#) concerning bullet lists

Gael 0 Reputation points
2025-02-17T15:55:25.07+00:00

Here is a truncated OpenXml standalone package xml => it's a string returned by range.WordOpenXml that can be processed by the OpenXml SDK WordprocessingDocument.FromFlatOpcString.

Below is a very simple bullet list (the xml has been truncated to make it easier to read):

<?xml version="1.0" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
	<pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
		<pkg:xmlData>
			<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
				<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>
			</Relationships>
		</pkg:xmlData>
	</pkg:part>
	<pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
		<pkg:xmlData>
			<w:document>
				<w:body>
					<w:p w:rsidR="00F80704" w:rsidRPr="00AF0002" w:rsidRDefault="00F80704" w:rsidP="00F80704">
						<w:pPr>
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
						</w:pPr>
						<w:bookmarkStart w:id="0" w:name="REQ_WORL3_0003"/>
						<w:r w:rsidRPr="00AF0002">
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
							<w:t>Key points:</w:t>
						</w:r>
					</w:p>
					<w:p w:rsidR="00F80704" w:rsidRPr="00AF0002" w:rsidRDefault="00F80704" w:rsidP="00F80704">
						<w:pPr>
							<w:pStyle w:val="Paragraphedeliste"/>
							<w:numPr>
								<w:ilvl w:val="0"/>
								<w:numId w:val="1"/>
							</w:numPr>
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
						</w:pPr>
						<w:r w:rsidRPr="00AF0002">
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
							<w:t>point 1</w:t>
						</w:r>
					</w:p>
					<w:p w:rsidR="00F80704" w:rsidRPr="00AF0002" w:rsidRDefault="00F80704" w:rsidP="00F80704">
						<w:pPr>
							<w:pStyle w:val="Paragraphedeliste"/>
							<w:numPr>
								<w:ilvl w:val="0"/>
								<w:numId w:val="1"/>
							</w:numPr>
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
						</w:pPr>
						<w:r w:rsidRPr="00AF0002">
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
							<w:t>point 2</w:t>
						</w:r>
					</w:p>
					<w:p w:rsidR="00F80704" w:rsidRDefault="00F80704" w:rsidP="00F80704">
						<w:r w:rsidRPr="00AF0002">
							<w:rPr>
								<w:highlight w:val="cyan"/>
								<w:lang w:val="en-US"/>
							</w:rPr>
							<w:t>point 3</w:t>
						</w:r>
						<w:bookmarkEnd w:id="0"/>
					</w:p>
					<w:sectPr w:rsidR="00000000">
						<w:pgSz w:w="12240" w:h="15840"/>
						<w:pgMar w:top="1417" w:right="1417" w:bottom="1417" w:left="1417" w:header="720" w:footer="720" w:gutter="0"/>
						<w:cols w:space="720"/>
					</w:sectPr>
				</w:body>
			</w:document>
		</pkg:xmlData>
	</pkg:part>
</pkg:package>

As you can see, the last bullet list item doesn't not have a paragraph id, nor paragraph properties which makes it not identifiable as such.

I have another document with exactly the same issue except the last item is not in the bullet list.
I compared the two and there is no way to tell the difference.

The common point is the bookmarkEnd (with a very suspicious id but fortunately, I do not need it) for both. Removing it is not an options.

Is there a plan to fix those issues?

Thx.
Regards.

Word
Word
A family of Microsoft word processing software products for creating web, email, and print documents.
960 questions
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
11,308 questions
Office Development
Office Development
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Development: The process of researching, productizing, and refining new or existing technologies.
4,277 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Jiale Xue - MSFT 48,966 Reputation points Microsoft Vendor
    2025-02-18T03:36:43.8266667+00:00

    Hi @Gael , Welcome to Microsoft Q&A,

    Use the WordprocessingDocument.FromFlatOpcString method provided by the Open XML SDK to parse Flat OPC XML. If you simply want to get <w:t>, there is an example for reference:

    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Wordprocessing;
    using System;
    using Paragraph = DocumentFormat.OpenXml.Wordprocessing.Paragraph;
    
    namespace _2_18_1
    {
        class Program
        {
            static void ParseFlatOpcXml(string flatOpcXml)
            {
                // Create a WordprocessingDocument directly using the FromFlatOpcString method
                using (var wordDoc = WordprocessingDocument.FromFlatOpcString(flatOpcXml))
                {
                    var body = wordDoc.MainDocumentPart.Document.Body;
                    Paragraph previousParagraph = null;
    
                    foreach (var para in body.Elements<Paragraph>())
                    {
                        // Try to get paragraph properties
                        var pPr = para.ParagraphProperties;
                        NumberingProperties numPr = null;
                        if (pPr != null)
                        {
                            numPr = pPr.NumberingProperties;
                        }
    
                        bool isListItem = false;
                        if (numPr != null)
                        {
                            // There is an explicit list numbering property
                            isListItem = true;
                        }
                        else
                        {
                            // If the current paragraph has no paragraph properties, determine whether the previous paragraph is a list item
                            if (previousParagraph != null)
                            {
                                var prevPPr = previousParagraph.ParagraphProperties;
                                if (prevPPr != null && prevPPr.NumberingProperties != null)
                                {
                                    // Assuming that the previous paragraph is a list item, the current paragraph may also be a list item
                                    isListItem = true;
                                }
                            }
                        }
    
                        // Perform different processing depending on whether it is a list item
                        if (isListItem)
                        {
                            Console.WriteLine("List item: " + para.InnerText);
                        }
                        else
                        {
                            Console.WriteLine("Normal paragraph: " + para.InnerText);
                        }
    
                        previousParagraph = para;
                    }
                }
            }
    
            static void Main(string[] args)
            {
                string flatOpcXml = @"<?xml version=""1.0"" standalone=""yes""?>
    <?mso-application progid=""Word.Document""?>
    <pkg:package xmlns:pkg=""http://schemas.microsoft.com/office/2006/xmlPackage"">
    	<pkg:part pkg:name=""/_rels/.rels"" pkg:contentType=""application/vnd.openxmlformats-package.relationships+xml"" pkg:padding=""512"">
    		<pkg:xmlData>
    			<Relationships xmlns=""http://schemas.openxmlformats.org/package/2006/relationships"">
    				<Relationship Id=""rId1"" Type=""http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"" Target=""word/document.xml""/>
    			</Relationships>
    		</pkg:xmlData>
    	</pkg:part>
    	<pkg:part pkg:name=""/word/document.xml"" pkg:contentType=""application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"">
    		<pkg:xmlData>
    			<w:document xmlns:w=""http://schemas.openxmlformats.org/wordprocessingml/2006/main"">
    				<w:body>
    					<w:p w:rsidR=""00F80704"" w:rsidRPr=""00AF0002"" w:rsidRDefault=""00F80704"" w:rsidP=""00F80704"">
    						<w:pPr>
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    						</w:pPr>
    						<w:bookmarkStart w:id=""0"" w:name=""REQ_WORL3_0003""/>
    						<w:r w:rsidRPr=""00AF0002"">
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    							<w:t>Key points:</w:t>
    						</w:r>
    					</w:p>
    					<w:p w:rsidR=""00F80704"" w:rsidRPr=""00AF0002"" w:rsidRDefault=""00F80704"" w:rsidP=""00F80704"">
    						<w:pPr>
    							<w:pStyle w:val=""Paragraphedeliste""/>
    							<w:numPr>
    								<w:ilvl w:val=""0""/>
    								<w:numId w:val=""1""/>
    							</w:numPr>
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    						</w:pPr>
    						<w:r w:rsidRPr=""00AF0002"">
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    							<w:t>point 1</w:t>
    						</w:r>
    					</w:p>
    					<w:p w:rsidR=""00F80704"" w:rsidRPr=""00AF0002"" w:rsidRDefault=""00F80704"" w:rsidP=""00F80704"">
    						<w:pPr>
    							<w:pStyle w:val=""Paragraphedeliste""/>
    							<w:numPr>
    								<w:ilvl w:val=""0""/>
    								<w:numId w:val=""1""/>
    							</w:numPr>
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    						</w:pPr>
    						<w:r w:rsidRPr=""00AF0002"">
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    							<w:t>point 2</w:t>
    						</w:r>
    					</w:p>
    					<w:p w:rsidR=""00F80704"" w:rsidRDefault=""00F80704"" w:rsidP=""00F80704"">
    						<w:r w:rsidRPr=""00AF0002"">
    							<w:rPr>
    								<w:highlight w:val=""cyan""/>
    								<w:lang w:val=""en-US""/>
    							</w:rPr>
    							<w:t>point 3</w:t>
    						</w:r>
    						<w:bookmarkEnd w:id=""0""/>
    					</w:p>
    					<w:sectPr w:rsidR=""00000000"">
    						<w:pgSz w:w=""12240"" w:h=""15840""/>
    						<w:pgMar w:top=""1417"" w:right=""1417"" w:bottom=""1417"" w:left=""1417"" w:header=""720"" w:footer=""720"" w:gutter=""0""/>
    						<w:cols w:space=""720""/>
    					</w:sectPr>
    				</w:body>
    			</w:document>
    		</pkg:xmlData>
    	</pkg:part>
    </pkg:package>";
    
                ParseFlatOpcXml(flatOpcXml);
                Console.ReadLine();
            }
        }
    }
    
    

    Best Regards,

    Jiale


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment". 

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  3. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.