Removing Page and Section Breaks from a Word Document

In today's post I am going to show you how to remove page and section breaks within a Word document using the Open XML SDK. Removing these two types of breaks is similar, but requires two different approaches. Let's start off by jumping into removing page breaks.

My post will talk about using version 2 of the SDK.

If you just want to jump straight into the code, feel free to download this solution here.

Solution to Remove Page Breaks

To remove page breaks in a document we need to take the following actions:

  1. Open the Word document via the Open XML SDK
  2. Get access to the main document part
  3. Find all page breaks within the main document part
  4. For every page break found, remove it from the document
  5. Save changes

For the sake of this example, let's say I am starting with the following Word document:

This document has a page break (shown outlined in red) on the first page.

The Code

The code is pretty straight forward and follows the solution steps as described above in the solutions section:

static void RemovePageBreaks(string filename) { using (WordprocessingDocument myDoc = WordprocessingDocument.Open(filename, true)) { MainDocumentPart mainPart = myDoc.MainDocumentPart; List<Break> breaks = mainPart.Document.Descendants<Break>().ToList(); foreach (Break b in breaks) { b.Remove(); } mainPart.Document.Save(); } }

Pretty easy stuff!

End Result

Running this code I should end up with a document that looks like the following:

Now let's see how to remove section breaks within a document. Before I actually jump into the solution of removing sections, I want to talk a bit about section breaks within a Word document.

Section Breaks in a Word Document

WordprocessingML does not natively store the concept of pages, since it is based on paragraphs and runs. Instead it uses sections to specify groups of paragraphs that have a specific set of page properties.

Every Word document has at least one section, where each section specifies page properties (like page size, orientation, margins, etc), header/footer references, column information, etc. Given this information, there are really two high level types of sections:

  1. A section as a paragraph property – A document may have zero or more of these types of sections
  2. A document final section property – A document can will only have one of these types of sections

In today's post I am going to show you how to remove all sections that are a paragraph property.

Solution to Remove Section Breaks

To remove section breaks in a document we need to take the following actions:

  1. Open the Word document via the Open XML SDK
  2. Get access to the main document part
  3. Find all paragraph properties that are contain section breaks
  4. For every paragraph property found, remove the section property as a child of the paragraph property
  5. Save changes

For the sake of this example, let's say I am starting with the following Word document:

This document has a section break (shown outlined in red) on the first page, which separates a one column section from a two column section.

The Code

This code is also pretty straight forward and follows the solution steps as described above in the solutions section:

static void RemoveSectionBreaks(string filename) { using (WordprocessingDocument myDoc = WordprocessingDocument.Open(filename, true)) { MainDocumentPart mainPart = myDoc.MainDocumentPart; List<ParagraphProperties> paraProps = mainPart.Document.Descendants<ParagraphProperties>() .Where(pPr => IsSectionProps(pPr)).ToList(); foreach (ParagraphProperties pPr in paraProps) { pPr.RemoveChild<SectionProperties>(pPr.GetFirstChild<SectionProperties>()); } mainPart.Document.Save(); } } static bool IsSectionProps(ParagraphProperties pPr) { SectionProperties sectPr = pPr.GetFirstChild<SectionProperties>(); if (sectPr == null) return false; else return true; }

End Result

Running this code I should end up with a document that looks like the following:

Notice how the document now has two columns. This solution removed the first section property, which specified a one column section.

Zeyad Rajabi

Comments

  1. a section break at the end of an sdtContent (ie w:p/w:pPr/w:sectPr/w:type immediately before the </w:sdtContent>, you can't delete in the Word UI
  2. what is in the OpenXML a section break of type "continuous", is displayed in the Word UI as a section break of type "Next Page" Any insight into this?  Is there a bug tracking system somewhere I can access in which this could be reported? thanks Jason
  • Anonymous
    June 24, 2009
    Jason, For issue #1 try using Shift-Delete. That should delete the section break. For issue #2 I am unable to repro what you are seeing. I see Word correctly displaying the breaks as continuous.