Condividi tramite


How to Use altChunk for Document Assembly

Merging multiple word processing documents into a single document is something that many people want to do.  An application built for attorneys might assemble selected standard clauses into a contract.  An application built for book publishers can assemble chapters of a book into a single document.  This post explains the semantics of the altChunk element, and provides some code using the Open XML SDK that shows how to use altChunk.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCInstead of using altChunk, you could write a program to merge the Open XML markup for documents.  You would need to deal with a number of issues, including merging style sheets and resolving conflicting styles, merging the comments from all of the documents, merging bookmarks, and more.  This is doable, but it’s a lot of work.  You can use altChunk to let Word 2007 do the heavy lifting for you.

altChunk is a powerful technique.  It’s a tool that should be in every Open XML developer’s toolbox.  In an upcoming post, I’ll show an example of the use of altChunk in a SharePoint application.  You can create compelling document assembly solutions in SharePoint using altChunk.

Overview of the altChunk Markup

The altChunk markup tells the consuming application to import content into the document.  This behavior is not required for a conforming application – a conforming application is free to ignore the altChunk markup.  However, the standard recommends that if the application ignores the altChunk markup, it should notify the user.  Word 2007 supports altChunk.

To use altChunk, you do the following:

  • You create a new part in the package.  The part can have a number of content types, listed below.  When you create the part, you assign a unique ID to the part.
  • You store the content that you want to import into the part.  You can import a variety of types of content, including another Open XML word processing document, HTML, or text.
  • The main document part has a relationship to the alternative format part.
  • You add a w:altChunk element at the location where you want to import the alternative format content.  The r:id attribute of the w:altChunk element identifies the chunk to import.  The w:altChunk element is a sibling to paragraph elements (w:p).  You can add an altChunk element at any point in the markup that can contain a paragraph element.

A few options for content types that can be imported into a document are:

  • application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml

The alternative format content part contains an Open XML document in binary form.

  • application/xhtml+xml

The alternative format content part contains an XHTML document.

  • text/plain

The alternative format content part contains text.

There are more than these three options; the code presented in this post shows how to implement altChunk for these three types of content.

The altChunk markup in the document looks like this:

<w:p>
<w:r>
<w:t>Paragraph before.</w:t>
</w:r>
</w:p>
<w:altChunkr:id="AltChunkId1" />
<w:p>
<w:r>
<w:t>Paragraph after.</w:t>
</w:r>
</w:p>

altChunk: Import Only

One important note about altChunk – it is used only for importing content.  If you open the document using Word 2007 and save it, the newly saved document will not contain the alternative format content part, nor the altChunk markup that references it.  Word saves all imported content as paragraph (w:p) elements.  The standard requires this behavior from a conforming application.

Using altChunk

The following screen-clipping shows a simple word processing document.  It has a heading, a paragraph styled as Normal, and a comment:

The following screen-clipping shows another word processing document, with content that we want to insert into the first document.

After running the example program included with this post, the resulting document looks like the following.  Notice that the resulting document has comments from both of the source documents:

The following example shows how to merge two Open XML documents using altChunk.  It uses V1 of the Open XML SDK, and LINQ to XML:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using System.Xml;
using System.Xml.Linq;

class Program
{
static void Main(string[] args)
{
XNamespace w =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";
XNamespace r =
"https://schemas.openxmlformats.org/officeDocument/2006/relationships";

using (WordprocessingDocument myDoc =
WordprocessingDocument.Open("Test.docx", true))
{
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
"application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml",
altChunkId);
using (FileStream fileStream =
File.Open("TestInsertedContent.docx", FileMode.Open))
chunk.FeedData(fileStream);
XElement altChunk = new XElement(w + "altChunk",
new XAttribute(r + "id", altChunkId)
);
XDocument mainDocumentXDoc = GetXDocument(myDoc);
// Add the altChunk element after the last paragraph.
mainDocumentXDoc.Root
.Element(w + "body")
.Elements(w + "p")
.Last()
.AddAfterSelf(altChunk);
SaveXDocument(myDoc, mainDocumentXDoc);
}
}

private static void SaveXDocument(WordprocessingDocument myDoc,
XDocument mainDocumentXDoc)
{
// Serialize the XDocument back into the part
using (Stream str = myDoc.MainDocumentPart.GetStream(
FileMode.Create, FileAccess.Write))
using (XmlWriter xw = XmlWriter.Create(str))
mainDocumentXDoc.Save(xw);
}

private static XDocument GetXDocument(WordprocessingDocument myDoc)
{
// Load the main document part into an XDocument
XDocument mainDocumentXDoc;
using (Stream str = myDoc.MainDocumentPart.GetStream())
using (XmlReader xr = XmlReader.Create(str))
mainDocumentXDoc = XDocument.Load(xr);
return mainDocumentXDoc;
}
}

To use altChunk with HTML, the code looks like this:

using (WordprocessingDocument myDoc =
WordprocessingDocument.Open("Test3.docx", true))
{
string html =
@"<html>
<head/>
<body>
<h1>Html Heading</h1>
<p>This is an html document in a string literal.</p>
</body>
</html>";
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
"application/xhtml+xml", altChunkId);
using (Stream chunkStream = chunk.GetStream(FileMode.Create, FileAccess.Write))
using (StreamWriter stringStream = new StreamWriter(chunkStream))
stringStream.Write(html);
XElement altChunk = new XElement(w + "altChunk",
new XAttribute(r + "id", altChunkId)
);
XDocument mainDocumentXDoc = GetXDocument(myDoc);
mainDocumentXDoc.Root
.Element(w + "body")
.Elements(w + "p")
.Last()
.AddAfterSelf(altChunk);
SaveXDocument(myDoc, mainDocumentXDoc);
}

Using V2 of the Open XML SDK:

using (WordprocessingDocument myDoc =
WordprocessingDocument.Open("Test1.docx", true))
{
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open("TestInsertedContent.docx", FileMode.Open))
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
mainPart.Document.Save();
}

The attached code shows examples of placing an Open XML document, html, and text into an alternative content part.  I’ve provided two versions of the example – one using V1 of the Open XML SDK (and LINQ to XML), and another using V2 of the Open XML SDK.

altChunk.zip

Comments

  • Anonymous
    October 28, 2008
    Hi Eric, Interesting post. I have just been reading up on OpenXML and it looks like a great solution to my document assembly problem. Is it possible to combine excel tables/charts and powerpoint slides into a word document using OpenXML. Clearly altChunk wouldn't be the method as it only works with Word/XML/XTML files but would it work for Excel/Powerpoint elements embedded into Word? Ed

  • Anonymous
    October 28, 2008
    Hi Eric,  Great bit of code - nearly exactly what I was looking for.  I seem to have a problem though if each sub-document has a different header - the headers seem to get lost.  Any ideas? Terry

  • Anonymous
    October 29, 2008
    Hi Eric, Thanks for the code sample. I am facing a problem with the bullets & numbering when using altChunk to merge two word documents (office 2003 .doc documents converted to .docx using OFC.exe). The code I am using is given below.            string oriDoc = @"C:Final.docx";            string mergedDocPath= @"C:A.docx";            using (WordprocessingDocument doc = WordprocessingDocument.Open(oriDoc, true))            {                IEnumerator<Locked> enumerator = doc.MainDocumentPart.StyleDefinitionsPart.Styles.Descendants<Locked>().GetEnumerator();                while (enumerator.MoveNext() == true)                    enumerator.Current.Val = BooleanValues.True; //Tried using False as well, but it doesnt make sense here.                doc.MainDocumentPart.Document.Save();                Paragraph paragraph = doc.MainDocumentPart.Document.Descendants<Paragraph>().Last();                AlternativeFormatImportPart importPart = doc.MainDocumentPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);                using (StreamReader reader = new StreamReader(mergedDocPath, true))                    importPart.FeedData(reader.BaseStream);                AltChunk altChunk = new AltChunk();                altChunk.AltChunkProperties = new AltChunkProperties();                altChunk.AltChunkProperties.MatchSource = new MatchSource();                altChunk.AltChunkProperties.MatchSource.Val = BooleanValues.True;//Tried using False as well                altChunk.Id = doc.MainDocumentPart.GetIdOfPart(importPart);                paragraph.InsertAfterSelf(altChunk);                doc.MainDocumentPart.Document.Save();            } A.docx originally looks like this,


Diggity dog ffdgfgfdg first time dfvidjgldgdgm dfsfsdfgdgdfgghfgghfh:

  1.       Zoom vroom
  2.       doom boom
  3.       dhgfhfghgfhfghgfhgfhgfh dsfsfsddfsfdsfgdsfdgffg fgffdgdghfdh fsfgf fdsfsfsdfdsf:
  4.       Sweeetdfvggdggf a.       dfsfvcff                                                               i.      why Go                                                             ii.      jeremy                                                            iii.      black

After merging the formatting becomes like this,

Diggity dog ffdgfgfdg first time dfvidjgldgdgm dfsfsdfgdgdfgghfgghfh: • Zoom vroom • doom boom • dhgfhfghgfhfghgfhgfhgfh dsfsfsddfsfdsfgdsfdgffg fgffdgdghfdh fsfgf fdsfsfsdfdsf: • Sweeetdfvggdggf • dfsfvcff • why Go • jeremy • black

Something similar happens to bullets too. The bullets style changes to the bullets styling of "Final.Docx". On checking the afchunk the bullet & numbering were correct, which indicates that the parent document superimposes its bulleting & numbering on the chunk. I thought about setting DocumentProtection.Enforcement and DocumentProtection.Formatting to false. Also I tried setting AutoFormatOverride.Val to false. But I couldn't find a way to do that. Also will setting these help? Also does setting AltChunk.Id manually rather than by using MainDocumentPart.GetIdOfPart cause a difference? If the above method does not work, should I instead take all the Styles from the second document and merge them into the first document? Although this does not look like the right way to go about doing things. Thanks, Anand.

  • Anonymous
    October 31, 2008
    Stephen McGibbon has screenshots of the Open XML and ODF support coming in Windows 7 Wordpad , as announced

  • Anonymous
    November 02, 2008
    Hi, Ed, Terry, and Anand, Thanks for the great questions.  I'll be responding to these, but it may be as late as the end of next week, due to schedule constraints.  Thanks for your patience. -Eric

  • Anonymous
    November 03, 2008
    Suite à la PDC 2008 et au workshop Open XML donné par Microsoft à Redmond ( Doug , encore mille excuses

  • Anonymous
    November 05, 2008
    I received this message privately, but the question and the response are relevant to many, so including it here. Question: I'm attempting to merge multiple documents (which contain rows of a table) into a single document.  When the merge process happens, I get what looks to be a paragraph marker between my table rows (so there's visual seperation between the rows of the table, wich isn't what I want). Any thoughts on how to modify altChunk's behavior to not include the document delimeter between the documents that it merges? My response: I've seen this same behavior, and as far as I know, this is behavior that is not configurable in Word.  I'll check, but would guess that this can't be changed. The solution to this is to write some utility that can move content between docs (not using altChunk).  I'm starting on the prep work for this.  See this post: http://blogs.msdn.com/ericwhite/archive/2008/11/03/inserting-deleting-moving-paragraphs-in-open-xml-wordprocessing-documents.aspx -Eric

  • Anonymous
    November 14, 2008
    Eric, I am having the excact problem as Anand. Maybe you have a good solution for this. Rather strange that it is not possible to do inline numbering type in the document.xml itself.

  • Anonymous
    November 17, 2008
    Hi, i have few doubts

  1. is it possible to view a altchunk from word 2007 or it can be view only in xml format
  2. can we insert the contents in between the documents?
  • Anonymous
    December 08, 2008
    One of the most common requests we hear related to word processing documents is the ability to merge

  • Anonymous
    January 05, 2009
    Hi, I try to mergedonc and then making some string replace using this code : http://www.codeproject.com/KB/office/OfficeTokenReplacement.aspx It'sdoing some regex on thewhole xml but when I use chunk, the unziped embeded content is under AltChunk1.docx and I have to uzip after. I first tried with the PDC source code //Find all content controls in document                List<SdtBlock> sdtList = mainPart.Document                    .Descendants<SdtBlock>().Where(s => sourceFile                        .Contains(s.SdtProperties                            .GetFirstChild<Alias>().Val.Value)).ToList();                //Go through all the content controls                if (sdtList.Count != 0)                {                    string altChunkId = "AltChunkId" + id;                    id++;                    //Add altchunk into document                    AlternativeFormatImportPart chunk =                        mainPart.AddAlternativeFormatImportPart(                        "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml",                        altChunkId);                    //stream data from source file into altchunk                    chunk.FeedData(File.Open(sourceFile, FileMode.Open));                    //Create new altchunk element                    AltChunk altChunk = new AltChunk();                    altChunk.Id = altChunkId;                    //Swap out content control for altchunk                    foreach (SdtBlock sdt in sdtList)                    {                        OpenXmlElement parent = sdt.Parent;                        parent.InsertAfter(altChunk, sdt);                        sdt.Remove();                    }                    //Save                    mainPart.Document.Save();                } but I only have paragraph and no SdtBlock ? Could you please help me !!

  • Anonymous
    January 06, 2009
    Hi Eric ...Can this be used with word 2003?

  • Anonymous
    January 11, 2009
    How can I insert an AltChunk at a special place ?

  • Anonymous
    February 10, 2009
    Hi Eric, Nice sample of code. I am using some html as altChunk. Its working for plain html but, if the html contains some images, the images are not coming. I understand the problem as images are not in the scope of the document. As you have mentioned that whwn the document with alt chunk is saved by MS Word2007, it converts all the altChunk to WordML. My question is whether can we do the same(converting HTML to WordML). It wil be a great help for my project.

  • Anonymous
    April 13, 2009
    Resolution ================ Step 1: Open a new Microsoft Word 2007 document and type A B C Save the document

  • Anonymous
    April 16, 2009
    The comment has been removed

  • Anonymous
    April 17, 2009
    Hi Rama, Somehow you are getting duplicate rIDs for the altChunk that you are adding.  rIDs need to be unique - there are a variety of ways to enforce this.  It isn't a problem when creating a document from scratch, but when modifying an existing document, you need to take care that you only add new parts with uniuqe rIDs.  Does this help you with your issue? -Eric

  • Anonymous
    April 18, 2009
    Hi Eric, I've really appreciated your article and I have one question: regarding the "altChunk: Import Only" section, is there any way to avoid this peculiar behaviour? In other words, is there an altChunk property or another markup that can be used to embed external sources (i.e. html files) avoiding them to be totally erased from the archive after the first saving? Many thanks, Kulio.

  • Anonymous
    April 19, 2009
    Hi Kulio, Unfortunately, the behavior can't be changed.  When you open the document in Word, the embedded external source is removed from the package. -Eric

  • Anonymous
    April 19, 2009
    There are two ways to assemble multiple Open XML word processing documents into a single document: altChunk,

  • Anonymous
    April 21, 2009
    DocumentBuilder is an example class that’s part of the PowerTools for Open XML project that enables you

  • Anonymous
    April 29, 2009
    The comment has been removed

  • Anonymous
    April 29, 2009
    Hi Ramesh, you need to include a "using System.Linq;" using statement. -Eric

  • Anonymous
    April 29, 2009
    hi Eric, Thanks for your response. but i used System.Xml.Linq namespace. I used System.Xml.Linq  and DocumentFormat.OpenXml dll to merge the office documents, which is working fine in 3.5 framework. When i bind my page with sharepoint site iam getting an exception saying 'System.Collections.Generic.IEnumerable<DocumentFormat.OpenXml.Wordprocessing.Paragraph>' does not contain a definition for 'Last'   at System.Web.Compilation.AssemblyBuilder.Compile() Code snippet: using (WordprocessingDocument myDoc =                    WordprocessingDocument.Open("Desc.docx", true))                {                    string altChunkId = "AltChunkId" + i;                    MainDocumentPart mainPart = myDoc.MainDocumentPart;                    AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(                        AlternativeFormatImportPartType.WordprocessingML, altChunkId);                    using (FileStream fileStream = File.Open("Temp.docx", FileMode.Open))                        chunk.FeedData(fileStream);                    AltChunk altChunk = new AltChunk();                    altChunk.Id = altChunkId;                    mainPart.Document                        .Body.InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());                    mainPart.Document.Save();                } Note: when i change my applicaiton framework version to 3.0 also i am getting the same exception in my local, which i got in sharepoint. Is it mean that sharepoint doens't support 3.5 framework DLL.. Please advise.

  • Anonymous
    April 30, 2009
    Hi Ramesh, the Enumerable.Last extension method is in the System.Linq namespace, not System.Xml.Linq.  By default, the SharePoint project doesn't include a using for System.Linq.  So to use the Last extension method, you need to add that using statement. In general, when you get build errors like this, take a look at the MSDN docs on the class/method/type.  The docs always tell you which assembly the class is in, and what namespace the class is in.  Then, you can add appropriate references and using statements.  Make sense? -Eric

  • Anonymous
    May 06, 2009
    Good stuff, very helpful.   I have a slight twist to this I am working on, maybe someone can help.   Instead of opening existing files and merging them, I am programmatically creating WordProcessingDocuments using C# in .NET.  Then based on various conditions I may or may not want to combine them and then stream them out as a single document.     So instead of adding data to the stream in the form of: Stream fileStream = System.IO.File.Open(fileName, FileMode.Open); chunk.FeedData(fileStream); I tried to do this: Stream stream = wordDoc.MainDocumentPart.GetStream(); chunk.FeedData(stream); Which compiles but then when you try to open the final document it give me a message that the docx can't be opened because of problems with the contents.    Any ideas?

  • Anonymous
    May 07, 2009
    Hi rmagill,  quick question - are you properly disposing of all of your streams?  That could very well cause this problem.  Another debugging technique for a situation like this - read streams to byte arrays - as necessary, you can create a non-resizable memory stream from a byte array using one of the MemoryStream constructors (I believe that the memory stream uses the passed in byte array as its backing store).  You can then examine this byte array to see what's different. It's best to always use a 'using' block for every object that implements IDisposable: private static void SaveXDocument(WordprocessingDocument myDoc,     XDocument mainDocumentXDoc) {     // Serialize the XDocument back into the part     using (Stream str = myDoc.MainDocumentPart.GetStream(FileMode.Create, FileAccess.Write))     using (XmlWriter xw = XmlWriter.Create(str))         mainDocumentXDoc.Save(xw); } -Eric

  • Anonymous
    May 12, 2009
    Hi Eric, I thank you for your last answer. I've solved the problem creating the html files 'on the fly' and using a kind of 'custom marker' in the document that is replaced at runtime with the proper altchunk reference tag. Now I am wondering if there is a way to embed also a css stylesheet for the html files. The stylesheet file is placed in a directory "word/html". I've found out that if I insert "<Default Extension="css" ContentType="text/css" />" into [Content_Types].xml I get no error message when opening the docx. However the CSS is ignored in the docx file. On the contrary, ff I integrate the styles in a <style> tag inside the html file, the proper style is displayed correctly inside the docx file. Many thanks, Kulio.

  • Anonymous
    May 12, 2009
    Hi Kulio, From the dev team: Word only supports a few content types for altChunks. Word does support HTML and MHT, which is why putting them in a <style> tag worked. For HTML, Word only reads the HTML file itself and not any supporting files in the package. So if you have any external stylesheets, images, etc. MHT might be the best route. -Eric

  • Anonymous
    May 21, 2009
    I've had no trouble getting this working.  However, the one difficult I'm having is this: If I have a hyperlink in my source HTML that looks like this: <a href="myimage.jpeg">, and I have added myimage.jpeg, is there any way that I can get my hyperlink to refer to that image?  Currently the URL is resolved to "directoryTheDocumentIsIn/myimage.jpeg". I'm not sure whether the HyperlinkBase extended property could be used for this...I can't figure out a way. Also, I'm a little confused. I was under the impression that with altchunk, Word does a one-time conversion of the content and does away with the source altchunk file.  However, I find that even after opening the docx file several times, the altchunk file remains, and document.xml still contains the <altchunk> tag, rather than any imported html.

  • Anonymous
    May 26, 2009
    The comment has been removed

  • Anonymous
    June 11, 2009
    Beautiful work Eric.  Worked like a charm. I have captured content using InfoPath forms in Moss and wished to export the content to a word document.  the content control do not allow you to map the content directly by including the customxml parts.  This option sure worked. Cheers

  • Anonymous
    June 16, 2009
    Thanks for the awesome post! I'm also trying to assemble different types of office documents (excel, word, powerpoint) into a single document.  Your post really helped me with combining word documents, but I'm not sure how to proceed with the other types (excel, powerpoint).  Any suggestions?

  • Anonymous
    July 05, 2009
    What is best practice for assembling a document from a database source? Is it to use content controls? To use AltChunks? To use content controls replaced at runtime with AtlChunks? To use custom markup replaced at runtime with database content; e.g. http://geekswithblogs.net/DanBedassa/archive/2009/01/16/dynamically-generating-word-2007-.docx-documents-using-.net.aspx or http://msdn.microsoft.com/en-us/library/cc850835(office.14).aspx).

  • Anonymous
    July 29, 2009
    I need to create a .docx file from an html document that includes image tags with a src element pointing to a url. Is there any way make to sure that images contained in the HTML document are put into the Word package, so that Word will render the images without internet connectivity? When I read your "altChunk: Import Only" section, I was hopeful that Word might actually accomplish this for me. I can't get that to happen, however. Am I missing something? Also, you mention in that section that you must "open the file and save it" for the altchunk stuff to be removed. Is there anyway to do that programmatically? (rather than requiring the user to open and save). In fact, I've had to actually EDIT & save before the "auto-pruning" would take place. Any suggestions? Thanks!

  • Anonymous
    July 30, 2009
    hi Kmote, I don't know of any way to accomplish what you're trying to do.  As far as I know, you are correct - you must make a small change in the file and save it. Ultimately, the solution to this is to have an html => open xml converter in code that you can modify for your specific needs.  I have this on my todo list - but it will be some time before I can get to this. I wish I had a better answer for you, but I don't. -Eric

  • Anonymous
    September 04, 2009
    Eric, Thanks for this post! Any advice on merging Excel documents into a Word document? I have been searching the internet up and down and your blog is by far the best. Thanks, Matt

  • Anonymous
    September 17, 2009
    Hi, really good post, I tried to import a mht file, but it doesn't work. Do you know the content type for this?

  • Anonymous
    October 27, 2009
    Fantastic Eric. Just what I needed. I can't tell you how much time and grief you probably saved me... Thanks!

  • Anonymous
    January 21, 2010
    Hi Eric, I was planning to use altchunk to insert html text to word template that have custom xml tags (pink tags from schema). My requirement is that user will create template with xml tag. I will read the tag name using xpath or linq to xml and replace the node with altchunk. But since we cannot use Custom XMl tags (pink tags) as per Gray's blog what is the alternative solution? How can i map the content control with my xml schema tag so that i can insert altchunk. There would be several such tags and each will have different html text. Here is the link from Gray's blog http://blogs.technet.com/gray_knowlton/archive/2009/12/23/what-is-custom-xml-and-the-impact-of-the-i4i-judgment-on-word.aspx?CommentPosted=true#commentmessage Any help is appreciated.

  • Anonymous
    April 21, 2010
    Hi Eric, Looks like its been a while since your last post. I am trying like many others to merge the headers in as well. I can merge the documents no problems but only the first documents header and footer get saved to the final document. Is there a way around this? Chris

  • Anonymous
    April 21, 2010
    Hi Chris, If you want to control sections and headers, have you taken a look at using DocumentBuilder instead of altChunk? http://blogs.msdn.com/ericwhite/archive/2010/01/08/how-to-control-sections-when-using-openxml-powertools-documentbuilder.aspx For a comparison of the two approaches: http://blogs.msdn.com/ericwhite/archive/2009/04/19/comparison-of-altchunk-to-the-documentbuilder-class.aspx -Eric

  • Anonymous
    April 22, 2010
    Hi Eric, Yes we have looked at DocumentBuilder but as we are using xml documents, streams and office documents it is not really suitable. I have decided to take the longer route of a copying everything manually into to each document( we loop through them, depending on how many are selected), making sure that all style-references, header-references, footer-references, ... are preserved. I have been using the reflector tool to see how this is created but cannot seem to find the rsid values for each paragraph to add to the properties. Below is what i have so far, Dim paraRef = mainPart.GetIdOfPart(mainPart.Document.MainDocumentPart) Dim para As Paragraph = New Paragraph With {.RsidParagraphAddition = paraRef} Dim parid As String = mainPart.Document.MainDocumentPart.GetIdOfPart(mainPart.Document.MainDocumentPart) Dim headid As String = mainPart.Document.MainDocumentPart.GetIdOfPart(mainPart.Document.MainDocumentPart.HeaderParts).ToString Dim footid As String = mainPart.Document.MainDocumentPart.GetIdOfPart(mainPart.Document.MainDocumentPart.FooterParts).ToString Dim headRef As HeaderReference = New HeaderReference With {.Id = headid, .Type = HeaderFooterValues.First} Dim footRef As FooterReference = New FooterReference With {.Id = footid, .Type = HeaderFooterValues.First} Dim title As TitlePage = New TitlePage() Dim paraProp As ParagraphProperties = New ParagraphProperties Dim sectionProperty As SectionProperties = New SectionProperties sectionProperty.Append(headRef) sectionProperty.Append(footRef) sectionProperty.Append(title) paraProp.Append(sectionProperty) para.Append(paraProp) What am i missing?

  • Anonymous
    April 22, 2010
    Hi Chris, First thing - I modified DocumentBuilder a while ago so that it works just fine with streams and in-memory documents.  I'm not quite clear why you can't use it. Regarding Rsid elements and attributes, you really don't need to add those.  Those are only used for a fairly obscure scenario where I pass a single document to two people, who separately edit it, and then the results are merged back into a single document.  If you are programmatically assembling a document, then almost by definition, you don't care about Rsid elements and attributes.  You can discard those in the generated document. Regarding your example, it is not clear to me what is missing.  In general, I take the approach of creating the resulting document exactly as I want it using Word, and then looking at the resulting markup. -Eric

  • Anonymous
    April 22, 2010
    The comment has been removed

  • Anonymous
    April 25, 2010
    Hi Eric, Just noticed i haven't stated the exact error: Object reference not set to an instance of an object. In addition it is the: Imports DocumentFormat.OpenXml.Wordprocessing statement that is not available with the previous dll, which means docx.MainDocumentPart.Document cannot be found and type SimpleField is not declared. We use these as we have mergefields in each document which get populated upon creation. Chris

  • Anonymous
    August 08, 2010
    Hi Eric, After merging multiple docx into single document , how can I update the source docx files if the user modify the content of the assembled document?  

  • Anonymous
    January 11, 2011
    Hi Eric, I'm merging HTML documents into Word documents via altChunk, but I'd like the style from the Word documents to be applied to the HTML documents. I've tried putting the altChunk inside of a paragraph, run and have even done it without the surrounding sdt tags, but still can't get it to work. Have any suggestions? Here is some sample markup: <w:sdt>                    <w:sdtPr>                      <w:alias w:val="description" />                      <w:tag w:val="description" />                    </w:sdtPr>                    <w:sdtContent>                      <w:p w:rsidR="00275992" w:rsidRPr="00275992" w:rsidRDefault="00275992" w:rsidP="00EC68D3">                        <w:pPr>                          <w:pStyle w:val="LineItemTable" />                        </w:pPr>                        <w:r>                          <w:altChunk r:id="raac4be36-f977-4735-9ffc-a5cbf35dd6d5">                            <w:altChunkPr>                              <w:matchSrc w:val="false" />                            </w:altChunkPr>                          </w:altChunk>                        </w:r>                      </w:p>                    </w:sdtContent>                  </w:sdt>

  • Anonymous
    January 23, 2011
    Hi, I am trying to merge word documents in sharepoint document library. Some pages in the docs are in portrait and some in landscape. after merging documents all the pages in the documents r displayed in portrait mode. how can i retain page orientation programmatically ? i think we can do it by inserting section properties after each page or each document. here is  my code Appreciate your help..            foreach (SPFile item in listitem.Folder.Files)            {              //  SPFile inputFile = item.File;                SPFile inputFile = item;                string altChunkId = "AltChunkId" + id;                id++;                byte[] byteArray = inputFile.OpenBinary();                AlternativeFormatImportPart chunk = outputDoc.MainDocumentPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML,                    altChunkId);                using (MemoryStream mem = new MemoryStream())                {                    mem.Write(byteArray, 0, (int)byteArray.Length);                    mem.Seek(0, SeekOrigin.Begin);                    chunk.FeedData(mem);                }                AltChunk altChunk = new AltChunk();                altChunk.Id = altChunkId;                outputDoc.MainDocumentPart.Document.Body.InsertAfter(altChunk,                    outputDoc.MainDocumentPart.Document.Body.Elements<Paragraph>().Last());                outputDoc.MainDocumentPart.Document.Save();            }            outputDoc.Close();            memOut.Seek(0, SeekOrigin.Begin);            ClientContext clientContext = new ClientContext(SPContext.Current.Site.Url);            ClientOM.File.SaveBinaryDirect(clientContext, outputPath, memOut, true);            // Conversion

  • Anonymous
    March 26, 2013
    Hi Eric, How about the support of Altchunks in Office 2003 with Compatibility pack installed. I created a very simple word document using open xml sdk and added an alt chunk to to the body with a stream of simple html content. It fails to open in office 2003. Any thoughts ?? John

  • Anonymous
    March 26, 2013
    Hi John, Yes, you are right, altChunk is not supported in Office 2003.  There are other features of Open XML not supported in 2003, such as content controls.  This has to do with the actual functionality in that version of Office.  There is no code to do the conversion and import for altChunk, nor to handle content controls, therefore those are not supported. -Eric

  • Anonymous
    April 11, 2013
    Hi Eric, I am converting the html content to word using altchunk. The problem is spacing is adding between lines in the paragraph. In html there is no space between the lines but in the word the space is automatically getting added between each line. I used the below html. <html> <head/> <body> <div > <div>sdsdsd</div> <div><strong>sdsdsdsdsd sdsdsdsdsd sdsdsdsdsd</strong></div> <div><strong>erterterttr</strong></div> <div>Sample <em>Text</em></div> <div><font color="#ff0000">ACCCC</font></div> <div><font color="#ff0000">sdsdsd</font></div> <div><font color="#ff9900">Test Doc</font></div> <div><a href="http://sgehmoss01:9005//sites/Conversion/default.aspx">Default</a></div> <div> </div> <div><a href="www.google.com/">All Items</a></div> <div><font color="#ff0000"></font></div> <div>AAA</div> <div><img alt="Home Page" src="http://sgehmoss01:9005/Sites/Conversion/_layouts/images/homepage.gif"></div> <div> </div> <div><img alt="Second One" src="http://sgehmoss01:9005/Sites/Conversion/_layouts/images/homepage.gif"></div> <div> </div> <div>Saasasas</div> <div>asas</div> <div>as</div> <div> </div> <div>asas</div></div> </body> </html>

  • Anonymous
    January 27, 2014
    The comment has been removed

  • Anonymous
    January 28, 2014
    @Matt, Yes, you are correct, there are certain places where you can put altChunk, and other places where you can't.  altChunk imports block content (i.e. siblings of paragraphs and tables) so the altChunk element needs to go there, not within a paragraph.  Beyond that, I'm not sure. I'd be happy to take a look at one of your corrupted docs and I can probably tell you what is wrong.  If you would be good enough to submit the question on the forums at OpenXmlDeveloper.org, it would be super easy for me to respond.  Also, by answering the question there, others can take advantage of the answer. Cheers, Eric