Jaa


XAML FlowDocument to HTML Conversion Prototype

XAML FlowDocuments and HTML have some things in common. But they also have some distinct differences that makes writing a conversion utility tricky. A well written XSLT could potentially process XHTML input and generate FlowDocument content... But this pre-supposes well-formed HTML in the first place. I've tried to go down this road on a few occasions with limited success.

Since most HTML isn't well formed, a more flexible solution was to build a conversion library. The attached application contains class libraries capable of converting from HTML to FlowDocument, or from FlowDocument to HTML. I can't emphasize enough that this is simply a prototype -- true fidelity of content is not promised nor is it expected. However, if you're interested in playing with (and potentially improving upon) a conversion prototype, you'll find the attached project very useful. The user interface is basic -- simply a TextBox into which you can paste content for conversion. The converted content appears in the same TextBox after the "Convert!" button is pressed.

These classes can also be used to process the entire contents of a folder and turn all of the HTML contained therein to XAML. We're using a similar technique for our updated version of the SDKViewer demo that will ship with the RC1 SDK (so content will be up to date in future versions of the application, unlike the present circumstance). You could do something similar, using a foreach loop, like the following:

C#

Directory.CreateDirectory("test");
string[] myString = Directory.GetFiles(filepath);
foreach (String s in myString)
    {
    FileStream htmlFile = new FileStream(s, FileMode.Open, FileAccess.Read);
    StreamReader myStreamReader = new StreamReader(htmlFile);
    File.WriteAllText(("test\\" + s + ".xaml"), (HtmlToXamlConverter.ConvertHtmlToXaml(myStreamReader.ReadToEnd(), true)),Encoding.UTF8);
    }

This conversion library isn't perfect -- but it gives you a big head start if your enterprise is considering conversion from HTML to FlowDocuments. Converting your HTML content would allow you to take advantage of the enhanced reading capabilities of WPF, including paginated content, magnfication, and annotations.

For more information on the WPF document platform, see this SDK topic:

https://windowssdk.msdn.microsoft.com/library/en-us/wpf_conceptual/html/6e8db7bc-050a-4070-aa72-bb8c46e87ff8.asp?frame=true

Good luck.

-Keith

About Us


We are the Windows Presentation Foundation SDK writers and editors.

html2flow.zip

Comments

  • Anonymous
    May 25, 2006
    Thanks!
    I've been trying to come up with a good solution for this problrm for a while (I've got as far as playing with some regular expressions but it's fragile).

  • Anonymous
    March 08, 2007
    You might want to have a look at Chris Lovett's SGMLReader class, as this is a slot-in replacement for an XMLReader AND will read an HTML file.  This then can be piped into an XSL stylesheet and transformed directly into XAML.

  • Anonymous
    May 27, 2007
    Hello, great, but no images?

  • Anonymous
    February 25, 2008
    Nice but doesn't seem to do table borders any way I can work out

  • Anonymous
    August 16, 2010
    Great Tool theoreticay. It is just too rough. Does not support lots of elements. Something more complete would be great. Thanks

  • Anonymous
    April 24, 2011
    I am using this solution to convert my XAML to HTML. When i paste my XAML i get the equivalent HTML but the actual text (or the text property in the XAML) is not returned between the HTML Tags. Please suggest a solution to this or Please tel me where am i going wrong ? Thank You

  • Anonymous
    April 26, 2011
    Hi Madhur, The best place to get your question answered is the MSDN forums at social.msdn.microsoft.com/.../threads.  This sample has been removed from our sample set and is no longer supported.  Perhaps someone in the forums will be able to assist you. Thanks.

  • Anonymous
    May 25, 2011
    The comment has been removed

  • Anonymous
    May 31, 2011
    What license is this code released under? The files only say "Copyright (C) Microsoft Corporation.  All rights reserved.", but don't contain any trace of a license statement. As is stands, I'm not even sure if I am allowed to extend it or even use it in my (commercial) application to help with getting some earlier documents imported.

  • Anonymous
    September 03, 2012
    While converting 'HtmlToXaml'  get 'system out of memory exception' when trying to get : string xaml = xamlFlowDocumentElement.OuterXml; I tried using StringBuilder() also but same error. Below is the code: StringBuilder stringBuilder = new StringBuilder();            XmlWriter xmlWriter = new XmlTextWriter(new StringWriter(stringBuilder));            xamlFlowDocumentElement.WriteTo(xmlWriter);            //return stringBuilder.ToString();            var stringReader = new StringReader(stringBuilder.ToString());            XmlReader xmlReader = XmlReader.Create(stringReader);            object o = XamlReader.Load(xmlReader);            if (o == null)            {                throw new Exception("Error parsing XAML. XamlReader.Load returns null.");            }            xamlFlowDocumentElement = null;            stringBuilder = null;            return o as FlowDocument; Any help ?