Converting RTF to HTML

Have you ever had the desire to convert some RTF text into HTML? Probably not. But if you do, then you are in luck! I recently had the need to do this conversion and after some searching found out a way to do it by enhancing a sample distributed in the MSDN library.  The sample is called: XAML to HTML Conversion Demo

The sample has code which converts HTML to and from a XAML Flow Document.  But this doesn’t make things easier until you realize that there is a way to convert RTF to XAML easily. The key is to use System.Windows.Controls.RichTextBox which can load RTF from a stream and save it as XAML.  This conversion is shown below:

         private static string ConvertRtfToXaml(string rtfText)
        {
            var richTextBox = new RichTextBox();
            if (string.IsNullOrEmpty(rtfText)) return "";
            var textRange = new TextRange(richTextBox.Document.ContentStart, richTextBox.Document.ContentEnd);
            using (var rtfMemoryStream = new MemoryStream())
            {
                using (var rtfStreamWriter = new StreamWriter(rtfMemoryStream))
                {
                    rtfStreamWriter.Write(rtfText);
                    rtfStreamWriter.Flush();
                    rtfMemoryStream.Seek(0, SeekOrigin.Begin);
                    textRange.Load(rtfMemoryStream, DataFormats.Rtf);
                }
            }
            using (var rtfMemoryStream = new MemoryStream())
            {
                textRange = new TextRange(richTextBox.Document.ContentStart, richTextBox.Document.ContentEnd);
                textRange.Save(rtfMemoryStream, DataFormats.Xaml);
                rtfMemoryStream.Seek(0, SeekOrigin.Begin);
                using (var rtfStreamReader = new StreamReader(rtfMemoryStream))
                {
                    return rtfStreamReader.ReadToEnd();
                }
            }
        }

With this code we have all we need to convert RTF to HTML. I modified the sample to add this RTF To XAML conversation and then I run that XAML through HTML converter which results in the HTML text. I added an interface to these conversion utilities and converted the sample into a library so that I would be able to use it from other projects.  Here is the interface:

  public interface IMarkupConverter
    {
        string ConvertXamlToHtml(string xamlText);
        string ConvertHtmlToXaml(string htmlText);
        string ConvertRtfToHtml(string rtfText);
    }
    public class MarkupConverter : IMarkupConverter
    {
        public string ConvertXamlToHtml(string xamlText)
        {
            return HtmlFromXamlConverter.ConvertXamlToHtml(xamlText, false);
        }
        public string ConvertHtmlToXaml(string htmlText)
        {
            return HtmlToXamlConverter.ConvertHtmlToXaml(htmlText, true);
        }
        public string ConvertRtfToHtml(string rtfText)
        {
            return RtfToHtmlConverter.ConvertRtfToHtml(rtfText);
        }
    }

With this I am now able to convert from RTF to HTML.  However, there is one catch - the conversion uses the RichTextBox WPF control which requires a single threaded apartment (STA).  Therefore in order to run your code that calls the ConvertRtfToHtml function, it must also be running in a STA.  If you can’t have your program run in a STA then you must create a new STA thread to run the conversion. Like this:

 MarkupConverter markupConverter = new MarkupConverter();
private string ConvertRtfToHtml(string rtfText)
{
   var thread = new Thread(ConvertRtfInSTAThread);
   var threadData = new ConvertRtfThreadData { RtfText = rtfText };
   thread.SetApartmentState(ApartmentState.STA);
   thread.Start(threadData);
   thread.Join();
   return threadData.HtmlText;
}
private void ConvertRtfInSTAThread(object rtf)
{
   var threadData = rtf as ConvertRtfThreadData;
   threadData.HtmlText = markupConverter.ConvertRtfToHtml(threadData.RtfText);
}
        
private class ConvertRtfThreadData
{
   public string RtfText { get; set; }
   public string HtmlText { get; set; }
}

Here is the zip contain the code for the Markup converter: MarkupConverter.zip

MarkupConverter.zip

Comments

  • Anonymous
    June 06, 2010
    Hi First thanks and u have done a good job, I am using the dll to convert rtf to html in ssrs 2008 project  (rdl files) My problem is that rdl dosn't support the style text-decoration:underline. What changes need to be done in the dll to use the tag U Thanks

  • Anonymous
    August 10, 2010
    Can it be vice versa? I mean can I covert from HTML to RTF? Thank you very much