Edit

Share via


How to: Access the HTML Source in the Managed HTML Document Object Model

The DocumentStream and DocumentText properties on the WebBrowser control return the HTML of the current document as it existed when it was first displayed. However, if you modify the page using method and property calls such as AppendChild and InnerHtml, these changes will not appear when you call DocumentStream and DocumentText. To obtain the most up-to-date HTML source for the DOM, you must call the OuterHtml property on the HTML element.

The following procedure shows how to retrieve the dynamic source and display it in a separate shortcut menu.

Retrieving the dynamic source with the OuterHtml property

  1. Create a new Windows Forms application. Start with a single Form, and call it Form1.

  2. Host the WebBrowser control in your Windows Forms application, and name it WebBrowser1. For more information, see How to: Add Web Browser Capabilities to a Windows Forms Application.

  3. Create a second Form in your application called CodeForm.

  4. Add a RichTextBox control to CodeForm and set its Dock property to Fill.

  5. Create a public property on CodeForm called Code.

    public string Code
    {
        get
        {
            if (richTextBox1.Text != null)
            {
                return (richTextBox1.Text);
            }
            else
            {
                return ("");
            }
        }
        set
        {
            richTextBox1.Text = value;
        }
    }
    
    Public Property Code() As String
        Get
            If (RichTextBox1.Text IsNot Nothing) Then
                Code = RichTextBox1.Text
            Else
                Code = ""
            End If
        End Get
    
        Set(ByVal value As String)
            RichTextBox1.Text = value
        End Set
    End Property
    
  6. Add a Button control named Button1 to your Form, and monitor for the Click event. For details on monitoring events, see Events.

  7. Add the following code to the Click event handler.

    private void button1_Click(object sender, EventArgs e)
    {
        HtmlElement elem;
    
        if (webBrowser1.Document != null)
        {
            CodeForm cf = new CodeForm();
            HtmlElementCollection elems = webBrowser1.Document.GetElementsByTagName("HTML");
            if (elems.Count == 1)
            {
                elem = elems[0];
                cf.Code = elem.OuterHtml;
                cf.Show();
            }
        }
    }
    
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim elem As HtmlElement
    
        If (WebBrowser1.Document IsNot Nothing) Then
            Dim cf As New CodeForm()
            Dim elems As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("HTML")
            If (elems.Count = 1) Then
                elem = elems(0)
                cf.Code = elem.OuterHtml
                cf.Show()
            End If
        End If
    End Sub
    

Robust Programming

Always test the value of Document before attempting to retrieve it. If the current page is not finished loading, Document or one or more of its child objects may not be initialized.

See also