Dela via


DOMParser and XMLSerializer in IE9 Beta

We've talked a lot about UI and browser features lately. Today I want to get back to web development by discussing some additions to the platform in IE9 Beta: DOMParser and XMLSerializer.

What do they do?

DOMParser enables building a document from an XML string and XMLSerializer allows you to serialize it back again. Together they make XML to DOM conversions as simple as using JSON, making it easier to use XML as a data-transfer format. More importantly, the nodes created by DOMParser are native, meaning they can be inserted and rendered within any web page. Plus XMLSerializer can serialize any native DOM node to an XML string, even nodes from HTML documents. This native-ness makes it easier to render your data directly, without having to transform it to HTML first.

How do they work?

Below is a basic example of how these APIs can be used. Check out the DOMParser and XMLSerializer demo on our Test Drive site for more details and a live sample.

 // Parse a string into an XML document
var parser = new DOMParser();
var doc = parser.parseFromString("<myXML/>", "text/xml");
 // Serialize any native DOM node to an XML string
// (including nodes from HTML documents)
var serializer = new XMLSerializer();
var xmlString = serializer.serializeToString(doc);

The second parameter to parseFromString should be "text/xml" or "application/xml" for best cross-browser compatibility. For the full list of supported strings in IE9, see the MSDN documentation for parseFromString.

Why did we add these APIs?

Although these APIs are non-standard, they are supported in the latest versions of Firefox, Chrome, Safari, and Opera. They are also used by a number of existing sites and frameworks. Given this real-world usage and cross-browser support, we chose to implement these APIs in IE9 as part of our commitment to enabling the "Same Markup" on the web. Having these APIs helps more sites run the same code in the same way cross-browser. They also make working with XML from script easier.

How is this different from MSXML?

MSXML provides an XML structure that is separate from IE's native DOM. This means MSXML objects cannot be inserted and rendered within a web page. MSXML objects also do not get the interoperability and performance benefits of native JavaScript integration. The performance difference is particularly noticeable when copying elements from MSXML to HTML in order to render them.

How does this work with XMLHttpRequest (XHR)?

The responseXML property of XMLHttpRequest still returns an MSXML object in IE9, but you can use DOMParser with responseText to get a native XML object instead.

 // Using DOMParser with XMLHttpRequest
var parser = new DOMParser();
var xhr = new XMLHttpRequest();
...
var doc = parser.parseFromString(xhr.responseText, "text/xml");
...

What about XML parsing errors?

When the provided XML is not well-formed, the parseFromString API will throw a SYNTAX_ERR DOM Exception. This was chosen to align with the error handling behavior of innerHTML in XML documents as per the HTML5 spec.

Next Steps

As I mentioned before, many sites already use the DOMParser and XMLSerializer APIs today. Make sure your pages use feature detection to properly identify support for these APIs when using them:

 if(window.DOMParser) {
    // Code relying on DOMParser support
} else {
    // Fallback code
}
 if(window.XMLSerializer) {
    // Code relying on XMLSerializer support
} else {
    // Fallback code
}

Tony Ross
Program Manager

Comments

  • Anonymous
    October 15, 2010
    What document/browser modes are these usable in? Are they usable outside HTML5 and XHTML documents? I have a tool that uses DOMParser to insert inline SVG in old HTML documents -- it works in non-HTML5 documents in other browsers using feature detection, curious if it will in IE9.  (When it can't insert SVG, it inserts raster images instead.) Don't know if this is answerable, but as a rule, is switching browser/document modes more like setting a quirks flag in IE9 or more like loading a copy of the old rendering engine?  Like, are new features normally present in other modes?  I guess new features must be missing in compatibility modes, or else old feature-detecting JavaScript targeting an old version of IE might fail.  But would be interesting to hear from you guys about it.

  • Anonymous
    October 15, 2010
    "When the provided XML is not well-formed, the parseFromString API will throw a SYNTAX_ERR DOM Exception." Mozilla returns <parsererror></parsererror> document (IMO; very broken and unexpected behavior), and Opera previously thrown the exception but was forced to change due to websites relying on the Gecko behavior. I  think Webkit has always done what Mozilla did. So if you don't want sites breaking all the sudden, you might want to do the same thing (although I really don't like that behavior).

  • Anonymous
    October 15, 2010
    When I bring XML content to my page that includes rows to insert in a table or options to drop into a select list will IE9 handle this without barfing? e.g. var domParser = new DOMParser(); mySelectObject.innerHTML = domParser.parseFromString( optionGroupAndOptionsFromXMLHTTPRequestResponse ); This was always one of the biggest issues with IE when trying to use it as an end user platform for serious web applications.  90% of the time I want to fetch data to stuff into a table or a drop down list and IE is the only browser that can't handle this. I don't think the development community can take IE9 seriously until this works across all elements without fail... but my current tests show that this is still broken in IE9 beta 1. Also, since IE9 now supports SVG can IE handle SVGElement.appendChild(XMLNodeOfNodesOfNodesOfNodes);??? or are we going to have bugs in IE9's SVG from day 1 to make development a pain in IE for the next 15 years too. adam

  • Anonymous
    October 15, 2010
    The comment has been removed

  • Anonymous
    October 16, 2010
    :) great job

  • Anonymous
    October 16, 2010
    Thanks! I lost hope as you closed my feedback as "by design" in March: connect.microsoft.com/.../please-implement-domparser-and-xmlserializer best regards Holger

  • Anonymous
    October 16, 2010
    You enable Same Markup by implementing error handling that's incompatible with all existing user agents? Really? That's so sad it isn't even funny.

  • Anonymous
    October 16, 2010
    Cool! Can I ask why XMLHttpRequest.responseXML still returns an MSXML object in IE9, not native XML?

  • Anonymous
    October 16, 2010
    Nice work guys, I’m glad to see that I no longer have to deal with the MSXML quirks. However, does using DOMParser with responseText respect the XML encoding declaration? In other browsers it doesn’t, which is why I normally try to avoid parsing responseText. I would suggest to instead either 1. let responseXML return a document in the new DOM, or 2. add a boolean setting to XMLHttpRequest to direct it to use the new DOM.

  • Anonymous
    October 16, 2010
    To explain further what I mean, this testcase www.grauw.nl/.../test-responsetext.html currently throws error: XML5111: WC_E_XMLCHARACTER: illegal xml character in IE9. With responseText already being decoded to a native JS string, you can not really properly interpret responseText as another encoding anymore. (Although actually the latest versions of the other browsers actually seem to have worked their way around it, looks like they are not just doing encoding sniffing, but even content sniffing as when I enter ISO-8859-5 in the encoding (test file 3), responseText will show a different character.)

  • Anonymous
    October 17, 2010
    IE9beta is horrendously slow on this page: www.whatwg.org/.../current-work I hope it will be better in next beta.

  • Anonymous
    October 17, 2010
    Great Job Microsoft! And please do not forget to support Webworkers and Websockets. Chrome and already support it and FireFox 4.0 will support it.

  • Anonymous
    October 17, 2010
    It`s very interesting.

  • Anonymous
    October 17, 2010
    The comment has been removed

  • Anonymous
    October 17, 2010
    @xslt 2.0 support: yes please! @Aseem: i too am not happy about IE XMLHttpRequest returning msxml objects. g

  • Anonymous
    October 17, 2010
    The comment has been removed

  • Anonymous
    October 18, 2010
    @Adam, guiseppe: Shorter would be: var domParser = new DOMParser(); var  node = domParser.parseFromString( optionGroupAndOptionsFromXMLHTTPRequestResponse, "text/xml" ); var rightDocNode = document.importNode(node ,true); mySelectObject.appendChild(rightDocNode); add a few object detection for old IEs :-)

  • Anonymous
    October 18, 2010
    @Holger - are you just suggesting this out loud as an idea???... In most cases we don't want an optgroup (IE has issues rendering them anyway) We just want to add [x] options which could be 25 or 250 options which means that you would still need to make 25 to 250 calls to appendChild() for each option which completely misses the point of importing a DOM structure. Plain and simple, MSFT should hurry up and fix .innerHTML on Selects and Tables before they go to market with IE9 touting it as an HTML5 standards based browser when it most certainly fails the most utterly basic of DOM manipulations.

  • Anonymous
    October 18, 2010
    @holger i thought so too, tried that in firefox before posting, and it did not work. note, that in your code, the 'node' variable holds a DOMDocument node, which is not what you want to append to the select element. DOMParser does not return DOMDocumentFragment, nodelist, nodeset or the like. this is what i tried and did not work in firefox 3.6: var child = doc.documentElement.firstChild; do {    yourSelectElement.appendChild(document.importNode(child, true)); } while (child = child.nextSibling); result: no error message; nodes are inserted into DOM; nodes have correct ownerDocument, but appearantly are not recognized as HTMLOptionElement. i double checked now with latest final opera and latest dev channel chrome, with similar results (no error msg, same resulting DOM on quick look, but different rendering of DOM). this is what i also tried: var child = doc.documentElement.firstChild; do {    yourSelectElement.add(document.importNode(child, true), null); } while (child = child.nextSibling); result: no nodes are inserted into DOM, firefox: "Could not convert JavaScript argument arg 0 [nsIDOMHTMLSelectElement.add]"  nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)" Opera: "Uncaught exception: Error: WRONG_ARGUMENTS_ERR" chrome: no error message the doctype was html5 and the document rendered in standard mode. @Mona: "you would still need to make 25 to 250 calls to appendChild() for each option" while i am certain you wanted to write "[...] for each select", it should be noted that you do not have to call appendChild manually for each option, hence the loop. and 250 calls, using the HTMLSelectElement.add method as described in my previous post takes about 7ms (firefox 3.6), 4.5ms (opera) and 3.5ms (chrome) on a 2 year old notebook (core2duo T9400 @2.53GHz), so this should be neglectable, especially considering the time it takes to complete the http request for fetching the data. additionally, the select could be populated while the transfer is in progress, to potentially save some fraction of given time. last but not least, writing to innerHTML is not free either, as, you guess it, besides parsing, the nodes also have to be inserted into the DOM. i start getting the feeling that to some, complaining is more important than having a working solution. every ieblog entry has comments with innerHTML complaints. it is annoying. my automatic adaptive in-brain content filter is starting to classify such whiners as trolls and/or hobbyists with lack of knowledge (about their lack of knowledge). g

  • Anonymous
    October 19, 2010
    @Craig It freezes firefox4beta for about 15 sec on my core2duo but then it works though firefox seems kinda little laggy. On IE9, it freezes for a long time every time you want to scroll. On Chrome, it's fine.

  • Anonymous
    October 19, 2010
    @eiras Thanks for the heads up about <parsererror></parsererror>. So far I haven't heard any reports of sites breaking due to the thrown exception, but please share any if you know about them. Overall we chose to throw the exception based on a the following considerations:

  • Exact error document format was inconsistent cross-browser
  • Alignment with the HTML5 behavior for innerHTML in XHTML
  • Easier for developers to identify during coding
  • Reduced compatibility risk since most live XML doesn't trigger parser errors
  • Anonymous
    October 19, 2010
    @Laurens Holst Thanks for pointing out the encoding issue; I'll look into it. As for responseXML, we kept it returning an MSXML object for compatibility since there are a couple of XML features our native DOM doesn't have yet, namely XPath. I do think your suggestion of using a boolean setting to opt-in to a native DOM may provide a workable alternative in the interim. I'll investigate further and see what I can do.

  • Anonymous
    October 19, 2010
    @Tony Ross [MSFT] - re: "I'll investigate further and see what I can do" - apparently you weren't briefed on the MSFT feedback protocol.  The correct standard reply is: "Closed - By Design" for all features, bugs, implementations (working or broken). However seeing as you haven't been assimilated yet - any chance you can jump into the .innerHTML code and fix a few of the long-standing major bugs that IE has that currently make IE9 incapable of claiming proper HTML5 support? - thanks!

  • Anonymous
    October 24, 2010
    Preliminary testing seemed to show that DOMSerializer is smarter than innerHTML and provides better code. Does it mean you should change innerHTML to use the same code as the DOMSerializer ?

  • Anonymous
    October 25, 2010
    I am very knowledgeable after reading this. Not because I liked this article, but I got this in a very well manner. Very well explained.One can get inspiration with this read.The most striking thing about the centre of Detroit these days is how quiet it is.