Dela via


IE and the Accept Header

RFC 2616 describes the Accept request header as follows:

The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image.

While I’ll spare you a debate about the merits and pitfalls of server-driven content negotiation, suffice it to say that I think that such negotiation is impractical (and/or suboptimal) for most scenarios encountered by general-purpose web browsers. The primary reason to spare the debate (and inevitable flame war) is that it’s mostly a moot point, because all versions of Internet Explorer are seriously limited when it comes to support of the Accept header. Note: other browsers actually suffer from similar limitations in certain cases– I’m focused on IE in this post because it’s the browser I work on.

Let’s take a look what IE sends in the Accept header.

Install a recent version of Fiddler, run it, and click Rules > Customize Rules.  Scroll to the static function Main() block, and add the following line within:

FiddlerObject.UI.lvSessions.AddBoundColumn("Accept", 50, "@request.Accept");

Save the file, and you’ll see a new column titled Accept appears within the Fiddler UI, showing the value of the Accept request header.

Navigate to websites, and watch the value of the Accept header sent in each request.  You’ll quickly notice that in almost all cases, the Accept header contains */* , meaning that IE is willing to accept documents of any MIME type. This is technically accurate, insofar as IE will offer to download and save MIME-types it doesn’t know how to render.

However, in some navigations, you’ll see that IE sends a more complete string, containing a wide variety of MIME-types.  For instance:

image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/msword, application/vnd.ms-excel, application/x-shockwave-flash, */*

Hit F5 to refresh that page, and IE will probably go back to sending */* again.  Clearly, IE is inconsistent in what it chooses to send in the Accept header, but by now you’re probably curious where these MIME types even come from.

IE generates this list by enumerating the values listed in the following registry key:

HKLM\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Accepted Documents

Any application can, upon install, attempt to advertise the MIME-types it supports using this registry key.  However, I strongly recommend that developers not list MIME types here. 

Why not?  Well, first of all, the list isn’t typically sent, so you cannot write server code which checks for your MIME-type and conditionally serves content of that type only if the client asks for it.  In almost all cases, servers will end up getting requests for content of type */* and return content in your format anyway.

More importantly, it turns out that Accept headers containing custom types cause two serious problems:

  1. Slower performance due to bloated requests
  2. Server errors on sites expecting headers of fixed maximum length

The first problem is pretty obvious: any time the full MIME-type list is sent, significant request bandwidth is wasted.  The Accept header above, for instance, is 191 characters long, and due to the asymmetrical nature of bandwidth for most users (upload bandwidth is usually a small fraction of download bandwidth) such waste can quickly add up. 

The second problem is less obvious but more serious: Many web server devices and frameworks expect HTTP headers to be shorter than a certain length and will return HTTP error codes (HTTP/400 and HTTP/406 are popular) if overlong headers are received. Beyond the immediate annoyance of such errors, there’s almost never any indication to the user what has gone wrong and how to fix it. Users cannot be expected to know that the problem is an overlong header and find the above-mentioned registry key to start deleting entries.

Often, an IE user encountering this problem will try another browser and find that it works fine, because other browsers generate their Accept headers from other lists that are less likely to be updated by installed applications. For instance, Firefox sends the value of its network.http.accept.default preference as the content of the Accept header.

Alert readers will notice that Microsoft applications are culprits in Accept header bloat, and this is something that IE will be working with other teams around the company to help mitigate.  In a future IE version, we may remove or substantially change Accept-header generation logic to help eliminate this problem.

User-Agent string extensibility causes a similar problem, and that issue is so prevalent that it will be the subject of a future post: Internet Explorer User-Agent: Use and Abuse.

Thanks for reading!

-Eric

Update: The IE9 Release Candidate significantly changes IE's use of the Accept header in IE9 Browser Mode. IE9 deprecates registry-based extensibility of the Accept header, and rather than sending */* for most downloads, now sends a more-specific Accept header based on which HTML element initiated the request. I wrote about the details of IE9 Accept Headers on the Fiddler Blog.

Comments

  • Anonymous
    July 15, 2009
    Thx, this was straight to the point. I have a question though: "I strongly recommend that developers not list MIME types here". Is there another way to instruct browser to send a particular accept header? I believe it is possible to write an addon for IE, but this could be a heavy solution. Radu

  • Anonymous
    July 15, 2009
    @Ruxi: There's no way to do so that would work reliably. I don't believe adding the header inside BeforeNavigate would work, because Trident will override it. You could use an APP-wrapper around the HTTP/HTTPS protocols, but this suffers from tons of performance, reliability, and maintainability issues.  Wrapping intrinsic protocols with APPs is strongly discouraged and will be the subject of a future post.

  • Anonymous
    July 30, 2009
    The comment has been removed

  • Anonymous
    July 30, 2009
    @Brianary: I think you're confused. There are two points I make in this post: 1> Adding tokens to the Accept header simply doesn't work properly in IE, and thus it should be avoided to avoid introducing problems. 2> "Spamming" is a significant problem for the Accept header, and it's a significant problem for the User-Agent header as well. The User-Agent length issue is severe enough that it will be the topic of a future post. <<"re-evaluating whether every random IE Addon or Windows install should be able to add tokens to the UA.">> Indeed; as with the Accept header, we'll be looking at whether or not we could change the UA code in the future to not send custom tokens. However, you must keep in mind that other browsers also allow such tokens (e.g. see general.useragent in Firefox), and that some deployed servers and services rely on such tokens (even if it is a bad practice).

  • Anonymous
    July 30, 2009
    No, I understand and mostly agree. I was mostly addressing your concerns about the size of the headers overall, and individual header size in particular. However, to eliminate spamming from both Accept and UA will likely simply result in an X-* header (or many) in order to communicate what the browser supports. X-* usage is already exploding for response headers to support IE, and we've already seen a UA-CPU header because the standard UA is too spammed to use. This doesn't do anything to reduce header bloat overall. Breaking everything into separate specific, well-defined HTTP headers will address the size and format problems with UA, but also balloon the header size overall. Perhaps the coolest thing about REST, which seems to have gained great acceptance at Microsoft, is that is uses as much existing HTTP infrastructure as possible for simplicity and efficiency. We're really talking about future IE versions, since it isn't likely that Microsoft will release a patch to cull UA tokens anytime soon, so working around a buggy IE Accept implementation, rather than fixing it, seems a bit short-sighted. I'm hoping that many tokens will simply disappear, but realistically there's going to be some signalling of support that happens. Long term, it would just be nice if that were MIME types in the Accept header, which is much better defined and constrained, than long English freeform text detail or repetitive enumeration of CLR versions in a new X-CLR header. Thanks for indulging this pedantic discussion. :)

  • Anonymous
    July 31, 2009
    <<will likely simply result in an X-* header>> Ah, but it's extremely difficult to add such headers in the browser, even if your code is already running. Nearly none of the folks spamming the UA would be willing to go to the trouble of adding a custom header. <<seen a UA-CPU header because the standard UA is too spammed to use.>> I don't know what led you to that conclusion, but I don't find it likely. Among other things, IE communicates bitness information in the UA-string already anyway. <<there's going to be some signalling of support that happens>> I believe that standardizing JavaScript-accessible Capabilities APIs on the Navigator object are the right approach here. Time will tell.

  • Anonymous
    August 03, 2009
    <blockquote> User-Agent string extensibility causes a similar problem, and that issue is so prevalent that it will be the subject of an upcoming post. </blockquote> Any idea when? I'd like a UA string as follows: "MSIE/8.0 (Windows NT 7.0; 64bit; Trident/4.0)" The following parts are redundant since nobody checks for them any more, and any websites that did check for them (but haven't been updated since) probably won't work for other reasons anyway: claiming to be NN4 (Mozilla), the word "compatible" which is rather meaningless, encryption strength identifier (which only meant anything for US-made browsers anyway), and all that ".NET CLR" crp that you put in there which doesn't mean anything and often appears three or four times with different numbers.

  • Anonymous
    August 03, 2009
    The comment has been removed

  • Anonymous
    August 03, 2009
    The comment has been removed

  • Anonymous
    August 03, 2009
    Some notes: Opera dropped the "Mozilla/4.0 (compatible;" years ago, and works fine for every page I've tried it with. I've used the Firefox User Agent Switcher extension to change my Firefox UA to "Firefox/3.5.1 (Windows NT 5.2; en-US; rv:1.9.1.1) Gecko/20090715". Google's UI does seems to change, but nothing else so far. I've created a quick VBScript that changes my IE UA to "MSIE/7.0 (Windows NT 5.2; Win64; x64)": Set sh= WScript.CreateObject("WScript.Shell") sh.RegWrite "HKLMSOFTWAREMicrosoftWindowsCurrentVersionInternet Settings5.0User Agent", _ "MSIE/7.0 (Windows NT 5.2; Win64; x64)" & vbCrLf & _ "X-User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; Win64; x64; ", "REG_SZ" WScript.Echo "New User-Agent: " & vbCrLf & _ sh.RegRead("HKLMSOFTWAREMicrosoftWindowsCurrentVersionInternet Settings5.0User Agent") According to http://www.enhanceie.com/ua.aspx , this does seem to confound ASP.NET's HTTPBrowserCapabilities.

  • Anonymous
    August 03, 2009
    @Brianary: I intend no offense by noting this, but I absolutely guarantee that we've spent more time thinking about user-agents over the last few years than you have. A number of people on the IE team have spent dozens of hours researching here, and investigating options for making changes. 1> Opera isn't compatible with every site that IE is. 2> Opera users a service to "lie" about its UA string to incompatible sites. 3> Even if Opera works, there are sites hard-coded to expect specific IE user-agents that are also hard-coded to expect certain Opera UAs.  Sites break when these change. Your registry script to change the UA string does not do what you think it does; it mangles your text into the actual UA string. I encourage you to use my user-agent switcher (www.enhanceie.com/dl/uapicksetup.exe) to try out sites, with the caveat that you'll be visiting well under 1% of the Internet sites in use today, so even if you don't see broken sites, you can rest assured that they're out there.

  • Anonymous
    August 03, 2009
    Oh, I completely agree that Microsoft has more data about this than I'll ever see. My concern, I guess, is just that sites that still do browser sniffing may be overrepresented in terms of active feedback, and should mature a bit to a more modern approach, which they are unlikely to ever do until it becomes a practical consideration. Point taken with Opera. I haven't used it as a primary browser even on my tiny corner of the web. Not that it really matters, but are you sure about the script? It seems to work for me when I echo request headers back. Again, I appreciate your time. I'd sure like to see the legacy "Mozilla/4.0 (compatible;" stuff retired, but getting rid of much of the UA spamming would certainly reduce the constantly swelling UA string length. Thanks for your continued advocacy.

  • Anonymous
    August 03, 2009
    <<may be overrepresented in terms of active feedback>> To be fair, we don't get any feedback from sites that aren't actively maintained, which make up the bulk of the compatibility risk. We actively seek out broken sites. <<are you sure about the script>> I took a closer look at what your registry script does. It cleverly relies on a WinINET bug, failure to remove 0D 0A sequences from the registry value. So, while the key in question only is intended to contain the replacement for "Mozilla/4.0", you've injected a full UA and relied on the WinINET bug to shove the rest of the UA into the next header. This works in IE8, although it's a bug which should be closed in a future version.

  • Anonymous
    August 04, 2009
    <<you call him out for not providing supporting data, but then you do the same thing>> Well, for one thing, I happen to be right, and he's not. :-) Seriously though, he's asserting that billions and billions of pages don't do a particular thing, and I'm asserting that of those billions and billions of pages, some of them do such a thing. Who do you think is more likely to be correct? As for the specifics, I posted a few examples quite some time ago the last time such ridiculous claims were made, and as expected the reaction was: "Oh, you only listed a few major sites, which means that every other site must work properly."   It's utterly tiresome to conduct endless arguments with essentially anonymous folks who have no "skin in the game."

  • Anonymous
    August 04, 2009
    An appeal to authority is certainly a fair play. It would be pretty impressive, though, if Microsoft would open up the list of sites and their status on Microsoft Connect. There would be much better transparency, and we smaller players would perhaps understand the scale of the situation better, without having to tie up and IE team member (over and over each time this comes up). :) I'd say I have skin in the game, just on the other side, I guess.

  • Anonymous
    December 03, 2009
    The comment has been removed

  • Anonymous
    December 03, 2009
    @Fireblaze: Actually, no... you missed a section which covers this: Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence. For example,       Accept: text/, text/html, text/html;level=1, /   have the following precedence:       1) text/html;level=1       2) text/html       3) text/       4) /

  • Anonymous
    December 16, 2009
    Eric,  Excellent post! I'm sure you might not have realized how popular such a topic might be until after you continue to receive feedback about it several months later. Personally I have found this article very helpful as it helped me to determine how to get IE to properly handshake with a proprietary HTTP server. It couldn't handle any Accept header greater than 255 characters ("Accept: " and "rn" included). Now the trick is to get the rest of the organization to buy into the obvious fix of removing some of these MIME types from the registry. Alternate browsers have been used up until now as a workaround to allow us to interface with the site. With today's discovery we can hopefully move away from that and also reduce our Accept-header bloat a bit.  The interesting thing to note is that my only concern is an organizational one. I suspect that this is also one of the many things your team would face when attempting to make changes to the Accept-header compilation mechanism. While it might make sense from your perspective (and many people in the world, including myself), that's only 1% of the battle. You have the task of working with other teams inside a huge corporation to try and change the way they've been working for a while. Regardless of the fact that Microsoft is a tech company with a lot of good employees I'm sure, they are still people and people don't like change. I can also see that not everyone in MS has the same level of understanding of the HTTP RFC you have. This further complicates the issue of change.  This might be showing my ignorance a bit, but I'm OK with that. How many times does the MIME type actually affect what format the server uses for the target data? I could see this happen more in the case of a web service where you could specify, say, JSON vs. XML. However, IE isn't really used to interface with web services that often. It's meant for people to use, not machines. So, the question remains as to whether or not "/" wouldn't be sufficient in many, many cases...if not all. I'm sure you've got a bit more knowledge here and I'd appreciate your insight. Have you guys run across a website that changes its content's format based on different Accept-header MIME types?  Sincerely,  -Archimedes