File Download and Filenames

Several months ago, I blogged about IE’s support for International Filenames on Downloads. Today’s post is a bit simpler and describes two cases when IE may rename downloaded files.

Filename Extension and QueryString Parameters

If a file download HTTP response does not contain a Content-Disposition header, Internet Explorer will determine the filename from the URL. This may behave unexpectedly if the response's Content-Type is an ambiguous MIME type (e.g. application/octet-stream, typically sent for executable files) and there are querystring parameters in the URL. For example, downloading www.fiddlercap.com/dl/FiddlerCapSetup.exe?id=123 will result in a file named “FiddlerCapSetup” without a file extension. This file will show the Windows Open With dialog if you attempt to run the file:

image

The IE Team ourselves ran into this issue when launching an IE8 Beta. The website team had decorated every link on the install page with a querystring parameter, including the installer URL, for analytics purposes. For an short period of time, users had to manually rename the installer in order to install the beta.

To work around the issue, you can:

  1. Send a Content-Disposition header with the desired filename specified
  2. Remove all querystring parameters from the URL
  3. Add a bogus querystring parameter at the end with the desired extension: www.fiddlercap.com/dl/FiddlerCapSetup.exe?id=123&x=.exe

Any of these three approaches will restore the lost extension and allow the downloaded file to work properly.

Now, the obvious questions are: “Why does IE do that? Can’t you just change IE? ” As it turns out, some server-side “CGI” programs have a .EXE file extension and utilize a querystring parameter to convey a download’s filename to IE. So, unfortunately, there would be a compatibility impact to changing IE to ignore the querystring, even if the sites in question are relying on a non-standard behavior.

The Content-Disposition Header and Run-From-Cache

Recently, a product team contacted us to report that their website’s downloads didn’t work properly when the site was in the Internet Zone, but worked fine from the Intranet and Trusted Sites zones. Intrigued, I asked for further details, and the team explained their architecture.

They host a webpage that allows the user to download an “online installer” that installs one or more individual products. The installer itself is simply a “stub” that downloads and installs the selected products. Interestingly, there’s only a single installer, and the team relies upon the filename of the stub installer to decide what products to install. The filename is set via a Content-Disposition header, and the stub installer itself is downloaded via navigation to an ASHX URL.

For instance, https://example.com/getinstaller.ashx?InstallProducts=1,2,3 will return the installer with the header Content-Disposition: attachment; filename="Install-1-2-3.exe" while https://example.com/getinstaller.ashx?InstallProducts=4,7 will return the installer with a Content-Disposition: attachment; filename="Install-4-7.exe" . Upon starting, the stub installer examines its own filename and downloads and installs the requested products.

Now, at first glance, this appears to be a clever architecture, because it solves a number of problems. For instance, to otherwise use a single installer for all of these combinations, they would have to find a way to interrogate the user’s browser for which products to install, e.g. by checking cookies or some other approach that is browser specific. Other strategies would either require that they keep multiple versions of the installer on the server (one for every permutation) or dynamically build and resign the installer for each individual user (requiring that they keep a code-signing certificate on the server).

The problem is that there are two major problems with this approach—one obvious, and one not so obvious. The obvious problem is that the user might not immediately run the installer and might instead save the installer with a different name. If the user renames the file (e.g. “CoolStuffToInstall.exe”) then the installer will fail. The more subtle problem is that you cannot rely upon the filename if the file is Run directly from the Temporary Internet Files cache. The product team correctly noted that the cache filename can have a numeric disambiguater appear at the end when run from the cache (e.g. “Install-1-2-3[2].exe”) and they simply changed their parser to ignore this number. However, they ran into a much more subtle problem later, when they deployed their service from the Local Intranet developer test site to the live Internet site.

The problem is that when an executable file is downloaded from an Internet server in Protected Mode, it must be copied from the Protected Mode (low integrity) Temporary Internet Files (TIF) folder to the non-Protected Mode (medium integrity) TIF folder. When this copy occurs, the file is renamed using a filename generated from the original download URL. So, the stub installer was always being copied and executed with the filename getinstaller[1].exe, causing the command-line parsing logic to fail. The team had never encountered this problem in their internal testing because they were always testing the service using a server in the Local Intranet zone. The Local Intranet Zone runs outside of Protected Mode, so the renaming issue was never encountered.

Update: IE9 Release Candidate has changed this behavior and files now keep their name when copied between caches.

We suggested a variety of workarounds to the team, including using a HTTP Handler to allow any URL within a given path to return the installer, like so: https://example.com/getinstaller/Install-1-2-3.exe, such that the filename is preserved upon copy. Of course, this approach still suffers from the “renamed by user” problem. We also suggested that the installer itself offer checkboxes to enable the user to select which products to install.

The stub-installer itself also installed a URL Protocol Handler, which allowed websites to later invoke the installer using a special URL (e.g. installer://site/params/etc). While URL Protocol Handlers incur a significant attack surface and must be carefully reviewed for security issues, their one major advantage is that they are supported by all browsers on Windows and they do not suffer from the file-renaming issues that are inherent in the original “magic filename” architecture.

Until next time,

-Eric

Comments

  • Anonymous
    November 22, 2010
    Useful and interesting, thank you! (didn't know about the 'bogus GET parameter', I definitely have a use case for that) The URL handler is IMHO a rather brittle egg-and-chicken approach - as in "but the [custom] URL works on my machine, what is wrong with you(rs)?" Doesn't IE store some metadata about the downloads (which enable the dialog "this file was downloaded from Internet, are you sure")? Does that include the original URL, or is it just "this file downloaded from Internet Zone"?

  • Anonymous
    November 22, 2010
    The comment has been removed

  • Anonymous
    December 06, 2010
    Would using the syntax //example.com/getinstaller.ashx/Install-1-2-3.exe solve their problems? (ericlaw: The commenter observes that ASP.NET will ignore the trailing path after the ashx filename, running the .ASHX file and exposing the URL data to the page via Request.PathInfo.) It doesn't require a special HTTP handler and it's easy to implement... I have done this a few times :)