Compartir a través de


IDNA - Is it fussball or fußball?

[Update - Note: A coworker pointed out that it's really Fussball or Fußball, since it's a noun.  I knew that, I'm just using the lower case here because domain names get lower cased, which is a seperate issue and I didn't want to complicate the example.]

IDNA2008 (which is something of a misnomer, since it's actually a 2010 standard, so its "new", not 2 years old!), introduced some interesting breaking changes with IDNA2003.  UTS#46 addresses some of those, including a "transitional period."  IDNA == Internationalized Domain Names for Applications.

In IDNA2003, fußball would be mapped to fussball, but IDNA2008 allows the eszett as it's own character, so it allows fußball.  Unfortunately that means that in a mixed environment, some users typing in fußball end up at a server named fussball, and others go someplace called fußball.  Imagine if that was paypal...

UTS#46 addresses this by requiring that browsers map names to the old form for lookup.  That way everyone ends up at the same place.  UTS#46 calls this "transitional" behavior, which is expected to happen for some unspecified transitional period.  For display, browsers would hopefully use the new form.

So how long is this transitional period?  Well, lets think about how fast people apply updates to all those machines out there.  The number of "zombies" should be a clue: even if there were updates for every configuration available today (not just Windows, but Macs and everything else), it will take a LONG time to get.  Some systems quote a support period basically the same as a Vulcan's Pon Farr cycle.  (I wonder if they upgrade afterward?)  And lots of machines out there are outside their supported cycle.  Think about XP.  My guess is that this transitional period is 5-10 years.

So, if you're getting a domain or running a server, I would recommend making sure the IDNA2003 forms work for you for the forseeable future. 

Now, the technical details aside, the German eszett has some other interesting characteristics.  I picked fußball as an example, but I have no clue what the guys at fußball.de's opinion is on this subject, so don't pretend I'm sticking words in their mouth :)  In Germany, fußball is supposed to be spelled with an eszett (ß).  In Switzerland, they don't use eszett, they use ss.  Additionally fussball is used even within Germany by fans, or others in casual use.  So if I owned a domain like that, I'd pretty much make sure I had both the ss and the ß form registered, not only to ensure my Swiss customers could find my site, but also so that I make sure I don't lose my IDNA2003 customers, and also so that a competitor didn't try to take advantage of my trademark.  There are a few words in German that are actually different words, not just alternate spellings, but if I owned such a domain I'd register both forms.

Which sort of brings me to another idea, which, unfortunately, isn't a standard.  A big reason IDNA2008 ended up this way is that people, reasonably, wanted their names spelled correctly.  (There's also a need for truly unique unmapped identifiers).  Unfortunately there's no standard differentiation between "display" and "lookup".  I would have much preferred a system that continued the IDNA2003 compatiblity, allowing lookup to find the names, but provide a display record to show the site's preferred display form.  There are lots of sites that have odd display forms.  Were I the AAA auto club, I'd probably would much rather be AAA.com (again my opinion, no clue what AAA really thinks).  NamesLikeThis.com also get mangled by the current system.  DNS is rather nice about handling those kinds of alternate spellings for us in ASCII (unless you're Turkish).  They match fine, but have display problems.

Anyway, I digressed.  Remember to keep expecting to see those IDNA2003 form domain names for a long time to come,

-Shawn