Team Foundation Internationalization (Part 2)
Last time I wrote some considerations related to the UI Language of Team Foundation (both Server and Client). This time I am going to talk about Locale-related aspects, always with respect to TFS.
Online you will find many definitions of Locale (including here ) so I won’t repeat that. In this context, I’ll consider User locale, System locale and Input Locale.
“So what’s the big deal – everything works, right?”
Right. That does not mean that we didn’t have a number of issues, some very visible, which were found and fixed along the way. And some (hopefully minor) limitations still persist. I’ll try to summarize some of those issues and limitations here.
User Locale: this affects how numbers, dates, currencies, etc. are entered and displayed by your application. In the .NET world, this is handled by the CurrentCulture class and most, if not all, methods that deal with numbers, dates, sort order, etc. default to a culture-sensitive behavior.
“So that’s good because there’s nothing to do, right?”
Right … for the most part. Because of this default behavior, most of the code will work as expected. I believe the bigger source of such bugs comes from interop and multi-tier scenarios.
In the first bucket (interop), think about how heavily Visual Studio and the Team Foundation Client interoperate with MS Office (Excel, Outlook, Project) – we found a few non trivial issues when sending/receiving numbers and dates/times to and from Office, especially when you deal with dates in non-Gregorian calendars. And some bugs only showed up e.g. in MS Project and not in Excel, because of the different object model, or because of a different date/time handling/interpretation between those products.
In the multi-tier bucket, you can imagine clients in different locales, invoking web services or remote procedures and passing numbers in a culture-sensitive way. The server could have a different culture than the client, so it’s (obviously) better for the client to parse culture-sensitive data locally (on the client, where the data was entered) and pass either strongly-typed data or data formatted in an “invariant” way (if you need to convert e.g a date to a string) to the server so that the server can univocally interpret/parse it.
Another series of issues comes from Time Zone handling – some are implementation issues and some are design issues. Imagine clients from different time zones storing data in a central repository. To make dates/times comparable, you need to normalize them – there may be several strategies possible. The one we picked was to store all dates/times in UTC format.
“Then everything is fine, right?”
Well yes, with some caveats.
First, how do you visualize the content on the client? “My team is all located in Mumbai, and I don’t want to see dates/times in UTC!” Easy, by using the User Locale of the client, I can display date/times/numbers in the culture that the user expects, and I can also convert the UTC dates coming from the server to the local (client) time zone. That works fine.
But what if you have a SQL Cube on the server that preprocesses data you stored in the relational DB, and creates views sliced by date/time? Does it mean data gets sliced based on the UTC time zone? That could be funny and could generate inconsistent results.
And if you have a web page/report that shows that data, what Time Zone do you use to display it? UTC?
Not so easy anymore, is it?
For the Cube and the Web Reports, we decided to use the Time Zone of the server as our reference. We thought that if that whole team was in the same Time Zone, all data would be consistent (client, Cube and Reports). If teams were spread across the world, then it’s a guess – hopefully the servers are located where the majority of the team is. We also thought that it would be good for teams worldwide to view the same reports exactly the same way all around the world. We imagined how confusing it would be for people in different time zones to look at the same report and see different (or differently sliced) data.
A tad easier choice, but it required some specific design/implementation, was on the Web Access side. You probably know that most formatting for Web Apps is based on the Browser’s HTTP_ACCEPT_LANGUAGE parameter. If that could be used for formatting purposes (with some level of assumption) it certainly cannot be used to determine the time zone of the machine that is browsing your page. We looked at how things were implemented in Outlook Web Access, and we adopted a similar approach. You can define your own user settings at the App level, and those affect the cultural formatting and the Time Zone settings. While you can set the Culture setting to “Auto” (which will match the abovementioned HTTP_ACCEPT_LANGUAGE property) or set it to your own liking, you will have to select a specific Time Zone.
One caveat with this, is that the Culture setting currently also affects the UI language of the Web Access tool. Just so you know …
Another tricky source of bugs is whether or not to apply culture sensitive formatting based on the data you are processing – but I think this will be a good topic for another post …
System Locale: this affects non-Unicode applications. “but TFS is fully Unicode, right?” Right. But, we have some command-line tools, and while the CMD shell is Unicode-capable, it is not fully Unicode enabled (e.g. you can’t type/display text in Complex Scripts, like Arabic). So in this case we have a limitation imposed by the platform. Are there any workarounds? Sure and we implemented one, but only on the mostly used commandline apps. We allowed for a file containing other commands to be passed as a parameter, and that file san be saved as Unicode- see Specialized Options in this article.
There are other considerations related to the System Locale but I’ll leave them for yet another post.
Input Locale: I’ll keep it short, but this has caused some of the most painful bugs. Certainly among the hardest to investigate.
It’s not about the Locale per-se, but when you are building an application you want it to be efficient, easy to use, etc. So sometimes you try to add smart features, like automatic filtering of a List, so that each time you type a character, the list gets filtered. Unfortunately not all languages can benefit from this feature . If you are typing e.g. Japanese text via the Japanese IME, the text does not get flushed to the input buffer until you are “done” typing, so your application won’t even be aware that someone is typing Japanese text until after the fact.
It can even get a bit more complex with Korean – Korean is a language that allows direct input (hence in-place editing) but, depending on your filtering logic you may run into unexpected behaviors. You really have to test it. I am planning to develop a small sample of this issue, just to give you an idea of what I mean – I’ll either edit this post or create a new one for it.
I really thought the Locale topic was going to be short one, but I obviously was wrong – not only I wrote a long post, but I also got more topics for future posts. Anyhow I hope it gave you an idea of the type of issues one can run into when building an (I have to say, fairly complex) application. And that was only on Locales! Stay tuned for more and, as usual, please feel free to let me know what you think.
Aldo