WinForms localization - VS or WinRes?
Let’s look at Windows Forms localization for a minute. MSDN has a good section about Developing World Ready Applications, including how to get avoid globalization & localizability issues, how to get your code localizable, what tools are provided etc. I'm not trying to repeat what's in there - if you haven't read it yet, spend a day or two in there before you come back.
As MSDN states, there are two ways you can localize Windows Forms using the tools included in Visual Studio/.NET Framework SDK. What’s common for each is that you start by opening the form you want to localize in Visual Studio and set the Localizable property to True. After that, one method to localize the form is to simply set the language you want to localize into and start changing properties. The other way is to open the resx file for the form in winres.exe. (There’s a third way too – "pre-size” all dialogs so any language will fit, and then only allow the translators to translate text. This is a horrible idea though, don’t do this if you’re serious about delivering international software.)
What’s really important here is that, once you pick one way, you can’t easily switch to the other. The reason for this is that the resx schema used by Visual Studio for localized forms differs from the schema used in winres.
So which one do you pick? In my oh-so-personal opinion, winres is the way to go, for several reasons - ranging from how you might not want to spread your source code around, to how much harder loc kit maintenance becomes with higher file count & more complex setup, to the fact that Visual Studio costs money but winres is included in the .NET Framework SDK which anyone can download.
Whichever tool you chose though, remember that this is a case where you really want to make the right decision, and early. You don’t want to get stuck half-way trying to switch from one tool to another.
So, now you’ve picked one tool or the other. That's it, you're ready to go? Well.... your localizers can translate and resize dialog boxes, there are still a lot of things left to solve. Such as –
- How will you make sure that your application is localizable and that all globalization issues have been resolved? And how will the localizers communicate back any issues they come across?
- When do you start localization? Too soon, and you’ll waste resources translating stuff that’s still changing; too late, and you're bound to encounter loads of localizability issues that you won’t have time to fix. Will all languages start at the same time, will everyone localize all features? Do you try to document and translate new terminology up front? Will help files be localized separately, and if so how do make sure they’re consistent with the software?
- How will the localizers translate other resources, such as error messages, bitmaps and icons? Translating resx files in notepad isn't ideal.
- If you have more than one localizer per language, how will they manage glossaries and terminology? Getting consistency in style and terminology in a large project is a challenge (even for a single localizer).
- How will translations be recycled? Very few products come without repeated strings and there’s no reason to translate “Browse”, “Cancel” and “Click next to continue” more than once. Remember that Microsoft provides glossaries that you can use for reference.
- How can you and your localizers know when they’re done? You probably want a way to measure what’s left to translate and which dialogs are yet to be resized, but you also want to know what types of quality checks (linguistic and others) need to be performed, how they'll be performed and how results will be tracked. If you don't, you can accidentally release text bits that haven’t been translated or spellchecked. Embarrassing.
- How will localizers get context information from developers? If I don’t know how a string is used, I can’t know that I’m coming up with the right translation. Is “Display” a verb or a noun? Is “Second" number two, or is it a 60th of a minute? Is "Enabled” plural or singular? And once the meaning of a string is clarified, how is this communicated back to all languages?
- If localization starts while the product is still under development, how will the localizers' work be merged when updated resources are made available? If you’re using a mix of in-house localization and outsourced work, how will you sort out file management?
- How will the localized products be built and tested, and how will builds be made available to the localizers?
…and more… None of these are easily answered, but hopefully I can give some tips in upcoming posts...
Comments
- Anonymous
August 11, 2004
My biggest gripe with Visual Studio's WinForms localization support is that it's not easy at all to add your own string to the mix. If I want a localized string to appear in a MessageBox, I have to create my own .resx file for each language and manually call resgen/al to compile this .resx with those generated by the WinForms designer. Talk about a mess.
Does WinRes.exe handle this problem better? - Anonymous
August 11, 2004
"This is a horrible idea though, don’t do this if you’re serious about delivering international software"
Unless I have got it wrong, the idea that only strings get translated is a huge gain in overall cost, since you keep using the same dialog boxes, oops, forms regardless the language.
If you localize the form resources, then you'd better not have 40 of them, and 6 languages to support..
Shed a light on this? not on technical point, but on the cost of it? - Anonymous
August 11, 2004
Andy, I know where you're coming from. What you have here is the effect of starting localization before the product is done, and the pain of getting incremental updates to the product localized.
I'll talk more about this in an upcoming post, but my recommendation is to not create the resx files per language manually. Put a process in place where you can build the app with whatever resx files the localizer hand off -- if a string is missing for a language (due to mismatched update levels), let it fall back to the neutral culture string. If you have an extra string in the localized resx file, leave it in or wash it out as part of the build process (on a copy of the resx file). If a resource has been updated, let the "old" resource in the latest localized resx file make it into the build.
The result is that localization always plays catch-up to the source language, you'll see weird localization bugs just because of this, but. But, you also avoid the hassle of manually propagating new resources to all languages, and you can always make a localized build regardless of the state of the translations.
What I describe above is pretty much what we do for Windows localization, be it .NET or Win32. And I think to achieve this, you need to look more at your process than at the tools used. - Anonymous
August 11, 2004
The comment has been removed - Anonymous
August 11, 2004
The comment has been removed - Anonymous
August 11, 2004
Stephane, maybe I'm spoilt working here. I can understand if time or budget constraints forces you to make trade offs. As long as you make an informed decision.
When I localize, I strive to make it invisible that it's even a localized product. Ideally, the end users won't even notice that it has been translated. If I can't see or change dialog layout, this is very hard to achieve - either text will be truncated and out of context or the text won't flow naturally, and either way you're likely to increase the cost of testing/bug fixing.
"localizers won't dare make any change to dialogs themselves, fearing to cause wreak havoc somewhere" - this can be handled through education and testing. If your localizers are worried about breaking the app by sizing a label, my bet is that they make a lot worse mistakes without knowing... (Hey, there's another idea for a topic, how to avoid breaking applications when localizing.) - Anonymous
August 11, 2004
This really depends on your environment. We are using an unmanaged environment, with all the usual MFC dependencies. You know what I 'am talking about : your dialog boxes are properyl translated, but not the ok, yes, apply buttons that happen to be part of the mfc42xxloc.dll resources. Well, I'd like to know your point of view regarding MUI, and the ability for the OS to re-route dll being loaded. A basic scenario has proven that this can cause much more harm than without. I know that MUI has not been given all attention so far but yet it's still interesting to know any point from the inside.
I have the same question for the common controls dlls being automatically loaded from a .NET folder install rather than the system32 folder as soon as you have deployed the .NET SDK. How do you still manage to know what you are localizing when you are not even sure what dll is loaded? especially those dependencies that are part of the os? - Anonymous
August 12, 2004
For the common dialogs & the MFC strings, we don't worry about them. If users decide to install Swedish Office on a French machine, they'll have to expect to see some mixed language. The only option is for each application to re-implement File->Open etc, and then we're back at Windows 3.11 again.
That's easy for me to say though, since there is a Swedish platform out there... But what if you're trying to release an application in a language for which there is no localized OS, like Tagalog?
Well... I'd still say that this is Microsoft's problem, not yours. I wouldn't recommend you re-implementing the common dialogs, just like you wouldn’t implement an encryption algorithm on your own or rebuild ADO from the ground up. You'd just spend time & money on something that doesn’t have anything to do with what your application is really supposed to do.
The good news is that this is a situation we're actively working on improving. I can't really talk about specifics yet, it's a bit too early for that, but I hope that we can resolve this for the ISVs within the not too distant future.
As for MUI, the MUI technology used in the OS is intended just for the OS. The recommendation is that if you want similar behaviour for your application, create this behaviour yourself. Houman has written a good piece on how this can be done, available at http://www.microsoft.com/globaldev/handson/dev/muiapp.mspx. I believe that Office is using model #3 in this paper.
If I was designing a Win32 application today, I'd try to mimick the way it works in managed code, having a satellite resource DLL for each language and fall back on the base language as needed. One of the really big benefits of doing so is simply maintenance -- if you get stuck in a situation where you have to release an urgent fix, you can do so without having to wait to have it localized for all your languages. (We do this more and more -- an example is in http://www.microsoft.com/technet/security/bulletin/MS04-014.mspx. As you see under "Revisions", the patch was released quickly to secure customers, then it was updated four weeks later when the new error messages had been localized.)
“A basic scenario has proven that this can cause much more harm than without” – Now I’m curious :) Could you elaborate on this? - Anonymous
August 13, 2004
I totally agree that string-only translation is a mess. A slightly off-topic (webforms instead of winforms, maybe there's an article on that coming soon? ':): For webforms it really seems to be the "recommended" way (the application I'm working on is Webforms-based and I had a couple of conversations about this when I attended the GDDC) . The good thing is that webforms can be adjusted for flow layout reasonably well. The bad side is that it is necessary to take screencaps of every page and send them to the translators so that they can have an idea of what the context is. It is a pain but the advantages of the fallback mechanism in the resource manager are just too good to pass up.
Regarding Stephane's comment on the cost savings, the reason why reusing strings is generally not a good idea is because you need to be absolutely sure that whatever you intend to reuse really is the same and the translation won't depend on context. There are many words that don't need to be conjugated in English but they do for other languages. E.g., if you have a string "white" in English that is reused in several places in your UI, it may require different translations for languages where adjectives change according to gender (e.g., Spanish, French) or according to time (e.g., Japanese) - Anonymous
August 13, 2004
Yeah, I've been thinking about writing something about web forms localization. I started doing research on this, becuase I don't have all that much experience here. The only web based project I've localized so far is UDDI in Windows Server 2003, and there we had one big resource dll, and one big headache trying to figure out what goes where. I look into this though, hopefully I can come up with something useful to post...
Maybe I misunderstood Stephane's post. I thought it was about keeping localizers from resizing to save money, not reusing strings.
You're absolutely right about your comment here though, about context dependent strings. I've done some research on what the story is in Windows. You'd be surprised at the amount of these intentional inconsistencies, either for grammatical reasons or because of semantics. One of the most important points here is that the shorter the string, the more likely it is that you'll have intentional inconsistencies. Handily, short strings are, well, short, so keeping a few extra of them in the resources doesn't add that much to the cost of translating. Reducing duplicated longer strings is both much safer and gives greater savings.
If you have a large localized project, you might want to run a "reverse inconsistency check" to see cases where two or more different English strings have been given the same translation. Sometimes you find mistakes (maybe a missing negation), but often you'll also see how your translators have increased consistency in the product. I mean, why do you need fifteen different ways to say "Access denied"? - Anonymous
August 13, 2004
Eusebio : we don't reuse strings. For one, the scenario you describe would make us reuse "words", rather than "strings". What we did most aggresively was keeping a single .rc file regardless the language and have a string translator do translation of all dialog strings at load time. Strings would be stored in a map in a different file, on a language by language basis.
Although this is a huge time saver, there are two main limitations :
- encoding : if you use sbcs upfront, then you are in bad shape when it comes to adding dbcs.
- German : keeping a single .rc file doesn't do well when you are adding a language like German in your portfolio, given the average string size. Arguably, we don't have a German version, if that isn't clear enough. :-)
Jesper : regarding MUI, we had a dll loading scheme that showed us the lack of control over what dll was really being loaded in the end, and that indeed would cause us troubles.