Sdílet prostřednictvím


Working with the Bing Translator API

imageAn
online translator really isn’t all that new.  They’ve been around for at least
8 years or so.  I remember the days when I would use Babelfish for
all of my fun translations.  It was a great way to get an immediate translation
for something non-critical.  The problem in a lot of cases was grammatical correctness. 
Translating word for word isn’t particularly difficult but context and grammar varies
so much between languages that it was always challenging to translate entire sentences,
paragraphs, passages, etc. from one language to another. 

Fortunately the technology has improved a lot over the years. Now, you can somewhat
reliably translate entire web pages from one language to another.  I’m not saying
it’s without fault – but I am saying that it’s gotten a lot better over time. These
days there are a few big players in this space.  Notably Google
Translate
, Babelfish and the Bing
Translator
.  The interesting thing I’ve found is that only Bing
actually has a supported API into its translation service

There are 3 primary ways to interact with the service:

They all seem to expose the same methods but it’s just the way you call them that
differs.  For
example, the sample code published for the HTTP method looks like
:

  1: string appId= "myAppId";
  2: string text= "Translate this for me";
  3: string from= "en";
  4: string to= "fr";
  5:  
  6: string detectUri= "https://api.microsofttranslator.com/v2/Http.svc/Translate?appId=" +appId +
  7: "&text;=" +text + "&from;=" + from + "&to;=" +to;
  8: HttpWebRequesthttpWebRequest = (HttpWebRequest)WebRequest.Create(detectUri);
  9: WebResponseresp = httpWebRequest.GetResponse();
  10: Streamstrm = resp.GetResponseStream();
  11: StreamReaderreader = new System.IO.StreamReader(strm);
  12: string translation= reader.ReadToEnd();
  13:  
  14: Response.Write("Thetranslated text is: '" + translation + "'.");

Then,
for the SOAP method:

  1: string result;
  2: TranslatorService.LanguageServiceClientclient = 
  3: new TranslatorService.LanguageServiceClient(); 
  4: result= client.Translate("myAppId", 
  5: "Translatethis text into German", 
  6: "en", "de"); 
  7: Console.WriteLine(result);

And
lastly for the AJAX method
:

  1: var languageFrom= "en";
  2: var languageTo= "es";
  3: var text= "translate this.";
  4:  
  5: function translate(){
  6: window.mycallback= function(response) { alert(response); }
  7: 
  8: var s= document.createElement("script");
  9: s.src= "https://api.microsofttranslator.com/V2/Ajax.svc/Translate?oncomplete=mycallback&appId;=myAppId&from;=" 
  10: +languageFrom + "&to;=" + languageTo+ "&text;=" + text;
  11: document.getElementsByTagName("head")[0].appendChild(s);
  12: }

Fortunately, it all works as you’d expect – cleanly and simply.  The really nice
thing about this (and the Google Translator)
is that when faced with straight-up HTML like:

  1: <p class="style">HelloWorld!</p>

They will both return the following:

  1: <p class="style">¡Holamundo!</p> 

Both translators will keep the HTML tags intact and only translate the actual text. 
This undoubtedly comes in handy if you do any large bulk translations.  For example,
I’m working with another couple of guys here on an internal (one day external) tool
that has a lot of data in XML files with markup.  Essentially we need to translate
something like the following:

  1: <Article Id="thisdoes not get translated" 
  2: Title="Titleof the article" 
  3: Category="Categoryfor the article"
  4: >
  5: <Content><![CDATA[<P>descriptionfor the article<BR/>anotherline </p>]]></Content>
  6: </Article>

The cool thing is that if I just deserialize the above into an object and send the
value of the Content member to the service like:

  1: string value =client.Translate(APPID_TOKEN, 
  2: content, "en", "es");

I get only the content of the HTML translated:

  1: <p>Descripcióndel artículo<br>otralínea</p> 

Pretty nice and easy.  One thing all of the translator services have trouble
with is if I just try to translate the entire xml element from the above in one shot.  Bing returns:

  1: <article id="thisdoes not get translated" 
  2: title="Titleof the article" 
  3: category="Categoryfor the article">
  4: </article> 
  5: <content><![CDATA[<P>Descripcióndel artículo<br>otralínea]]</content> >

And Google returns:

  1: <=Id artículo "esto no se traduce"
  2: Título= "Título del artículo"
  3: Categoría= "Categoría para el artículo">
  4:  
  5: <Content> <! [CDATA[descripción <P> parael artículo <BR/> otralínea </ p >]]>
  6: </ contenido>
  7: </> Artículo

Oh well – I guess no one’s perfect and for now we’ll be forced to deserialize and
translate each element at a time.

Enjoy!