Home | About Me | Developer PFE Blog | Become a Developer PFE
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
Sign In
An online translator really isn’t all that new. They’ve been around for at least 8 years or so. I remember the days when I would use Babelfish for all of my fun translations. It was a great way to get an immediate translation for something non-critical. The problem in a lot of cases was grammatical correctness. Translating word for word isn’t particularly difficult but context and grammar varies so much between languages that it was always challenging to translate entire sentences, paragraphs, passages, etc. from one language to another.
Fortunately the technology has improved a lot over the years. Now, you can somewhat reliably translate entire web pages from one language to another. I’m not saying it’s without fault – but I am saying that it’s gotten a lot better over time. These days there are a few big players in this space. Notably Google Translate, Babelfish and the Bing Translator. The interesting thing I’ve found is that only Bing actually has a supported API into its translation service.
There are 3 primary ways to interact with the service:
They all seem to expose the same methods but it’s just the way you call them that differs. For example, the sample code published for the HTTP method looks like:
1: string appId = "myAppId";
2: string text = "Translate this for me";
3: string from = "en";
4: string to = "fr";
5:
6: string detectUri = "http://api.microsofttranslator.com/v2/Http.svc/Translate?appId=" + appId +
7: "&text;=" + text + "&from;=" + from + "&to;=" + to;
8: HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(detectUri);
9: WebResponse resp = httpWebRequest.GetResponse();
10: Stream strm = resp.GetResponseStream();
11: StreamReader reader = new System.IO.StreamReader(strm);
12: string translation = reader.ReadToEnd();
13:
14: Response.Write("The translated text is: '" + translation + "'.");
Then, for the SOAP method:
1: string result;
2: TranslatorService.LanguageServiceClient client =
3: new TranslatorService.LanguageServiceClient();
4: result = client.Translate("myAppId",
5: "Translate this text into German",
6: "en", "de");
7: Console.WriteLine(result);
And lastly for the AJAX method:
1: var languageFrom = "en";
2: var languageTo = "es";
3: var text = "translate this.";
4:
5: function translate() {
6: window.mycallback = function(response) { alert(response); }
7:
8: var s = document.createElement("script");
9: s.src = "http://api.microsofttranslator.com/V2/Ajax.svc/Translate?oncomplete=mycallback&appId;=myAppId&from;="
10: + languageFrom + "&to;=" + languageTo + "&text;=" + text;
11: document.getElementsByTagName("head")[0].appendChild(s);
12: }
Fortunately, it all works as you’d expect – cleanly and simply. The really nice thing about this (and the Google Translator) is that when faced with straight-up HTML like:
1: <p class="style">Hello World!</p>
They will both return the following:
1: <p class="style">¡Hola mundo!</p>
Both translators will keep the HTML tags intact and only translate the actual text. This undoubtedly comes in handy if you do any large bulk translations. For example, I’m working with another couple of guys here on an internal (one day external) tool that has a lot of data in XML files with markup. Essentially we need to translate something like the following:
1: <Article Id="this does not get translated"
2: Title="Title of the article"
3: Category="Category for the article"
4: >
5: <Content><![CDATA[<P>description for the article<BR/>another line </p>]]></Content>
6: </Article>
The cool thing is that if I just deserialize the above into an object and send the value of the Content member to the service like:
1: string value = client.Translate(APPID_TOKEN,
2: content, "en", "es");
I get only the content of the HTML translated:
1: <p>Descripción del artículo<br>otra línea</p>
Pretty nice and easy. One thing all of the translator services have trouble with is if I just try to translate the entire xml element from the above in one shot. Bing returns:
1: <article id="this does not get translated"
2: title="Title of the article"
3: category="Category for the article">
4: </article>
5: <content><![CDATA[<P>Descripción del artículo<br>otra línea]]</content> >
And Google returns:
1: <= Id artículo "esto no se traduce"
2: Título = "Título del artículo"
3: Categoría = "Categoría para el artículo">
5: <Content> <! [CDATA [descripción <P> para el artículo <BR/> otra línea </ p >]]>
6: </ contenido>
7: </> Artículo
Oh well – I guess no one’s perfect and for now we’ll be forced to deserialize and translate each element at a time.
Enjoy!