September 9, 2007

Microsoft Launches Translation Service


Microsoft launched a service for automatic translation called Windows Live Translator. The site lets you translate a text limited to 500 words or a web page from English to German, Dutch, French, Spanish, Portuguese, Italian, Korean, Chinese, Japanese, Russian.

Microsoft uses Systran to produce most of the translations, but also offers an option to translate computer-related texts using a machine translation system developed in-house. Microsoft's translation technology has been used to translate technical materials, including MSDN Library.

"Recent research in Machine Translation (MT) has focused on data-driven systems. Such systems are self-customizing in the sense that they can learn the translations of terminology and even stylistic phrasing from already translated materials. Microsoft Research MT (MSR-MT) system is such a data-driven system, and it has been customized to translate Microsoft technical materials through the automatic processing of hundreds of thousands of sentences from Microsoft product documentation and support articles, together with their corresponding translations."

Microsoft intends to integrate this service into Live Search and provide a feature already available in other search engines for a long time. Windows Live Translator's presentation is extremely interesting: the default view shows the original page and the translation side by side in two vertical frames. If you hover over a sentence in one of the pages, the sentence is highlighted in both pages. If you scroll in one of the pages, the other page performs the same action. This is an interesting approach especially for those who speak both languages fairly well or want to learn a new language. Unfortunately, it's difficult to read a page that requires to scroll horizontally.


Google also has a translation service powered by Systran. The translations are identical to the ones returned by Babel Fish, but they're different from Windows Live's translations, so Microsoft might use an updated version of Systran's software.

Google developed a machine translation system that's available to the public for only three languages: Arabic, Chinese and Russian. To expand these systems to other languages, it's important to have a lot of parallel texts. "Rather than argue about whether this algorithm is better than that algorithm, all you have to do is get ten times more training data. And now all of a sudden, the worst algorithm is performing better than the best algorithm on less training data," explained Peter Norvig, Director of Research at Google.

While machine translation is not yet a replacement for human translation in most cases, it's a great way to get the approximate gist of a text in a foreign language. One of the most important problems is that machine translation doesn't always produce coherent phrases and doesn't understand the subtleties of language, so don't use it to translate poetry or to send important emails.

20 comments:

  1. This is an interesting approach especially for those who speak both languages fairly good or want to learn a new language.

    Good joke.

    I've translated texts from French and Dutch, and I've edited machine translations, and on the whole I'd rather translate from scratch than repair the mess made by machines. Perhaps technical texts are easier, but anything that involves any subtlety of language or use of idiom is just beyond these systems at the current state of things.

    IMHO, of course.

    ReplyDelete
  2. "an interesting approach" = displaying the original page and the translation in two frames. Google only shows the translation.

    ReplyDelete
  3. Google's Arabic translation is not usefull... it doesn't understand arabic

    ReplyDelete
  4. Is it that bad? Google got some awards for the Arabic-to-English and Chinese-to-English translations.

    ReplyDelete
  5. Seems like I got this story out earlier at startuplay.com, but still nice to see google doing all the PR work for MS.

    ReplyDelete
  6. Just possibly some readers may be interested in contributing to the (crude) comparison between "rules-based" and "statistically-based" machine translations, linked to from here:
    http://fm.schmoller.net/2007/09/machine-transla.html.

    ReplyDelete
  7. It is great fun to translate from german to english and back to german.

    ReplyDelete
  8. really fun is to see SYSTRAM in full action, translating back and forth between 5 different languages as done by Lost in Translation

    ReplyDelete
  9. It's a lot better than nothing. It never ceases to amaze me that as soon as people start attempting to solve some really hard problems, all they want to do is tear the solutions apart instead of contributing useful information or even helping. The more of this kind of thing that goes on, the closer we are going to get to a universal translator. We already have software that can take written text and read it back (the first ones were very crude). Voice recognition software was extremely crude 10 years ago as well. The translations are a lot better than before and I am in favor of ANYTHING that moves this field of computer science forward.

    ReplyDelete
  10. Cool, but it's wrong. When you try to translate "File" from English to German, you get "Akte" which is "Document" in English, when it should be "Datei"

    ReplyDelete
  11. Nice Improvments of Windows Live !

    ReplyDelete
  12. This kind of software is not really promoted if MS refers to its MSDN articles. Their German translations are always so utterly bad. They do not grasp the meaning of the words and mix up the word order completely. They are really funny reading indeed, but if you want to know what all the mess is about, you have always to read the English original. The translation does not serve any other purpose than just being a joke.

    I did not encounter an automatic translation software so bad as the MSDN knowledge base articles. Contrary, the @PROMPT translation software at least produces understandable texts.

    ReplyDelete
  13. My site offers the same service from Systran with the advantage that you can add translation to your own site with a bit of HTML code.

    You can download it here http://www.appliedlanguage.com/trans/free_quick.aspx

    ReplyDelete
  14. If you NEED a translation, then you need an accurate translation. In which case you need a professional translator. The traditional model is high minimum charge and long turnaround times (days).

    www.LiveTranslation.com has online HUMAN translators providing real-time translation from only $1.99.

    It is a REAL alternative to inaccurate computer translation.

    ReplyDelete
  15. The Italian translation is still less than a joke...

    ReplyDelete
  16. I visited a website which used Systram and about half of its translation from English to Japanese was either wrong or incomprehensible. I cannot believe they actually sell this product!

    ReplyDelete
  17. "This is an interesting approach especially for those who speak both languages fairly good or want to learn a new language."

    This a joke? Correct ENGLISH would be "fairly well".

    Ha. Guess this guy should conquer English before moving onto another language :p

    ReplyDelete
  18. Some good news and some very bad ones. I have used the Google French translator back and forth quite a lot. It's good, especially for mathematical papers and the like. The German translator is a bit shaky but still quite helpful. Surprisingly, Italian is simply unusable even for the most basic tasks. It just doesn't speak Italian at all (e.g. Italian to English: "Cosi fan tutte" -> "So fan tutte")! Google should remove it.
    The Japanese translator is a puzzle. It basically generates random nonsense when translating Japanese pages to English. It appears to have no idiomatic capabilities at all.

    ReplyDelete
  19. iv attempted to use translators before and they seem ok but none of them are perfect! i decided to learn a language recently and decided that learning language online would be easier then classes as i could do it in my spare time

    ReplyDelete
  20. I use Systrans for some sites and I really dislike it. I've actually converted some of my intranets to use google's ajax api and it is really pretty cool (http://developer.morrisdev.com/translate/translate3.htm) It's all javascript, so you can rip that page apart. if you look at that main website (which is unfinished by a long shot), you'll see that I cycle through the HTML objects on the page and translate them to the selected language once the page loads.

    Still, the translation works pretty well, getting the idea across, but I speak several other languages and I can tell you, it isn't that great.

    Systrans has the benefit of exclusion lists, where you can replace certain words on the fly. That's nice, but I think I'm going to be able to do that with the google one if I change things a bit.

    Neither one of them helps much with SEO, as systrans is actually located on the systrans server, so you're not hosting the foreign language and the google results are post-load, so they are never seen by a webbot.

    My next step is to see if I can run the ajax serverside and serve the pages with the foreign language already there. Maybe the search engines will like that.

    We'll see.

    By the way, learning a language from these translators is a BAD IDEA. Go get the rosetta stone or a book or date a girl/guy that doesn't speak english. :)

    ReplyDelete