January 10, 2010

Google's Sensitive Translation Service

Eric Baković from Language Log noticed a subtle feature of Google Translate. Google's machine translation system shows radically different results when you change the punctuation or the case of a text.

Here's an example of a small change in a text that improves Google's translation:

[Spanish] tu hija que te quiso tanto y no supo demostrarlo - perdoname.
[English] your daughter that you loved so much and she could not prove it - pardon me.

[Spanish] Tu hija que te quiso tanto y no supo demostrarlo - perdoname.
[English] Your daughter who loved you so much and failed to prove it - pardon me.

"I don't pretend to know anything about Google's translation algorithm(s), but I do find it interesting that what seem like very minor manipulations like those shown above can lead to both bizarrely different results as well as to subtle improvements," notices Eric.

Jim Regan offers a possible explanation: "Google uses statistical machine translation, so algorithms have little to do with it - the translation is created by matching all the translations available for the different parts of the sentence, and then ranked against an n-gram language model of the target language to see how likely it is that those particular phrases go together, to assemble the translation. As case can be significant - acronyms are usually all upper case, proper names use an initial capital, etc. - it makes sense that it affects the translation."

5 comments:

  1. if you use contracted form, it also changes the translation : http://tinyurl.com/yl8arw9

    ReplyDelete
  2. Actually both phrases should translate to what the second phrase translates, with the only difference being the capital letter at the beginning. The first translation is incorrect.

    ReplyDelete
  3. I'm surprised that changing punctuation would drastically alter a sentence like that.

    Although there are some tweaks that are needed with the software, the Translation service offers the potential for those who don't speak English, for example, the ability to get the information that they need. Plus, it's free.

    ReplyDelete
  4. This is why I really prefer to use the spanish translation from SpanishDict.com. It gives you the translations from 3 different translators so you can compare the results and find the one that sounds best. If I am still not sure, I just put it into the forum and the super friendly users correct it for me for free.

    ReplyDelete
  5. ither wait for Variations Create Hierarchies Job Definition and Variations Propagate Sites and Lists Timer Job to run as scheduled or run them manually. Houston translation Service

    ReplyDelete

Note: Only a member of this blog may post a comment.