An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

August 12, 2010

How Google Translate Works

Google uploaded a video that explains how Google's machine translation service works. It's fascinating to see how much Google Translate has improved in the past 4 years and how many Google services use it.


Here's the full text of the video:
"Google Translate is a free tool that enables you to translate sentences, documents and even whole websites instantly. But how exactly does it work? While it may seem like we have a room full of bilingual elves working for us, in fact all of our translations come from computers. These computers use a process called 'statistical machine translation' -- which is just a fancy way to say that our computers generate translations based on patterns found in large amounts of text.

But let's take a step back. If you want to teach someone a new language you might start by teaching them vocabulary words and grammatical rules that explain how to construct sentences. A computer can learn foreign language the same way - by referring to vocabulary and a set of rules. But languages are complicated and, as any language learner can tell you, there are exceptions to almost any rule. When you try to capture all of these exceptions, and exceptions to the exceptions, in a computer program, the translation quality begins to break down. Google Translate takes a different approach.

Instead of trying to teach our computers all the rules of a language, we let our computers discover the rules for themselves. They do this by analyzing millions and millions of documents that have already been translated by human translators. These translated texts come from books, organizations like the UN and websites from all around the world. Our computers scan these texts looking for statistically significant patterns -- that is to say, patterns between the translation and the original text that are unlikely to occur by chance. Once the computer finds a pattern, it can use this pattern to translate similar texts in the future. When you repeat this process billions of times you end up with billions of patterns and one very smart computer program. For some languages however we have fewer translated documents available and therefore fewer patterns that our software has detected. This is why our translation quality will vary by language and language pair. We know our translations aren't always perfect but by constantly providing new translated texts we can make our computers smarter and our translations better. So next time you translate a sentence or webpage with Google Translate, think about those millions of documents and billions of patterns that ultimately led to your translation - and all of it happening in the blink of an eye."

22 comments:

  1. Because there are many of languages, Google Translate can't handle all of it easily. Collaborators, feedbacks, and other aspects are the key to manage the language databases and its quality performance. Still, will it better than a popular and trusted dictionary ever made?

    ReplyDelete
  2. If its SO smart , how come it sucks SO bad.. I mean it's almost en exception to the rule that the translations actually makes sense..

    ReplyDelete
  3. Which language pairs do you use? What's great about Google Translate is that it's constantly improving. Depending on the number of parallel texts that are available and the difficulty of a language, the quality might improve at a slower pace.

    ReplyDelete
  4. When I get from my students anything translated by any program, the mark is always "not tested". Translator tools are just tools, first one needs to use one's head - so, don't try using any language unless you make an effort to study it.

    ReplyDelete
  5. Yo te leo todos los días, y lo traduzco con Google Translate. Saludos

    ReplyDelete
  6. I'm using Google Translate every day, to translate short phrases, and in my opinion it is a great tool, that was improved over the time. Still not perfect, but is better than any paid options.

    ReplyDelete
  7. Of course it is a great tool, and sometimes it has even helped me understand a poem. Sí, señor. Of course, in the end I had to put two and two together, but Google did the rough part. Instead of 20 minutes, it took 1 second. What percentage of a saving is that?

    ReplyDelete
  8. Google Translate is by no means a perfect translation service — you’re still going to have to invest in those language classes if you want to be able to fluently communicate with speakers of foreign tongues..For this video proved that.Thanks for regarding this video...

    subtitle translation tool

    ReplyDelete
  9. the Welsh translation (to and from) is getting better and better - like watching a kid progress through school or something. I put in a fairly complex technical document the other day and it was about 90% accurate, which is amazing when you come to think about it.

    However, the one place Translate falls down is that it doesn't know if its translation is correct or not - how could it? It might spit out text that is algorithmically correct, but doesn't really make sense in a real-world context ("that's not how we would say it"). So there needs to be human input here. Maybe a "yes, that is correct", as well as "no, here is a better translation" buttons.

    And native speakers - correct Translate if it is wrong! The more you correct it, the more accurate it will be. Seems a bit pointless translating into a language you're already fluent in, but the more people do that the better it will be for everyone. Maybe Google could introduce incentives for that - like ratings or something "Top translator for March was xxx".

    ReplyDelete
  10. google Translate is indeed a great toll for personal use, but when it comes to professional use, it might cost a pretty penny as the Canadian Polica discovered recently http://www.articlesbase.com/international-business-articles/canadian-police-pays-3000-per-day-for-relying-on-google-translate-2994095.html
    So, be careful to what use you reserve it.

    ReplyDelete
  11. I honestly don't get comments like "..how come it sucks SO bad..". And why so many people seem to have such huge complaints about it. I mean yes it's not a hundred percent accurate. Even people that have two native languages still have trouble translating things.
    And without Google Translate who knows where I'd be now. Go to http://www.sanakirja.org/ and try to translate some article from Finnish to English see how that goes for you. You can't even translate the whole thing at once. You'd have to go word for word. And it has to be in it's base form so no conjugations. What with sixteen or eighteen conjugations to remember I'd have a field day trying to get at the base word. Not to mention that a lot of words mesh together into a single thirty/forty letter word that would require further breakdown. So yeah it might not be accurate but at least once you're done you have some sort of idea of what is going on and is generally easy to figure out what is out of place and understand what has happened. If I can figure it out I don't think it should be a problem for most people.

    I'd just like to say thank you for all the hard work on Google Translate. Don't know where I'd be without it.

    ReplyDelete
  12. The video uses the wrong translation for Danish - "DANSK" is the correct word, but the video uses "DANSKE".

    ReplyDelete
  13. Google Translator is very useful when it comes to informal documents and short texts. But it's important to note two things:

    1. Google Traslator needs a very large quantity of computers working at the same time

    2. Google Translator is not worth it for professional translators simply because it translates everything, even in the wrong way. Other translation systems like apertium have learned that some words shouldn't be translated in some cases because there is no clear translation. When translators have to correct the translation, if those words have been translated, it'll be harder to find out what's wrong. Apertium in this case is much different and translators appreciate that.

    There are also many limitations with Google Translate: we cannot use it too much and there is no way to implement it on our computer (IT IS NOT FREE SOFTWARE).

    I just wanted to point this out. I love Google Translator anyway :)

    ReplyDelete
  14. Well I use it to translate parts in Japanese I don't understand. Every day. It does its job how it's supposed to...

    ReplyDelete
  15. Has Google Translate some geeky link on lite version without menu on the top with google apps and other surplus?

    I am looking for something like this: https://mail.google.com/tasks/ig but not for Translate not for Tasks.

    ReplyDelete
  16. it rocks for some languages, and for some you barely get the idea and for the rest almost complete non-sense but that's all because there isn't material as they said in that language and i don't think we could blame them for it. all who say it sucks... it's free and you don't have to use it :D

    ReplyDelete
  17. I never use Google translator because it always completely butchers what I saw in Japanese. I posted something in Japanese and my mom used Google translator to translate it and the translation was so wrong and she almost had a heart attack over what she thought I posted.

    I don't get why Google is so bad with Japanese? I have to use Babel Fish and it translates Japanese pretty well for the most part, but Google just makes it so incredibly wrong I don't understand how.

    ReplyDelete
  18. I am taking spanish as my foreign language right now. And while this is somewhat helpful in checking my work, there are common mistakes it makes such as wrong fem/masc articles. I wouldn't always rely on Google translate for corrections.

    ReplyDelete
  19. If you think it makes a mess of Japanese, PLEASE do not send ANYTHING in Chinese. It turned my note into a death threat!!!

    Must be someone watching and retyping???????

    Google needs to debug the people that have a bug on their program!

    ReplyDelete
  20. Google translate doesn't speak Japanese very well for me either;~;

    ReplyDelete
  21. The quality of machine translation is influenced by multiple factors. Firstly, there are the algorithms the systems use, and the weighting they give to different kinds of evidence. Secondly there are the properties of the language pairs themselves. For example, I just obtained a very credible Google translation of a 98 page document in French about linguistics. French and English share a great deal linguistically, and of course there is also a huge volume of professionally translated French-English content for the machine to develop its parameters from. On the other hand an English-Korean Google translation is usually gibberish without even the gist being guessable. Korean and English are radically different languages, conceptual categories in those cultures diverge widely, the amount of human translations has always been limited, and the quality of Korean-English human translation has historically been poor (often 3rd hand from Japanese). All of this has major consequences, not only for Google translation but for bringing people from one of the world's major economies into the vast global human English conversation.

    ReplyDelete