Machine Translation and Speech Recognition at Google

October 8, 2008

Machine Translation and Speech Recognition at Google

Google's technologies for automatic translation and speech recognition already have visible results: you can translate texts in 35 languages at translate.google.com or use voice to find a local business with GOOG-411, but Google intends to expand their use. You should be able to translate an email written in a foreign language or find answers to simple questions by voice.

Mike Cohen, who leads Google's speech technology efforts, and Franz Och, machine translation researcher, chat with Alfred Spector, VP of Research and Special Initiatives at Google, about two technologies that might seem unrelated to Google's core competency. Both statistical machine translation and speech recognition are search problems and Google's computer infrastructure can process large amounts of data that are needed to build language models. Another big advantage for Google is that it has popular services that generate a lot of useful data.

"When we first created GOOG-411, we had no speech data. Because we had so much query data here at Google (textual queries that people had typed to Google Maps), we could already train a pretty good language model. Now, obviously, text is a little different than speech and now that we've also trained on speech, we have better performance than we had back then, but even out of the box we could get good performance on that problem because we had so much textual data," says Mike Cohen.

17 comments:

AnonymousOctober 8, 2008 at 10:29 AM
That's excellent -- now if we could just get some of that voice recognition technology applied to Grand Central... It would be about 200% more useful to me if I could "say" one, two, three, etc. rather than having to hit a key during call screening...
ReplyDelete
Replies
AnonymousOctober 8, 2008 at 11:15 AM
Completely agree Chuck. It would be nice to get voice mails as text too. It seems that all of these technologies have enormous potential for GrandCentral but there seems to be no GC focus what so ever.
ReplyDelete
Replies
steeleweedOctober 8, 2008 at 12:54 PM
The view that "[] ...speech recognition [is] search problem[s]" is common, but I think there are better ways to capture/identify speech. Translation is a different issue and mass data analysis may be appropriate here.
ReplyDelete
Replies
AnonymousOctober 9, 2008 at 8:20 AM
English to Russian automatic translator is poor. It translates incorrectly simple and obvious phrases. Also it doesn't translate some words at all. :(
ReplyDelete
Replies
DrewOctober 9, 2008 at 4:01 PM
I will third that comment by Chuck. My phone doesn't let you input numbers when you've received a call and it's in password lock, so if my phone happens to be password locked (it automatically locks every few hours) and I get a call, I can't answer and I'm forced to wait until the party inevitably hangs up. For this reason, I don't give out my GC number unless I'm sure someone's going to try to bug me. I'm annoyed at how long it's taking them to get GrandCentral up to speed, but I can only assume they must be planning a big overhaul soon, especially considering their Android platform is rolling out.
ReplyDelete
Replies
TomOctober 15, 2008 at 9:47 PM
check speechsystems.in which works on speech recognition.All the guys are from CMU.
ReplyDelete
Replies
AnonymousOctober 20, 2008 at 9:24 PM
Speech recognition....am going mad with it. Am doing a final year project for my university course and I got to do a online chatterbot specialised on nutrition with speech recognition :-S

Anyone can atleast gimmi some help, hints where to start...langauges...etc programming platform?
ReplyDelete
Replies
AnonymousNovember 28, 2008 at 2:26 AM
@steeleweed:
"mass data analysis" is part of what happens when doing speech recognition as a search problem. Search is also applicable for parsing syntactic structure, and is thus an important ingredient in many machine translations systems. Besides, most of the NLP technology out there uses a rather eclectic mix of methods...
ReplyDelete
Replies
AnonymousDecember 10, 2008 at 9:54 AM
Another Cool company making tremendous progress is SpeechCycle, based in NYC....see www.speechcycle.com
ReplyDelete
Replies
AnonymousApril 5, 2009 at 1:30 PM
Spinvox.com - Already there, Artificial intelligence made in Cambridge, patented and awesome quality. Sorry Google, this time you're very late !
ReplyDelete
Replies
BennyApril 8, 2009 at 6:19 AM
As impressing as this development is, the work of an actual translation agency won't be replaced so easily.
ReplyDelete
Replies
johnSeptember 13, 2009 at 12:28 AM
I would like to see Google offer a speech recognition service that allows you to record a long audio message into your smart phone and have it transcribed into text and posted on your blog. This service would be useful to writers and reporters because they wouldn't have to worry about losing a thought in the process of scrambling to write it down.

This service could aid Google in their speech recognition and language translation effort in the following way: In exchange for the free service outlined above Google would periodically send each user a paragraph to read out loud for submission. In addition, since Google has such deep pockets, they could offer a lottery to solicit even more people to use this service.

So basically, a smart phone application would record your speech and upload it to Google's servers when you get close to a hot spot. The speech could then get transcribed asynchronously into text and posted on your blog. No need to transcribe it in real time since accuracy is the goal for everyone.
ReplyDelete
Replies
onanongOctober 3, 2009 at 7:02 AM
Since everybody has a different voice, voice recognition (and live interpretation) seems much more difficult than just machine translation (of text). We already know there is much work to be done for machine translation. So I don't expect voice recognition to progress to the point live machine interpretation is feasible. That would also probably require quite a lot of training from the users. If you have ever tried a voice recognition program on your PC to replace your keyboard you should know...
Anyway, we have to start somewhere, and any initiative to improve machine translation and/or voice recognition should be welcomed.
ReplyDelete
Replies
UnknownOctober 18, 2010 at 2:54 PM
The interface in science is more friendly than Facebook.
ReplyDelete
Replies
ituaDecember 12, 2010 at 4:47 PM
working on a final year machine translation project to convert english txt to a local language in my country (nigeria). pls any advice cus dnt knw where to start
ReplyDelete
Replies
ituaDecember 12, 2010 at 4:54 PM
working on my final project and i want it to be on language translation actualy to convert text from english to a local language in my country(nigeria). any advice pls cus dnt knw where to start
ReplyDelete
Replies
ituaDecember 12, 2010 at 4:56 PM
need help in this machine translation thingy. supposed to b my final year project but dnt knw where to start
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Google Operating System

Unofficial news and tips about Google

October 8, 2008

Machine Translation and Speech Recognition at Google

17 comments:

Follow

Labels

Popular Posts

Blog Archive

Recommended Sites