Voice Search is a Google Chrome extension that lets you search using your voice. It's not developed by Google, but it uses an experimental Chrome feature called form speech input. The feature is enabled by default in the dev channel builds, but it can be manually enabled by adding a command-line flag.
"Voice Search comes pre-loaded with the following default services: Google, Wikipedia, YouTube, Bing, Yahoo, DuckDuckGo and Wolfram|Alpha. You can also add your own user-defined search engines. It also integrates a speech input button for all websites using HTML5 search boxes. This extension requires a microphone. Speech input is very experimental, so don't be surprised if it doesn't work. Also, try to speak clearly for best speech recognition results," suggests the author.
Speech recognition is limited to English and it doesn't work very well, but this extension is a good way to test a feature that will be enabled in the future Chrome releases. If you have a website, it's quite easy to add support for speech input, but it may take a while until Google's Speech Input API specification becomes a standard and all browsers implement it.
{ Thanks, Silviu. }
This is cool! I don't know if I want to shouting at my monitor to search, but it is cool.
ReplyDeleteOn a related note, I have a question for everyone that may have a simple answer:
ReplyDeleteIf Android phones can do Google searches for "chubby bunny," with near perfection, while the speaker has five marshmallows in his mouth, why is Google Voice's transcription service often far from accurate?
@Cougar:
ReplyDeleteSome possible explanations: less training data, less feedback, more background noise, more complex phrases.
Thanks for the response, Alex.
ReplyDeleteI'm personally unfamiliar with "training data."
I'm very comfortable with your explanations.
Do you think it's the same software/machinery that performs the transcriptions?
Google's approach to voice recognition is similar to the one used for Google Translate, so the two services probably share a lot of data. Google needs to build a language model using large amounts of data, then use a lot of audio samples to build a voice model and then develop a voice recognition system that tries to connect the two models and produce some useful results.
ReplyDeleteGoogle has to find text sources (probably from the Web) and audio sources (these are more difficult to find). It's much easier to build a recognition system for voice search because Google already has a lot of queries that could be used as text sources and the system can self-improve by using the audio samples collected using Voice Search. It's a lot more difficult to build a speech-to-text system for voicemail or YouTube videos because the input is more complex and less predictable.
I found a better explanation from Google:
ReplyDelete"Creating a general voice input service had different requirements and technical challenges compared to voice search. While voice search was optimized to give the user the correct web page, voice input was optimized to minimize (Hangul) character error rate. Voice inputs are usually longer than searches (short full sentences or parts of sentences), and the system had to be trained differently for this type of data. The current system's language model was trained on millions of Korean sentences that are similar to those we expect to be spoken. In addition to the queries we used for training voice search, we also used parts of web pages, selected blogs, news articles and more. Because the system expects spoken data similar to what it was trained on, it will generally work well on normal spoken sentences, but may yet have difficulty on random or rare word sequences -- we will work to keep improving on those."
@Alex.
ReplyDeleteCool, thanks for the additional explanations.
I'm sure things will improve soon enough, especially after Google purchased that voice recognition firm from (I believe) the UK.
does not work on mine chrome
ReplyDeletei use google chrome but my grey speaker is not showing since two days please help me?
ReplyDeleteAlso, try to speak clearly for best speech recognition results," suggests the author.speech recognition program
ReplyDelete