May 29, 2009

Google's Context-Sensitive Spell Checker

Spell checkers aren't usually very smart: they highlight words that aren't in a dictionary and suggest a list of similar words. Even if they take into account words that aren't included in dictionaries and they deal with plurals and verb tenses, most spell checkers can't find words that are used incorrectly in a context.

Wikipedia includes as an example: "Their coming too sea if its reel", a phrase that has 5 spelling mistakes, even though all the words can be found in the dictionary. If you enter this text in Gmail's editor and click on "Check spelling", Gmail won't find any error. Type the same text in Google's search box, and you'll get a "did you mean" message that suggests to search for "Their coming to see if its real". As you can see, Google's search box has a better spell checker than Gmail since it doesn't rely on a dictionary, it uses a huge amount of searches to determine what are the most probable sequences of words that follow a certain pattern. Unfortunately, the spell checker available at is optimized for searches, which are usually short, so you can't use it to spell check an email message or a blog post.

Google Wave, the service demoed yesterday at Google I/O, includes a context-sensitive spell checker that highlights errors as you type. Google uses the language models built for Google Translate to find words that don't belong in a certain context.

