Google Docs API
tests a new feature that lets you perform
OCR (optical character recognition) on an image. There's
a live demo that illustrates this feature: you can upload a high-resolution JPG, GIF, or PNG image that has less than 10 MB and Google Docs extracts the text and converts it into a new document. Google mentions that "the operation can currently take up to 40 seconds" and a small test showed that the service is not yet reliable: it's slow and it frequently returns errors.

The results are far from perfect and you'll find many errors, but the service is free and it's constantly improving. Here's the result of the OCR for
this scanned document:

There aren't many free OCR services available, so an OCR service provided by Google would be very popular.
ABBYY FineReader Online is one of the best online OCR services, but the free version is limited to 10 pages a day.
Google
sponsors the development of an open-source OCR software called
OCRopus, but it's not clear if the online service provided by Google Docs uses OCRopus.
Labels: Google Docs
webboodah said on September 29, 2009 11:29 AM PDT:
This fits in very will with their acquisition of reCAPTCHA....
Jurgi said on September 29, 2009 12:04 PM PDT:
@ebboodah — yup, then can send hard words to users od reCaptcha to get amazing accuracy. And soon after that, cybercriminals will invent API to send reCaptcha images to Google Docs OCR for great accuracy of spamming. Perpetuum mobile. :D
said on September 29, 2009 12:06 PM PDT:
Great idea ! But you're right : this doesn't work very well. I 'm excpecting a much more performant version :).
said on September 29, 2009 12:31 PM PDT:
Is this English only for now? Currently we're very pleased with the Acrobat tools performance, but that installation is not available to all users. Web based would be great.
said on September 29, 2009 6:05 PM PDT:
If you are going to criticize Google for not having perfect OCR translations, at least get the spelling right in your post.....
"excpecting a much more performant version :)"
said on September 29, 2009 9:33 PM PDT:
Is this like stegnography?
Fetard said on September 30, 2009 12:45 AM PDT:
I just tested it on some pictures. In some cases, I need to resize *2 the image, to be recognized, but it's awesome to be able to use OCR online !
Don't forget OCR is very complex technology, so, be patient !
Kirill said on September 30, 2009 2:06 AM PDT:
Very poor quality
The best online ocr service is :
http://www.finereaderonline.com
Chris said on September 30, 2009 1:37 PM PDT:
Google and OCR technology. My prediction is pretty close:
http://detailcode.com/news/future-predictions-of-googles-recaptcha/
Has anyone tested the accuracy when images are slightly skewed/rotated?
Dear Google,
it would be so kind of you if you could further in your endeavour and make Google Docs OCR work for Hebrew and Arabic. Perhaps, Cyrillic too. Certainly, people who read Chinese, Japanese or Hindi have similar levels of enthusiasm too.
This is great news, as I have a lot of books I wish to scan. Contrary to people who love collecting books on shelves but not reading them, I love reading books but not keeping them. Printed books are real heavy stuffs and inconvenient to move around with. Printed books also do not lend themselves willingly towards searching their contents - very bad feature.
I want to be able to read my stuffs anywhere I am in the world. I would scan the books, recycle the paper, but keep the covers in a trunk in an obscure corner of the cellar as proof of my ownership of the books.
BTW, what's the meaning of "performant"? I am afraid the frequent appearance of this word is about to make it join the list of annoying words like "obligated" that have replaced "obliged", "percentage points" instead of "percent" or "presumptive candidate" instead of "presumed candidate" as real words.
Jurgi,
That way, Google can find the criminals much faster and black-list all their property faster.
Hence,
Isn't “candidate” enough?
said on November 17, 2009 1:26 AM PDT:
Good start, the other so called free services suck.