An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

December 15, 2009

Google Docs Indexes PDF Files

This feature should've been added a long time ago: Google Docs indexes PDF files and you can finally search the contents of all your files.

Google Docs search has been recently improved by adding support for automatic stemming and synonyms, so you can search for [create shortcut] and find documents that contain [creating shortcuts] or [creates a shortcut].


To make things even better, Google should detect scanned PDF documents and use OCR to extract text. This feature is already used by Google's search engine to index scanned documents and it's available as an experiment for Google Docs API.

7 comments:

  1. Another good reason to start using docs. Thanks for the info.

    ReplyDelete
  2. Doesn't appear to be working with PDFs that haven't already been OCR'd, which has always been the case since they introduced PDF support. :\

    ReplyDelete
  3. Isn't working for now, my pdfs are still only title-indexed.

    ReplyDelete
  4. It's detecting scanned text in a PDF that I uploaded last week. I'm guessing that it just takes time.

    ReplyDelete
  5. It seems to index only the first 100 pages of a pdf file!

    ReplyDelete
  6. Once you have a PDF you want to index edit the keywords in the preferences. I have found that google index the keywords as well.

    ReplyDelete

Note: Only a member of this blog may post a comment.