Tuesday, July 07, 2009

Google's PDF Viewer for Search Results

Google's PDF viewer is now integrated with Google's search results, replacing the "view as HTML" option. Google converts PDF files into PNG images, but the files are still searchable and you can copy some of their content.

Not all the PDF files from Google's search results include the new option, so it's likely that Google doesn't perform the conversion on the fly and not all the files have been converted.



The PDF viewer is also used for the PDF files uploaded to Google Docs and for Gmail attachments.

Labels: ,

  15 comments ( Post a comment )
looks like the pdf files on .edu domains are indexed for now!
Google Plans to Introduce a PC Operating System
http://bit.ly/euY2s
One new feature from google . thats really great.
@Hprofl, indeed. Even Zoho already jump into the train: http://blogs.zoho.com/general/if-the-browser-is-the-os-then-a-dockable-smart-phone-should-be-the-pc The real question is what part of the OS market, this OS will take? Linux or µ$oft one ?
Seems you must have a Google Docs account and login to view pdf-files. View link points eg to http://docs.google.com/gview?a=v&q=cache:2gqRQKstQzoJ:www.glencoe.com/sec/literature/litlibrary/pdf/hamlet.pdf
No, you don't need a Google account to view the file.
@Alex: Then what is the view link for you? Isn't it like http://docs.google.com/gview?a=v&q=cache:2gqRQKstQzoJ:www.glencoe.com/sec/literature/litlibrary/pdf/hamlet.pdf
That's the address, but you don't need to login to view it. What would be the point?
Well, Google asked me to login with that link. Ask Google about the point, getting more Docs-users?
Not for all PDFs (at the moment). Still a very welcome development!

Although... sometimes I prefer the HTML view - it's much quicker for large files and easier to copy text from.
Great!!

Inline access to PDF content when, in those instances, a client-side viewer isn't available or functional!!!.

Is there a way to exclude the "View as HTML" results; leaving just the "PDF Viewer" processed files in search results?
I think it is generated on-the-fly, since it gives an error if it's in their cache but not available on the web at the moment.
If someone manages to tweak the URL and view any PDF file hosted online, please let me know.
I doubt you can tweak the URL as a proper key/hash code after q=cache: is required this key/hash is connected to the url in the second part of the cache parameter. So only cached (read converted) PDFs on Google's servers can be used.

The viewer URLs - which I could access without being logged in into my Google Account - make me wonder if Google Docs will allow publishing user PDFs soon (they currently only can be shared).
Looks like the folks at Google have figured out how to use Ghostscript, gs(1).
Is it possible to learn to use gs yourself,
even if you don't work at Google? Who knows? Is it possible to use gs without installing a new operating system? Who knows?
They have also discovered that pdf viewers are slower than web browsers.
Who knows, maybe they'll discover that image viewers are faster than web browsers.
Let's wait and see.