An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

December 8, 2007

Google Starts to Index Images Uploaded to Blogger

Even if this sounds hard to believe, Google Image Search started to index images uploaded to Blogger in December 2007. Until this month, all the images were prevented from being indexed by search engines for unknown reasons. This move is closely related to the fact that the images hosted at Picasa Web Albums started to be indexed by Google.

The chart from Google Analytics shows the number of referrals from images.google.com for this blog:


... and here are some results from Google Image Search:


The images uploaded from Blogger's interface are hosted at Picasa Web Albums, but they're also available at subdomains like bpX.blogger.com, where X is a digit. Another Blogger oddity, inherited from Picasa Web, is that you can't directly link to an image (if you click on a link to one of the two images uploaded above, you'll see a dialog that asks you to download the image). Blogger even has a workaround for this silly restriction: it automatically creates web pages that include the pictures (here's a link to the same image, but this time the image is included in a web page).


Images uploaded before August last year, when Blogger launched the latest major upgrade, are still not crawlable.

8 comments:

  1. "ou can't directly link to an image (if you click on a link to one of the two images uploaded above, you'll see a dialog that asks you to download the image)"

    That's just Firefox acting stupid. Opera works fine.

    ReplyDelete
  2. @That's Opera acting stupid then, as it should be downloaded, because google gives it a application/something tag and it is wrong to show it as an image then as it is a widely used technique to make any file downloadable.

    ReplyDelete
  3. Ionut, i do not understand a thing,
    1- uploaded pictures from blogger are stored in the picasa web AS PRIVATE, right? and private albums are not indexed.

    2- And even if you make them public... images shouldn't be indexed unless you click the checkbox saying "Make my public albums searchable"

    So....?

    ReplyDelete
  4. Blogger adds this HTTP header:

    Content-Disposition:attachment

    which is used to force a file download.

    @Richard:
    Well, the images are stored both at bpX.blogger.com (Blogger) and lhX.google.com (Picasa Web Albums). Example for one of the images from this post:

    (Blogger)
    bp0.blogger.com/_ZaGO7GjCqAI/R1r3POlcqbI/
    AAAAAAAAG14/rt5jrfj3Hrk/s640/image-search-results-from-blogspot.png

    (Picasa Web)
    http://lh3.google.com/picasaalbums/R1r3POlcqbI/
    AAAAAAAAG14/LQrAODi9fFg/s640/image-search-results-from-blogspot.png

    ReplyDelete
  5. @Anonymous:
    That's inaccurate. Opera treats the header the same as other browsers so you'll see the same download dialog (tested in Opera 9.23/Windows).

    ReplyDelete
  6. Blogger forces image download to prevent cross-site scripting.

    Internet Explorer's overzealous MIMEtype sniffing can actually decide that a valid PNG image, served as image/png, is an HTML file - if it encounters any HTML-like markup in the first 256 bytes of the file. So, a malicious user could create a PNG image with a script tag in the comment header, and voilà, they have script running on the Google.com domain name!

    Fortunately, by forcing a download, the image can never be rendered in IE as an HTML file from the Google.com domain name.

    ReplyDelete
  7. "2- And even if you make them public... images shouldn't be indexed unless you click the checkbox saying "Make my public albums searchable""

    I'm not sure why you would expect anything you publish publically on the web to not be searchable by default. If you upload a picture anywhere else it will be searchable, why not Picasa.

    ReplyDelete
  8. IrfanView, yeh!:))

    ReplyDelete