March 16, 2007

XML Output for Google Search Results

Note: This tool may conflict with Google's TOS, but I thought it was interesting enough to tell you about it.

Many people would like to have feeds for Google search results. They could use them to monitor some keywords or to develop their web applications. But Google didn't show any interest in providing this feature; moreover, they cut the support for the SOAP API.

In an interesting twist, someone realized that Google actually has a way to return results in an XML file, but you need to do some work to actually retrieve them. So your URL will look the same as the standard URL for a Google search, except that you'll have to add some new parameters:

* ch=[value of a checksum]
* client=navclient-auto

Basically, you'll pretend you're Google Toolbar (that's the explanation for the client parameter) and add a checksum for the query that uses a similar algorithm to the checksum used to find the PageRank value. Unlike the API, you won't have any limitation (although Google might realize you're not Google Toolbar).

The code and some demos are available here.

Homework:
1. How does this code breaches Google's TOS more than screen scraping?
2. Do you know where is this feature used in Google Toolbar?

5 comments:

  1. Let's have a pool on how many days this service lasts.

    ReplyDelete
  2. Obviously this doesn't work anymore. Is there any way to obtain XML results as of now?

    ReplyDelete
  3. Is it possible to have google results in XML file ?

    ReplyDelete
  4. AFAIK XML results are only delivered with Google's paid site search, which is $100 year unless you need to search a huge web site.

    ReplyDelete
  5. Does this script work. I want to get the search results in XML

    ReplyDelete

Note: Only a member of this blog may post a comment.