Tuesday, April 22, 2008

Google Search REST API

More than one year after Google discontinued the SOAP Search API, it finally got a proper replacement. The AJAX Search API can now be used from any Web application, not just in JavaScript. The other two Google AJAX APIs for feeds and translations were updated for non-AJAX use, as well.

"For Flash developers, and those developers that have a need to access the AJAX Search API from other Non-Javascript environments, the API exposes a simple RESTful interface. In all cases, the method supported is GET and the response format is a JSON encoded result set with embedded status codes."

"Using the APIs from your Flash or Server Side framework couldn't be simpler. If you know how to make an http request, and how to process a JSON response, you are in business," says Mark Lucovsky. Here's a simple example for web search:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Earth%20Day

There are some differences between the old SOAP API and the REST one.

PROs:
- the new API doesn't require a key
- there's no limitation for the number of queries
- it's much easier to use
- you can use the REST API for web search, but also for image search, news search, video search, local search, blog search and book search.

CONs:
- you need to send "a valid and accurate http referer header"
- you can only get up to 8 results in a single call and you can't go beyond the first 32 results
- the terms of use are pretty restrictive: for example, you need to attribute the results to Google and you are not allowed to change the order of search results.

It's interesting to notice that Yahoo's search APIs are more developer-friendly and, although they require an application ID and have some usage limitations (5,000 queries per IP per day), they offer more features and they are more flexible, by also including XML output. Another important difference is that Yahoo doesn't require "a valid and accurate http referer header".

Philipp Lenssen suggests that it's much easier to just screenscrape the results, but search engines could change their code or block your requests.

Update. Check this excellent interview with Mark Lucovsky, who mentions that the API has been available for almost two years, but it wasn't officially documented:

Labels: ,

  6 comments ( Post a comment )
is it really that difficult to create a 'valid and accurate http referer'? how much work is that? 30 seconds? more? less?

as for Yahoo having a 'more developer-friendly' api - one could argue that. for instance, would you rather develop in python or cobol? that's your choice with search you can search either with google or yahoo.

also, the obvious. the lesser search engine _has_ to provide more liberal usage terms, else nobody would even bother at all, would they?

:)
Hi, I've just launched my new Job Search site JobGeni.com (beta) - The Spider that aggregates the best job boards on the web with the new Google AJAX Feed API.
No nightly spidering, no big servers, just realtime rss feed search! have a look:
http://www.jobgeni.com/
What about the Live Search APIs? They are SOAP only and require a key, but they allow 25K calls a day and up to 50 results per page - a lot more than what Yahoo offers, and come with less strings attached than Google's.
And you get image search and phonebook search too.
They are not to easy to use, but there is sample code in Flash out there that is not rocket science
actually, i don't think a referer is needed. cut'n'paste of the examples from http://code.google.com/apis/ajaxsearch/documentation/reference.html#_intro_fonje into a new tab works.
> but search engines could change
> their code or block your requests

That's a valid downside to screenscraping. Just a side-note: even APIs sometimes change their output, and if you're unlucky, that may break your code. I remember that Yahoo's REST API changed their output one day by adding a seemingly harmless XML-namespace declaration on top. However, these XMLNS thingies happen to confuse the PHP XML DOM parser so much that it won't be able to work with it anymore (unless you adjust it with a workaround).
Do you have the search api for sponsored links.
(e.g. link for string layer is http://www.google.com/sponsoredlinks?q=lawyer&hl=en&um=1&ie=UTF-8)
?