Google has a new search API that can only be used for research. Another educational-only API allows programmatic access to detailed results obtained by Google's machine translation.
"The University Research Program for Google Search is designed to give university faculty and their research teams high-volume programmatic access to Google Search, whose huge repository of data constitutes a valuable resource for understanding the structure and contents of the web. Our aim is to help bootstrap web research by offering basic information about specific search queries. Since the program builds on top of Google's search technology, you'll no longer have to operate your own crawl and indexing systems."
Even if this new API is less limited than the now-unsupported SOAP API, you can't send more than a single query in one second and you can't use it to display "interactive search results for end users".
Greg Linden thinks that Google should be more open. "It is good that Google is making tools available to researchers, but they may have to go further than a throttled search API. As is, many researchers trying to work at large scale still will have to build their own crawls and indexes."
Other search engines offer less restrictive APIs, but Google tries to protect its most valuable asset as much as possible. The only official Google Search API works only for client-side coding and lets you access the top 8 search results.
The translation service "provides researchers, in the field of automatic machine translation, tools to help compare and contrast with, and build on top of, Google's statistical machine translation system". For example, you can request a list of the best possible translations of a text with detailed information about the scores.
Pffft. Just because I'm not in university doesn't mean I can't do some research.
ReplyDeleteGoogle:
> The program must not be used to
> display or retrieve interactive
> search results for end users.
What about research involving end users?
You:
> you can't send more than a single
> query in one second and you can't
> use it to display "interactive
> search results for end users".
Perhaps you're allowed to use multi-threading to send more in 1 second.
> What about research involving end users?
ReplyDeleteMaybe it's more difficult to prove you're actually doing research.
> Perhaps you're allowed to use multi-threading to send more in 1 second.
I don't think so. "A time period of at least one second must be allowed between requests." So you can only get 86,400 queries/day. Yahoo API offers 5,000 queries/IP/day and Microsoft gives 25,000 queries a day.
According to the documentation for the University Research Program for Google Translate, this is the Google RS2 service I found in June last year.
ReplyDelete