June 3, 2007

Google Shows More Fresh Results

New York Times has a long article about Google's search quality team and the way they constantly improve the ranking algorithms.

"Search over the last few years has moved from Give me what I typed to Give me what I want", says Amit Singhal from Google. His team tries to find patterns in the list of queries that return bad results, obtained from other Googlers or from users. Tweaking the ranking algorithm to favor some web pages in certain conditions is difficult because the results may change in unexpected ways.

One of the most important patterns from last year was that people expected to see fresh pages for queries related to recent events. For example, a search for "Google Finance" didn't return Google's financial site many days after the launch.

Mr. Singhal introduced the freshness problem, explaining that simply changing formulas to display more new pages results in lower-quality searches much of the time. He then unveiled his team's solution: a mathematical model that tries to determine when users want new information and when they don't. (And yes, like all Google initiatives, it had a name: QDF, for "query deserves freshness.") (...)

The QDF solution revolves around determining whether a topic is "hot." If news sites or blog posts are actively writing about a topic, the model figures that it is one for which users are more likely to want current information. The model also examines Google's own stream of billions of search queries, which Mr. Singhal believes is an even better monitor of global enthusiasm about a particular subject.

The visible part of QDF is the recently launched Hot Trends site, but this is just the tip of the iceberg. For queries related to things that are suddenly popular, Google's ranking algorithms are biased towards recent web pages. You may see results from Google News inside the search results pages or a blog search OneBox at the bottom of the page. Google also seems to be crawling pages at a much faster pace and not just for popular sites that are frequently updated, like they did before. I often see some of my posts in the search results hours after they're published.

