An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com .

April 26, 2007

Ranking Web Pages Based on Their History

A new Google patent describes some scores that could be used for ranking search results. These scores use information about a document, from the moment when Google first finds it to the present. The history of a web page could help Google determine if the content is fresh, still useful or outdated.

"Search engine may use the inception date of a document for scoring of the document. For example, it may be assumed that a document with a fairly recent inception date will not have a significant number of links from other documents (i.e., back links). For existing link-based scoring techniques that score based on the number of links to/from a document, this recent document may be scored lower than an older document that has a larger number of links (e.g., back links)."

"For some queries, documents with content that has not recently changed may be more favorable than documents with content that has recently changed. As a result, it may be beneficial to adjust the score of a document based on the difference from the average date-of-change of the result set. In other words, search engine may determine a date when the content of each of the documents in a result set last changed, determine the average date of change for the documents, and modify the scores of the documents (either positively or negatively) based on a difference between the documents' date-of-change and the average date-of-change. "

"Documents for which there is an increase in the rate of change might be scored higher than those documents for which there is a steady rate of change, even if that rate of change is relatively high. The amount of change may also be a factor in this scoring."

"Using this date as a reference, search engine may then monitor the time-varying behavior of links to the document, such as when links appear or disappear, the rate at which links appear or disappear over time, how many links appear or disappear during a given time period, whether there is trend toward appearance of new links versus disappearance of existing links to the document, etc. (...) By analyzing the change in the number or rate of increase/decrease of back links to a document (or page) over time, search engine may derive a valuable signal of how fresh the document is."

If a page still gets links one year after it was created, Google might assume it's still useful. If a page is constantly updated (like Wikipedia pages), the content could be more relevant to the reader. These are some simple rules that could remove outdated pages from the top results.

{ via Russel Shaw. }

This blog is not affiliated with Google.