An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

March 19, 2007

How Google Blog Search Ranks Results

Unlike most blog search engines, Google Blog Search ranks the results by relevancy. You can change that by clicking on "sort by date", but the default option is useful if you want to find the most significant blog posts about a topic. But how does Google rank blog posts?

A new patent gives us some answers. Google uses indicators to reflect the quality of a blog or of a blog post.

Positive signals
Negative signals (spam signals)
  • links from blogrolls (especially from high-quality blogrolls or blogrolls of "trusted bloggers")
  • links from other sources (mail, chats)
  • using tags to categorize a post
  • PageRank
  • the number of feed subscriptions (from feed readers)
  • clicks in search results
  • posts added at a predictable time
  • different content between the site and the feed
  • the amount of duplicate content
  • using words/n-grams that appear frequently in spam blogs
  • posts that have identical size
  • linking to a single web page
  • a large number of ads
  • the location of ads ("the presence of ads in the recent posts part of a blog")

To rank the search results, Google combines a quality score obtained by mixing those signals with a relevance score (IR score) that depends on the query. "The IR score may be determined based on the number of occurrences of the search terms in the document. The IR score may be determined based on where the search terms occur within the document (e.g., title, content, etc.) or characteristics of the search terms (e.g., font, size, color, etc.). A search term may be weighted differently from another search term when multiple search terms are present. The proximity of the search terms when multiple search terms are present may influence the IR score." (the quote was slightly altered for clarity)

We learned that Google uses all kinds of factors to determine the popularity and the quality of a blog, but that doesn't mean less popular blogs are left out if they have relevant content. You should also try to avoid all the negative signals that may indicate your blog is spammy.

{ via Search Engine Roundtable }

14 comments:

  1. About the only one I don't understand is "predictable time" (?)

    ReplyDelete
  2. There are two different types of timings related to posts in the patent application.

    The writers state that spam blogs are posted to either with a lot of posts in short bursts, or at very consistent and predictable time intervals, such as every three hours and 43 minutes.

    ReplyDelete
  3. Interesting that they count simply the USE of tags as a positive indicator. It's something I've been doing myself for some time, judging the care and attention someone puts into a post by the care they put into categorisation. Judging the categorisation is quicker than reading a post!

    ReplyDelete
  4. How long before blog-spam tools adapt to this information?

    How many years before computers can publish original-looking fake blogs that are good enough to fool a human? With a million "original" blogs clogging up the rankings a spammer can do a lot of spamvertising.

    ReplyDelete
  5. Yup, spam blogs will catchup, and I trust that Google will adapt to treat them as such. It's the typical game of leap-frog with one side trying to out-better the other.

    ReplyDelete
  6. I have a few questions about this which I've voiced here. Does anyone know the answer to these questions?

    http://www.seoblogpro.com/archives/google/make-google-blog-search-fall-in-love-with-your-blog/

    ReplyDelete
  7. Can anyone define "Trusted Bloggers"? How is google coming up with this list?

    ReplyDelete
  8. They certainly don't have a hand-built list of popular blogs. Google could use PageRank, the number of subscribers from feed readers and other factors to compile a list of popular blogs. One problem with top blogs is that they tend to link only to other top blogs.

    ReplyDelete
  9. Excellent scoop! Thanks a lot for the brand new information.

    I myself wrote a post on this based on your findings in my blog(Blogging India)!

    Tim, what they call "trusted bloggers" are those high PR, high traffic, high readership top blogs who have been on the blogosphere for atleast 3-5 years, and have a good reputation... When it comes to Google, its not the opinion which counts, but the content quality, readership and reputation.

    ReplyDelete
  10. spamers may use this info

    ReplyDelete
  11. This is a great site. Thank you for your information. I THANK YOU I SALUTE YOU IT,S A AMZING SITE.

    ReplyDelete
  12. All this of course relies on your site being indexed by the search engine.. Google seams to take ages to do this.... and then you could still not be indexed.

    ReplyDelete
  13. I trust that Google will adapt to treat them as such. It's the typical game of leap-frog with one side trying to out-better the other.

    ReplyDelete
  14. I'm curious why the blog search home page top rankings and ads are mostly right wing anti Obama blogs. Day after day how can the same blogs top the list? It only makes sense if it's fixed or the blogs know how to game the search engine.

    ReplyDelete

Note: Only a member of this blog may post a comment.