A leaked copy of Google's quality rater guidelines (PDF), used internally to evaluate the quality of search results, reveals some interesting things about Google's approach to search. "According to the document, which is dated April 2007 and at least looks legitimate, a quality rater has the job to first research and understand a specific search query – say [cell phones] –, to then look at the quality of a website returned for this query," notes Philipp Lenssen.
Queries can be navigational (the user has a single site in mind, the official homepage of a company/product), informational (searching for information), transactional (trying to obtain something: buy a product, download a video) or a combination of these categories. Depending on the query, search results can be: vital (for navigational queries with a dominant interpretation), useful (comprehensive, authoritative resources), relevant (pages that don't cover all the aspects of a query), not relevant (marginally related to the query) or off-topic.
Google thinks that there must be a connection between queries and search results in terms of generality: broad queries are best matched by broad pages, specific queries by specific pages. Search results must take into account the dominant interpretation of a query in a certain location and at a certain moment.
Spam is treated separately from search results evaluation. A web page may be spammy even if it's considered "vital" for some queries or it's very authoritative. "Webspam is the term for web pages that are designed by webmasters to trick search engine robots and direct traffic to their websites," explains Google. Web pages that include ads and scraped content from other sites, but don't bring any original information are considered spam. "When trying to decide if a page is Spam, it is helpful to ask yourself this question: If I remove the scraped (copied) content, the ads, and the links to other pages, is there anything of value left? If the answer is no, the page is probably Spam."
{ via Google Blogoscoped and SEO Book }
Subscribe to:
Post Comments (Atom)
This explains why Google search engine put more factor on blogs.
ReplyDeleteBlogs have tags which can be keywords of query. The relationship between keywords and content is closer in blog.
This also explains why Google hate spam blogs. For example, if you post blogs to fast, say, with MS Live Writer, the publishing may be blocked for a moment by Captcha.