February 16, 2011

More About Google's Reading Level Filter

Google's Daniel M. Russell has more information about the reading level filter, a feature recently added to the advanced search page.

The reading-level is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a model of the intrinsic complexity of the text. (...) We also used data from Google Scholar, since most of the articles in Scholar are considered advanced.

So the breakdown isn't grade- or age-specific, but reflects the judgments of teachers as to overall level of difficulty. Roughly speaking, "Basic" is elementary level texts, while "Intermediate" is anything above that level up to technical and scholarly articles, a la the articles you'd find in Scholar.

That's not exact, but it's a fairly robust model that works across a wide variety of different text styles and web pages.

Unfortunately, the feature only works for English and it's probably difficult to add support for other languages.


  1. How can human categorize every website's reading levels?
    Does having reader level "Advance" hurts somebody website search engine ranking.

    Hope this are only filters and not ranking factors.

    1. Google is likely using machine learning algorithms to extract features from pages which are classified at the different levels by professionals. They then use these data to train an algorithm which could classify future pages based on the ideal features.

  2. Teachers only categorized a small number of pages and the results were used to create a statistical model.

  3. I am studying English as a second language, currently, and I thinking about creating an app to make it is easier for folk like me.

    To create such app it is needed to apply a two(2) filters to the Google Search:
    1.Grammar rules filter: Present continuous(I am doing), future time(will), reported speech, etc. (filtering texts that full of selected rule examples)
    2.Reading levels filter: basic, intermediate, advanced.

    Second filter (reading level) is not a problem as it already exists at Google's advanced search. But first filter (Grammar rules) is not exists. Is it possible to create that?

    Here is the way how it should works: you enter the query and set your level of language skills and the grammar rule that you want to train by reading. Than press ENTER and it returns filtered results from Google around the globe with ability of further filtering by native google's filters.

  4. So in addition to assuming that i now desire my search results to be at least partially censored (auto "safe-search" filters that must be changed after every reboot), Google now assumes that i may be far from proficient in reading/speaking/comprehending my chosen browser language as well?


