February 16, 2011

More About Google's Reading Level Filter

Google's Daniel M. Russell has more information about the reading level filter, a feature recently added to the advanced search page.

The reading-level is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a model of the intrinsic complexity of the text. (...) We also used data from Google Scholar, since most of the articles in Scholar are considered advanced.

So the breakdown isn't grade- or age-specific, but reflects the judgments of teachers as to overall level of difficulty. Roughly speaking, "Basic" is elementary level texts, while "Intermediate" is anything above that level up to technical and scholarly articles, a la the articles you'd find in Scholar.

That's not exact, but it's a fairly robust model that works across a wide variety of different text styles and web pages.

Unfortunately, the feature only works for English and it's probably difficult to add support for other languages.


