An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to

August 13, 2006

Google's Digital Library of Alexandria

Google's founders have intended to digitize library books since they were students at Stanford. The idea that you can find and read books by typing some keywords in a program may seem great for a library, but when it comes to digitizing all the books in the world, material obstacles interfere: many books are copyrighted and are sold in bookstores. Google has started to scan public domain books and out-of-print books in 2004 and wants to continue the process with the rest of the books. Google also has partnerships with some publishers and universities like Stanford.

"When Google announced the library scanning project, in December 2004, it had four library partners besides Stanford. Two of them (Oxford University and the New York Public Library) took a legally cautious approach to digitization, permitting Google to copy only public domain works. A third, the University of Michigan, took the opposite view, asserting forcefully that Google could scan every one of its 7 million books. Harvard hedged its bets, initially agreeing only to a limited test program. Last week, the University of California signed on as a sixth Google partner. Its scanning program will include both public domain and copyrighted material," reports Washington Post.

Last year, Author Guild and major publishing houses like McGraw-Hill and Penguin Group have sued Google, for scanning books without permission. Google says their digitizing process doesn't infringe copyright, as it's a transformative process covered by the fair use. Google also compares scanning and indexing books with crawling and indexing web pages. Google has to store a cache of the content, transform it into an index of keywords and make it searchable. Publishers that don't want to have the website / book in the index can request that. The difference is that web pages are mostly available for free, while books must be bought. Google Book Search shows only a small number of pages from a book, and doesn't allow copying book content. "Copyrighted books are indexed to create an electronic card catalog and only small portions of the books are shown unless the content owner gives permission to show more," says Google.

What publishers fail to understand is that a book search engine will increase their sales, as people will discover books they wouldn't have found otherwise. The vast collection of human knowledge would be available to anyone interested. The quality of the content is also better than the web's frugal information. The book search engine could also morph into a digital library, that allows you to read, download and print books for a price. Publishers are afraid that Google would undermine their power and would take advantage of their content for free, but so were the webmasters when Google started to crawl the web and slow down their servers.

You can read more about Google Book Search in Washington Post's Google Wants to Digitize Every Book. Publishers Say Read the Fine Print First.


  1. People are allowed to freely read books in bookstores and libraries without ever having to pay for them. Book publishers seem to love it, in fact, because it's obvious how they'll buy the book from there.

    And of course, readers pay for the convenience of being able to read it outside the store, and they like to reward the author for their work, and add the book to their shelves to read later (or just show off).

    The point is, Google Print is no different. They want people to be able to read (parts of) books from anywhere in the world. You'd still need to buy them in order to be able to read them outside of your computer screen.

    The only difference is that, simply put, Old Media is afraid of the internet. They've bought into the lies of the RIAA and MPAA that people downloading music and movies has cut their sales (it hasn't), and they're afraid that if people could read their books online, they wouldn't buy any books (they would).

    I imagine that eventually, Google's dream of an online library will come true. But it will take a generation of people unafraid of the internet to make it happen. Oh it'll happen, just not for a few decades.

  2. In my personal experience, google books has done exactly what it's supposed to to: I looked up a book and ended up spending $30 on the actual book. I think it's a great service; I don't see why the publishers are so jumpy with all of the options they have to opt out of the program.


  3. You've only got to look at the Baen Free Library (Click Here for the Library) for a publisher giving away free books...

    Ok, they're ann odd-one-out in the world of publishers, but the Eric Flint-authored synopsis of the reasoning behind the decision makes interesting reading.

    It all boils down to There ain't no such thing as bad publicity - and getting someone to read books is definitely good publicity.

  4. I think it's a wonderful idea. This seems to tie into the debate over Wikipedia's usefulness too in that most colleges and schools deny it as a source of information due to it's innacuracies; But it is so successful as a source of information for so many people because there is NO other freely available source of information. Google's attempt to scan books is a huge step in allowing people a more freely available source of information - public domain works straight away - and allowing them to find the things that aren't freely available in a simpler and free and more accessible manner.
    The biggest hurdle I see is getting the companies that own all these books and "sources" to realize that information and rights to being able to view, cite, and study it shouldn't be a limited and/or premium-paid only sort of thing. People don't, won't, stand for this sort of commercialized form of censorship forever. Encyclopedias are about as common in the home as a Ford Pinto these days, electronic versions (like Encarta and Britanica's electronic format) aren't much better saturated into the home market. Yet, home users often need information for countless reasons. For this to happen these companies have to come off this "we compiled all these things already known by people as a whole into THIS book so you can't see it b/c THIS BOOK is THE BOOK you can use 'officially' and we want $ for it" and get to the point where we understand as a people that information on topics like Copernicus, the Roman Empire, etc. are NOT private copyright-able data. IT is existing data from which millions of people have the same knowledge of and therefore by it's very nature (and 99% of the data in things like encyclopedias is that way) and should be public domain - there's no way around it.
    For the works of fiction and of pure entertainment; Okay, I'll concede that currently owned copyrighted books will need some protection - but Google seems to have done that very well. It gives you just enough to get the idea of it to see if you want to read it or not. Older books, iconic authors and novels, should be public domain; it serves no one to have a book sit in a basement somewhere for decades until the company forgets to renew a copyright and then sit for a few decades more to ensure that they've allowed it to lapse long enough for it to be challenged into public domain...
    I hope Google can get these non-PD books scanned and help people find great books and things they could otherwise never see. And for the PD stuff - I want to see Google become a hub unlike anything else for these books/materials!

  5. A metaphorical phrase "Don't judge a book by its cover" will become a literal expression after Google's Digital Library!Well done,Sasha & Lawrence!

  6. Thanks Google's founders for their thoughfulness.

  7. It is a great idea. it will benefit eweryone.