An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

August 12, 2006

Human Nature As Seen On AOL Search

AOL has published this week 36 millions search queries from more than 650,000 US users. The data has been anonymized, but you can still use the queries to find information about each person. The data, released by AOL Labs as a research material, stirred a lot of negative controversy, and it was removed from AOL's site. AOL spokesman Andrew Weinstein confessed: "This was a screw-up, and we're angry and upset about it. It was an innocent enough attempt to reach out to the academic community with new research tools, but it was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant."

The valuable data is still available on other sites, including this searchable database and this mirror (439 MB, tgz file). You can find users like 1983280 and track all the searches between March 1st and May 31st this year. The data set includes these fields: UserID, Query, Query Time, Clicked Rank, Destination Domain. So what can you find out about our user? She's a teenager interested in politics, she's from Washington DC, she likes photography and American Idol, one of her parents died and she's about to get married. From other users, you can find the name, the address, the work place and other details that allows identifying the person. New York Times discovered the user no. 4417749, Thelma Arnold, a 62-year-old widow who lives in Lilburn. AOL didn't realize that, in the name of the science, has comitted the biggest privacy breach a search engine ever did. Google didn't let the Government to obtain a similar data set, and AOL, who gets the search results from Google, releases them to the public.

Despite all the privacy considerations, the database is fascinating and it could be the subject of a book about human nature.

What happens when your life is exposed to the public by small fragments of text? You reveal your intentions, your problems and fears, your friendships and your hidden desires. Your queries reveal more than any detective or psychiatrist could find about your life.

4 comments:

  1. Your queries reveal more than any detective or psychiatrist could find about your life.

    or not. You never know if people are searching for something they've just listen on a song or movie. You may be searching for your clients or for anybody else. You may be just searching dumb terms to see how dumb is the content on the Internet. Yuo might be doped. Or you are right. But you don't know, and you could never condemn anyone by searching for anything at all. If I search for "hwo to make a bomb", that doesn't mean I want to make a bomb. I may just be interested in knowing if there is such information on the Internet, and get concerned about that.

    Assumptions are very dangerous.

    ReplyDelete
  2. but after analyzing what you find, if you find something ocassionally or a trend - its more than likely not just a random assumption - tie the pieces together and you get a lot of scary information

    ReplyDelete
  3. for the longest time, I've been recommending leaving messages for various groups in google queries - I don't have an aol account, but maybe someone read what I wrote and did the same thing there...

    These were some of my suggestions:

    Ever hear of the Fourth Amendment?"
    "STOP INVADING MY PRIVACY!"
    "Alberto R. Gonzales can Kiss My @$$"
    "How Many DOJ Employees Does It Take to Change A Light Bulb?"

    ReplyDelete
  4. We discuss particular AOL users at aol.zanoza.lv

    ReplyDelete