July 3, 2008

Viacom Wanted the Source Code for Google's Search Engine, But Obtained YouTube's Server Logs

In the ongoing trial between Viacom and Google, regarding the videos uploaded to YouTube that infringe Viacom's copyright, Viacom really wants to prove that the most popular videos watched at YouTube were from its programs. Viacom even claimed that Google's search results are biased to give better ranking to the infringing YouTube videos, so it asked for... Google's source code (and YouTube's source code too). Here are some excerpts from the rulings:
Plaintiffs move jointly pursuant to Fed. R. Civ. P. 37 to compel YouTube and Google to produce certain electronically stored information and documents, including a critical trade secret: the computer source code which controls both the YouTube.com search function and Google's internet search tool "Google.com". YouTube and Google cross-move pursuant to Fed. R. Civ. P. 26(c) for a protective order barring disclosure of that search code, which they contend is responsible for Google's growth "from its founding in 1998 to a multi-national presence with more than 16,000 employees and a market valuation of roughly $150 billion" and cannot be disclosed without risking the loss of the business.

The search code is the product of over a thousand person-years of work. There is no dispute that its secrecy is of enormous commercial value. Someone with access to it could readily perceive its basic design principles, and cause catastrophic competitive harm to Google by sharing them with others who might create their own programs without making the same investment. Plaintiffs seek production of the search code to support their claim that "Defendants have purposefully designed or modified the tool to facilitate the location of infringing content." (...) YouTube and Google maintain that "no source code in existence today can distinguish between infringing and non- infringing video clips -- certainly not without the active participation of rights holders".

Unfortunately for Viacom and Google's competitors, the request to provide the source code has been rejected. But another request, this time for YouTube's server logs, has been approved.
Defendants' "Logging" database contains, for each instance a video is watched, the unique "login ID" of the user who watched it, the time when the user started to watch the video, the internet protocol address other devices connected to the internet use to identify the user’s computer ("IP address"), and the identifier for the video. That database (which is stored on live computer hard drives) is the only existing record of how often each video has been viewed during various time periods. Its data can "recreate the number of views for any particular day of a video." Plaintiffs seek all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website. They need the data to compare the attractiveness of allegedly infringing videos with that of non-infringing videos.

Google argued that the task requires a lot of resources, since the logging database has 12 TB, and it violates users' privacy. Google has previously stated in a blog post that an IP address without additional information cannot identify people, so it's not personal information. "Therefore, the motion to compel production of all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website is granted."

Viacom wanted other things: the schema for Google's advertising database and for Google Video's database, data about private YouTube videos etc. You can read the entire document as it's pretty entertaining.

Salon thinks that "all's not lost. Google might manage to reverse this decision on appeal, and Viacom, gauging the outrage, could decide to withdraw or limit its request." After all, getting YouTube's server logs just to determine the popularity of the infringing videos is an abuse: YouTube could have offered aggregated data about those videos.

Update: According to Search Engine Land, Google sent a letter to Viacom regarding the removal of personal data.
Given Plaintiffs' stated reasons for seeking information from the logging database -- to conduct proportionality analyses -- potentially personally identifiable information should be irrelevant. Indeed, Plaintiffs have previously represented that they do not desire to investigate users' viewing activities, and Viacom's general counsel is on record today stating that Viacom does not want to receive individuals' usernames and IP addresses. Accordingly, we request that Plantiffs agree that YouTube may redact usernames and IP addresses from the viewing data in the interests of protecting user privacy.

Update 2 (July 15): "We are pleased to report that Viacom, MTV and other litigants have backed off their original demand for all users' viewing histories and we will not be providing that information," says YouTube Blog.

8 comments:

  1. Google had two choices: handle source code, or compromise users privacy. Apparently lawyers fought so the latter choice would be taken instead.
    Apparently the philosophy of "do no evil" can be stated by: "our code is more important than anything else, including users privacy!"

    ReplyDelete
  2. Some people can't research (@Anonymous)... Google fought both of these and Viacom pushed for both of these. Google won the fight regarding source code but lost the fight to protect log files. However, they are continuing to follow up and fight that user data should be allowed to be anonymized, which they will likely win.

    In other words, Google didn't just fight for one thing and give up for another. Look at the real evil in this picture. Viacom. They are fighting to get Google's trade secrets AND fighting to get Google's customers' private information. That is terribly far reaching.

    ReplyDelete
  3. Give them time. As the update says, they are now trying to get Viacom to agree to let YouTube redact usernames/IPs from the data to preserve user privacy. Given that the reasons they claimed to want the data do not require or involve that info in any way, I can see them getting away with it. Of course, given the way RIAA/MPAA have tried to claim things were infringing at random in the past, I'm certain that the results of Viacom's analysis will show that infringing video receive more views, simply because anything that contains any term from any of their productions (in title or description) will be presumed to be infringing with no further look. Sort of like how the RIAA will try to bust you for sharing a file that has a filename that looks like it might be something infringing, maybe.

    ReplyDelete
  4. this is absolutely crazy. I can only hope this comes back to bite viacom. until then, I think I will be avoiding using my youtube account. take a look, I just wrote my opinion here... http://webpoet.wordpress.com/2008/07/04/time-wasted-not-forgotten/

    ReplyDelete
  5. I tried to spread awareness of this, but the only attempt that gained traction was hijacked by Google-phobic activists. :(

    BTW, the comment verification system is broken for; the word verification image just shows up as blank in Firefox 3 on Vista here.

    ReplyDelete
  6. Maybe everyone should boycott Viacom. Eventually, they would run out of money to sue people with.

    Our legal system should also be adjusted to where a plantiff can more easily be held liable for legal fees and other expenses if the accused are found innocent. That way, if the RIAA sues me for streaming my mp3's from home to work, I'm not out thousands in lawyer fees and thousands in lost work.

    ReplyDelete
  7. Viacom International, Inc. et al v. Youtube, Inc. et al

    Plaintiffs: Viacom International, Inc., Comedy Partners, Country Music Television, Inc., Paramount Pictures Corporation and Black Entertainment Television, LLC
    Defendants: Youtube, Inc., Youtube, LLC and Google, Inc.

    Case Number: 1:2007cv02103
    Filed: March 13, 2007

    Court: New York Southern District Court
    Office: Foley Square Office
    County: NewYork
    Presiding Judge: Judge Louis L. Stanton

    Nature of Suit: Intellectual Property - Copyrights
    Cause: 17:501 Copyright Infringement
    Jurisdiction: Federal Question
    Jury Demanded By: None
    Amount Demanded: $1,000,000,000.00

    ReplyDelete

Note: Only a member of this blog may post a comment.