Home | About | RSS Feed | Contact and Publicity Guidelines | Comment Policy the Law, the Universe, and Everything 

Search


Concurring Opinions is a
general-interest legal blog
operated by Concurring
Opinions LLC, a Pennsylvania
Limited Liability Corporation.

jr_114_9780195367195_bnr

jr_114_9780195383768_bnr

advertise-here4


FC-CO(SS)

Our Podcast

Subscribe to Law Talk

law-rev-contents2.jpg


  • Posts by Author

  • Categories

  • Archives


  • Recent Comments

    • Mike Zimmer on From the other side at AALS . . .

    • Mike Zimmer on The Employer’s Strategy in Gross v. FBL Financials

    • Mike Zimmer on Drafting the 28th Amendment

    • M.G.M on Drafting the 28th Amendment

    • A.J. Sutter on Lawyers: Don’t Trade on Inside Information!

    • No Load Funds on Consumer Financial Product Safety?

    • grad student on Princeton and the Behavioral Revolution

    • Anon321 on The Passive Voice in Statutory Interpretation

    • Steven Kaminshine on The Employer’s Strategy in Gross v. FBL Financials

    • Alex Kreit on Politicians: Have you talked to your constituents about drug policy?

    • Alex Kreit on Election Night 2009

    • mikeb302000 on Election Night 2009

    • Neal Goldfarb on The Passive Voice in Statutory Interpretation

    • Orin Kerr on Politicians: Have you talked to your constituents about drug policy?

    • MYarnell on Curricular Reform Revisited

  •  

    Site Meter

Toward a Public Alternative in Digital Archiving and Search

posted by Frank Pasquale

With inimitable clarity, Cory Doctorow made the case for an open alternative to Google in The Guardian earlier this month. He focused on the secrecy of search:

[S]earch engines routinely disappear websites for violating unpublished, invisible rules. Many of these sites are spammers, link-farmers, malware sneezers and other gamers of the system. . . . The stakes for search-engine placement are so high that it’s inevitable that some people will try anything to get the right placement for their products, services, ideas and agendas. Hence the search engine’s prerogative of enforcing the death penalty on sites that undermine the quality of search.

[Nevertheless, i]t’s a terrible idea to vest this much power with one company, even one as fun, user-centered and technologically excellent as Google. It’s too much power for a handful of companies to wield.

Search engines like Google have some good reasons for keeping their algorithms confidential–if they were public, manipulators could quickly swamp Google users with irrelevant results. However, just as Comcast cannot circumvent net neutrality regulation by saying all its traffic management and spam-fighting methods are trade secrets, search engines should not be able to use such arguments to escape regulation altogether. Moreover, there are ways of developing a qualified transparency that would let a trusted third party examine a search engine’s conduct without exposing its business methods for all the world to see.

But Doctorow does not want regulation here–he wants an alternative. Having made a similar case for a “public option” in the case of health insurance, I like this line of argument, but I think Doctorow is underestimating the barriers to entry. Though he’s aware of the failure of Wikia, Doctorow wonders if a “wikipedia for search” could be built:

We can imagine a public, open process to write search engine ranking systems, crawlers and the other minutiae. But can an ad-hoc group of net-heads marshall the server resources to store copies of the entire Internet? . . . . It would require vast resources. But it would have one gigantic advantage over the proprietary search engines: rather than relying on weak “security through obscurity” to fight spammers, creeps and parasites, such a system could exploit the powerful principles of peer review that are the gold standard in all other areas of information security.

The “rival public system” approach has been suggested for search engines a few times before. About a decade ago, Introna & Nissenbaum demonstrated that “the conditions needed for a marketplace to function in a ‘democratic’ and efficient way are simply not met in the case of search engines.” Recognizing this, Jean-Noel Jeanneny made a case for a French language alternative to dominant US-based search engines. The Quaero project in the EU appears to be answering that call, though in a far more dirigiste manner than Doctorow would probably like.

I have a few thoughts on a “public option” in search, building on a talk I gave at Yale Law’s Library 2.0 conference in the spring.

First, I think we have to fully understand just how big Google’s present operation is. They’re using somewhere between 100,000 and a million computers to index the web. Is a program like SETI or other distributed computing systems capable of “storing” that in many computers? Indexing the web is a project orders of magnitude more storage- and processing-intensive than hosting an online encyclopedia like Wikipedia, or even hosting the collaborative editing process that is Wikipedia’s “secret sauce.”

Nevertheless, there are some steps that could lead to an infrastructure for a public option in search. Google’s supporters have frequently argued that it needs to scan and store books because they could be lost in disasters. Couldn’t a similar case be made that government or an NGO needs to index Google’s archive of web pages and books in case, say, a tornado hits a central Google storage facility? At what point does it become critical infrastructure?

Note that there should be a strict separation in such a proposal between information a search engine company properly owns (such as user data patterns, records of how many people clicked on what, etc.), and an underlying collection of materials that would be “archived” as a base of content for the public option. For example, to take one small slice of search, books: I would argue that any settlement of the current lawsuit between Google and publishers should require the U.S. Copyright Office to require digital deposit of all copyrighted books in the US, as a database for a future public option in search. In antitrust terms, the digitized copies are an “essential facility” for future advances in book search–particularly if the cozy relationship between Google and a books “Registry” envisioned in the current settlement documents is ratified by the courts.

The big question here is whether we want a government entity to do all this archiving for the web generally, or some publicly funded third party. Some might think that the latter entity is a better bet in terms of privacy protections. But the more one understands how flimsy a legal barrier separates government actors from “private” data stores, the less difference it makes whether the database used for the public option is in governmental or NGO hands.

Finally, even if a public alternative in search seems unlikely, I deeply believe we need to guarantee one in book search. Note that in web searches, Google’s role is usually only to direct us toward what is most relevant–not to ration access to knowledge, a role it so often plays in book search with snippets, restricted portions, etc. In this new role it is much more like a private health insurer rationing access to care than it is your traditional Web 2.0 info-company organizing access to the web by creatively accessing the wisdom of crowds. It’s a middleman, and if we’ve learned anything from the health care field, it’s that highly concentrated provider markets combined with highly concentrated insurer markets lead to ever-higher prices for everyone outside that charmed circle of bilateral monopoly. Here’s how Joseph White characterized the developments in health care:

One might wonder why consolidation among insurers did not allow them to resist the providers’ demand for increased payments. The simple answer is that there were two concentrated parts of the market and one fragmented part. The insurers had to choose between fighting a full-pitched battle with the providers or exploiting their own market power vis-a-vis employers. Raising premiums to employers was a lot easier.

Substitute “publishers” for “providers,” “Google” for “insurers,” and “readers” for “employers” in that dynamic, and you have a pretty good sense of how the book search settlement will ultimately play out without some alternative service. Right now, Medicare is the only entity exercising genuine price discipline and providing universal access in the US health field. We need something like it in book search.

PS: I have more thoughts on Doctorow’s piece in the comments section of this interesting blog post by Berin Szoka. I really hope Doctorow does not endorse First Amendment protection for whatever dominant search engines do.


 June 20, 2009 at 7:59 pm   Posted in: Antitrust, Google & Search Engines, Privacy   Print This Post Print This Post

Responses (5)

  1. Seth Finkelstein - June 21, 2009 at 1:47 am

    I lost you somewhere between “Nevertheless, there are some steps that could lead to an infrastructure for a public option in search.”, and when you seemed to advocate for a public digitized BOOK collection. These are very different projects.

    Notably, books are static and don’t involve complicated link analysis algorithms.

  2. James M - June 21, 2009 at 12:00 pm

    I don’t think this is a problem. There are all sorts of search engines out there. Google is just the most successful because it is user focused and it DOES suppress most search engine spammers.

    For any public option to be even close to being useful it must also use some algorithms to suppress the spammers. You have no good technical or legal argument as to why a public or open algorithm would be better than any of the many existing search engines and you have pointed out the one problem with such a suggestion, the spammers would have the source to out game the system.

  3. Joe - June 22, 2009 at 6:47 am

    As if this conversation is really going to solve anything. As soon as any new search engine opens up, spammers and computer people everywhere will figure out ways to outsmart it. They’re surprisingly creative in getting what they want.

  4. David Schwartz - June 22, 2009 at 11:11 pm

    This probably won’t work for the same reason Wikipedia’s regulation system doesn’t work. Rather than being dominated by a desire to make money by giving people what they want, it will be dominated largely by people who have nothing better to do and have a perverse vested interest in people getting particular results.

    Fortunately, Wikipedia’s not that dependent on its regulation system, and there’s not much money in perverting people’s views on political controversial subjects. However, for a search engine ranking, the incentives may well be much different.

    In any event, the argument is pretty bogus on its face. Why trust supermarkets with telling us what foods we can and can’t buy? That’s an awful lot of power. Let’s start a chain of non-profit supermarkets, just in case the chain stores someday choose to stop selling our favorite coffee. But why would they do that? Should I really worry that some other coffee maker will pay my local grocery to only offer me coffee I don’t want. How long can they do that?

  5. P2P Foundation » Blog Archive » Toward a Public Alternative in Digital Archiving and Search - July 9, 2009 at 2:47 am

    [...] X-Posted: Concurring Opinions. [...]

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word


  • « Previous post
  • Next post »

Authors

Daniel J. Solove

Website
Understanding Privacy

Kaimipono Wenger

Website
SSRN Page

Dave Hoffman

Website
SSRN Page

Nate Oman

Website
SSRN Page

Frank Pasquale

Website
SSRN Page

Deven Desai

Website
SSRN Page

Danielle Citron

Website
SSRN Page

Lawrence Cunningham

Website
SSRN Page

Sarah Waldeck

Website
SSRN Page

Jaya Ramji-Nogales

Website
SSRN Page

Solangel Maldonado

Website
SSRN Page

Gerard Magliocca

Website
SSRN Page


Guests

Rachel Godsil
Alex Kreit
Anita Krishnakumar
Matthew Sag
Michael Zimmer






Previous Guests

Michael Abramowicz
Michelle Adams
Robert Ahdieh
Michelle Anderson
Laura Appleman
Ann Bartow
Francesca Bignami
Jeremy Blumenthal
Kathleen Boozang
Bruce Boyden
Donald Braman
Al Brophy
Neil H. Buchanan
Bill Burke-White
Scott Burris
Paul Butler
Naomi Cahn
Anupam Chander
Miriam Cherry
Jack Chin
Jennifer Collins
Allison Danner
Brannon Denning
Deven Desai
Mike Dimino
Mark Edwards
David Fagundes
Christine Haight Farley
Kim Ferzan
Dan Filler
Michael Froomkin
Amanda Frost
Timothy Glynn
Rachel Godsil
Eric Goldman
David Gray
Craig Green
Tristin Green
Jeffrey Harrison
Erica Hashimoto
Carissa Hessick
Laura Heymann
Robert Hillman
Christine Hurt
Darian Ibrahim
John Ip
Kevin Johnson
Dan Kahan
Brian Kalt
Sam Kamin
Michael Kang
Chimène Keitner
Orin Kerr
Nancy Kim
Heidi Kitrosser
Adam Kolber
Russell Korobkin
Anita S. Krishnakumar
Susan Kuo
Greg Lastowka
Sarah Lawsky
Erik Lillquist
Jeff Lipshaw
Jonathan Lipson
Jacqueline Lipton
Joseph Liu
Michael Madison
Solangel Maldonado
Jason Mazzone
Linda McClain
William McGeveran
Salil Mehra
Carrie Menkel-Meadow
Max Minzner
Scott Moss
Eric Muller
Jaya Ramji-Nogales
Helen Norton
Elizabeth Nowicki
Paul Ohm
Michael O'Shea
David Opderback
Kristen Osenga
Rafael Pardo
Marcy Peek
Eduardo Peñalver
Robert Percival
David Post
Shruti Rana
Geoffrey Rapp
Neil Richards
Lori Ringhand
Alice Ristroph
Susan Scafidi
Paul Secunda
Jonathan Siegel
Jessica Silbey
Peter Smith
Charles Sullivan
Rick Swedloff
Steph Tai
Andrew Taslitz
Robert Tsai
Jenia Turner
Steve Vladeck
Sarah Waldeck
Melissa Waters
Alfred Yen
David Zaring
Timothy Zick
Spencer Weber Waller
Howard Wasserman
Frank Wu
Corey Yung
Jonathan Zittrain

Blogroll

Above the Law
ACS Blog
Althouse
Balkinization
Becker-Posner Blog
BlackProf
BoingBoing
Chicago Law Faculty Blog
Conglomerate
CrimLaw
Crime & Federalism
CrimProf Blog
Crooked Timber
Discourse.net
Dorf on Law
Election Law
Emergent Chaos
The Faculty Lounge
Feminist Law Profs
43(B)log
Freakonomics Blog
Freedom to Tinker
Google Blogoscoped
How Appealing
Ideoblog
Info/Law
Instapundit.com
Juris Novus
Jurisdynamics
Law and Humanities Blog
Law and Letters
Law Librarian Blog
Legal Profession Blog
Legal Theory Blog
Legal Times Blog
Leiter Reports
Brian Leiter's Law School Reports
Lessig Blog
Madisonian Theory
Media Law Blog
Mirror of Justice
The Moderate Voice
National Security Advisors
Opinio Juris
Point of Law
PrawfsBlawg
ProfessorBainbridge.com
Property Prof Blog
Red Tape Chronicles
The Right Coast
Schneier on Security
SCOTUSBlog
Security Dilemmas
Sentencing Law and Policy
Simple Justice
Sivacracy.net
The Situationist
Susan Crawford
TalkLeft
Talking Points Memo
TaxProf Blog
Tech & Marketing Law
Truth on the Market
Volokh Conspiracy
WorkPlace Prof Blog
WSJ Law Blog
Wonkette
The Yin Blog


© Concurring Opinions

Powered by WordPress