the Law, the Universe, and Everything 

Search

Concurring Opinions is a
general-interest legal blog
operated by Concurring
Opinions LLC, a Pennsylvania
Limited Liability Corporation.

Yale University Press

ad-logo5.jpg

Our Podcast

Subscribe to Law Talk

Law-Rev-Forum-2.jpg

law-rev-contents2.jpg

Law-Prof-Blog-Census.jpg

Categories

Administrative Announcements
Administrative Law
Admiralty
Advertising
Agricultural Law
Anonymity
Antitrust
Architecture
Articles and Books
Bankruptcy
Behavioral Law and Economics
Bioethics
Blogging
Book Reviews
Capital Punishment
Civil Procedure
Civil Rights
Conferences
Constitutional Law
Consumer Protection Law
Contract Law & Beyond
Corporate Law
Criminal Law
Criminal Procedure
Culture
Current Events
Cyberlaw
DRM
Economic Analysis of Law
Education
Empirical Analysis of Law
Employment Law
Environmental Law
Family Law
Feminism and Gender
First Amendment
Food
Google & Search Engines
Health Law
History of Law
Humor
Immigration
Insurance Law
Intellectual Property
International & Comparative Law
Interviews
Jurisprudence
Law and Humanities
Law and Inequality
Law and Psychology
Law Practice
Law Professor Blogger Census
Law Rev (Boston College)
Law Rev (Boston University)
Law Rev (California)
Law Rev (Chicago)
Law Rev (Columbia)
Law Rev (Cornell)
Law Rev (Duke)
Law Rev (Emory)
Law Rev (Fordham)
Law Rev (Georgetown)
Law Rev (GW)
Law Rev (Harvard)
Law Rev (Illinois)
Law Rev (Indiana)
Law Rev (Michigan)
Law Rev (Minnesota)
Law Rev (Northwestern)
Law Rev (Notre Dame)
Law Rev (NYU)
Law Rev (Penn)
Law Rev (S Cal)
Law Rev (Stanford)
Law Rev (Texas)
Law Rev (UCLA)
Law Rev (Vanderbilt)
Law Rev (Virginia)
Law Rev (Wash U)
Law Rev (Yale)
Law Rev Contents
Law Rev Forum
Law School
Law School (Hiring & Laterals)
Law School (Law Reviews)
Law School (Rankings)
Law School (Scholarship)
Law School (Teaching)
Law Student Discussions
Law Talk
Legal Ethics
Legal Theory
Media Law
Movies & Television
Philosophy of Social Science
Politics
Privacy
Privacy (Consumer Privacy)
Privacy (Electronic Surveillance)
Privacy (Gossip & Shaming)
Privacy (ID Theft)
Privacy (Law Enforcement)
Privacy (Medical)
Privacy (National Security)
Property Law
Race
Religion
Reparations
Science Fiction
Securities
Social Network Websites
Sociology of Law
Supreme Court
Tax
Teaching
Technology
Tort Law
Web 2.0
Weird
Wiki
Wills, Trusts, and Estates

Recent Comments

James Grimmelmann on Conditions for the Digital Library of Alexandria

Frank on Conditions for the Digital Library of Alexandria

James Grimmelmann on Conditions for the Digital Library of Alexandria

Archives

May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
August 2005
July 2005
June 2005

 

« Gallacher on Cite Neutrality | Main | Ah the Good Life: Firms and Keeping Associates Happy »

November 24, 2007

Conditions for the Digital Library of Alexandria

posted by Frank Pasquale

librarywall.jpgI have been in the middle of a major rethink of search engines' efforts to digitize books. As it started I enthusiastically celebrated their potential to tame information overload. But major research librarians are now questioning search engines' practices here:

Several major research libraries have rebuffed offers from Google and Microsoft to scan their books into computer databases, saying they are put off by restrictions these companies want to place on the new digital collections. The research libraries, including a large consortium in the Boston area, are instead signing on with the Open Content Alliance [OCA], a nonprofit effort aimed at making their materials broadly available.

As the article notes, "many in the academic and nonprofit world are intent on pursuing a vision of the Web as a global repository of knowledge that is free of business interests or restrictions."

As noble as I think this project is, I doubt it can ultimately compete with the monetary brawn of a Google. And why should delicate old books get scanned 3 or 4 times by duplicative efforts of Google, Microsoft, the OCA, and who knows what other private competitor? I also worry that a fragmented archiving system might create a library of Babel. So what is to be done?

My new position is: leverage current copyright challenges to Google's book search program to guarantee that it serves the public interest. Here's how that might work:

Google’s plans to scan and index hundreds of thousands of copyrighted books have provoked extraordinary public controversy and private litigation. This project aims to archive and provide text-based indexing for an enormous number of books. Google’s scanning of copyrighted books is prima facie infringement, but Google is presently asserting a fair use defense. The debate has largely centered on the rival property rights of Google and the owners of the copyrights of the books it would scan and edit.

Given Google’s alliance with some of the leading libraries in the world, journalistic narratives have largely portrayed the Google Book Search project as an untrammeled advance in public access to knowledge. However, other libraries are beginning to question the restrictive terms of the contracts that Google strikes when it agrees to scan and create a digital database of a library’s books. While each library is guaranteed access to the books it agrees to have scanned, it is not guaranteed access to the entire index of scanned works.

Those restrictive terms foreshadow potential future restrictions on and tiering of their book search services. Well-funded libraries may pay a premium to gain access to all sources; lesser institutions may be left to scrounge among digital scraps. If permitted to become prevalent, such tiered access to information would threaten to rigidify and reinforce existing inequalities in access to knowledge, and life chances. Such tiering divides society into two groups–those who can afford to access the information, and those who cannot. To the extent that the latter group’s relative poverty is not its own fault, information tiering inequitably subjects it to yet another disadvantage, whereby others’ wealth can be leveraged into status, educational, or occupational advantage.

Given the diciness of the fair use case for projects like Google Book Search, courts should condition the legality of such archiving of copyrighted content on universal access to the contents of the resulting database. Landmark cases like Sony v. Universal have set a precedent for taking such broad public interests into account in the course of copyright litigation. Given the importance of “commerciality” in the first of the four fair use factors, suspicion of tiered access could also be figured into that prong of the test. A more ambitious (if less likely) solution would require Congress to set such terms in a legislative settlement of the issue.

However the matter is ultimately settled, any outcome in favor of dominant categorizers should be conditioned on their maintaining open access to search results. Such a condition would help assure that the type of “tiered access” common for legal resources would not further pervade the networked world. If Google’s proposed extension of the fair use defense succeeds, such a holding should be limited to current versions of the services that conduce to a common informational infrastructure. To the extent it or other search engines limit access to parts of their index, their public-spirited defenses of their archiving and indexing projects are suspect.

PS: For more thoughts on the future of digital archiving, see Diane Leenheer Zimmerman's Can Our Culture Be Saved?

PPS: This post is part of a series, which starts here.

Photo Credit: ekornblut, Wall of Library of Alexandria.

Posted by Frank Pasquale at November 24, 2007 08:11 PM

Trackback Pings

TrackBack URL for this entry:
http://www.concurringopinions.com/movabletype/mt-tb.cgi/2772.

Comments

Your position is exactly backwards. Far and away the biggest threat to the universally accessible library is hold-up from copyright holders. They can kill the library. A dominant search engine can only delay or hamper it.

Duplication of scanning efforts is not a problem. If we spend 10x as much as we should making 5 scanned copies of everything, so what? It's still money well spent, and good search engines will still iron out any inconsistencies. And overspending on the digitization project is a far far easier way to prevent a tiered access future than trying to scan once and set exactly the right conditions on it. We should all be pushing as hard as we can for scanning and searching to be an unambiguous fair use, so that lots of institutions get into the scan+search business. (Keep in mind, too, that the costs of scanning are continually falling, whereas the stock of things that need to be scanned is not growing at anywhere near the same rate.)

In general, you're too eager to see search markets as natural monopolies. They aren't. Universal search does have high costs and thus high barriers to entry, but smaller pieces of the search market are still cheap to play in. Especially given mobility of users, Google's dominance today isn't a function of unique market factors forcing concentration; it's a result of a Schumpeterian breakthrough in search technology that Google spearheaded and is still milking. Another paradigm shift in how things are done could dethrone it in the space of a few years -- and it's quite possible that that shift could be to something not under the control of any one company.

There are serious issues that large search engines present. We should face and address those issues. But for a lot of the deeper problems you worry about in search, there are more pressing present dangers from other powerful entities. That's the case with neutrality (the incumbent broadband ISPs) and it's the case wit book scanning (the publishers).

Posted by: James Grimmelmann at November 25, 2007 12:51 AM


James,

Some responses:

1. You say: "[T]here are more pressing present dangers from other powerful entities. That's the case with neutrality (the incumbent broadband ISPs) and it's the case with book scanning (the publishers)."

Agreed. Google book search without the conditions I've advanced above would be better than a Google book search contingent on a million licensing deals.

The real question is whether conditions like mine would scuttle the project. And I don't think they are that burdensome. They can also be pared down; for example, there is a much greater societal need to have scholarship indices being free and open access than, say, indices of Danielle Steele books.

2. You say "smaller pieces of the search market are still cheap to play in." I agree, but I think a) any one of those people is still pretty reliant on Google to route it customers and b) Google does not need to monopolize that space to still be a dominant force that deserves scrutiny.

Do you really think someone else is poised to make a "Schumpeterian breakthrough" on general-purpose search?

3. The falling costs of scanning are a good argument for multiple scanning enterprises. And yes, the risk of one inaccurate or incomplete scan should be factored in against the risk of harming a book via multiple scans. But I still think that this project is such a small part of the overall Google business plan that even if the conditions I've proposed were applied, they would not significantly deter the project.

Nobody is saying "Google can't advertise on the index"--that's their core business plan. I'm just saying, don't try to make money off tiered access--just as Google says to the carriers when it lobbies for net neutrality.

Posted by: Frank at November 25, 2007 04:29 PM


And my meta-responses:

The real question is whether conditions like mine would scuttle the project. And I don't think they are that burdensome.

Perhaps not as an end-result, but your means of getting there -- through current copyright challenges to doing the project at all -- is playing with fire.

[A]ny one of these people is still pretty reliant on Google to route it customers

Maybe, maybe not. Dopplr mostly isn't, Abebooks mostly isn't, Altlaw mostly isn't. The principal Google searches they care about are navigational queries on their own names.

Do you really think someone else is poised to make a "Schumpeterian breakthrough" on general-purpose search?

Yes. The key will be to redefine the problem; my best guess is that whatever comes next will be significantly user-generated.

Posted by: James Grimmelmann at November 26, 2007 07:43 AM


Post a comment




Remember Me?

(you may use HTML tags for style)

Authors

Daniel J. Solove

Website
Understanding Privacy

Kaimipono Wenger

Website
SSRN Page

Dave Hoffman

Website
SSRN Page

Nate Oman

Website
SSRN Page

Frank Pasquale

Website
SSRN Page

Deven Desai

Website
SSRN Page


Guests

William Birdthistle
Elaine Chiu
David Fontana
James Grimmelmann
Dan Kahan
Sam Kamin
Anita S. Krishnakumar
William McGeveran
Michael O'Shea






ad-logo3.jpg

blawg100_winner2.jpg

Previous Guests

Michael Abramowicz
Michelle Adams
Robert Ahdieh
Michelle Anderson
Laura Appleman
Francesca Bignami
Jeremy Blumenthal
Bruce Boyden
Donald Braman
Al Brophy
Bill Burke-White
Scott Burris
Anupam Chander
Miriam Cherry
Jack Chin
Jennifer Collins
Allison Danner
Brannon Denning
Deven Desai
Mike Dimino
Christine Haight Farley
Kim Ferzan
Dan Filler
Amanda Frost
Timothy Glynn
Rachel Godsil
Eric Goldman
Craig Green
Jeffrey Harrison
Erica Hashimoto
Laura Heymann
Christine Hurt
Heidi Kitrosser
Adam Kolber
Russell Korobkin
Anita S. Krishnakumar
Greg Lastowka
Joseph Liu
Solangel Maldonado
Jason Mazzone
William McGeveran
Salil Mehra
Carrie Menkel-Meadow
Scott Moss
Eric Muller
Jaya Ramji-Nogales
Elizabeth Nowicki
Paul Ohm
Michael O'Shea
Rafael Pardo
Marcy Peek
Eduardo Peñalver
Neil RIchards
Lori Ringhand
Alice Ristroph
Paul Secunda
Peter Smith
Charles Sullivan
Rick Swedloff
Steph Tai
Robert Tsai
Steve Vladeck
Sarah Waldeck
Melissa Waters
Alfred Yen
David Zaring
Timothy Zick
Jonathan Zittrain

Blogroll

Above the Law
ACS Blog
Althouse
Balkinization
Becker-Posner Blog
Beltway Blogroll
BlackProf
BoingBoing
Chicago Law Faculty Blog
Conglomerate
Convictions
CrimLaw
Crime & Federalism
CrimProf Blog
Crooked Timber
Discourse.net
Dorf on Law
Election Law
Emergent Chaos
Feminist Law Profs
43(B)log
Freakonomics Blog
Freedom to Tinker
Google Blogoscoped
How Appealing
Ideoblog
Info/Law
Instapundit.com
JD2B.com
Juris Novus
Jurisdynamics
Law and Letters
Legal Profession Blog
Legal Theory Blog
Legal Times Blog
Leiter Reports
Brian Leiter's Law School Reports
Lessig Blog
Madisonian
Mirror of Justice
National Security Advisors
Opinio Juris
Point of Law
Political Theory Daily Review
PrawfsBlawg
ProfessorBainbridge.com
Property Prof
Red Tape Chronicles
The Right Coast
Schneier on Security
SCOTUSBlog
Security Dilemmas
Sentencing Law and Policy
Simple Justice
Sivacracy.net
The Situationist
Susan Crawford
TalkLeft
Talking Points Memo
TaxProf Blog
Tech & Marketing Law
Truth on the Market
Volokh Conspiracy
WorkPlace Prof Blog
WSJ Law Blog
Wonkette
The Yin Blog

Pajamas Media BlogRoll Member