Category: Google and Search Engines

1

Fifty Years of “I know it when I see it.”

On June 22, 1964, Justice Potter Stewart coined the phrase “I know it when I see it” in his concurring opinion in Jacobellis v. Ohio. Fifty years later, that expression holds the distinction of being one of the few modern legal phrases to become a regularly accepted expression among educated Americans. The half-century anniversary of Jacobellis provides a fitting opportunity to ask why “I know it when I see it” has enjoyed such popularity and what lessons that phrase and its history might hold for us today.

Jacobellis reversed the conviction of an Ohio movie theater manager for showing obscene material in the form of the French film Les Amants (The Lovers), which included a sex scene at its conclusion. The court’s 6-3 decision was highly fragmented, with six opinions in total and the plurality garnering only two votes.

Potter Stewart

In a short 144-word concurring opinion, Stewart wrote that he found it almost impossible to define obscenity precisely, which should only include “hard-core pornography.” His now famous line concluded the opinion:

 “But I know it when I see it, and the motion picture involved in this case is not that.”

At the time, the pithy phrase actually garnered little interest in the public sphere. Many newspapers chose instead to focus on another obscenity case decided that same day, Quantity of Books v. Kansas. Those journalists who did write about Jacobellis largely ignored “I know it when I see it” and chose to focus on the legal technicalities the case posed.

While it is difficult to pinpoint exactly when Stewart’s iconic expression became common, we can chart its growing popularity via Google’s Ngram search engine. Google Ngram measures the percentage of English language books that contain a phrase up to five words long. Because “I know it when I see it” is seven words, I ran the search for each five-letter segment of the phrase (“I know it when I;” “know it when I see;” “it when I see it.”). The graph clearly shows the steeply rising and still growing interest in Stewart’s phrase, starting slightly after 1964:

I know it when I see it Ngram

 

The Ngram search also reveals some interesting instances of similar phrases, both legal and not, pre-dating Jacobellis. Consider two examples: In an obituary for Benjamin Cardozo that ran in the Columbia, Yale and Harvard law journals in 1939, Learned Hand praised Justice Cardozo for his wisdom, writing:

“And what is wisdom — that gift of God which the great prophets of his race exalted? I do not know; like you, I know it when I see it, but I cannot tell of what it is composed.”

Read More

0

Google Books and the Social (Justice) Contract

In channeling Judge Baer, Judge Chin at long last dropped the other shoe in the judicial effort to bring new information technology uses for copyrighted works fully in to the copyright regime. Congress has been slow to address the challenge of tapping the full copyright social utility/justice potential of these advances and it’s been left to the courts to sort it all out in the context of individual adversarial conflicts. Poignantly, when Jonathan Band asks “What [was] the Authors Guild fighting for?”, he also illustrates the tree-myopic/forest blind nature of the Guild’s position. What the Guild failed to see is that property rights fit into a larger socio-legal system: Yes your neighbor is precluded from trespassing on to your land but your ability to engage in whatever “private” activity strikes your fancy while thereon is limited by the legal system as a whole. Your land is individual private property, not an independent sovereign state.

 

Judge Baer reminded rights holders of this aspect of the social contract and now Judge Chin has made it clear to the Guild that this is not some narrow, eccentric application of copyright social utility. Property rights, including copyrights, exist to advance society, and to state the obvious, information technology has evolved our society. Like all other rights, customs, and expectations, however, whereas some aspects of copyright as previously envisioned fit comfortably into our new configuration others don’t fit at all. And when that ill-fit impedes important social progress modifications must be made, and if necessary, expectations altered.

 

The courts’ reasoning in both Hathitrust and Google Books moves fair use jurisprudence further toward the express consideration of copyright social justice in the application of the doctrine. As Kevin Smith notes, the judges in both cases have seized this opportunity to retrofit fair use, and it seems to me that these decisions push beyond questions of aesthetic and even functional transformation and pave the way for weighing social transformation in assessing the first fair use factor. I have also applied some of the legal conclusions drawn from Bill Graham Archives and other Grateful Dead archive projects to specific copyright social justice needs, for example, that of socially beneficent access to the literature of the Harlem Renaissance. Like some other historically and culturally important works, many of these books enjoy only marginal commercial market value and similar to the information harvested through data mining, “digital fair use” may be the only means by which to return these works to the general public. The social resuscitation of significant works through mass-digitization, and other uses that serve important and otherwise unattainable copyright social objectives, should be considered a purpose that satisfies the first fair use factor.

 

Authors and other copyrights holders would do well to finally get ahead of the information technology curve. The Authors Guild’s mistake was not so much in the effort to preserve what they considered to be their property rights or even in the effort to extract every conceivable drop of revenue out those rights, but rather, in failing to accept that in order for these rights to retain any value they must function as part of a thriving societal system or eventually forfeit the basis for legal recognition. In the analog world, the public’s access to most books remains largely dependent upon the vagaries of the commercial marketplace. Digital information technology has presented the opportunity to compile the world’s books toward the creation of global libraries accessible to every human being on a socially equitable basis. To believe that analog social inequity will be permitted to endure indefinitely in the face of digital information possibilities is simply unrealistic. Keeping in mind that the stimulation, perpetuation, and re-ignition of the cultural expression/dissemination/inspiration combustive cycle is the raison d’etre of copyright will enable authors to embrace digital change and as Gil Scott Heron sang, possibly even direct the change rather than simply be put through it.

 

0

Google Books and Author’s Rights

I agree with James Grimmelmann that the Google Books decision is a bit anticlimactic (although the appeal has the potential to add suspense by bringing the case back from the dead). After last October’s decision in Authors Guild v. Hathitrust, the only question really was whether Judge Chin would distinguish HathiTrust on the grounds that the defendants there were nonprofit institutions of higher education, while the defendant here was a commercial entity. To be sure, Judge Chin was not bound by Judge Baer’s analysis that HathiTrust’s use was transformative and did not in any way harm the market for the works at issue,  but these holdings were so consistent with precedent in the Second and Ninth Circuit that it was hard to imagine that Judge Chin would disagree with them. That left the commercial/non-commercial distinction, which has become far less significant in recent years in cases involving transformative uses.

Both judges’ recognition of the enormous social utility of creating a searchable index of books, and the absence of harm to authors caused by such an index (to the contrary, the index benefits authors by making their works more discoverable), highlights the mystery at the heart of these cases: What is the Authors Guild fighting for? Why did it not settle last year, when the publishers dropped their suit against Google? Why did it continue to pursue its litigation against HathiTrust after HathiTrust abandoned its orphan works project?

For some Authors Guild members, it might be about the money. They may believe that there is a pot of gold at the end of the Google rainbow. If the Internet could make instant millionaires (if not billionaires) out of all these kids who express themselves through Internet acronyms, emoticons, and 140 character tweets, then surely authors who spend years writing finely crafted books deserve a share of that fortune.

For others, it seems to be a matter of principle. But exactly what principle? Apparently, that no one should use their works without their permission. While they may agree with fair use in the abstract, they oppose it as applied to their works. The fact that the use is socially beneficial and does not harm them economically is irrelevant. I would amend James’s “three c” formulation with a fourth c: creators should have complete control over copies.

The Authors Guild’s belief in complete control is based more on the Continental “author’s rights” (droit d’auteur) tradition than on the Anglo-American utilitarian tradition. In the author’s rights approach, copyright springs not from statutes but from natural law. The relationship between the author and his work is intimate and indivisible. By contrast, in the Anglo-American system, copyright is not a response to natural law, but rather is a matter of legislative choice directed at incentivizing the creation of works for the benefit of society.  The Anglo-American utilitarian approach in theory provides only as much protection as is necessary to encourage creative activity, while the author’s rights approach provides more robust protections of both economic rights and moral rights such as the right of attribution and integrity.  Historically, the difference between the two approaches translated into longer copyright terms and narrower exceptions in author’s rights jurisdictions.

However, in response to lobbying by rights-holders, Congress has enacted certain features of author’s rights systems — for example, the ever-increasing copyright term. The first U.S. copyright act provided a term of 14 years, renewable for another 14 years, for a total of 28 years. Now, the copyright term matches the European Union’s term of life of the author plus 70 years.

Efforts are underway to import other author’s rights features. The U.S. Copyright Office just released a report recommending that Congress consider adoption of a resale royalty (droit de suite) for visual artists. Under this framework, a visual artist would receive a percentage of the amount paid for a work each time it was resold by a third party.  A resale royalty is in effect a tax on the sale of copyright products and is directly contrary to the long-established first sale doctrine.

The complete control over copyrighted works sought by the Authors Guild and reflected by proposals such a resale royalty are inconsistent with the public interest purpose of our copyright system. Fortunately, Judge Chin, and Judge Baer before him, recognized that the objective of copyright is not to enrich rights-holders, but “to advance the progress of the arts and sciences.”

 

Is Big Data Overhyped?

ClickFor some in Silicon Valley, the rise of new data and communication networks creates unprecedented opportunities to solve problems like obesity, traffic, and flu pandemics. For example, an app like FitBit or LoseIt can keep track of calories and buzz a dieter once he goes over his daily limit. Futuristic early warning systems can warn drivers away from bottlenecks, and detect emerging influenza outbreaks.

Evgeny Morozov’s illuminating book To Save Everything, Click Here challenges both “internet centrism” and “solutionism.” The internet may, for instance, make traffic worse. Moreover, solutionism tends to “reach for the answer before the questions have been fully asked.” Is the problem really traffic, or something deeper in the way cities and opportunities are arranged? Solutionism tends to prioritize issues that widely accessible tech can address: small, algorithmically decomposable bits of wicked problems.

While a solutionist might think of gamified calorie counting as a wonderful new way to fight obesity, a more sober analysis of the problem will lead us to doubt the smartphone will make us svelte. Similarly, calorie counts may be a great disclosure tactic, but disclosure is only the first step on the road to changing behavior. And our food problem, like our traffic problem, may entail reconsideration of privilege, taste, and inequality as far deeper problems than individual struggles for self-control.

Big data has been linchpin of solutionist narratives about the future of tech in health care. However, there are still major challenges in data quality. Even if the data were perfect, causal inference still may be a challenge, as Hoffman & Podgurski explain:
Read More

5

The Importance of Section 230 Immunity for Most

Why leave the safe harbor provision intact for site operators, search engines, and other online service providers do not attempt to block offensive, indecent, or illegal activity but by no means encourage or are principally used to host illicit material as cyber cesspools do?  If we retain that immunity, some harassment and stalking — including revenge porn — will remain online because site operators hosting it cannot be legally required to take them down.  Why countenance that possibility?

Because of the risk of collateral censorship—blocking or filtering speech to avoid potential liability even if the speech is legally protected.  In what is often called the heckler’s veto, people may abuse their ability to complain, using the threat of liability to ensure that site operators block or remove posts for no good reason.  They might complain because they disagree with the political views expressed or dislike the posters’ disparaging tone.  Providers would be especially inclined to remove content in the face of frivolous complaints in instances where they have little interest in keeping up the complained about content.  Take, as an illustration, the popular newsgathering sites Digg.  If faced with legal liability, it might automatically take down posts even though they involve protected speech.  The news gathering site lacks a vested interest in keeping up any particular post given its overall goal of crowd sourcing vast quantities of news that people like.  Given the scale of their operation, they may lack the resources to hire enough people to cull through complaints to weed out frivolous ones.

Sites like Digg differ from revenge porn sites and other cyber cesspools whose operators have an incentive to refrain from removing complained-about content such as revenge porn and the like.  Cyber cesspools obtain economic benefits by hosting harassing material that may make it worth the risk to continue to do so.  Collateral censorship is far less likely—because it is in their economic interest to keep up destructive material.  As Slate reporter and cyber bullying expert Emily Bazelon has remarked, concerns about the heckler’s veto get more deference than it should in the context of revenge porn sites and other cyber cesspools.  (Read Bazelon’s important new book Sticks and Stones: Defeating the Culture of Bullying and Rediscovering the Power of Character and Empathy).  It does not justify immunizing cyber cesspool operators from liability.

Let’s be clear about what this would mean.  Dispensing with cyber cesspools’ immunity would not mean that they would be strictly liable for user-generated content.  A legal theory would need to sanction remedies against them.  Read More

3

Stanford Law Review Online: Software Speech

Stanford Law Review

The Stanford Law Review Online has just published a Note by Andrew Tutt entitled Software Speech. Tutt argues that current approaches to determining when software or speech generated by software can be protected by the First Amendment are incorrect:

When is software speech for purposes of the First Amendment? This issue has taken on new life amid recent accusations that Google used its search rankings to harm its competitors. This spring, Eugene Volokh coauthored a white paper explaining why Google’s search results are fully protected speech that lies beyond the reach of the antitrust laws. The paper sparked a firestorm of controversy, and in a matter of weeks, dozens of scholars, lawyers, and technologists had joined the debate. The most interesting aspect of the positions on both sides—whether contending that Google search results are or are not speech—is how both get First Amendment doctrine only half right.

He concludes:

By stopping short of calling software “speech,” entirely and unequivocally, the Court would acknowledge the many ways in which software is still an evolving cultural phenomenon unlike others that have come before it. In discarding tests for whether software is speech on the basis of its literal resemblance either to storytelling (Brown) or information dissemination (Sorrell), the Court would strike a careful balance between the legitimate need to regulate software, on the one hand, and the need to protect ideas and viewpoints from manipulation and suppression, on the other.

Read the full article, Software Speech at the Stanford Law Review Online.

0

Some more on ISPs and 6 Strikes – Where’s The Citizen Policing?

I wrote about the Six Strikes plan earlier today. I wanted to add a call for transparency on download speeds so the average citizen could police the penalties. The Wired report noted that responses “might include reducing internet speeds.” Given the problems with ISPs providing clear and consistent speeds, it seems to me that if they can reduce speeds in the name of copyright enforcement, they should also be open about what those speeds are. Google’s speed test may be useful and its M-Lab may play a role (M-Lab claims “Measurement Lab (M-Lab) is an open, distributed server platform for researchers to deploy Internet measurement tools. The goal of M-Lab is to advance network research and empower the public with useful information about their broadband connections. By enhancing Internet transparency, M-Lab helps sustain a healthy, innovative Internet.” Hmm. I wonder whether Google’s foray into broadband will not only show the speeds easily but jump onto the ISP copyright enforcement bandwagon. I suppose that would be a consistent approach given the copyright/search results policy, but it may be one that starts to indicate that the alleged tech industry/online activist solidarity is well, alleged.

0

SOPA, PIPA and some truth about activism

As folks start to claim they saved the Internet and rally for alleged ways to keep the Internet open for all, I want to call out something Rep. Issa said at Stanford in April. Step one, and to me the but-for moment, in stopping SOPA and PIPA was the security and CS community speaking (which was rare) about just how dangerous (“A potpourri of dumb things” – Issa at around 8:15) the bills were. Without that the activism probably could never have gotten in place. Furthermore, as I noted elsewhere, science can shift. Science is, by definition, amoral. If you build it, it will work. So expect the copyright industry to demand new things. Expect them to hire and fund studies about how to get what they want without going using “A potpourri of dumb things.” And note that Google’s recent shift in approach regarding links and alleged pirate sites shows that things change.

This is not an apolitical moment. It is deeply political, but pretends that it is not about a power shift. When Internet and tech companies swear they are there for you, be skeptical. In some senses they are. Many folks I know at Google really are interested in serving users. Many are also scientists who will pursue, as they should, the truth of what is possible. The current bus-stop tour by Reddit’s co-founder, Alexis Ohanian is political. Per the Washington Post, for him, “[T]he key issue is getting Internet openness on the minds and into the talking points of politicians in this election.”

What does openness mean? What are the politics of openness? Why do Facebook, Google, Reddit want openness? South by Southwest looks like it may have panel on disrupting DC. The description reads like an evangelic rally (a good tip that thought is replaced by faith). But to its big credit (except for saying the questions will be answered), the panel looks at some decent issues:

1. The Industrial Revolution brought about a political realignment that created the existing party system. Can the Internet do the same?
2. Beyond “openness,” what are the essential characteristics that define the Internet’s political identity? Market oriented or socially conscious? Libertarian or progressive? (Or all of the above?)
3. Politically, does the Internet most resemble an interest group (like big business or labor unions), a movement, or something we haven’t seen before?
4. Is Internet culture weakening partisanship — or making it worse?
5. Technology drives growth, but some say it also kills jobs. How do we make sure that the benefits of the Internet are widespread? Is there a consistent political viewpoint here among Internet activists, or does this break down along typical political lines?

I doubt one panel can tackle all these questions. Much will depend on the panelists and whether the panel is really open in that it has voices other than those who all agree. Nonetheless, one thing that is missing is a deeper look at the power structures and history that inform the issue. For example, the idea of realigning parties still relies on parties. And, there is an essentialism to Internet identity that is ironic at best and willfully blind and lacking irony at worst.

Have I abandoned my Google brothers and sisters? Oh perhaps, but I don’t think so. These questions were ones I raised while there. Some disliked them. Some took them seriously. The people I respected and loved the most pushed me to dig into these points. Like society, Google has many people with many views and agendas. That’s the point. With all companies and all people asserting truth, administer several grains of salt, reflect, (maybe add some lime and tequila first). For those wishing a good book on the problems with saying we know where we are going, check Professor Wendy Brown’s work, especially Politics Out of History.

0

Google Says “No, No” to Mr. or Ms. Pirate; What About Hate Speech?

Fred von Lohmann posted that Google has changed its algorithm. Now “it’ll start generally downranking sites that receive a high volume of copyright infringement notices from copyright holders.” The Verge reports that:

because its existing copyright infringement reporting system generates a massive amount of data about which sites are most frequently reported — the company received and processed over 4.3 million URL removal requests in the past 30 days alone, more than all of 2009 combined. Importantly, Google says the search tweaks will not remove sites from search results entirely, just rank them lower in listings. Removal of a listing will still require a formal request under the existing copyright infringement reporting system — and Google is quick to point out that those unfairly targeted can still file counter-notices to get their content reinstated into search listings.

The data-driven basis makes sense to me. So what other areas could be monitored and adjusted? I disagree with the idea that search engines should take on policing roles for certain speech that Danielle Citron and others have urged. But this shift may open the door to more arguments for Google to be a gatekeeper and policer of content. Assuming enough data is available, Google or any data-driven service, could make decisions to include or exclude entries (or shift ranking). Those moves already happen. But the difficult question will now be why or why not act on some issues but not others. James Grimmelman has a work in progress on search and speech that gets into this question. I believe the algorithm issues still control. Nonetheless, by nodding to the copyright industry, Google may be opening the door to further calls to be the Internet’s gatekeeper. Of course, if it does that, others will attack Google for doing just that from competition and other angles.

Automated Arrangement of Information: Speech, Conduct, and Power

Tim Wu’s opinion piece on speech and computers has attracted a lot of attention. Wu’s position is a useful counterpoint to Eugene Volokh’s sweeping claims about 1st Amendment protection for automated arrangements of information. However, neither Wu nor Volokh can cut the Gordian knot of digital freedom of expression with maxims like “search is speech” or “computers can’t have free speech rights.” Any court that respects extant doctrine, and the normative complexity of the new speech environment, will need to take nuanced positions on a case-by-case basis.

Digital Opinions

Wu states that “The argument that machines speak was first made in the context of Internet search,” pointing to cases like Langdon v. Google, Kinderstart, and SearchKing. In each scenario, Google successfully argued to a federal district court that it could not be liable in tort for faulty or misleading results 1) because it “spoke” the offending arrangement of information and 2) the arrangement was Google’s “opinion,” and could not be proven factually wrong (a sine qua non for liability).
Read More