Taking on the Known Unknowns
posted by Frank Pasquale
As information proliferates, I predict that secret algorithms are going to play a much larger role in our society. That’s one reason I joined Oren Bracha to write a piece on the proper limits of secrecy in dominant search engines’ operations.
Silicon Valley prawf Eric Goldman is a lot more optimistic about the role of search engines in society, and he suggests that Oren and I are digital Don Quixotes:
While illegitimate suppression is an analytically interesting issue (not dissimilar to a situation where a newspaper editorial team is out to “get” a particular individual or company), the paper doesn’t offer much empirical evidence showing search engines actually engage (or have engaged) in the behavior that the paper seeks to redress. Thus, the paper may build a strong theoretical construct to attack a non-existent practice.
Oren has responded, and I have a bit of a “surreply” to Eric’s response to Oren.
First, here is an excerpt of Oren’s defense of our attempt to address the “known unknown” of search manipulation (which Goldman graciously posted):
Do search engines engage in targeted discrimination? . . . In fact, it is nearly impossible to know. This is an important part of the problem and this is our point! Search Engines’ practices form a “black box” nearly inaccessible to others.
The basic technology is asymmetric: the search engine has all the information about its practices and the user or listed entity has almost none. As a matter of business practices, search engines are [justifiably] very secretive . . . . The law too plays its part in creating this veil of secrecy. Trade secret law affords protection to such information. [Courts' tendency to dismiss] any cause of action in regard to manipulation minimizes the prospect that the information will ever be divulged.
The net result is an opaque veil which is very hard to pierce. We simply don’t know what search engines do. This is part of a broader, and to my mind, troubling phenomenon of “the closed hood society” where areas of information crucial to public policy are shut away behind a secrecy wall.
To put it in fancy legal theory terms, there is a classic article by Felstiner, Abel and Sarat called “The Emergence and Transformation of Disputes: Naming, Blaming, Claiming.” It explains how before a legal dispute can ever arise there have to be conditions that allow claimants to recognize (name) the wrong, associate it with the culprit (blaming), and claim redress. In the current black box world of search engines there are some first signs of naming, but blaming and claiming is almost impossible.
Our first important point in the paper is that, while there are some legitimate reasons for search engines’ secrecy, someone has to be allowed to peer into the black box and scrutinize their practices, to ascertain whether there is a problem. Toward this end we discuss several mechanisms that can balance search engines’ legitimate interest in secrecy and the need to look inside the black box.
* * *
Concentrated power is suspicious. Concentrated power that operates in the dark is even more suspicious.
Goldman then responded:
Do a global replace of the words “search engine” below with “newspaper editor” and does your analysis change one bit? If not, what implications for your article? In other words, do you know what method or criteria your newspaper editors (or any other print publisher) decide which stories to cover, how many column-inches to give stories, and where to place them in the newspaper? (i.e., what stories go on the front page with a big headline; what stories get reduced to a “notes” section of a paper)? If you don’t know, does it matter? I think explaining how/why search engine placement decisions differ from newspaper editorial decisions will make explicit a set of key assumptions about the ways that people “consume” media and the nature of trust we as media consumers repose in our media intermediaries.
This reminds me of my post on the proper precedent for search engine regulation. Here is Oren’s reply:
You ask: “Do a global replace of the words ‘search engine’ below with ‘newspaper editor’ and does your analysis change one bit?” The answer is yes and no.
“No” because, as we explain at length in the paper, the search engines as intermediaries debate is really a reincarnation of the familiar debate about the mass media. In the early days of the Net it was common to think and argue that the Internet with its “democratizing” effect would do away with the intermediaries and solve all the problems of the mass media system. . . .
It turns out that search engines are the new intermediaries that replicate many of the difficulties raised by the old ones. Thus the debate over search engines has the same general patterns as that over traditional mass media. However, there are differences of degree and nuances between the two that may convince in the search engines context even some that were concerned by mass media but were ultimately unconvinced that anything should be done.
Which brings me to the “yes” or to some of the differences between the two contexts.
First, one important difference seems to be concentration. Traditional mass media is concentrated enough, but I don’t think it is as concentrated as the search engine market (especially if the yardstick is newspapers). I would be much less worried in the absence of a picture in which a handful of titans control the bulk of the market. The point is not merely the existence of gatekeepers, but the fact that a very small number of them control huge chunks of the market. In this respect search engines seem, at the moment, worse than the mass media.
Second, it is somewhat disingenuous for search engines to claim that they are just like newspapers editors. In other contexts search engines strongly maintain that they are merely conduits and not media outlets. They need that in order to justify sweeping immunity as in the case of DMCA 512 and CDA 230. But they cannot have it both ways. Search engines cannot be “just conduits” for purposes of immunities and “media outlets” when it comes to regulating their discriminatory practices. [emphasis added] If you ask me they are right when they claim to be closer to mere conduits. Of course, both conduits and media outlets have a gatekeeping element, but search engines as conduits seem to be located at a deeper layer of the system. Also they are and are perceived as less associated with the content toward which they channel people.
Third [are] the first amendment implications. It’s impossible to develop here the full array of arguments for and against first amendment protection to search engines’ discretion/discrimination, but one point is directly related to their distinction from more traditional media. Even if one starts from the (controversial but good law) Tornillo premise that interfering with the media’s absolute discretion about what to carry would be prohibited forced speech, that does not mean that there are good reasons to apply this rule to search engines. The common rationale of Tornillo and all its various extensions was that the “carrier” is associated with the speech or content. However, because search engines, unlike a newspaper for example are generally perceived and experienced as conduits rather than speakers or media outlets, they are simply not associated with the content toward which they point people. Most people do not associate Google with the content they find using its search engine, just as they do not associate the content of a telephone conversation with their telephone service provider. Hence, the Tornillo line of case is distinguished and there is less doctrinal and substantive reason to shield search engine’s discretion.
The list could be extended, but I’ll stop here. In short, despite the general common patterns, there are nuanced important differences between search engines and other more traditional media. Most of those differences, I think, point in the direction of a stronger justification to impose some scrutiny on search engine’s discretion to discriminate and manipulate.
And here’s a final response from Eric:
I think your weakest argument is that people *think* of search engines as conduits. Even if this is true, I guarantee that in the near future people will realize that search engines are active content mediators. When this consumer perception changes, you have a tough time distinguishing Tornillo. Also, assuming you read the newspapers, does it bother you that you don’t know how your editors make decisions?
It’s also pretty weak to argue that newspapers in the 1970s (i.e., at the time of Tornillo) were less concentrated than the search engine market today, both as a matter of local market concentration and regarding switching costs/procuring substitutes.
Okay, is there anything left to say? Here’s a little “surreply” to Eric from me:
1) We’ve got to take conentration in the search industry seriously. One of the commenters on our article claimed we were too conciliatory on that issue–that in fact an HHI of over 3000 prevails. Though I’m happy about developments in “social search,” I still think that the fundamental fodder for that industry will come from the dominant search engine(s). We give several reasons in the paper for thinking that industry will remain concentrated. We could also add another suggested by the same commenter (who has a good understanding of some mechanics at MS-Search and Google): “the Web is huge, and merely indexing even a respectable portion of it will require a huge server farm.”
2) As for our tolerating ignorance of the way that newspapers make editorial decisions: here’s the difference. There are hundreds of newspapers out there, and we can perform good studies of, say, front-page news in order to figure out how some might be biased. For example, if the Daily X never reports on Iraq on the front page, and everyone else does, it’s easy for us to perceive its bias relative to some baseline of coverage. You can’t really do that for search engines. The only studies I’ve seen are based on four search engines, and I would think it’s pretty difficult to “reverse engineer” any proof of bias in that situation.
3) Google makes a lot of assurances about the quality of its searches–whatever happened to “trust but verify”? Consider this comment on Google’s marketing for the Bourne Ultimatum:
Google (who once said they “never manipulate rankings to put [their] partners higher in [their] search results” and that their “ads are created and managed under the exact same guidelines, principles, practices and algorithms as the ads of any other advertiser”) got paid with “goods” as they received product placement in the Bourne Ultimatum: as part of the collaboration which convinced Google it’s OK to rent out their results space, one movie scene shows a Google search taking place.
Is this “stealth marketing?” I don’t know, and I acknowledge that you’ve argued that secret marketing agreements are not as big a problem as is commonly thought. However, I think consumers deserve some protection. They might gain confidence about the fairness and accountability of search engines if they know that a neutral third party can on occasion verify the numerous representations that Google makes about its services. Responsible commentators now consider some baseline rules like the SEC’s a bulwark of the securities industry, and perhaps some day a similar commission will protect public confidence in the search industry.
4) As I have expressed to someone who’s worked on TM issues for Google: wouldn’t the regulatory option in fact help the search engines in some ways? Consider what you’ve said regarding the Rhino Sports v. Sport Court case, where “Sport Court . . . issued a subpoena to Google requesting” many records:
Maybe I’m missing something big, and maybe I’m not aware of how common these types of subpoenas to search engines are, but this data sounds like it would have significant competitive value. At minimum, I suspect every trademark owner and SEO would LOVE to have this data. This data may be yours for the price of a complaint and a subpoena–Google appears to be complying with the subpoena without a fight.
If the search engine is truly worried that manipulative SEO’s might abuse this data, wouldn’t the type of regulatory option that we propose–with significant secrecy safeguards in place–be a better centralized option for scrutiny of search engine practices than hundreds of court cases scattered around the country?