Networks, Crowds, and Markets (first tip: Crowds Are Not So Wise)

Deven Desai

Deven Desai is an associate professor of law and ethics at the Scheller College of Business, Georgia Institute of Technology. He was also the first, and to date, only Academic Research Counsel at Google, Inc., and a Visiting Fellow at Princeton University’s Center for Information Technology Policy. He is a graduate of U.C. Berkeley and the Yale Law School. Professor Desai’s scholarship examines how business interests, new technology, and economic theories shape privacy and intellectual property law and where those arguments explain productivity or where they fail to capture society’s interest in the free flow of information and development. His work has appeared in leading law reviews and journals including the Georgetown Law Journal, Minnesota Law Review, Notre Dame Law Review, Wisconsin Law Review, and U.C. Davis Law Review.

You may also like...

8 Responses

  1. A.J. Sutter says:

    Apropos of “the state prices actually do converge to the true probabilities as the size of the crowd grows”: This is a somewhat misleading statement, both in the original book and in the deracinated context presented in the post.

    What are “true probabilities”? This phrase glosses over some serious issues about the meaning of probability. For a frequentist interpretation — which many mathematicians would hold is the only one constitutive of “true probabilities” — one would have to have a series of repetitions of the same field of horses competing against each other under the same conditions.

    Horse races rarely qualify as repeatable events of this sort. Not only does the race depend on the field, but also on the weather, the condition of the track, the physical condition of each horse, the jockey and his physical condition, etc., etc. For such a situation one could estimate probabilities, e.g. by the Bayesian method (which is what many bettors in effect do, based on prior data in the Racing Form or similar data sheet), by intuition (maybe more common at the track), etc. It’s not appropriate to call these estimates “true probabilities” — they’re just estimates.

    To have “true probabilities” pertinent to a horse race under a frequentist understanding, it wouldn’t be enough to show that there are only a few identifiable and repeatable conditions that affect the outcome of a race. One would then also have to actually repeat the races, holding those conditions constant, to derive the probabilities. Obviously, this doesn’t happen too often. (To clarify, I’m assuming — and the context of Easley & Kleinberg’s discussion of betting justifies the assumption — that we’re talking about the probability of a horse winning a race before it’s been run; obviously, after the race the “true probability” of the horse’s victory is either identically 0 or identically 1.)

    E&K ignore these issues, and in the passage quoted above simply assume the existence of a “true probability” of a horse winning a race, without offering any explanation of what this might mean. They then show by algebra (or actually, assert a certain result of algebra that occurs offstage) that a big enough crowd’s opinions converge “to the truth.” This may be correct purely in a tautological sense, if we accept their unjustified assumption. But when applied to the real world, it’s rhetorical sleight of hand. Without a “true probability” in the horse race scenario, there isn’t any basis for asserting that “truth” exists. So even under the qualifications they mention, they’re overstating the “wisdom” of crowds. Unfortunately, the strong positive connotations of words like “truth” and “wisdom” make these sorts of careless arguments very manipulative.

  2. Ted Sichelman says:

    Deven–Thanks for the heads up on this book. I wholly agree it’s the type of work we in the legal academy should be reading to gain a fuller understanding of the various buzz-concepts we throw around.

    A.J.–Your comment on “true probabilities” seems more a quibble with terminology than a substantive criticism of the methodology here. The notion relates not to whether we can in actuality repeat the horse race under the same conditions. Rather, it is whether in some ideal world if we were to infinitely repeat the horse race under the same measurable conditions (assuming the immeasurable conditions play some role in the outcome) as the horse race of interest, how often would each horse win. In other words, the usage of “true probability” in this context does not turn on whether philosophically speaking, a “true probability” exists, just as much as a discussion of drawing circle in the sand depends on whether a “true circle” can physically exist. So while I agree there may be a terminological concern with the specific phrase “true probability,” the concept seems clear enough in context, and I don’t see how the phrase lends itself to “manipulation,” at least for those readers that understand the appropriate contextual usage.

  3. A.J. Sutter says:

    Ted, it’s not a quibble at all: it’s very much a criticism of both the methodology, which is GIGO, and the motivation, which is not only to vindicate the “wisdom” of crowds but to support the supposed predictive power of markets (@704).

    E&K use the expression “true probability” in a specific sense: namely, that there is a well-defined probability a that, in their example, horse A will win a race (see section 22.10). They call a a “true probability,” and assert that this is learnable through observance of repeated trials of independent races, using Bayesian learning. In their example, they don’t address the issue of how to define events so that you can tell when they are being repeated — they merely take it for granted that one race is sufficiently like another. (In 22.10, they posit that A is always racing against the same one horse, B, which is already so oversimplified from real life as to be useless.)

    So one question is, under what conditions do two horse races (even with the same horses involved) constitute repetitions of the same event? And what about a more realistic horse race, with many different horses in the field on each occasion — is this ever repeatable? In the realistic case, a probability for A‘s winning the race cannot be defined even in the ideal sense (e.g. as a number that we might not be able to know). The definition of probability of an outcome of an event requires repeatability of the event. And not just in the frequentist view: E&K’s own Bayesian learning example in 22.10 is based on the observation of repeated races. If a real-world horse race at a track is not repeatable, then the probability of a given outcome of the race is undefined, not a known, knowable or even unknown number.

    (The probability of horse A winning a horse race has a very different ontological status from a circle. A circle, even in an ideal sense, can be defined. Husserl would even argue that circles exist only in an ideal sense, so in effect all circles are “true” circles. And as you know, departures from true circularity are measured all the time, e.g. planetary oblateness and orbital eccentricity, just to name some astronomical examples. But the probability of an outcome of an unrepeatable event is a non sequitur, and doesn’t exist in any “true” sense.
    Of course, people estimate “probabilities” of unrepeatable events all the time, such as what are the odds of the euro collapsing, a CDO tanking, Ron Paul winning Iowa, etc., and they base their actions on those estimates. This is an apt description of what goes on at the track, too. But these are subjective estimates, — and in context, most people would acknowledge that, and wouldn’t call them “true” in any epistemic sense other than as a description of their judgment.)

    Since E&K assume without justification that the races are repeatable events, we’ve already got “garbage in.” But their argument is even sloppier: it plays with loaded dice. In the passage leading up to Deven’s quote (@700-703), they talk mostly about an individual bettor’s estimated probability, e.g. a_i is bettor i’s estimate of the probability that horse A will win. They then suppose (see quote above) that “the opinions in the crowd about the probability of horse A winning are independently drawn from a distribution whose mean is equal to the true probability of horse A winning.” But why should this be the case? Even if a “true probability” existed, why should it just happen to be the mean of the distribution of opinions? This is just question-begging — the crowd is wise because the authors already supposed it to be when they defined the problem. This constitutes both “garbage in” and “garbage out”.

    I leave aside the independence issue, which Deven rightly highlights. E.g., suppose 17th Century Massachusetts townsfolk placed bets on the outcomes of a series of witch-dunkings. Would their opinions converge to the “true probability”? If not, might there be an independence issue? And if that issue exists in that example, why not also in connection with presidential prediction markets, equity markets, etc.?

    I also leave aside the issue of how could you ever empirically test E&K’s claim that the results converge to a “true probability.” After all, you can’t know a a priori — you’ve got to run lots of independent trials to determine it. You can run lots of horse races and compute a number, but this begs the question of whether it’s appropriate to consider all those races repetitions of the same event. The “truth” of the “true probability” then becomes tautologous: you’re just saying that whatever number you compute based on your assumptions is “true.” But if “true probabilities” really are reifiable in the same way as circles, you should have some independent way of determining them. Otherwise call them “truthy”, at best.

    So now the problem of manipulation (more “garbage out”): You may be correct, Ted, that the authors don’t care whether or not “philosophically speaking, a ‘true probability’ exists.” My point: they’re wrong not to care. Their rhetorical point is that somehow the “crowd” arrives at the “truth”. For realistic examples of unrepeatable events, this is an abuse of the word “truth,” and therefore is misleading about the power of “crowds.” Moreover, the authors’ question-begging leads them to claim that the market is “predicting” something, when the definition of the textbook problem has already constrained “the market” to the desired result. This flimflammery and generalization from oversimplified examples are in support of an ideology it’s hard to imagine anyone retaining after 2008:

    Our overall conclusion, that the market selects for the trader with the most accurate beliefs, and asymptotically prices assets according to these beliefs, applies equally well in other settings such as prediction markets. … [T]his idea draws on a long history of economic arguments for market efficiency based on natural selection [11, 157, 172], in which smarter traders come to hold an increasingly large fraction of the wealth in the market, and thereby exert an increasingly large influence on the market.[@729-730]

    After reading this, some Bayesian learning leads to my next bet: we’re in for many more financial crises in the future. And if the legal academy thinks it needs to catch up with the idea of the Efficient Market Hypothesis, then in the words of Basque anthropologist Julio Caro Baroja, we might as well “put out the lights and go home.”

  4. Ken Arromdee says:

    By this reasoning, if I wanted to compute “probability of getting heads when flipping a coin at ____ time on ___ day” I could not do it, since the act “flipping the coin at this particular moment” is not repeatable, and I don’t know whether any confounding factors that exist at this particular moment might exist at another time.

  5. Ken Arromdee says:

    They then suppose (see quote above) that “the opinions in the crowd about the probability of horse A winning are independently drawn from a distribution whose mean is equal to the true probability of horse A winning.” But why should this be the case? Even if a “true probability” existed, why should it just happen to be the mean of the distribution of opinions?

    They’re not saying that. They’re saying that it’s the mean of the distribution, and that the opinions are drawn from the distribution. That is not the same as saying that it’s the mean of the distribution of opinions.

  6. A.J. Sutter says:

    Ken@4: You could have established that the coin is “fair” by a series of prior trials — the notion is that the confounding factors will average out. (OTOH, you could indeed maintain that this is not possible at all, and claim that all probability is subjective, and the “true” values don’t exist.) In any case, specific social events like a specific horse race (other than in textbooks) or a specific presidential election don’t involve analogues to “fair coins.” In practice, people ignore confounding factors, but choosing to ignore them isn’t the same as showing that the confounding factors even out. If that were true, it might still be 2006, so to speak.
    @5: Yes, thanks for the correction. My argument’s objection then collapses (i) to the issue of independence, in the case of a repeatable event, i.e., why is the supposition of independence reasonable in real life, and (ii) to the non sequitur issue, in the case of an unrepeatable event, i.e. “true probability” is undefined.

  7. Ted Sichelman says:

    Thanks for the detailed response, A.J. While I agree we can measure deviations of “real” circles from an “ideal” circle, but not for probabilities of unrepeatable revents, I still think one can suitably define a “true” probability in this context (in the manner I did in my previous post). In that sense, I still hold to my general comment about the similarity between “ideal” circles and “true” probabilities.

    Namely, determine the set of measurable characteristics with respect to a given event (e.g., horse race) and an observer, such that there is some residual uncertainty (that depends on the immeasurable characteristics, or inherent randomness) as to a given outcome (e.g., winner of the race) of the given event. Based on this single event, keep the measurable characteristics constant and allow the immeasurable characteristics to vary over a infinite series of events (or simply assume inherent randomness over the series of events). Although this cannot be done in practice, we can imagine an ideal world in which we can do so. The “true” probability of a given outcome is the % of times this outcome results. So while we don’t have an ideal definition like that of the circle, we have one that seems sufficiently coherent in this context.

    Now, the question becomes whether the views of the crowd get us closer to the ideal true probability defined as such. In situations where we cannot repeat a particular event (holding measurable conditions constant) a suitable number of times, then there is no direct empirical test of the claim in any robust sense. However, if we have independent reasons to believe that the measurable conditions of the specific event remain fairly constant from similar event to similar event, then we can roughly compare the predictions of the crowd to those of individual “experts” in an empirically (though I agree there is no a priori reason to believe crowds will be any better than individual “experts” at predicting outcomes or that markets will select “accurate traders” in the long run, and any assumptions otherwise are “garbage”).

    Of course, those independent reasons to believe the measurable characteristics remain fairly constant from event to event may be wholly suspect, which casts into doubt the entire enterprise. However, all of scientific theory rests upon these sorts of assumptions. For instance,it may very well be the case that the sun rises every day for reasons completely unrelated to the laws of gravity as we know them, but we infer from _non-scientific_ knowledge that we have good reason to believe otherwise. Of course, keeping conditions sufficiently constant for most social phenomena is incomparably more difficult than for physical phenomena, and it may be that some proponents of prediction markets have generally ignored potentially confounding factors. Nonetheless, it seems that the concepts involved are defined well enough to test these claims empirically for a broad and important set of social phenomena.

  8. A.J. Sutter says:

    Thanks again, Ted. I think I can boil my reply down to some concise points:
    1. I don’t have any objection to comparing predictions of a crowd to predictions of “experts” — other than thinking that this comparison will yield “truth”. Cf. the view of the Bayesian mathematician and game theorist Leonard Savage, that all probabilities are subjective.
    2. While we may be able to imagine an ideal world in which we can determine measurable characteristics plus a residual, the question is whether this is any closer to our real world than, say Vulcan or the Planet of the Apes would be. The connotations of the word “residual,” for example, may be misleadingly reassuring that the really important stuff has already been measured. In any case, the more loosely you characterize the event that is being repeated, the less meaningful are the results of your calculation, and the more danger there is that you’ll come up with a GIGO interpretation.
    3. Husserl, in The Crisis of the European Sciences and his essay “The Origin of Geometry,” critiques far more eloquently than I can the notion that manipulations in the ideal world have any bearing on what happens in the real world (“life-world”). (He wasn’t just talking about social phenomena, but physical ones as well.) The method you propose is a textbook case of what he was so riled about. The ascription of “truth” to such manipulations was the essence of the “crisis” he was writing about (in Germany in 1938).
    4. The financial crisis of 2008 is another perfect example of the hazards of reifying probabilities (the Efficient Market Hypothesis, “risk management,” Black-Scholes, etc. etc.). That E&K should still be defending the EMH may mean they’re suffering from cognitive dissonance, but my New Year’s hope is that their readers will be more critical.