Law Profs Who Code
posted by Paul Ohm
Law Professors who write about the Internet tend to develop facts through a combination of anecdote and secondary-source research, through which information about the conduct of computer users, the network’s structure and architecture, and the effects of regulation on innovation are intuited, developed through stories, or recounted from others’ research. Although I think a lot of legal writing about the Internet is very, very good, I’ve long yearned for more “primary source” analysis.
In other words, there is room and need for Internet law scholars who write code. Although legal scholars aren’t about to break fundamental new ground in computer science, the hidden truths of the Internet don’t run very deep, and some very simple code can elicit some important results. Also, there is a growing cadre of law professors with the skills needed to do this kind of research. I am talking about a new form of empirical legal scholarship, and empiricists should embrace the perl script and network connection as parts of their toolbox, just as they adopted the linear regression a few decades ago.
I plan to talk about this more in a subsequent post or two, but for now, let me give some examples of what I’m describing. Several legal scholars (or people closely associated with legal scholarship) are pointing the way for this new category of “empirical Internet legal studies”.
- Jonathan Zittrain and Ben Edelman, curious about the nature and extent of filtering in China and Saudi Arabia, wrote a series of scripts to “tickle” web proxies in those countries to analyze the amount of filtering that occurs.
- Edelman has continued to engage in a particularly applied form of Internet research, for example see his work on spyware and adware.
- Ed Felten—granted, a computer scientist not a law professor—and his graduate students at Princeton have investigated DRM and voting machines with a policy bent and a particular focus on applied, clear results. Although the level of technical sophistication found in these studies is unlikely to be duplicated in the legal academy soon, his methods and approaches are a model for what I’m describing.
- Journalist Kevin Poulsen created scripts that searched MySpace’s user accounts for names and zip codes that matched the DOJ’s National Sex Offender Registry database, and found more than 700 likely matches.
- Finally, security researchers have set up vulnerable computers as “honeypots” or “honeynets” on the Internet, to give them a vantage point from which to study hacker behavior.
What are other notable examples of EILS? Let’s keep with the grand Solovian tradition, and call this a Census. Is this sub-sub-discipline ready to take off, or should we mere lawyers leave the coding to the computer scientists?
February 20, 2007 at 9:28 am
Posted in: Empirical Analysis of Law
Print This Post












Responses (11)
James Grimmelmann - February 20, 2007 at 11:00 am
The field began with a debacle: the Rimm cyber-porn study.
I’d add two more virtues of law-profs who code. The first is that they can often build things that they think ought to exist. That’s a great way to translate ideas into practice that often complements or substitutes for directly “legal” activities. The second is that getting inside the process of programming is a great BS detector for examining claims about Internet technologies. There are aspects of programs and programming that are hard to grok without hands-on experience coding.
gr - February 20, 2007 at 11:11 am
I don’t know who wrote the code but Thomas Smith’s ‘Web of Law’ SSRN . Could be on there. Its empirical, and network-ish. But not about the internet. Rather its taking internet type concepts and applying them to citations.
Eric Goldman - February 20, 2007 at 11:23 am
Geist and Kesan have both done good work doing empirical studies of the domain name system. Noveck has also done a number of interesting coding projects at her “dotank.” Eric.
Greg Lastowka - February 20, 2007 at 11:34 am
Paul — Just my opinion, I’ve felt that learning to code has helped me most by giving me an instinct for avoiding common mistakes made in software policy debates. It helps you see below the surface of what you’re observing.
But with regard to law profs scripting (not just understanding it) — above you have examples of how scripts can be a useful tool for gathering data. But gathering data is essentially gathering relevant facts, right? So I’m curious: is your model of “Law and Scripting” more or less a sub-category of empirical/descriptive research? Or are you suggesting there is some scholarly benefit found in the process of scripting the code to gather the data? And if it is the latter, what is the benefit?
Maybe you’ll address this is future posts?
mmmbeer - February 20, 2007 at 11:52 am
What about lawyers who code? I’m a Comp. Sci. major and was a former web-application jockey before and during law school. Now I do lots of transactional intellectual property work (e.g. those license agreements no one else reads).
I find contract drafting to essentially be another type of programming language.
Paul Ohm - February 20, 2007 at 12:18 pm
Thanks for the comments, all. I should have mentioned Geist’s, Kesan’s and Smith’s work, which I have read and admired. I don’t know about the “dotank,” but I plan to learn.
Greg, thus far, most of the projects we’ve described do little more than gather information, and EILS (I’m determined to coin a new acronym, here!) is at the very least a nascent branch of empirical legal studies.
But along with the virtues you and James cite, there are reasons why I think Law Profs will benefit from writing the code themselves, rather than waiting for someone else or outsourcing the work to others, aside from merely the potential efficiency gains.
For one thing, nobody is writing code to answer some types of policy-laden questions. They just aren’t the kind of questions that interest Computer Scientists or professional programmers, for example.
I also think that some of the research we’ve discussed–Felten’s Sony Root Kit work is the best example–begins to cross the line between mere info gathering and full-blown scientific method-based inquiries. Thanks to the generativity of PCs and code (See Zittrain) and the openness of the Internet, lawprofs have the opportunity to engage in the “science” of this field moreso than in other fields. I’ll try to flesh all of this out in a later post.
Finally, I don’t know much about the historical evolution of ELS, so let me turn your question around and ask, why do lawprofs run their own regressions? Is it “more or less a sub-category of empirical/descriptive research, or [do those professors] suggest[] there is some scholarly benefit found in the process of” running ANOVAs themselves?
Ben Edelman - February 20, 2007 at 1:40 pm
Paul,
Interesting post. Thanks. For some time I thought about my work in much the spiric that you describe EILS — so it’s particularly nice to see this set out so clearly.
Greg, I find it’s helpful to do my own coding, my own data collection, and my own empirical work (be it ANOVA or otherwise) because I have greater flexibility doing this myself than in asking others to do it for me. I can’t imagine having to send out documents to be professionally word-processed; I’m far more efficient just sitting down with a text editor and revising on my own. I have the same view about writing software. Separately, I also share James’s view of coding as “BS detector”; certainly I’ve found plenty of lies and half-truths through code-assisted data collection.
I do like the idea of making a list of law professors who code. But I wonder how long it would be if you were strict about it. Insist on 1) people who were/are actual faculty members at a law school, and who 2) wrote their own code (not just supervising others). I’m not on that list because I’m not a law professor. Neither are Felten and Poulsen. So far as I know, Zittrain hasn’t written his own code in recent years. Still, there may be some folks truly in the intersection of these sets. It’s often hard to know who writes their own code, versus hiring others to do it for them — though in principle the organizing principle could be the EILS spirit, even if the law professor doesn’t actually do the data collection & coding personally.
Ben
Bruce Boyden - February 20, 2007 at 1:41 pm
Almost more than empirical studies, I think explaining how this stuff works, in detail but in a non-arcane way, would be a tremendous service. E.g., I still only have a foggy sense of how internet traffic is routed.
greglas - February 20, 2007 at 3:14 pm
Paul> Is it “more or less a sub-category of empirical/descriptive research, or [do those professors] suggest[] there is some scholarly benefit found in the process of” running ANOVAs themselves?
I’m afraid my answer is that the ANOVA is the former, not the latter. Running regressions may teach you statistics, but I don’t think there is much payoff for the average law prof from regularly utilizing that skill–apart from (as Ben mentions) the flexible control it affords over an empirical project that might have to be done in a more cumbersome way.
Though, perhaps going through the formal rigor of statistical regressions might be a healthy exercise for a legal formalist!
Paul Ohm - February 20, 2007 at 4:04 pm
Greg says: Running regressions may teach you statistics, but I don’t think there is much payoff for the average law prof from regularly utilizing that skill–apart from (as Ben mentions) the flexible control it affords over an empirical project that might have to be done in a more cumbersome way.
That’s what I thought you’d say. I never intended to make any grand statements about the scholarly benefits that would redound to those who code. My original post was mostly about digging up information that is very hard to dig up.
I would, however, characterize this particular virtue as more than mere “flexible control,” but only because of the baseline of where we are with this kind of research. There is A LOT of data we don’t have access to today, because nobody — not the sociologists, not the economists, not the computer scientists — have spent much time looking for it. People ran regressions long before lawprofs “discovered” them. I think this situation is different–maybe different in degree not kind–but different.
Of course, what I’ve just said doesn’t separate the coders from the “people with money to give to coders,” but I have the strong feeling that for boring, practical reasons having to do with resources and priorities, the coders are the only ones likely to dig up this information, at least in the near term.
As to whether there’s a deeper, more fundamental benefit here, I think there is, but your comments, Greg, are forcing me to think more deeply about it before trying to convince you…
Michael Froomkin - February 21, 2007 at 11:21 am
Hmm. I’m not sure how I fit on your survey. I was a programmer before I was a lawyer and still write little scripts now and then (and do PHP / Perlish stuff for web pages).
Coding not only helped me understand what I was writing about when doing internet law work, but even before that it imposed a discipline of thinking about logic flow that I still find helpful in all legal work.
But I don’t code stuff up for empirical studies. I do try to publish useful sites, from time to time, but that’s not exactly coding either.
Leave a Reply