Should Empirical Legal Scholars Have Special Responsibilities?
posted by David Schwartz
Before delving into the substance of my first post, I wanted to thank the crew at Concurring Opinions for inviting me to guest blog this month.
Recently, I have been thinking about whether empirical legal scholars have or should have special ethical responsibilities. Why special responsibilities? Two basic reasons. First, nearly all law reviews lack formal peer review. The lack of peer review potentially permits dubious data to be reported without differentiation alongside quality data. Second, empirical legal scholarship has the potential to be extremely influential on policy debates because it provides “data” to substantiate or refute claims. Unfortunately, many consumers of empirical legal scholarship — including other legal scholars, practitioners, judges, the media, and policy makers — are not sophisticated in empirical methods. Even more importantly, subsequent citations of empirical findings by legal scholars rarely take care to explain the study’s qualifications and limitations. Instead, subsequent citations often amplify the “findings” of the empirical study by over-generalizing the results.
My present concern is about weak data. By weak data, I don’t mean data that is flat out incorrect (such as from widespread coding errors) or that misuses empirical methods (such as when the model’s assumptions are not met). Others previously have discussed issues relating to incorrect data and analysis in empirical legal studies. Rather, I am referring to reporting data that encourages weak or flawed inferences, that is not statistically significant, or that is of extremely limited value and thus may be misused. The precise question I have been considering is under what circumstances one should report weak data, even with an appropriate explanation of the methodology used and its potential limitations. (A different yet related question for another discussion is whether one should report lots of data without informing the reader which data the researcher views as most relevant. This scattershot approach has many of the same concerns as weak data.)
To take a contrived example, let’s say a researcher is relying in part upon old data and has reasons to believe that if more recent data were gathered and analyzed, the results would be different. Is it okay to publish results from the old data with a disclaimer saying that new data may yield different findings? To take another contrived example, let’s say a study finds as a secondary finding that 6 out of 10 randomly selected cases found X. For empiricists interested in statistical significance, there is almost nothing valuable to be said about the population from such a small sample size. Would it be appropriate to report that 60% of the sampled cases found X? Would it be sufficient if the article included a paragraph disclosing that the sample size was too small for purposes of making inferences? Some may argue that law reviews can figure this out. But weak data may be presented in an article that includes other solid data. While student editors may be able to identify articles relying upon solely weak data, they likely have more difficulty separating those that report both strong and weak data.
In many cases, I don’t believe transparency about the study’s methodology and their limitations is sufficient. Transparency is a bedrock requirement of solid empirical work. Furthermore, good empirical scholarship notes significant real or potential limitations. Readers of law review articles may unknowingly misuse the reported data because they lack an appreciation of the limitations. There is no peer review filter to remove portions of articles with weak data. Student editors are unlikely to insist upon removal during the editing process.
So where do I come out? Based upon the non-peer reviewed law review system we have now, my current thinking is that sometimes a researcher should report weak data, and sometimes she shouldn’t. One guidepost to be used is peer review. While not every legal scholar is familiar with the criteria for peer review, a researcher should attempt to evaluate whether the data would be published through a peer review process. When in doubt, consult more experienced colleagues. To me, when the data in question supports a personal, normative view of the scholar, the scholar should be cautious in reporting it because the scholar can be vulnerable to undercounting the data’s weaknesses.
The nature and timing of the issue being studied matters too. The scholar should err on the side of caution if she knows of amicus briefs in a pending case that would likely cite to the data. Even putting aside the potential for normative biases, scholars often want to include all of the data they have spent time gathering. Because they want credit for their hard work, they may be less willing to delete weak data than someone like a peer review referee who is unaffiliated with the project. (Of course, empirical legal scholars should also consider submitting to peer-reviewed journals, either as a substitute for student-edited law reviews or as a companion to a paper published in a student-edited law review.)
Another guidepost is to consider if the weak data is the best data that the researcher can reasonably obtain and analyze under the available resources and constraints. When it is the best available data, this factor weighs toward reporting. Furthermore, if the weaker data is consistent with other, more solid data, and the researcher intends to later supplement the findings, perhaps reporting is appropriate. Sometimes reporting weak or preliminary data can spur further, more solid research. This may outweigh concerns about potential misunderstandings of the data. And sometimes weak data is just one data point among many. Then, the risk that the data will be misused is smaller.
To reduce the possibility of misuse or misunderstanding, a heightened disclosure requirement is prudent. If a decision is made to include the weak data, a very explicit warning for the reader should be included. Scholars need to think more carefully about how to disclose uncertainty about the relevance of weak data.