Monday, September 24, 2012
Still More Thoughts on the Phillips & Yoo Citation Study
Mr. Phillips writes, in reply to the criticisms noted last week:
Prof. Leiter’s colleague’s concerns are about false positives, which would inflate scores. We find this the lesser evil, and thus diverged from the Leiter method by not using his sampling technique (which we failed to make clear in our methods section, and have since corrected) because we find the technique problematic from a sampling methodology and measurement theory perspective. Leiter looks at the first and last ten citations, counts up the number of “legitimate” ones, and multiplies that percentage by the total number of cites to get his initial raw value. Thus, someone with 1000 “cites” in Westlaw’s JLR, who had 16 legitimate cites of the first and last 10, would have a raw value of 800. This has three major problems. First, Leiter is using a non-random sample to represent the underlying population. That is a statistical no-no unless there is some kind of sophisticated statistical “correction.” Second, even if the sample was randomly drawn, it is too small to make useful inferences. The hypothetical professor we listed above (the average number of cites a professor had in our study was 976), with a random sample of 20 (with 16 legitimate), and 1000 total cites, would have a 95% confidence interval of 626-974, mean the “true” number of legitimate cites is most likely somewhere in that range—which is not very useful. Finally, the Leiter method makes it more difficult to compare scholars since some professors’ scores will be biased high and some biased low due to the non-random nature of the sampling, negating the value of the Leiter scores as a comparative metric, which is the only real value such scores have.
Our methodology just counts everything in the JLR database, biasing the scores higher than the “truth”, but treating everyone the same—equality of inflation—so that comparisons can be more easily made. Our method is also very easily reproduced, as Prof. Leiter’s colleague demonstrated. And we are not claiming our method (or any citation-based measure) is a measure of quality, but of relevance (and given that many citations are put in by student editors, citation studies are a long way from perfect). As to Prof. Strandburg, her situation is so rare—having highly cited works in an unrelated field, then completely shifting career trajectory and turning to the law—that the one or two people that are like her can be easily corrected when brought to our attention (as we did with her score). That is a lesser evil than completely excluding relevant work in peer-reviewed journals, in our opinion. And as for Prof. Cohen, while we have received much feedback wondering who he is and why he is included, we have also received feedback that “he should be included [because] he is well known in the IP field by those who read economics as well as law journals…[and] has done path breaking empirical research in IP for many years.” We appreciate the numerous feedback we have been receiving as we seek to refine our measure and paper.
I have to say this strikes me as unpersuasive. A few quick points: (1) ideally, random sampling for false positives would have been best, but in all the years of doing it non-randomly, no one has ever come forward with a single case where this method distorted the results; (2) by contrast, it is both a "statistical" and intellectual "no-no" to fail to correct for huge rates of false positives, since such rates are not evenly distributed across all names for the obvious reasons (e.g., someone with the last name "Judge"), and several cases of large false positives have now been identified; (3) in any case, it's an empirical, not statistical, question which method yields the most reliable outcomes, but I'm betting on the approach that I and now Sisk have used for quite some time; (4) using Web of Science was a good addition to the mix, but there clearly needs to be some sensible protocols in place to screen out citations utterly irrelevant to legal scholarship and also more sensible protocols about who counts as a member of a law faculty (tenure stream status in law was our criterion, which would eliminate a lot of the strange inclusions in the Phillips & Yoo lists). James Heckman and Gary Becker are now cross-appointed to the law faculty at Chicago, and they crush Cohen (and almost everyone else!) on Web of Science, but it would be bizarre to think that should be decisive in a ranking of law faculties!
Thoughts from readers about all this? Full name and valid e-mail address required.
I agree with Brian that the quoted explanation is unpersuasive. In this setting, imperfect correction of false positives creates more meaningful results than no correction of false positives.
If you'll allow me to make a related point, I think Brian's methodology of counting the number of cites within the previous 4 or 5 years is superior to the Yoo/Phillips methodology of counting number of cites per year in legal academia. The problem with counting cites per year in legal academia is that different people get tenure track jobs at different stages of their publishing careers. Not everyone starts at the same point. Some get jobs near the beginning of their publishing careers, and so they start with few or zero citations per year. Others come in after they have engaged in considerable publishing, so they start with many citations a year. The Yoo/Philips methodology has the strange effect of biasing the numbers based on how well-published people were before they began teaching, especially in the early years.
To see how this biases the numbers, consider Michael Greve of George Mason. Michael started at George Mason this fall, so he has only a few weeks as a law professor under his belt. But he already has 794 citations in the Westlaw database alone, because for the last 25 years he has been writing (some 9 books and over 25 articles) while at think tanks in DC. Under the Yoo/Phillips methodology, Michael Greve is by far the most influential scholar in all of legal academia: He has 794 citations per year of teaching, and that's if you give him credit for a full year of teaching after he has been a legal academic for just a few weeks. That's more than 3x the citations per year of Mark Lemley, the current #1 at merely 268 per year. I think Greve's work is very interesting, but I don't think anyone thinks he is by far the most influential person in legal academia.
Posted by: Orin Kerr | Sep 24, 2012 8:41:03 AM