Thursday, January 16, 2020

On the limits of regression analysis by economists and justice researchers

Since Grits recently issued dire complaints regarding the role of economists in criminal-justice policy, the publication of this statistician's related complaints the following day set off all sorts of wonderful, confirmation-bias-generated dopamine in your correspondent's brain, as well as filling out a more rigorous critique of economists' use of regression analysis only hinted at in my offering. The author, Andrew Gelman, writes about the interdisciplinary use of statistics in the social sciences. In his view:
We’re in a situation now with forking paths in applied-statistics-being-done-by-economists where we were, about ten years ago, in applied-statistics-being-done-by-psychologists. (I was going to use the terms “econometrics” and “psychometrics” here, but that’s not quite right, because I think these mistakes are mostly being made, by applied researchers in economics and psychology, but not so much by actual econometricians and psychometricians.) 
It goes like this. There’s a natural experiment, where some people get the treatment or exposure and some people don’t. At this point, you can do an observational study: start by comparing the average outcomes in the treated and control group, then do statistical adjustment for pre-treatment differences between groups. This is all fine. Resulting inferences will be model-dependent, but there’s no way around it. You report your results, recognize your uncertainty, and go forward. 
That’s what should happen. Instead, what often happens is that researchers push that big button on their computer labeled REGRESSION DISCONTINUITY ANALYSIS, which does two bad things: First, it points them toward an analysis that focuses obsessively on adjusting for just one pre-treatment variable, often a relatively unimportant variable, while insufficiently adjusting for other differences between treatment and control groups. Second, it leads to an overconfidence borne from the slogan, “causal identification,” which leads researchers, reviewers, and outsiders to think that the analysis has some special truth value. 
What we typically have is a noisy, untrustworthy estimate of a causal effect, presented with little to no sense of the statistical challenges of observational research. And, for the usual “garden of forking paths” reason, the result will typically be “statistically significant,” and, for the usual “statistical significance filter” reason, the resulting estimate will be large and newsworthy. 
Then the result appears in the news media, often reported entirely uncritically or with minimal caveats (“while it’s too hasty to draw sweeping conclusions on the basis of one study,” etc.).
Sound familiar? I couldn't begin to count the number of criminal-justice-related news stories I've seen over the years built around that formula. Gelman cast shade on journalists for not interrogating academic research more deeply, but his sharpest message was for economists:
Savvy psychologists have realized that just because a paper has a bunch of experiments, each with a statistically significant result, it doesn’t mean we should trust any of the claims in the paper. It took psychologists (and statisticians such as myself) a long time to grasp this. But now we have. 
So, to you economists: Make that transition that savvy psychologists have already made. In your case, my advice is, no longer accept a claim by default just because it contains an identification strategy, statistical significance, and robustness checks. Don’t think that a claim should stand, just cos nobody’s pointed out any obvious flaws. And when non-economists do come along and point out some flaws, don’t immediately jump to the defense. 
Psychologists have made the conceptual leap: so can you.
There's much more, you should go read it. Gelman also has an earlier essay hypothesizing how economists justify to themselves the fairly obvious contradictions in their worldview that seem self-evident to nearly everyone else listening to them. I enjoyed both these offerings.

Couple Gelman's observation about limits of the methods economists use with Grits' analysis of the limits of the data they're analyzing and the foundations underlying their mathematical pronouncements in the justice realm begin to crumble. As I'd written the other day:
The greater problem with applied math in the criminal-justice realm is the data to which said math is applied. The justice system typically doesn't gather data on the points upon which policy debates often hinge. Rather, it gathers data at the points where different bureaucratic entities interact when dealing with an individual. Cops hand off suspect to the county jail: a record is created. Charges filed by prosecutors on that person: another record is created. Then more, potentially, as prosecutors interact with judges and defense counsel, as those convicted enter prisons or probation, and so on. 
For the most part, data generated from these interactions cannot answer the most pressing questions facing the justice system, such as what causes crime to rise or fall, what causes people to desist from crime, what incentives face various decision makers throughout the process, etc..
Economists aren't policy makers (even if some of them aspire to be). But taken as a whole, their profession allowed, and often encouraged, the misuse of economic theory to justify and bolster the ideological underpinnings of mass incarceration. Before the #cjreform movement looks to economists for further solutions, Grits believes we should demand their assistance in exposing and undoing that harm.

MORE: Wow, the rabbit hole goes even deeper!


Anonymous said...

"this statistician's related complaints"

link is a 404.

Steven Michael Seys said...

The real problem with the use of statistics in any field is that statisticians explore correlations assuming they're causes. They forgot the earliest rule of applied statistics, "correlation is not causation."

Gritsforbreakfast said...

Fixed the link, thanks 7:31.