Statistical Evidence and the Law
In my consulting and expert-witness work, I encounter a common problem: a statistical analysis has been offered to support a legal conclusion, and the real question is whether the analysis actually does the job it has been asked to do.
These essays collect some thoughts from that experience. They discuss recurring issues that arise when statistical models are used to test for disparities, identify anomalies, estimate damanges, or compare competing explanations. My focus here is practical rather than theoretical: when statistical evidence supports a legal inference, when it does not, and what I would encourage lawyers and fact-finders to ask before treating a numerical result as legally meaningful.
For readers looking for a broader introduction to statistical and data-scientific reasoning, I also maintain a free textbook, Data Science in R: A Gentle Introduction.)
For consulting and expert-witness inquiries, see my Consulting page.
Beware of expert words with two meanings
Technical words like “bias,” “significance,” and “independent” can mislead when their statistical meanings are allowed to borrow weight from ordinary English. In litigation, the crucial question is often not whether the expert’s technical statement is true, but what legal or factual conclusions actually follow from it.
When a healthcare fraud model sees a green apple
Healthcare fraud models can identify providers whose billing patterns are unusual, but unusual billing is not the same as fraud. This essay explains why counsel should demand both statistical calibration and substantive validation before a billing outlier is treated as evidence of false claims.
When AI Performance Becomes a Legal Dispute
When an AI tool sold to improve a business outcome produces poor results, the dispute turns on statistical questions about what the system was promised to do and how that promise was tested. Lawyers need to connect the observed failure to a specific account of the model’s target, validation, reliability, deployment, and causal effect.
Anomaly Is Not Causation
When an expert says a pattern of behavior is statistically unlikely, that assessment depends on a mathematical model of “ordinary conduct” that the expert has constructed. Lawyers should examine that model carefully before allowing the mere fact of a statistical anomaly to do the work of legal inference.
What Can a Finished Product Prove About How It Was Made?
Finished products may look alike for reasons that matter legally, or for reasons that do not. In manufacturing disputes, the strongest evidence often comes from designing an experiment that shows whether resemblance points to the disputed process or to ordinary lawful alternatives.
The Denominator Problem in Statistical Evidence
Counts of bad events can look powerful in litigation, but a count is only half of a rate. Lawyers evaluating statistical evidence should ask what population, exposure, or opportunity produced the count before treating it as proof of risk, systemic conduct, medical necessity, damages, or classwide harm.
The More Places You Look, The More You Find
A statistical expert may search a large body of records and find one pattern that looks unusually powerful. The legal problem is that the final calculation may treat that pattern as though it had been specified in advance, even though it was discovered only after many other possibilities were examined. This so-called “multiple testing problem” arises in areas as diverse as product liability, employment, procurement, securities, and healthcare fraud.