Anomaly Is Not Causation

Why an unusual statistical pattern may justify scrutiny in litigation, but does not by itself identify the cause of the anomaly.
Author

James G. Scott

Misconduct, or innocent anomaly?

Suppose a public agency investigates a series of bids submitted for emergency-facilities maintenance contracts. Two vendors have submitted bids with unusually high overlap across dozens of line items: the same hourly labor rates, the same equipment charges, the same materials prices, and the same markup on several highly specialized services. The agency finds this suspicious. It hires an expert to build a statistical model to evaluate whether the overlap is consistent with ordinary bidding or suggestive of coordination. The expert concludes that the observed overlap is extraordinarily unlikely if the vendors were acting independently.

This is one example of a recurring litigation pattern: a party observes conduct that looks unusual, retains an expert to quantify how “improbable” or “anomalous” the behavior seems to be, and uses the resulting probability to support a legal claim. The same structure appears in fraud cases, where transaction patterns are said to depart from normal business activity; in antitrust cases, where pricing or bidding patterns are said to reflect coordination; and in employment cases, where pay, promotion, hiring, or discipline outcomes are compared to a benchmark. Across these settings, the statistical model is asked to convert “unusualness” or “improbability” into legal meaning.

The difficulty here is that “unlikely” and “caused by misconduct” are not the same conclusion. A model may show that the observed pattern is surprising under a particular account of ordinary behavior, while still saying nothing about the ultimate legal question: why did this pattern occur?

Probabilities come from models

In these cases, the probability is not sitting in the data waiting to be read off. It comes from a model. That is true whether the case involves bid overlap, unusual transaction patterns, pricing behavior, employment outcomes, or some other alleged anomaly. Before an expert can say that a pattern is “unlikely,” the expert has to specify what ordinary conduct is assumed to look like and what kinds of departures from that conduct will count as surprising.

That point is easy to miss because non-specialists may be tempted to think of a statistical model as a calculator: plug in the facts, press a button, and get back an objectively correct probability. But that is a misleading picture. A better way to think of a model is as a disciplined, judgment-laden translation of a real-world story into mathematical form.

Return to the bid-overlap example. The overlap cannot be deemed unlikely in the abstract. It can only be deemed unlikely relative to some model of the world: a mathematically precise account of how bids would have been prepared, if the vendors were acting lawfully. No such model can answer, in any direct legal sense, the question of whether the parties coordinated. It asks a narrower mathematical question: if these bidders were not coordinating, and if ordinary bidding worked in the particular way the model assumes, how likely would we be to see bid overlaps this extreme? The same structure appears in other settings. A fraud model may ask how unusual a transaction pattern is under a model of ordinary business activity. An employment model may ask how unusual a pay or promotion pattern is under a model of ordinary personnel decisions. Those are useful questions. But they are not the same as asking whether the alleged misconduct occurred.

That is the central legal and inferential problem. Once an anomaly has been observed, more than one initially implausible explanation may become plausible. In our example of two companies submitting suspiciously similar bids on hospital maintenance contracts, coordination is one possibility. But so are common suppliers, shared estimating software, or any number of other benign alternatives. If, for example, there is only one company in the state qualified to service the cold head in a superconducting MRI magnet, two vendors may submit strikingly similar line items because both are passing through the same supplier quote.

The litigation takeaway

When an expert relies on statistical evidence of an anomaly, lawyers should press on the conversion from statistical rarity to legal meaning.

How reasonable is the model’s operational version of ordinary, lawful conduct? What feature of the data is being tested for anomalous behavior, and was that feature chosen before the analysis? What comparison set was used? What innocent mechanisms could produce the same pattern? And what evidence, apart from the small probability itself, links the anomaly to the alleged misconduct?

A statistical anomaly becomes much more probative when the model itself is tied to a credible mechanism: why the overlap appears on discretionary items rather than pass-through costs, why the same pattern recurs across bids, who had access to what information, and whether documents, communications, or business practices point in the same direction. Without that connection, the model may identify a pattern worth investigating. It has not yet identified misconduct.