Beware of Expert Words with Two Meanings

Why technical terms in statistics like bias, significance, and independence can mislead legal factfinders when their meaning is conflated with ordinary English.
Author

James G. Scott

Familiar words can be dangerous

Expert testimony often turns on technical words that sound familiar. That familiarity can be dangerous. A judge or jury may hear a word in its ordinary English sense, while the expert is using the same word in a narrower technical sense. If the difference is not made explicit, a correct technical statement can inadvertently become a misleading legal inference.

Consider “bias.” In ordinary speech, bias suggests prejudice, unfairness, or improper motive. In statistics, though, bias has a highly specific meaning. It describes a procedure that has a systematic tendency to overshoot or undershoot the quantity it is trying to estimate. Yet the potential semantic trap is obvious: a factfinder may hear that an analysis involves a “biased estimate” and infer that someone acted unfairly or with improper motive. But that does not follow. A statistical estimate can be biased without anyone being prejudiced. In fact, for reasons not worth detailing here, a modestly biased estimator may even be preferred in many settings, because it is more accurate than any alternative.

Or consider “significance.” To lay people, significance means importance. But in statistics, significance means something almost pedantically narrow: that a result is sufficiently inconsistent with a specified null model, at a specified threshold, under specified assumptions. Here too, the word invites the wrong inference. A result declared in court to be “statistically significant” may sound like a result that matters. But it may be too small to matter legally or economically to anyone other than a statistician performing their day job.

Independent actors and independent outcomes are different ideas

But for my money, the most dangerous technical word in statistics is “independent.” The statistical meaning is not the same as the ordinary English meaning, but the two are close enough that one can easily be mistaken for the other, especially in litigation, where “independent” can already carry legal and factual weight.

Just think of all the ways a lawyer might use the word:

  • an independent contractor, as opposed to an employee;
  • an independent director, as opposed to someone beholden to management;
  • an independent investigation, as opposed to an internal whitewash;
  • an independent medical examiner, as opposed to a treating physician;
  • an independent expert, as opposed to a hired advocate;
  • an independent decision maker, as opposed to someone following orders.

In each setting, the word carries a sense of separateness, neutrality, or freedom from control. That is already a lot for one word to carry. When a statistical expert then uses the same word in a technical sense, the risk of confusion is obvious.

Take one common legal usage: independent actors. In ordinary legal usage, two actors behave independently if they act separately. They do not coordinate, collaborate, or act under a common instruction. In an employment case, for example, a company might argue that challenged hiring, promotion, or pay decisions were made by independent managers exercising their own judgment.

In probability, on the other hand, independence means something specific and technical: two events are independent if learning that one occurred does not change our assessment of the probability the other will occur. For example, suppose we meet two siblings and learn that one is colorblind. That fact will change what we think about whether the other sibling is colorblind, because siblings share genetic risk factors. The two events are not independent in the probabilistic sense.

By contrast, suppose a bird poops on my car in Austin. Does learning this fact tell us anything useful about the probability that the Texas Longhorns will subsequently win the national championship? Of course not. Those events are ridiculous to mention in the same sentence, and that is exactly the point: they are independent because one gives us no information about the other.

The definition deserves a slower look. Probability is a language for updating expectations in light of information. To say that two events are independent is to say that one event gives us no probabilistic information about the other. To say that they are not independent is to say the opposite: learning about one event changes what we should expect about the other.

Important, but not the same as saying why the events are related.

A hiring example

Consider a national employer with offices in several cities. The company hires for the same technical sales role in each office. A group of rejected applicants brings a discrimination claim, alleging that the company’s hiring process produced a common pattern of disadvantage across locations.

The company responds that there was no company-wide hiring decision to challenge. Each office had its own hiring manager. The managers interviewed candidates locally, made their own recommendations, and did not discuss individual applicants with one another. No one at headquarters instructed them to prefer or disfavor any protected group. In the company’s telling, these were independent hiring decisions made by independent managers.

Now suppose a statistical expert analyzes the hiring data and concludes that the outcomes were not independent. The expert means something quite specific: rejection rates move together across offices. Offices with lower hiring rates for one group tend to have lower hiring rates for that group elsewhere. Once we know what happened in one office, we can make a better probabilistic prediction about what happened in another.

That is dependence in the probabilistic sense. It may be a real statistical fact, exactly the sort of pattern a good statistical expert should notice. It may also matter legally, because it bears on whether the challenged outcomes look like isolated local decisions or like results produced within a shared hiring system.

But notice how the word “independent” now carries two meanings at once. The company is using the word to describe how the managers acted. The expert is using the word to describe how the outcomes behave statistically. Those are related ideas, but they are not the same.

A pattern is not an explanation

Even if the expert has established that hiring outcomes are statistically related, that does not yet tell us why they are related. There are many ways for hiring decisions to become probabilistically dependent without the hiring managers coordinating in the legal sense.

The mechanism is not hard to imagine. Separate managers can make separate decisions inside a shared system. If the offices draw from the same recruiting platform, use the same job posting, apply the same minimum qualifications, work within the same compensation bands, and rely on the same applicant-tracking system, their outcomes may resemble one another. Add a common screening tool, a common interview rubric, similar labor-market conditions, shared headcount pressures, and the same basic understanding about what experience matters for the role, and the probability point becomes intuitive. Learning something about one office’s outcomes may tell us something about another office’s outcomes because, even if they didn’t coordinate, the offices are not operating in wholly separate worlds.

That label, “statistically dependent decisions,” can matter a great deal. It may justify discovery into common practices or point counsel toward the institutional machinery behind the decisions. But the point is that statistical dependence is a pattern requiring explanation, not an explanation by itself.

The litigation lesson

Technical words with ordinary meanings should not be allowed to glide past the courtroom without definition. The lawyer should ask the expert to define the word, distinguish the technical meaning from the everyday meaning, and state exactly what conclusion is implied from the technical definition. Just as important, the expert should be asked what conclusions are not implied. Good answers may follow!