Correlation and Causation

We often hear that correlation doesn't imply causation, but sometimes it does. When, why, and how?

This page is a stub, created on 2023-07-01. Its contents are notes on the issues and angles I want to address about this topic.

We must know something about bona fide causation in order to be able to make claims about it, so it must be the case that we can--at least sometimes--establish causation.

It would be rather contradictory to say that we can never establish causation, since that would depend on first having acquired causal knowledge to make such a claim. (There's also a parallel here to saying "How can you ever know if you're awake and not dreaming?", since one must first understand awake and non-awake states to even posit such a question.) My snarky retort to the "correlation doesn't imply causation" line is "Oh, and how did you establish that--by reference to some correlation of seeing that a particular correlation didn't establish causation?"; it's a bit meta, but it points out the inherent contradiction.

It's not just that correlation is necessary but insufficient to establish causation. All we need to establish causation is correlation, but it must be a particular kind of correlation.

Sometimes, a mere correlation does not establish that there is a causal connection between the phenomena observed. But sometimes, a correlation does prove a causation. I'm not going to give a full account of induction here, but I will say that if you introspect carefully and honestly enough, you have to admit that you do possess some kind of inductive knowledge of which you are absolutely--and legitimately--confident. I will say that the chief factor involved in a proper inductive inference from a correlation to causation is a certain cognitive context that connects the correlation to a causal claim via an already-understood mechanism.

So, for instance, as I often claim, statistics is not science; not in itself, anyway. But it is proto-science, in that it describes the correlations that serve as the foundations for scientific knowledge. It's "science" when we have a causal understanding of the mechanisms involved.

When we develop new pharmaceuticals, we often deal with a lot of correlations before we attempt to make a causal claim (eg, drug X cures/ameliorates condition Y). We do a lot of controlled studies. Typically, there's a lot of careful precision involved. But notice the stark contrast between many psychotropic drugs, where scientists/researchers/doctors readily admit not (fully) understanding the mechanisms involved, and drugs like ibuprofen, where the anti-inflammatory mechanism is very well understood. These may not be the best examples, but they get at an important distinction: Sometimes we understand a lot about the mechanisms (and so we can properly call our claims "scientific"), and sometimes we know very little, where the best we have are mere correlations, where we know little about the mechanisms and have done our best to at least show that we have good reason to believe that there is some mechanism (ie, that it is causal, but not what the causality really is). This is a a bit of an aside, but unless we can understand the causality itself, it's very difficult (if not impossible) to actually prove that there is a causality, since the way we often do that is by ruling out alternative explanations, which amounts to proving a negative.

But my problem with "correlation doesn't imply causation" is that it's often used very sloppily. Rhetorically, it's used to dismiss evidence introduced in favor of a claim without actually engaging in the substance of it. It's a smear. Instead of pointing out how, for instance, the correlations observed are insufficient to infer a causal connection or mechanism, "correlation doesn't imply causation" amounts to "But what if you're wrong?". It's almost unanswerable, because the speaker has already indicated that they're probably not interested in evidence and logic; they just want it to seem as though they've won an argument. A serious, honest interlocutor would ask what about the observed correlation might lead to a causal claim; they might ask about the possible mechanism involved or why a mechanism is suspected.

Frankly, I don't see any good reason to ever say "correlation doesn't imply causation".

Connect

Logs

Refferal