The Ghost in the Model: Why Transformers Still Don’t Understand Causality

The way everyday tools have evolved the last few years, conversations often end with “ChatGPT [knows and] said …” or “ChatGPT doesn’t know.” This may be reasonable for facts that are well known and established, but “knowing” something from a scientific perspective requires establishing causality, and this is still a big gap for the transformer models that power LLMs. 

Causality has always been challenging to establish especially in complex human disease, where causes are a mix of genetic, environmental, and social factors. In general, to establish causality, we want to see that when we change a variable A, a change in a variable B follows. The gold standard is a randomized controlled trial, but we often can’t do these for ethical or practical reasons. In the absence of an RCT, there are established study designs and statistical frameworks for estimating counterfactuals to draw conclusions about causality. Studies using these frameworks on groups of people are used to make suggestions about what might work for an individual, and ultimately, as we’ve all experienced, we each must see what works for ourselves. 

Transformers are really good at solving problems we set them up to solve using the data and model architecture we provide to them, but modern transformer models continue to underperform in benchmarking studies of causal inference tasks. Ultimately, transformers are learning the data distribution we give them and using the correlations they find there to drive predictions. While we could carefully deploy them into statistical frameworks for causal inference in place of existing estimators, the recent paper by Neugebauer et al. assessing whether GLP1’s help prevent major adverse cardiovascular events in type 2 diabetes demonstrates how much thought, careful experimental planning, and post-hoc assessment goes into assessing and proposing a causal relationship – and ultimately this is still a human task. 

Like other machine learning models, we can train transformers to predict the probability of a next event – Shmatko et al.’s recent Nature paper released a model that does this using healthcare claims data from UK biobank. While this can be useful clinically, it isn’t establishing causality. However, this probability is something we can use to improve a person’s ability to self-advocate through advertising. If someone is likely to be diagnosed in the near future, we should provide them with information that may help them and encourage them to have a conversation with their doctor. It only gets spooky if we start listening to the ghost in the model whispering correlations and mistake these correlations for causations.