Page 48 -
P. 48
42 B. Edmonds
However, there is another reason that prediction is valued: it is considered the
gold standard of science—the ability of a model or theory to predict is taken
as the most reliable indicator of a model’s truth. This is done in two principle
ways: (a) model A fits the evidence better than model B, a comparative approach, 6
or (b) model A is falsified (or not) by the evidence, a falsification approach. In
either, the idea is that, given a sufficient supply of different models, better models
will be gradually selected over time, either because the bad ones are discarded or
outcompeted by better models.
Definition
By ‘prediction’, we mean the ability to reliably anticipate data that is not currently known
to a useful degree of accuracy via computations using the model.
Unpacking this definition:
• It has to do it reliably—that is, under some known (but not necessarily precise)
conditions, the model will work; otherwise one would not know when one could
use it.
• The data it anticipates has to be unknown to the modeller. ‘Predicting’ out-of-
sample data is not enough, since pressures to redo a model and get a better fit are
huge and negative results are difficult to publish.
• The anticipation has to be to a useful degree of accuracy. This will depend upon
the purpose to which it is being put, e.g. as in weather forecasting.
Unfortunately, there are at least two different uses of the word ‘predict’. Almost
all scientific models ‘predict’ in the weak sense of being used to calculate some
result given some settings or data, but this is different from correctly anticipating
unknown data. For this reason, some use the term ‘forecast’ for anticipating
unknown data and use the word ‘prediction’ for almost any calculation of one aspect
from another using a model. However, this causes confusions in other ways, so
this does not necessarily make things clearer. Firstly, ‘forecasting’ implies that the
unknown data is in the future (which is not always the case), and, secondly, large
parts of science use the word ‘prediction’ for the process of anticipating unknown
data. For example, if a modeller says their model ‘predicts’ something when they
simply mean that it calculates it, then most of the audience may misunderstand and
assume the author is claiming more utility than is intended.
As Watts (2014) points out, useful prediction does not have to be a ‘point’
prediction of a future event. For example, one might predict that some particular
thing will not happen, the existence of something in the past (e.g. the existence of
Pluto), something about the shape or direction of trends or distributions or even
qualitative facts. The important fact is that what is being predicted is not known
beforehand by the modeller and that it can be unambiguously checked when it is
known.
An Example Nate Silver aims to predict social phenomena, such as the results of
elections and the outcome of sports competitions. This is a data-hungry activity,
6 Where model B may be a random or null model but also might be a rival model