5 What is an inference?

Published

2024-10-30

In the assembly-line decision problem of § 1, the probability of early failure was very important in determining the optimal decision. If the probability had been \(5\%\) instead of \(10\%\), the optimal decision would have been different. Also, if the probability had been \(100\%\) or \(0\%\), it would have meant that we knew for sure what was the successful decision.

In that decision problem, the probabilities of the outcomes were already given. But in real decision problems the probabilities of the outcomes almost always need to be calculated, and their calculation can be the most time- and resource-demanding stage in solving a decision problem.

We’ll loosely refer to problems of calculating probabilities as “inference problems”, and to their calculation as “drawing an inference”. Drawing inferences is very often a goal or need in itself, without any underlying decision process.

Our purpose now is to learn how to draw inferences – that is, how to calculate probabilities. We’ll proceed by facing the following questions, in order:

What do we mean by “inference” and “probability”, more precisely? What important aspects about inferences and probabilities should we keep in mind?
What kind of mathematical notation do we use for inferences and probabilities?
What are the rules for drawing inferences, that is, for calculating probabilities?

5.1 The wide scope and characteristics of inferences

Let’s see a couple more informal examples of inference problems. For some of them an underlying decision-making problem is also alluded to:

Looking at the weather, we try to assess if it’ll rain today, to decide whether to take an umbrella.
Considering a patient’s symptoms, test results, and medical history, a clinician tries to assess which disease affects the patient, in order to decide on the optimal treatment.
Looking at the present game position the X-player, which moves next, wonders whether placing the next X on the mid-right position leads to a win.
The computer of a self-driving car needs to assess, from the current set of camera frames, whether a particular patch of colours in the frames is a person, in order to slow down the car and stop if that’s the case.
Given that \(G=6.67 \cdot 10^{-11}\,\mathrm{m^3\,s^{-2}\,kg^{-1}}\), \(M = 5.97 \cdot 10^{24}\,\mathrm{kg}\) (mass of the Earth), and \(r = 6.37 \cdot 10^{6}\,\mathrm{m}\) (radius of the Earth), a rocket engineer needs to know how much is \(\sqrt{2\,G\,M/r\,}\).
We’d like to know whether the rolled die is going to show .
An aircraft’s autopilot system needs to assess how much the aircraft’s roll will change, if the right wing’s angle of attack is increased by \(0.1\,\mathrm{rad}\).
By looking at the dimensions, shape, texture of a newly dug-out fossil bone, an archaeologist wonders whether it belonged to a Tyrannosaurus rex.
A voltage test on a newly produced electronic component yields a value of \(100\,\mathrm{mV}\). The electronic component turns out to be defective. An engineer wants to assess whether the voltage-test value could have been \(100\,\mathrm{mV}\) even if the component had not been defective.
Same as above, but the engineer wants to assess whether the voltage-test value could have been \(80\,\mathrm{mV}\) if the component had not been defective.

From measurements of the Sun’s energy output, measurements of concentrations of various substances in the Earth’s atmosphere over the past 500 000 years, and measurements of the emission rates of various substances in the years 1900–2022, climatologists and geophysicists try to assess the rate of mean-temperature increase in the years 2023–2100.

For the extra curious

Ch. 10 in A Survival Guide to the Misinformation Age.

Exercises

For each example above, pinpoint what has to be inferred, and also the agent interested in the inference.
Point out which of the examples above explicitly give data or information that should be used for the inference.
For the examples that do not give explicit data or information, speculate what information could be implicitly assumed. For those that do give explicit data, speculate which other additional information could be implicitly assumed.
Can any of the inferences above be done with full certainty (that is, to know which decision is successful), based the data given explicitly and implicitly?
Find the examples that explicitly involve a decision. In which of them does the decision affect the results of the inference? In which it does not?
Are any of the inferences “one-time only”? That is, has their object or the data on which they are based never happened before and will never happen again?
Are any of the inferences above based on data and information that come chronologically after the object of the inference?
Are any of the inferences above about something that is actually already known to the agent that’s making the inference?
Are any of the inferences about something that actually did not happen?
Do any of the inferences use “data” or “information” that are actually known (within the scenario itself) to be fictive, that is, not real?

From the examples and from your answers to the exercise we observe some very important characteristics of inferences:

Some inferences can be made exactly, that is, without uncertainty: it is possible to say for sure whether the object of the inference is true or false. Other inferences, instead, involve an uncertainty.
All inferences are based on some data and information, which may be explicitly expressed or only implicitly understood.
An inference can be about something past, but based on present or future data and information. In other words, inferences can show all sorts of temporal relations.
An inference can be essentially unrepeatable, because it’s about something unrepeatable or based on unrepeatable data and information.
The data and information on which an inference is based can actually be unknown; that is, they can be only momentarily contemplated as real. Such an inference is said to be based on hypothetical reasoning.
The object of an inference can actually be something already known to be false or not real: the inference tries to assess it in the case that some data or information had been different. Such an inference is said to be based on counterfactual reasoning.

5.2 Where are inferences drawn from?

This question is far from trivial. In fact it has connections with the earth-shaking development and theorems in the foundations of mathematics that originated in the 1900s.

For the extra curious

Mathematics: The Loss of Certainty.

The proper answer to this question will take up the next sections. But a central point can be emphasized now:

Inferences can only be drawn from other inferences.

In order to draw an inference – calculate a probability – we usually go up a chain: we must first draw other inferences, and for drawing those we must draw yet other inferences, and so on.

At some point we must stop at inferences that we take for granted without further proof. These typically concern direct experiences and observations. For instance, you see a tree in front of you, so you can take “there’s a tree here” as a true fact. Yet, notice that the situation is not so clear-cut: how do you know that you aren’t hallucinating, and there’s actually no tree there? That is taken for granted. If you analyse the possibility of hallucination, you realize that you are taking other things for granted, and so on.

Probably most philosophical research in the history of humanity has been about grappling with this runaway process – which is also a continuous source of sci-fi films. In logic and mathematical logic, this corresponds to the fact that in order to prove some theorem, we must always start from some axioms. There are “inferences”, called tautologies, that can be drawn without requiring others, but they are all trivial: for example “this component failed early, or it didn’t”. These tautologies are of little use in a real problem, although they have a deep theoretical importance. Useful inferences, on the other hand, must always start from some axioms.

Sci-fi films like *The Matrix* ultimately draw on the fact that we must take some inferences for granted without further proof.

In concrete applications, we start from many inferences upon which everyone, luckily, agrees. But sometimes we must also use starting inferences that are more dubious or not agreed upon by everyone. In this case the final inference has a somewhat contingent character. We accept it (as well as the solution of any underlying decision problem) as the best available one for the moment. This is partly the origin of the term “model”.

5.3 Basic elements of an inference

Let us introduce some mathematical notation and more precise terminology for inferences.

Every inference has an “object”: what is to be assessed or guessed. We call proposal the object of the inference.
Every inference also has data, information, hypotheses, or hypothetical scenarios on which it is based. We call conditional what the inference is based upon.
We separate proposal and conditional with a vertical bar “ \(\pmb{\pmb{\nonscript\:\big\vert\nonscript\:\mathopen{}}}\) ”, which can be pronounced “given” or “conditional on”.
Finally, we put parentheses around this and a “\(\mathrm{P}\)” in front, short for “probability”:

Proposal is Johnson’s (1924) terminology; Keynes (1921) uses “conclusion”; modern textbooks do not seem to use any specialized term. Conditional is modern terminology; other terms used: “evidence”, “premise”, “supposal”. The vertical bar, originally a solidus, was introduced by Keynes (1921).

\[ \mathrm{P}( \underbracket[1px]{\color[RGB]{34,136,51}\boldsymbol{\cdots}}_{\textit{proposal}} \nonscript\:\vert\nonscript\:\mathopen{} \underbracket[1px]{\color[RGB]{68,119,170}\boldsymbol{\cdots}}_{\textit{conditional}} ) = {\color[RGB]{238,102,119}\boldsymbol{\cdots}\%} \]

this means “the probability that [proposal], supposing [conditional], is . . . %”. Or also: “supposing [conditional], we can infer [proposal] with . . . % probability”.

We have remarked that in order to calculate the probability for an inference, we must use the probabilities of other inferences, which in turn are calculated by using the probabilities of other inferences, and so on, until we arrive at probabilities that are taken for granted. A basic inference process could therefore be schematized like this:

The next important task ahead of us is to introduce a flexible and enough general mathematical representation for the objects of an inference. Thereafter we shall study the rules for drawing correct inferences.