7 Truth inference
\[ \DeclarePairedDelimiters{\set}{\{}{\}} \DeclareMathOperator*{\argmax}{argmax} \]
Some inferences can be drawn with absolute certainty, that is, we can ascertain for sure the truth or falsity of their proposal. We call this particular “sure” kind of inferences truth inferences. Mathematical inferences are a typical example of this kind. You probably have some acquaintance with rules for drawing truth inferences, so we start from these.
7.1 A trivial inference
Consider again the assembly-line scenario of § 1, and suppose that an inspector has the following information about an electric component:
This electric component had an early failure (within a year of use). If an electric component fails early, then at production it didn’t pass either the shock test or the heating test. This component passed the shock test.
The inspector wants to assess whether the component did not pass the heating test.
From the data and information given, the conclusion is that the component for sure did not pass the heating test. This conclusion is certain and somewhat trivial. But how did we obtain it? Which rules did we follow to arrive at it from the given data?
Formal logic, with its deduction systems, is the huge field that formalizes and makes rigorous the rules that a rational person or an artificial intelligence should use in drawing sure inferences like the one above. We’ll now get a glimpse of it, as a trampoline for jumping towards more general and uncertain inferences.
7.2 Analysis and representation of the problem
First let’s analyse our simple problem and represent it with compact symbols.
Atomic sentences
We can introduce the following atomic sentences and symbols:
\[ \begin{aligned} \mathsfit{h}&\coloneqq\textsf{\small`The component passed the heating test'} \\ \mathsfit{s}&\coloneqq\textsf{\small`The component passed the shock test'} \\ \mathsfit{f}&\coloneqq\textsf{\small`The component had an early failure'} \\ \mathsfit{I}&\coloneqq\textsf{\small (all other implicit background information)} \end{aligned} \]
Proposal
The proposal is \(\lnot\mathsfit{h}\), but in the present case we could also have chosen \(\mathsfit{h}\).
Conditional
The bases for the inference are two known facts in the present case: \(\mathsfit{s}\) and \(\mathsfit{f}\). There may also be other obvious facts implicitly assumed in the inference, which we denote by \(\mathsfit{I}\).
Starting inferences
Let us emphasize again that any inference is drawn from other inferences, which are either taken for granted, or drawn in turn from others. In the present case we are told that if an electric component fails early, then at production it didn’t pass either the shock test or the heating test. We write this as
\[ \lnot\mathsfit{s}\lor \lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I} \]
and we shall take this to be true
(that is, to have probability \(100\%\)).
But our scenario actually has at least one more, hidden, inference. We said that the component failed early, and that it did pass the shock test. This means, in particular, that it must be possible for the component to pass the shock test, even if it fails early. This means that
\[ \mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I} \]
cannot be false
.
Target inference
The inference that the inspector wants to draw can be compactly written:
\[ \lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{s}\land \mathsfit{f}\land \mathsfit{I} \]
7.3 Truth-inference rules
Deduction systems; a specific choice
Formal logic gives us a set of rules for correctly drawing sure inferences, when sure inferences are possible. These rules can be formulated in different ways, leading to a wide variety of deduction systems (each one with a wide variety of possible notations). These systems are all equivalent, of course. The picture on the margin, for instance, shows how a proof of how our inference would look like, using the so-called sequent calculus, which consists of a dozen or so inference rules.
We choose to compactly encode all truth-inference rules in the following way.
First, represent true
by the number \(\mathbf{1}\), and false
by \(\mathbf{0}\).
Second, symbolically write that a proposal \(\mathsfit{Y}\) is true
, given a conditional \(\mathsfit{X}\), as follows:
\[ \mathrm{T}(\mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{X}) = 1 \]
or “\(=0\)” if it’s false
.
The rules of truth-inference are then encoded by the following equations, which must always hold for any atomic or composite sentences \(\mathsfit{X},\mathsfit{Y},\mathsfit{Z}\):
Let’s see two examples:
from one rule for “and” we can obtain the equality
\[ {\color[RGB]{102,204,238}\mathrm{T}(\mathsfit{X}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Y}\land \mathsfit{Z})} =\color[RGB]{204,187,68}\frac{\mathrm{T}(\mathsfit{X}\land \mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z})}{\mathrm{T}(\mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z})} \]
provided that \(\mathrm{T}(\mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z}) \ne 0\). Then wherever we see the left side, we can replace it with the fraction on the right side, and vice versa.
from the rule for “or” we can obtain the equality
\[ {\color[RGB]{102,204,238} \mathrm{T}(\mathsfit{X}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z}) - \mathrm{T}(\mathsfit{X}\land \mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z})} =\color[RGB]{204,187,68} \mathrm{T}(\mathsfit{X}\lor \mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z}) - \mathrm{T}(\mathsfit{Y}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{Z}) \]
Again wherever we see the left side, we can replace it with the sum on the right side, and vice versa.
Target inference in our scenario
Let’s see how these rules allow us to arrive at our target inference,
\[ \color[RGB]{238,102,119}\mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{s}\land \mathsfit{f}\land \mathsfit{I}) \]
starting from the given ones
\[ \color[RGB]{34,136,51}\mathrm{T}(\lnot\mathsfit{s}\lor \lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) = 1 \ , \qquad \mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) \ne 0 \]
One possibility is to work backwards from the target inference:
\[ \begin{aligned} &\color[RGB]{238,102,119}\mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{s}\land \mathsfit{f}\land \mathsfit{I})&& \\[1ex] &\qquad=\frac{\mathrm{T}(\lnot\mathsfit{h}\land \mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} {\color[RGB]{34,136,51}\underbracket[1pt]{\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})}_{\ne 0}} &&\text{\small ∧-rule and starting inference} \\[1ex] &\qquad=\frac{\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \lnot\mathsfit{h}\land \mathsfit{f}\land \mathsfit{I})\cdot \mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small ∧-rule} \\ &\qquad=\frac{\bigl[1-\mathrm{T}(\lnot\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \lnot\mathsfit{h}\land \mathsfit{f}\land \mathsfit{I})\bigr]\cdot \mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small ¬-rule} \\ &\qquad=\frac{\mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})- \mathrm{T}(\lnot\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \lnot\mathsfit{h}\land \mathsfit{f}\land \mathsfit{I})\cdot \mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small algebra} \\ &\qquad=\frac{\mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})- \mathrm{T}(\lnot\mathsfit{s}\land \lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small ∧-rule} \\ &\qquad=\frac{{\color[RGB]{34,136,51}\mathrm{T}(\lnot\mathsfit{s}\lor \lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})}- \mathrm{T}(\lnot\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small ∨-rule} \\ &\qquad=\frac{{\color[RGB]{34,136,51}1} - \mathrm{T}(\lnot\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I}) } {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small starting inference} \\ &\qquad=\frac{\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} {\mathrm{T}(\mathsfit{s}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{f}\land \mathsfit{I})} &&\text{\small ¬-rule} \\[1ex] &\qquad=\color[RGB]{238,102,119}1 &&\text{\small algebra} \end{aligned} \]
Therefore \(\color[RGB]{238,102,119}\mathrm{T}(\lnot\mathsfit{h}\nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{s}\land \mathsfit{f}\land \mathsfit{I}) = 1\). We find that, indeed, the electronic component must for sure have failed the heating test!
The way in which the rules can be applied to arrive at the target inference is not unique. In fact, in some concrete applications it can require a lot of work to find how to connect target inference with starting ones via the rules. The result, however, will always be the same:
[Optional] Equivalence with truth-tables
If you have studied Boolean algebra, you may be familiar with truth-tables; for instance the one for “and” displayed on the side. The truth-inference rules (7.1)–(7.4) contain the truth-tables that you already know as special cases.
\(\mathsfit{X}\) | \(\mathsfit{Y}\) | \(\mathsfit{X}\land \mathsfit{Y}\) |
---|---|---|
1 | 1 | 1 |
1 | 0 | 0 |
0 | 1 | 0 |
0 | 0 | 0 |
The truth-inference rules (7.1)–(7.4) are more complicated than truth-tables, but have two important advantages. First, they allow us to work with conditionals, and to move sentences between proposals and conditionals. Second, they provide a smoother transition to the rules for probability-inference.
7.4 Logical AI agents and their limitations
The truth-inference discussed in this section are also the rules that a logical AI agent should follow. For example, the automated control and fault-management programs in NASA spacecrafts, mentioned in § 6.1, are programmed according to these rules.
Many – if not most – inference problems that human and AI agents must face are, however, of the uncertain kind: it is not possible to surely infer the truth of some outcome, and the truth of some initial data or initial inferences may not be known either. We shall now see how to generalize the truth-inference rules to uncertain situations.