37  Making decisions

Published

2024-09-28

37.1 Maximization of expected utility

In the previous chapter we associated utilities to consequences, that is, pairs \((\mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y})\) of decisions and outcomes. We can also associated utilities to the decisions alone – and these are used to determine the optimal decision.

The expected utility of a decision \(\mathsfit{\color[RGB]{204,187,68}D}\) is calculated as a weighted average over all possible outcomes, the weighs being the outcomes’ probabilities:

\[\mathrm{U}(\mathsfit{\color[RGB]{204,187,68}D}\nonscript\:\vert\nonscript\:\mathopen{}\mathsfit{I}) = \sum_{\color[RGB]{238,102,119}y} \mathrm{U}(\mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{I}) \cdot \mathrm{P}({\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{34,136,51}X\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}x} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{\color[RGB]{34,136,51}data}\mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I}) \]


According to Decision Theory the agent’s final decision is determined by the

Principle of maximal expected utility

The optimal decision, which should be made by the agent, is the one having maximal expected utility:

\[ \mathsfit{\color[RGB]{204,187,68}D}_{\text{optimal}} = \argmax_{\mathsfit{\color[RGB]{204,187,68}D}} \sum_{\color[RGB]{238,102,119}y} \mathrm{U}(\mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{I}) \cdot \mathrm{P}({\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{34,136,51}X\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}x} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{\color[RGB]{34,136,51}data}\mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I}) \]

(where, in some tasks, the probabilities may not depend on \(\mathsfit{\color[RGB]{204,187,68}D}\))

In the formula above, \(\argmax\limits_z G(z)\) is the value \(z^*\) which maximizes the function \(G(z)\). Note the difference:  \(\max\limits_z G(z)\)  is the value of the maximum itself (its \(y\)-coordinate, so to speak), whereas  \(\argmax\limits_z G(z)\)  is the value of the argument that gives the maximum (its \(x\)-coordinate). For instance

\[\max\limits_z (1-z)^2 = 0 \qquad\text{\small but}\qquad \argmax\limits_z (1-z)^2 = 1\]


It may happen that there are several decisions which have equal, maximal expected utility. In this case any one of them can be chosen. A useful strategy is to choose one among them with equal probability. Such strategy helps minimizing the loss from possible small errors in the specification of the utilities, or from the presence of an antagonist agent which tries to predict what our agent is doing.

Numerical implementation in simple cases

The principle of maximal expected utility is straightforward to calculate in many important problems.

In § 36.3 we represented the set of utilities by a utility matrix \(\boldsymbol{\color[RGB]{68,119,170}U}\). If the probabilities of the outcomes do not depend on the decisions, we represent them as a column matrix \(\boldsymbol{\color[RGB]{34,136,51}P}\), having one entry per outcome:

\[ \boldsymbol{\color[RGB]{34,136,51}P}\coloneqq \begin{bmatrix} \mathrm{P}({\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y}' \nonscript\:\vert\nonscript\:\mathopen{} {\color[RGB]{34,136,51}X\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}x} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{\color[RGB]{34,136,51}data}\mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I}) \\ \mathrm{P}({\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y}'' \nonscript\:\vert\nonscript\:\mathopen{} {\color[RGB]{34,136,51}X\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}x} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{\color[RGB]{34,136,51}data}\mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I}) \\ \dotso \end{bmatrix} \]

Then the collection of expected utilities is a column matrix, having one entry per decision, given by the matrix product \(\boldsymbol{\color[RGB]{68,119,170}U}\boldsymbol{\color[RGB]{34,136,51}P}\).

All that’s left is to check which of the entries in this final matrix is maximal.

37.2 Concrete example: targeted advertisement

As a concrete example application of the principle of maximal expected utility, let’s keep on using the adult-income task from chapter  35, in a typical present-day scenario.

Some corporation, which offers a particular phone app, wants to bombard its users with advertisements, because advertisement generates much more revenue than making the users pay for the app. For each user the corporation can choose one among three ad-types, let’s call them \({\color[RGB]{204,187,68}A}, {\color[RGB]{204,187,68}B}, {\color[RGB]{204,187,68}C}\). The revenue obtained from these ad-types depends on whether the target user’s income is \({\color[RGB]{238,102,119}{\small\verb;<=50K;}}\) or \({\color[RGB]{238,102,119}{\small\verb;>50K;}}\). A separate study run by the corporation has shown that the average revenues (per user per minute) depending on the three ad-types and the income levels are as follows:

Table 37.1: Revenue depending on ad-type and income level
\(\mathit{\color[RGB]{238,102,119}income}\)
\({\color[RGB]{238,102,119}{\small\verb;<=50K;}}\) \({\color[RGB]{238,102,119}{\small\verb;>50K;}}\)
ad-type \({\color[RGB]{204,187,68}A}\) \(\color[RGB]{68,119,170}-1\,\$\) \(\color[RGB]{68,119,170}3\,\$\)
\({\color[RGB]{204,187,68}B}\) \(\color[RGB]{68,119,170}2\,\$\) \(\color[RGB]{68,119,170}2\,\$\)
\({\color[RGB]{204,187,68}C}\) \(\color[RGB]{68,119,170}3\,\$\) \(\color[RGB]{68,119,170}-1\,\$\)

Ad-type \({\color[RGB]{204,187,68}B}\) is a neutral advertisement type that leads to revenue independently of the target user’s income. Ad-type \({\color[RGB]{204,187,68}A}\) targets high-income users, leading to higher revenue from them; but it leads to a loss if shown to the wrong target (more money spent on making and deploying the ad than what is gained from users’ purchases). Vice versa, ad-type \({\color[RGB]{204,187,68}B}\) targets low-income users, with a reverse effect.

The corporation doesn’t have access to its users’ income levels, but it covertly collects, through some other app, all or some of the eight predictor variates \(\color[RGB]{34,136,51}\mathit{workclass}\), \(\color[RGB]{34,136,51}\mathit{education}\), \(\color[RGB]{34,136,51}\dotsc\), \(\color[RGB]{34,136,51}\mathit{sex}\), \(\color[RGB]{34,136,51}\mathit{native\_country}\) from each of its users. The corporation has also access to the adult-income dataset (or let’s say a more recent version of it).

In this scenario the corporation would like to use an AI agent that can choose and show the optimal ad-type to each user.


Our prototype agent from chapters 33, 34, 35 can be used for such a task. It has already been trained with the dataset, and can use any subset (possibly even empty) of the eight predictors to calculate the probability for the two income levels.

All that’s left is to equip our prototype agent with a function that outputs the optimal decision, given the calculated probabilities and the set of utilities. In our code this is done by the function decide() described in chapter  34 and reprinted here:

decide(probs, utils=NULL)
Arguments:
  • probs: a probability distribution for one or more variates.
  • utils: a named matrix or array of utilities. The rows of the matrix correspond to the available decisions, the columns or remaining array dimensions correspond to the possible values of the predictand variates.
Output:

a list of elements EUs and optimal:

  • EUs is a vector containing the expected utilities of all decisions, sorted from highest to lowest
  • optimal is the decision having maximal expected utility, or one of them, if more than one, selected with equal probability
Notes:
  • If utils is missing or NULL, a matrix of the form \(\begin{bsmallmatrix}1&0&\dotso\\0&1&\dotso\\\dotso&\dotso&\dotso\end{bsmallmatrix}\) is assumed (which corresponds to using accuracy as evaluation metric).

Example

A new user logs in; all eight predictors are available for this user:

\[\color[RGB]{34,136,51} \begin{aligned} &\mathit{workclass} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Private;} && \mathit{education} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Bachelors;} \\ & \mathit{marital\_status} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Never-married;} && \mathit{occupation} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Prof-specialty;} \\ & \mathit{relationship} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Not-in-family;} && \mathit{race} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;White;} \\ & \mathit{sex} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Female;} && \mathit{native\_country} \mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;United-States;} \end{aligned} \]

The agent calculates (using the infer() function) the probabilities for the two income levels, which turn out to be

\[ \begin{aligned} &\mathrm{P}({\color[RGB]{238,102,119}\mathit{\color[RGB]{238,102,119}income}\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\color[RGB]{238,102,119}{\small\verb;<=50K;}}} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{34,136,51}predictor}, \mathsfit{\color[RGB]{34,136,51}data}, \mathsfit{I}) = 83.3\% \\&\mathrm{P}({\color[RGB]{238,102,119}\mathit{\color[RGB]{238,102,119}income}\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\color[RGB]{238,102,119}{\small\verb;>50K;}}} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{34,136,51}predictor}, \mathsfit{\color[RGB]{34,136,51}data}, \mathsfit{I}) = 16.7\% \end{aligned} \]

and can be represented by the column matrix

\[\boldsymbol{\color[RGB]{34,136,51}P}\coloneqq \color[RGB]{34,136,51}\begin{bmatrix} 0.833\\0.167 \end{bmatrix} \]

The utilities previously given can be represented by the matrix

\[\boldsymbol{\color[RGB]{68,119,170}U}\coloneqq \color[RGB]{68,119,170}\begin{bmatrix} -1&3\\2&2\\3&-1 \end{bmatrix} \]

Multiplying the two matrices above we obtain the expected utilities of the three ad-types for the present user:

\[ \boldsymbol{\color[RGB]{68,119,170}U}\boldsymbol{\color[RGB]{34,136,51}P}= \color[RGB]{68,119,170}\begin{bmatrix} -1&3\\2&2\\3&-1 \end{bmatrix} \, \color[RGB]{34,136,51}\begin{bmatrix} 0.833\\0.167 \end{bmatrix}\color[RGB]{0,0,0} = \begin{bmatrix} {\color[RGB]{68,119,170}-1}\cdot{\color[RGB]{34,136,51}0.833} + {\color[RGB]{68,119,170}3}\cdot{\color[RGB]{34,136,51}0.167} \\ {\color[RGB]{68,119,170}2}\cdot{\color[RGB]{34,136,51}0.833} + {\color[RGB]{68,119,170}2}\cdot{\color[RGB]{34,136,51}0.167} \\ {\color[RGB]{68,119,170}3}\cdot{\color[RGB]{34,136,51}0.833} + ({\color[RGB]{68,119,170}-1})\cdot{\color[RGB]{34,136,51}0.167} \end{bmatrix} = \begin{bmatrix} -0.332\\ 2.000\\ \boldsymbol{2.332} \end{bmatrix} \]

The highest expected utility is that of ad-type \({\color[RGB]{204,187,68}C}\), which is therefore shown to the user.

Powerful flexibility of the optimal predictor machine

In the previous chapters we already emphasized and witnessed the flexibility of the optimal predictor machine with regard to the availability of the predictors: it can draw an inference even if some or all predictors are missing.

Now we can see another powerful kind of flexibility: the optimal predictor machine can in principle use different sets of decisions and different utilities for each new application. The decision criterion is not “hard-coded”; it can be customized on the fly.

The possible number of ad-types and the utilities could even be a function of the predictor values. For instance, there could be a set of three ad-types targeting users with \(\color[RGB]{34,136,51}\mathit{education}\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Bachelors;}\), a different set of four ad-types targeting users with \(\color[RGB]{34,136,51}\mathit{education}\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}{\small\verb;Preschool;}\), and so on.


37.3 The full extent of Decision Theory

The simple decision-making problems and framework that we have discussed in these notes are only the basic blocks of Decision Theory. This theory covers more complicated decision problems. We only mention some examples:

  • Sequential decisions. Many decision-making problems involve sequences of possible decisions, alternating with sequences of possible outcomes. These sequences can be represented as decision trees. Decision theory allows us to find the optimal decision sequence for instance through the averaging out and folding back” procedure.
For the extra curious


  • Uncertain utilities. It is possible to recast Decision Theory and the principle of maximum expected utility in terms, not of utility functions \(\mathrm{U}(\mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{I})\), but of probability distributions over utility values:

    \[\mathrm{P}({\color[RGB]{68,119,170}U\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}u} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{204,187,68}D}\mathbin{\mkern-0.5mu,\mkern-0.5mu}{\color[RGB]{238,102,119}Y\mathclose{}\mathord{\nonscript\mkern 0mu\textrm{\small=}\nonscript\mkern 0mu}\mathopen{}y} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I})\]

    Formally the two approaches can be shown to be equivalent.


  • Acquiring more information. In many situations the agent has one more possible choice: to gather more information in order to calculate sharper probabilities, rather than deciding immediately. This kind of decision is also accounted for by Decision Theory, and constitutes one of the theoretical bases of “reinforcement learning”.


  • Multi-agent problems. To some extent it is possible to consider situations (such as games) with several agents having different and even opposing utilities. This area of Decision Theory is apparently still under development.
For the extra curious