Lecture 14: Uncertainty And Expected Utility
Transcript of Lecture 14: Uncertainty And Expected Utility
Lecture 14: Uncertainty And Expected Utility
Advanced Microeconomics I, ITAM
Xinyang Wang∗
In this lecture, we will formally define uncertainty, and study preferences over uncertain-
ties. Our main focus would be linear preference.
1 Uncertainty
There are two ways to model uncertainties. Each with its own mathematical advantages.
1.1 Lottery
Let C be a set of outcomes, which could be almost anything, but we will think of it as
numbers. In addition, to reduce the level of technicalities, we assume C is a finite set in our
following analysis1.
First, we could describe uncertainty of outcomes by a probability distribution p over C.
Such a distribution is called a lottery. Formally, a lottery p is a function from C to R such
that
p(c) ≥ 0,∀c ∈ C,∑c∈C
p(c) = 1
Here, p(c) is the probability that outcome c happens.
We will denote the set of lotteries by L.
L = {p ∈ R|C|+ :∑c∈C
p(c) = 1}
∗Please email me at [email protected] for typos or mistakes. Version: March 21, 2021.1All analysis below remains when we study a general, not necessary finite, set C.
1
It worths noting that L is a convex set. (Exercise.)
Figure 1: The set of lotteries when there are three outcomes
1.2 Random Variable
A second approach to describe uncertainty is to fix a set S of states of the world together with
a probability distribution π : S → [0, 1] on S such that∑
s∈S π(s) = 1 and π(s) ≥ 0,∀s ∈ S.
For simplicity, we will for now assume S is a finite set.
The uncertainty of outcome is described by a function x : S → C, called a random
variable.
Figure 2: An illustration of a random variable
2
Note that a distribution π on S and a random variable x induces a lottery pvia the
equation: for any outcome c ∈ C,
p(c) =∑
s:x(s)=c
π(s)
As we think that the outcomes as numbers, we denote the set of random variables by
R|S|.
1.3 Examples
Now, we illustrate the difference between these two concepts by two examples.
Example 1. First, say you have a lottery ticket. It wins with probability 50% and returns
10 dollars, and loses with probability 50% and returns 0.
To start with, we take the outcome set be the set of integer number of dollars up to 100.
C = {0, 1, 2, ..., 100}
(As you may notice, a better outcome set is R+, which is infinite. But we will try to avoid
the mathematical complications created by infinity as much as possible.)
There are two ways to describe uncertainty involved in this lottery tickets.
First, we can view the uncertainty as a lottery p over C. That is, the function p : C → R
is defined by
p(c) =
1/2, c = 0
1/2, c = 10
0, otherwise
Second, we can view it by a random variable. Here, the state is
S = {wins, loses}
with the probability distribution π(win) = 1/2, π(loses) = 1/2. The random variable x : S →
3
C is defined by
x(win) = 10, x(loses) = 0
As we have mentioned above, the state space (S, π) and the random variable x will induce
a lottery p over the set of outcomes by
p(c) =∑
s:x(s)=c
π(c) =
1/2, c = 0
1/2, c = 10
0, otherwise
Example 2. Now, think that you have 2 such lottery tickets. Each of them wins with proba-
bility 50% and returns 10 dollars, and loses with probability 50% and returns 0. Assume the
winning probability of these tickets are independent.
First, to use a lottery to describe the uncertainty over outcomes, we notice, there are three
cases: none of the lottery wins, one of the lottery wins and both lottery wins, with probability
1/4, 1/2, 1/4 respectively. (If you do not understand how these probabilities are computed,
4
keep reading to the random variable part.) Therefore, the lottery p over C is defined by
p(c) =
1/4, c = 0
1/2, c = 10
1/4, c = 20
0, otherwise
Second, to use a random variable to describe the uncertainty, we notice there are four
states of this event: both tickets lose, the first ticket wins while the second ticket loses, the
first ticket loses while the second ticket wins, both tickets win. We refer to these four states
as s1, s2, s3, s4 respectively. Therefore,
S = {s1, s2, s3, s4}
and each of these states appear with equal probability due to the independency. i.e.
π(s1) = π(s2) = π(s3) = π(s4) = 1/4
A random variable x describe the map from states to outcomes:
x(s1) = 0, x(s2) = x(s3) = 10, x(s4) = 20
5
Now, we check the state space (S, π) and the random variable x defined above induces the
lottery p over outcomes.
p(0) =∑
s:x(s)=0
π(s) = π(s1) = 1/4
p(10) =∑
s:x(s)=10
π(s) = π(s2) + π(s3) = 1/2
p(20) =∑
s:x(s)=20
π(s) = π(s4) = 1/4
p(c) =∑
s:x(s)=c
π(s) = 0,∀c 6= 0, 10, 20
2 Expected Utility
Now, we start to study the preference over uncertainties. Mathematically, it should be an
object � defined on the set of uncertainties, either describe by the set of lotteries L or by
the set of random variables R|S|.
In this section, we define the expected utility representation. As there are two approaches
to describe uncertainty, there are two definitions:
When we describe uncertainties by lotteries, a von Neumann-Morgenstern utility function
U : L → R is one in the form
U(p) =∑c∈C
p(c)u(c)
where the payoffs of outcomes u : C → R is called the Bernoulli utility function.
When we describe uncertainties by random variables, a von Neumann-Morgenstern utility
function U : L → R is one in the form
U(x) =∑s∈S
π(s)u(x(s))
Here, π(s) is the probability of state s and u : C → R is the Bernoulli utility function.
In both cases, for clear reasons, this von Neumann-Morgenstern utility function is the
expected utility. From now on, we will mainly focus on the first approach to model uncertainty.
i.e. using lotteries. We will come back to use our second approach at the end of this lecture.
6
3 von Neumann-Morgenstern Utility is Cardinal
In this section, we will see von Neumann-Morgenstern Utility function is cardinal. i.e. the
preference ordering over lotteries described by a von Neumann-Morgenstern utility function
U remains unchanged after an increasing affine transformation of the utility function U , but
not after an arbitrary monotone transformation of u (which will break its linearity).
To prove this fact, we will use the following lemma.
Recall the set of lotteries L is convex. We say a function U : L → R is linear if
U(K∑k=1
αkpk) =K∑k=1
αkU(pk)
for any positive integer K, lotteries p1, ..., pK ∈ L, and any numbers α1, ..., αK such that
αk ≥ 0 for all 1 ≤ k ≤ K and∑K
k=1 αk = 1.2
Lemma. A function U : L → R is a von Neumann-Morgenstern utility function if and only
if it is linear.
Proof. First, we show any von Neumann-Morgenstern utility function U(p) =∑
c∈C p(c)u(c)
is linear. By definition, we take any probability (α1, ..., αK) and lotteries p1, ..., pK ∈ L. Let
q =∑K
k=1 αkpk. We have
q(c) =
(K∑k=1
αkpk
)(c) =
K∑k=1
αkpk(c),∀c ∈ C
2Note L is convex, so a convex combination of lotteries p1, ..., pK , in the form∑K
k=1 αkpk, is also a lottery
in L. Therefore, U(∑K
k=1 αkpk) is well-defined. Moreover, note the coefficient (α1, ..., αK) is itself a lottery.
Therefore, the object∑K
k=1 αkpk is a lottery over a collection of lotteries, thus is sometimes referred to as acompound lottery. The convexity of L implies that compound lotteries are themselves lotteries.
7
It implies
U(K∑k=1
αkpk) = U(q)
=∑c∈C
q(c)u(c) as U is a von Neumann-Morgenstern utility function
=∑c∈C
(K∑k=1
αkpk(c)
)u(c)
=∑c∈C
K∑k=1
αkpk(c)u(c)
=K∑k=1
∑c∈C
αkpk(c)u(c)3
=K∑k=1
αk∑c∈C
pk(c)u(c)
=K∑k=1
αkU(pk)
Therefore, we proved that U is linear.
Conversely, we want to show any linear function U is a von Neumann-Morgenstern Utility
function. To prove this, for any outcome c ∈ C, we use 1c to denote the lottery that will
have an outcome c for sure. That is
1c(c̃) =
1, c̃ = c
0, c̃ 6= c.
Note, any lottery p ∈ L can be written as
p =∑c∈C
p(c)1c4
3Switching the order of two finite sums is always valid. When we want to switch the order of two infinitesums, we should refer to the Fubini’s theorem.
4For instance, the lottery (1/2, 1/3, 1/6) can be written as 1/2(1, 0, 0) + 1/3(0, 1, 0) + 1/6(0, 0, 1).
8
By the linearity of U , and p is a probability distribution, we know
U(p) =∑c∈C
p(c)U(1c) =∑c∈C
p(c)u(c)
where u(c) is defined to be U(1c). Therefore, U is a von Neumann-Morgenstern utility
function. �
Next, we state the main observation in this section.
Theorem. Let U : L → R and V : L → R be two von Neumann-Morgenstern utility
functions. U and V represents the same preference on L if and only if V = aU + b for some
a > 0, b ∈ R.
Remark.
• Recall U represents a preference � on L if for any p � p′, U(p) ≥ U(p′).
• This theorem says that von Neumann-Morgenstern utility function is cardinal and is
not ordinal.
Proof. The (⇐=) part is easy:
p � q ⇐⇒ U(p) ≥ U(q)
⇐⇒ aU(p) + b ≥ aU(q) + b
⇐⇒ V (p) ≥ V (q)
Conversely, we suppose U and V are von Neumann-Morgenstern utility function that repre-
sents the same preference over L. By the lemma, U, V are linear.
Case 1: U is constant on L. Then V must be constant as well. So U = V + b for some
b ∈ R.
Case 2: U is not constant. Then, there exists lotteries p1, p2 ∈ L such that U(p1) < U(p2).
Now, we prove U(p) = aV (p) + b for all p ∈ L for some a > 0 and b ∈ R.
9
Consider p ∈ L such that
U(p) = λU(p1) + (1− λ)U(p2)
Here, the number λ can be solved by this equation and we obtain
λ =U(p2)− U(p)
U(p2)− U(p1)
Also, by linearity of U ,
U(p) = λU(λp1 + (1− λ)p2)
Case 2.1: If p ∈ L such that λ ∈ [0, 1], we have
V (p) = V (λp1 + (1− λ)p2) (As U,V represents the same preference)
= λV (p1) + (1− λ)V (p2) (By linearity)
= V (p2)− λ(V (p2)− V (p1))
= V (p2)− U(p2)− U(p)
U(p2)− U(p1)(V (p2)− V (p1))
=V (p2)− V (p1)
U(p2)− U(p1)U(p) + V (p2)− U(p2)
U(p2)− U(p1)(V (p2)− V (p1))
= aU(p) + b
where a = V (p2)−V (p1)U(p2)−U(p1)
> 0 and b = V (p2)− U(p2)U(p2)−U(p1)
(V (p2)− V (p1)).
10
Case 2.2: p ∈ L such that λ /∈ [0, 1], then either λ < 0 or λ > 1. The proof of these two
cases are the same. We study λ < 0.
We note the only obstacle of repeating the above process in Case 2.1 is at the step of
using linearity. That is, if we could prove
V (p) = λV (p1) + (1− λ)V (p2)
for λ < 0, we can repeat the above process and obtain V = aU + b, with coefficients a, b
defined in the same way.
To prove this, we observe that if λ > 0, then U(p) is on the right of U(p1) and U(p2).
Therefore, U(p2) is a convex combination of U(p1) and U(p).
Formally,
U(p) = λU(p1) + (1− λ)U(p2)
U(p2) =−λ
1− λU(p1) +
1
1− λU(p)
Let µ = −λ1−λ . As µ = 1 − 1
1−λ , µ is decreasing in λ. Thus, for λ ∈ (−∞, 0), µ ∈ (0, 1).
Therefore, by linearity,
U(p2) = µU(p1) + (1− µ)U(p) = U(µp1 + (1− µ)p2)
As U, V represents the same preference,
V (p2) =V (µp1 + (1− µ)p)
=µV (p1) + (1− µ)V (p)
=−λ
1− λV (p1) +
1
1− λV (p)
Rearranging the terms, we have V (p) = λV (p1) + (1− λ)V (p2). �
11
4 Axiomatic Characterization of expected utility
In this section, we present assumptions on the preference ordering � on L that implies it has
a von Neumann-Morgenstern utility function.
Recall that if a preference has a utility representation, it must be complete and transitive.
In addition, by Debreu’s theorem, on a convex set of possible choices, a preference has a
continuous representation if and only if it is continuous. Since von Neumann-Morgenstern
utility is clearly continuous, we will also need � to be continuous.
Now, we define a preference � on lotteries to be continuous: � is continuous on L if one
of the following equivalent definition holds:
• � is preserved at the limit. That is, for any convergent sequence of lotteries pn, qn in
L, if pn → p, qn → q, pn � qn, then p � q.
• For any p, p′, p′′ ∈ L, the sets
{α ∈ [0, 1] : αp+ (1− α)p′ � p′′}
{α ∈ [0, 1] : p′′ � αp+ (1− α)p′}
are closed.
Exercise.
1. Prove the equivalence of these two definitions.
2. Prove that if � is continuous, then for all p � p′′ � p′, there exists an α ∈ [0, 1] such
that αp+ (1− α)p′ ∼ p′′
Intuitively, the second statement in above exercise states that if � is continuous, then for
any lotteries p � p′′ � p′, p′′ must be indifferent to some mixture of p and p′. For instance, if
a decision marker prefers having an apple for certain, to having a pear for certain, and prefers
having a pear for certain to having a banana for certain, then, if this decision maker has a
continuous preference, then he or she will be indifferent between having a pear for certain
and the random event of having an apple or a banana with some probability distribution.
12
Now, we have the completeness, transitivity and continuity on preferences. By Debreu’s
theorem, there is a continuous representation of the preference over lotteries. However, as we
see in the last section, the von Neumann-Morgenstern utility function is characterized5 by its
linearity. Therefore, naturally, we need to have an assumption on preference corresponds to
the linearity of the representation. Such an assumption would be the Independence Axiom.
We say � on L satisfies the independence axiom if for any p, p′, p′′ ∈ L and α ∈ (0, 1),
p � p′ ⇐⇒ αp+ (1− α)p′′ � αp′ + (1− α)p′′
This axiom says that if a decision maker compares αp+(1−α)p′′ to αp′+(1−α)p′′, he or she
should focus on the distinction between p and p′ and hold the same preference independently
of both α and p′′.
With these four assumptions on preference, we are ready to state our representation
theorem.
Theorem (von Neumann-Morgenstern). A complete and transitive preference � on L is
continuous and satisfies the independence axiom if and only if it has an expected utility rep-
resentation. Moreover, such a representation is unique up to positive affine transformations.
Idea of proof. Let q and q′ be two lotteries and q � q′. Define U(q) = 1 and U(q′) = 0. First,
if p ∈ L such that q � p � q′, then by continuity, there is an α ∈ [0, 1] such that
p ∼ αq + (1− α)q′
We define U(p) = α. Similar to the proof of linearity, when p ∈ L such that p � q � q′,
by continuity, we have q ∼ αp + (1 − α)q′, and we define U(p) = 1α
; when p ∈ L such
that q � q′ � p, by continuity, we have q′ ∼ αq + (1 − α)p, and we define U(p) = −α1−α .
The independence axiom will imply that the function U defined above is linear. (Although
proving this requires some hard work.) By the characterization result in the previous section,
we know U is a von Neumann-Morgenstern utility function. The uniqueness is proved in the
previous section. �5corresponds to the if and only if theorem.
13
5 The boundedness of Bernoulli utility function
Observed by Bernoulli, the Bernoulli utility function u : C → R needs to be a bounded
function. (Here, we extend our outcome set C to be infinite. Think it is R+.)
The following argument is usually referred to as Bernoulli’s paradox. This paradox chal-
lenges the old idea that people value random outcomes according to its expected return.
Think about a fair coin is tossed until a head appear. If the first head appears in the n-th
toss, the payoff is 2n dollars. Therefore, expected payoff of this lottery is
∞∑n=1
P(head appears in step n)2n =∞∑n=1
1
2n· 2n =∞
That is, no matter how much income a decision maker has, he will be better off from all in
his income in exchange of this lottery. It seems unreasonable. Therefore, it means probably
we do not decide according to the expected return of a lottery.
By the same argument, the Bernoulli utility function u needs to be bounded.
Proof. Formally, if u is unbounded, then there sequence c1, ..., cn, ... ∈ C such that
u(cn) ≥ 2n,∀n ∈ N
Consider a lottery p defined by a flipping a fair coin: if the head first appears at the n-th
toss, the returns is cn. Therefore, the expected utility of lottery p is
∞∑n=1
1
2nu(cn) ≥
∞∑n=1
1
2n2n =∞
That is, the decision marker would be willing to pay any amount of money to have this
lottery, which seems unreasonable. Therefore, we assume that u is bounded. �
14
6 Subjective Probability
Recall the example in section 1: we toss a fair coin, with probability 1/2, the lottery ticket
wins we get 10 dollars and otherwise the lottery ticket loses and we get nothing. We know
that the probability distribution π = (1/2, 1/2) on S = {wins, loses}. And we can discuss
how we value the random variable which gives 10 dollars when the lottery wins and gives
nothing otherwise. One such valuation is defined by the expected utility
U(x) =∑s∈S
π(s)u(x(s))
Now, we note sometimes the probability distribution of states is not a commonly agreed
object. For instance, if we change the interpretation of above state space to the winning
state and the losing state of a football game between Mexico and the United States (That
is, s = wins if Mexico wins, s = loses if Mexico loses), then, regarding π, probably Mexican
and American will have different opinions.
Savage (1954) suggests that this probability π over states should be derived from the
preference over random variables, if we assume one maximizes expected returns. For instance,
we can ask a Mexican football fan to choose between above random variable (i.e. returns 10
dollars if Mexico wins, and otherwise zero) and a random variable which returns k dollars
in both states. For instance, if he or she is indifferent between these random variables when
k = 5, we know that probably he thinks the probability that Mexico wins is 50%; if he or
15
she is indifferent between these random variables when k = 9.9, we know that he thinks the
probability that Mexico wins is 99%.
Now, we ask ourselves if this argument works for general decision problems. For instance,
the question is if such a probability distribution π on the states always exists. Recall the
set of random variables is R|S|, we have a preference � on R|S|, and we wonder under which
conditions � can be represented by a utility function of the form
U(x) =∑s∈S
π(s)x(s)
Theorem 1. If the preference ordering � on R|S| is complete and transitive and satisfies
a) Continuity: for all s ∈ S, {y ∈ R|S| : y � x} and {y ∈ R|S| : x � y} are closed.
b) Monotonicity: x > y implies x � y
c) Independence: for all x, y, z ∈ R|S| and α ∈ [0, 1],
x � y ⇐⇒ αx+ (1− α)z � αy + (1− α)z
Then, there exists a probability distribution π on S such that � can be represented by
U(x) = Ex =∑s∈S
π(s)x(s)
Remark. The proof uses the separating hyperplane theorem and the monotonicity assumption
ensures we can apply this theorem.
7 Other types of preferences
So far, we focus on the von Neumann-Morgenstern utility function towards uncertainty. In
this section, we mention a few other possibilities.
1. Preference for uniformity
U(p) =N∑n=1
−(pn −1
N)2
16
2. Preference for certainty
U(p) = max pn
3. Preference in worse case
C = {c1, ..., cN} ⊂ R
U(p) = min{cn : pn > 0}
17