Lecture 14: Uncertainty And Expected Utility

Lecture 14: Uncertainty And Expected Utility

Advanced Microeconomics I, ITAM

Xinyang Wang∗

In this lecture, we will formally define uncertainty, and study preferences over uncertain-

ties. Our main focus would be linear preference.

1 Uncertainty

There are two ways to model uncertainties. Each with its own mathematical advantages.

1.1 Lottery

Let C be a set of outcomes, which could be almost anything, but we will think of it as

numbers. In addition, to reduce the level of technicalities, we assume C is a finite set in our

following analysis1.

First, we could describe uncertainty of outcomes by a probability distribution p over C.

Such a distribution is called a lottery. Formally, a lottery p is a function from C to R such

that

p(c) ≥ 0,∀c ∈ C,∑c∈C

p(c) = 1

Here, p(c) is the probability that outcome c happens.

We will denote the set of lotteries by L.

L = {p ∈ R|C|+ :∑c∈C

p(c) = 1}

∗Please email me at [email protected] for typos or mistakes. Version: March 21, 2021.1All analysis below remains when we study a general, not necessary finite, set C.

1

It worths noting that L is a convex set. (Exercise.)

Figure 1: The set of lotteries when there are three outcomes

1.2 Random Variable

A second approach to describe uncertainty is to fix a set S of states of the world together with

a probability distribution π : S → [0, 1] on S such that∑

s∈S π(s) = 1 and π(s) ≥ 0,∀s ∈ S.

For simplicity, we will for now assume S is a finite set.

The uncertainty of outcome is described by a function x : S → C, called a random

variable.

Figure 2: An illustration of a random variable

2

Note that a distribution π on S and a random variable x induces a lottery pvia the

equation: for any outcome c ∈ C,

p(c) =∑

s:x(s)=c

π(s)

As we think that the outcomes as numbers, we denote the set of random variables by

R|S|.

1.3 Examples

Now, we illustrate the difference between these two concepts by two examples.

Example 1. First, say you have a lottery ticket. It wins with probability 50% and returns

10 dollars, and loses with probability 50% and returns 0.

To start with, we take the outcome set be the set of integer number of dollars up to 100.

C = {0, 1, 2, ..., 100}

(As you may notice, a better outcome set is R+, which is infinite. But we will try to avoid

the mathematical complications created by infinity as much as possible.)

There are two ways to describe uncertainty involved in this lottery tickets.

First, we can view the uncertainty as a lottery p over C. That is, the function p : C → R

is defined by

p(c) =

1/2, c = 0

1/2, c = 10

0, otherwise

Second, we can view it by a random variable. Here, the state is

S = {wins, loses}

with the probability distribution π(win) = 1/2, π(loses) = 1/2. The random variable x : S →

3

C is defined by

x(win) = 10, x(loses) = 0

As we have mentioned above, the state space (S, π) and the random variable x will induce

a lottery p over the set of outcomes by

p(c) =∑

s:x(s)=c

π(c) =

1/2, c = 0

1/2, c = 10

0, otherwise

Example 2. Now, think that you have 2 such lottery tickets. Each of them wins with proba-

bility 50% and returns 10 dollars, and loses with probability 50% and returns 0. Assume the

winning probability of these tickets are independent.

First, to use a lottery to describe the uncertainty over outcomes, we notice, there are three

cases: none of the lottery wins, one of the lottery wins and both lottery wins, with probability

1/4, 1/2, 1/4 respectively. (If you do not understand how these probabilities are computed,

4

keep reading to the random variable part.) Therefore, the lottery p over C is defined by

p(c) =

1/4, c = 0

1/2, c = 10

1/4, c = 20

0, otherwise

Second, to use a random variable to describe the uncertainty, we notice there are four

states of this event: both tickets lose, the first ticket wins while the second ticket loses, the

first ticket loses while the second ticket wins, both tickets win. We refer to these four states

as s1, s2, s3, s4 respectively. Therefore,

S = {s1, s2, s3, s4}

and each of these states appear with equal probability due to the independency. i.e.

π(s1) = π(s2) = π(s3) = π(s4) = 1/4

A random variable x describe the map from states to outcomes:

x(s1) = 0, x(s2) = x(s3) = 10, x(s4) = 20

5

Now, we check the state space (S, π) and the random variable x defined above induces the

lottery p over outcomes.

p(0) =∑

s:x(s)=0

π(s) = π(s1) = 1/4

p(10) =∑

s:x(s)=10

π(s) = π(s2) + π(s3) = 1/2

p(20) =∑

s:x(s)=20

π(s) = π(s4) = 1/4

p(c) =∑

s:x(s)=c

π(s) = 0,∀c 6= 0, 10, 20

2 Expected Utility

Now, we start to study the preference over uncertainties. Mathematically, it should be an

object � defined on the set of uncertainties, either describe by the set of lotteries L or by

the set of random variables R|S|.

In this section, we define the expected utility representation. As there are two approaches

to describe uncertainty, there are two definitions:

When we describe uncertainties by lotteries, a von Neumann-Morgenstern utility function

U : L → R is one in the form

U(p) =∑c∈C

p(c)u(c)

where the payoffs of outcomes u : C → R is called the Bernoulli utility function.

When we describe uncertainties by random variables, a von Neumann-Morgenstern utility

function U : L → R is one in the form

U(x) =∑s∈S

π(s)u(x(s))

Here, π(s) is the probability of state s and u : C → R is the Bernoulli utility function.

In both cases, for clear reasons, this von Neumann-Morgenstern utility function is the

expected utility. From now on, we will mainly focus on the first approach to model uncertainty.

i.e. using lotteries. We will come back to use our second approach at the end of this lecture.

6

3 von Neumann-Morgenstern Utility is Cardinal

In this section, we will see von Neumann-Morgenstern Utility function is cardinal. i.e. the

preference ordering over lotteries described by a von Neumann-Morgenstern utility function

U remains unchanged after an increasing affine transformation of the utility function U , but

not after an arbitrary monotone transformation of u (which will break its linearity).

To prove this fact, we will use the following lemma.

Recall the set of lotteries L is convex. We say a function U : L → R is linear if

U(K∑k=1

αkpk) =K∑k=1

αkU(pk)

for any positive integer K, lotteries p1, ..., pK ∈ L, and any numbers α1, ..., αK such that

αk ≥ 0 for all 1 ≤ k ≤ K and∑K

k=1 αk = 1.2

Lemma. A function U : L → R is a von Neumann-Morgenstern utility function if and only

if it is linear.

Proof. First, we show any von Neumann-Morgenstern utility function U(p) =∑

c∈C p(c)u(c)

is linear. By definition, we take any probability (α1, ..., αK) and lotteries p1, ..., pK ∈ L. Let

q =∑K

k=1 αkpk. We have

q(c) =

(K∑k=1

αkpk

)(c) =

K∑k=1

αkpk(c),∀c ∈ C

2Note L is convex, so a convex combination of lotteries p1, ..., pK , in the form∑K

k=1 αkpk, is also a lottery

in L. Therefore, U(∑K

k=1 αkpk) is well-defined. Moreover, note the coefficient (α1, ..., αK) is itself a lottery.

Therefore, the object∑K

k=1 αkpk is a lottery over a collection of lotteries, thus is sometimes referred to as acompound lottery. The convexity of L implies that compound lotteries are themselves lotteries.

7

It implies

U(K∑k=1

αkpk) = U(q)

=∑c∈C

q(c)u(c) as U is a von Neumann-Morgenstern utility function

=∑c∈C

(K∑k=1

αkpk(c)

)u(c)

=∑c∈C

K∑k=1

αkpk(c)u(c)

=K∑k=1

∑c∈C

αkpk(c)u(c)3

=K∑k=1

αk∑c∈C

pk(c)u(c)

=K∑k=1

αkU(pk)

Therefore, we proved that U is linear.

Conversely, we want to show any linear function U is a von Neumann-Morgenstern Utility

function. To prove this, for any outcome c ∈ C, we use 1c to denote the lottery that will

have an outcome c for sure. That is

1c(c̃) =

1, c̃ = c

0, c̃ 6= c.

Note, any lottery p ∈ L can be written as

p =∑c∈C

p(c)1c4

3Switching the order of two finite sums is always valid. When we want to switch the order of two infinitesums, we should refer to the Fubini’s theorem.

4For instance, the lottery (1/2, 1/3, 1/6) can be written as 1/2(1, 0, 0) + 1/3(0, 1, 0) + 1/6(0, 0, 1).

8

By the linearity of U , and p is a probability distribution, we know

U(p) =∑c∈C

p(c)U(1c) =∑c∈C

p(c)u(c)

where u(c) is defined to be U(1c). Therefore, U is a von Neumann-Morgenstern utility

function. �

Next, we state the main observation in this section.

Theorem. Let U : L → R and V : L → R be two von Neumann-Morgenstern utility

functions. U and V represents the same preference on L if and only if V = aU + b for some

a > 0, b ∈ R.

Remark.

• Recall U represents a preference � on L if for any p � p′, U(p) ≥ U(p′).

• This theorem says that von Neumann-Morgenstern utility function is cardinal and is

not ordinal.

Proof. The (⇐=) part is easy:

p � q ⇐⇒ U(p) ≥ U(q)

⇐⇒ aU(p) + b ≥ aU(q) + b

⇐⇒ V (p) ≥ V (q)

Conversely, we suppose U and V are von Neumann-Morgenstern utility function that repre-

sents the same preference over L. By the lemma, U, V are linear.

Case 1: U is constant on L. Then V must be constant as well. So U = V + b for some

b ∈ R.

Case 2: U is not constant. Then, there exists lotteries p1, p2 ∈ L such that U(p1) < U(p2).

Now, we prove U(p) = aV (p) + b for all p ∈ L for some a > 0 and b ∈ R.

9

Consider p ∈ L such that

U(p) = λU(p1) + (1− λ)U(p2)

Here, the number λ can be solved by this equation and we obtain

λ =U(p2)− U(p)

U(p2)− U(p1)

Also, by linearity of U ,

U(p) = λU(λp1 + (1− λ)p2)

Case 2.1: If p ∈ L such that λ ∈ [0, 1], we have

V (p) = V (λp1 + (1− λ)p2) (As U,V represents the same preference)

= λV (p1) + (1− λ)V (p2) (By linearity)

= V (p2)− λ(V (p2)− V (p1))

= V (p2)− U(p2)− U(p)

U(p2)− U(p1)(V (p2)− V (p1))

=V (p2)− V (p1)

U(p2)− U(p1)U(p) + V (p2)− U(p2)

U(p2)− U(p1)(V (p2)− V (p1))

= aU(p) + b

where a = V (p2)−V (p1)U(p2)−U(p1)

> 0 and b = V (p2)− U(p2)U(p2)−U(p1)

(V (p2)− V (p1)).

10

Case 2.2: p ∈ L such that λ /∈ [0, 1], then either λ < 0 or λ > 1. The proof of these two

cases are the same. We study λ < 0.

We note the only obstacle of repeating the above process in Case 2.1 is at the step of

using linearity. That is, if we could prove

V (p) = λV (p1) + (1− λ)V (p2)

for λ < 0, we can repeat the above process and obtain V = aU + b, with coefficients a, b

defined in the same way.

To prove this, we observe that if λ > 0, then U(p) is on the right of U(p1) and U(p2).

Therefore, U(p2) is a convex combination of U(p1) and U(p).

Formally,

U(p) = λU(p1) + (1− λ)U(p2)

U(p2) =−λ

1− λU(p1) +

1

1− λU(p)

Let µ = −λ1−λ . As µ = 1 − 1

1−λ , µ is decreasing in λ. Thus, for λ ∈ (−∞, 0), µ ∈ (0, 1).

Therefore, by linearity,

U(p2) = µU(p1) + (1− µ)U(p) = U(µp1 + (1− µ)p2)

As U, V represents the same preference,

V (p2) =V (µp1 + (1− µ)p)

=µV (p1) + (1− µ)V (p)

=−λ

1− λV (p1) +

1

1− λV (p)

Rearranging the terms, we have V (p) = λV (p1) + (1− λ)V (p2). �

11

4 Axiomatic Characterization of expected utility

In this section, we present assumptions on the preference ordering � on L that implies it has

a von Neumann-Morgenstern utility function.

Recall that if a preference has a utility representation, it must be complete and transitive.

In addition, by Debreu’s theorem, on a convex set of possible choices, a preference has a

continuous representation if and only if it is continuous. Since von Neumann-Morgenstern

utility is clearly continuous, we will also need � to be continuous.

Now, we define a preference � on lotteries to be continuous: � is continuous on L if one

of the following equivalent definition holds:

• � is preserved at the limit. That is, for any convergent sequence of lotteries pn, qn in

L, if pn → p, qn → q, pn � qn, then p � q.

• For any p, p′, p′′ ∈ L, the sets

{α ∈ [0, 1] : αp+ (1− α)p′ � p′′}

{α ∈ [0, 1] : p′′ � αp+ (1− α)p′}

are closed.

Exercise.

1. Prove the equivalence of these two definitions.

2. Prove that if � is continuous, then for all p � p′′ � p′, there exists an α ∈ [0, 1] such

that αp+ (1− α)p′ ∼ p′′

Intuitively, the second statement in above exercise states that if � is continuous, then for

any lotteries p � p′′ � p′, p′′ must be indifferent to some mixture of p and p′. For instance, if

a decision marker prefers having an apple for certain, to having a pear for certain, and prefers

having a pear for certain to having a banana for certain, then, if this decision maker has a

continuous preference, then he or she will be indifferent between having a pear for certain

and the random event of having an apple or a banana with some probability distribution.

12

Now, we have the completeness, transitivity and continuity on preferences. By Debreu’s

theorem, there is a continuous representation of the preference over lotteries. However, as we

see in the last section, the von Neumann-Morgenstern utility function is characterized5 by its

linearity. Therefore, naturally, we need to have an assumption on preference corresponds to

the linearity of the representation. Such an assumption would be the Independence Axiom.

We say � on L satisfies the independence axiom if for any p, p′, p′′ ∈ L and α ∈ (0, 1),

p � p′ ⇐⇒ αp+ (1− α)p′′ � αp′ + (1− α)p′′

This axiom says that if a decision maker compares αp+(1−α)p′′ to αp′+(1−α)p′′, he or she

should focus on the distinction between p and p′ and hold the same preference independently

of both α and p′′.

With these four assumptions on preference, we are ready to state our representation

theorem.

Theorem (von Neumann-Morgenstern). A complete and transitive preference � on L is

continuous and satisfies the independence axiom if and only if it has an expected utility rep-

resentation. Moreover, such a representation is unique up to positive affine transformations.

Idea of proof. Let q and q′ be two lotteries and q � q′. Define U(q) = 1 and U(q′) = 0. First,

if p ∈ L such that q � p � q′, then by continuity, there is an α ∈ [0, 1] such that

p ∼ αq + (1− α)q′

We define U(p) = α. Similar to the proof of linearity, when p ∈ L such that p � q � q′,

by continuity, we have q ∼ αp + (1 − α)q′, and we define U(p) = 1α

; when p ∈ L such

that q � q′ � p, by continuity, we have q′ ∼ αq + (1 − α)p, and we define U(p) = −α1−α .

The independence axiom will imply that the function U defined above is linear. (Although

proving this requires some hard work.) By the characterization result in the previous section,

we know U is a von Neumann-Morgenstern utility function. The uniqueness is proved in the

previous section. �5corresponds to the if and only if theorem.

13

5 The boundedness of Bernoulli utility function

Observed by Bernoulli, the Bernoulli utility function u : C → R needs to be a bounded

function. (Here, we extend our outcome set C to be infinite. Think it is R+.)

The following argument is usually referred to as Bernoulli’s paradox. This paradox chal-

lenges the old idea that people value random outcomes according to its expected return.

Think about a fair coin is tossed until a head appear. If the first head appears in the n-th

toss, the payoff is 2n dollars. Therefore, expected payoff of this lottery is

∞∑n=1

P(head appears in step n)2n =∞∑n=1

1

2n· 2n =∞

That is, no matter how much income a decision maker has, he will be better off from all in

his income in exchange of this lottery. It seems unreasonable. Therefore, it means probably

we do not decide according to the expected return of a lottery.

By the same argument, the Bernoulli utility function u needs to be bounded.

Proof. Formally, if u is unbounded, then there sequence c1, ..., cn, ... ∈ C such that

u(cn) ≥ 2n,∀n ∈ N

Consider a lottery p defined by a flipping a fair coin: if the head first appears at the n-th

toss, the returns is cn. Therefore, the expected utility of lottery p is

∞∑n=1

1

2nu(cn) ≥

∞∑n=1

1

2n2n =∞

That is, the decision marker would be willing to pay any amount of money to have this

lottery, which seems unreasonable. Therefore, we assume that u is bounded. �

14

6 Subjective Probability

Recall the example in section 1: we toss a fair coin, with probability 1/2, the lottery ticket

wins we get 10 dollars and otherwise the lottery ticket loses and we get nothing. We know

that the probability distribution π = (1/2, 1/2) on S = {wins, loses}. And we can discuss

how we value the random variable which gives 10 dollars when the lottery wins and gives

nothing otherwise. One such valuation is defined by the expected utility

U(x) =∑s∈S

π(s)u(x(s))

Now, we note sometimes the probability distribution of states is not a commonly agreed

object. For instance, if we change the interpretation of above state space to the winning

state and the losing state of a football game between Mexico and the United States (That

is, s = wins if Mexico wins, s = loses if Mexico loses), then, regarding π, probably Mexican

and American will have different opinions.

Savage (1954) suggests that this probability π over states should be derived from the

preference over random variables, if we assume one maximizes expected returns. For instance,

we can ask a Mexican football fan to choose between above random variable (i.e. returns 10

dollars if Mexico wins, and otherwise zero) and a random variable which returns k dollars

in both states. For instance, if he or she is indifferent between these random variables when

k = 5, we know that probably he thinks the probability that Mexico wins is 50%; if he or

15

she is indifferent between these random variables when k = 9.9, we know that he thinks the

probability that Mexico wins is 99%.

Now, we ask ourselves if this argument works for general decision problems. For instance,

the question is if such a probability distribution π on the states always exists. Recall the

set of random variables is R|S|, we have a preference � on R|S|, and we wonder under which

conditions � can be represented by a utility function of the form

U(x) =∑s∈S

π(s)x(s)

Theorem 1. If the preference ordering � on R|S| is complete and transitive and satisfies

a) Continuity: for all s ∈ S, {y ∈ R|S| : y � x} and {y ∈ R|S| : x � y} are closed.

b) Monotonicity: x > y implies x � y

c) Independence: for all x, y, z ∈ R|S| and α ∈ [0, 1],

x � y ⇐⇒ αx+ (1− α)z � αy + (1− α)z

Then, there exists a probability distribution π on S such that � can be represented by

U(x) = Ex =∑s∈S

π(s)x(s)

Remark. The proof uses the separating hyperplane theorem and the monotonicity assumption

ensures we can apply this theorem.

7 Other types of preferences

So far, we focus on the von Neumann-Morgenstern utility function towards uncertainty. In

this section, we mention a few other possibilities.

1. Preference for uniformity

U(p) =N∑n=1

−(pn −1

N)2

16

2. Preference for certainty

U(p) = max pn

3. Preference in worse case

C = {c1, ..., cN} ⊂ R

U(p) = min{cn : pn > 0}

17

Lecture 14: Uncertainty And Expected Utility

Documents

Transcript of Lecture 14: Uncertainty And Expected Utility