Preferences

119

Transcript of Preferences

Page 1: Preferences

Mathematical foundations of microeconomic theory:

Preference, utility, choice

Mark Voorneveld

September 6, 2010

Page 2: Preferences

Contents

Preface iii

1 Preference 11.1 Preference relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Preference over commodity bundles . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Utility 92.1 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 From preference to utility: �nite or countable sets . . . . . . . . . . . . . . . . . . 102.3 Preference, but no utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 In no-man's-land: A necessary and su�cient condition for utility representation . 122.5 Continuous utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.6 Some special functional forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Choice 223.1 Existence of most preferred elements . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Revealed preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Choices of a consumer: classical demand theory 264.1 The preference/utility maximization problem . . . . . . . . . . . . . . . . . . . . 264.2 Properties of the demand correspondence and indirect utility . . . . . . . . . . . 284.3 The expenditure minimization problem . . . . . . . . . . . . . . . . . . . . . . . . 314.4 Relations between UMP and EMP . . . . . . . . . . . . . . . . . . . . . . . . . . 344.5 Welfare analysis for the consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.6 Welfare and Hicksian demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 Choices of a producer: classical supply theory 405.1 Production sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.2 Properties of production sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.3 The pro�t maximization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.4 Solving the PMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.5 The cost minimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.6 Linking the PMP and the CMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.7 E�ciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 General equilibrium 506.1 What is an equilibrium? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.2 Pure exchange economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.3 Welfare analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.4 Private ownership economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

i

Page 3: Preferences

7 Expected utility theory 577.1 Simple and compound gambles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577.2 Preferences over gambles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587.3 von Neumann-Morgenstern utility functions . . . . . . . . . . . . . . . . . . . . . 607.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

8 Risk attitudes 638.1 In for a gamble? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638.2 Certainty equivalent and risk premium . . . . . . . . . . . . . . . . . . . . . . . . 648.3 Arrow-Pratt measure of absolute risk aversion . . . . . . . . . . . . . . . . . . . . 658.4 A derivation of the Arrow-Pratt measure . . . . . . . . . . . . . . . . . . . . . . . 66

9 Some critique on expected utility theory 679.1 Problems with unbounded utility: a variant of the St. Petersburg paradox . . . . 679.2 Allais' paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679.3 Probability matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689.4 Rabin's calibration theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

10 Time preference 7010.1 Stationarity and exponential discounting . . . . . . . . . . . . . . . . . . . . . . . 7010.2 Preference reversal and hyperbolic discounting . . . . . . . . . . . . . . . . . . . . 7210.3 Limit-of-means and overtaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7310.4 Better may be worse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

11 Probabilistic choice 7711.1 The Luce model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7711.2 The logit model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8011.3 The linear probability model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8211.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Full circle: overview 85

Notation 88

References 89

Suggested solutions 91

ii

Page 4: Preferences

Preface

Overview

The purpose of these notes is to introduce you to some mathematical foundations of economictheory. These are building blocks of economics that hopefully contribute to your understanding offormal modeling in your other courses and in the research papers you will read and � eventually� write.

The typical model of the behavior of an economic agent requires careful answers to thefollowing questions:

� (Q1) What can the agent choose from, i.e., what is the set of feasible alternatives?

� (Q2) What does the agent like, i.e., what are the preferences over alternatives?

� (Q3) How are the former two combined to make a choice, i.e., to select among alternatives?

Although we make some brief excursions into bounded rationality, the main building block oftraditional economics is �rational� choice: choose from your set of feasible alternatives a mostpreferred one. This raises important related questions:

� (Q4) When do most preferred elements exist?

� (Q5) How are they a�ected when the agent's environment changes?

The fourth question is extremely important: you'd be surprised about how many people simplyskip over the existence issue and write papers about how solutions to economic problems area�ected by parameter changes, without ever wondering whether there even is a solution. The�fth question concerns things like how a consumer's demand is a�ected by price changes, wageincreases, etc.

Try to keep this in mind, because this is what will occupy us most of the time and constitutesthe red line of the course: regardless of the setting, we �rst have to answer (Q1) to (Q3) to providea meaningful �microfounded� model of an economic agent's behavior. Sections 1 to 3 provide ageneral framework for modeling preferences over and choice from a feasible set of alternatives.

This general framework is then applied to a number of speci�c cases: traditional models ofconsumer choice (Section 4), producer choice (Section 5), choice over outcomes that are no longerdeterministic, but occur with certain probabilities (Section 7), choice over outcomes occurringover time (Section 10), and even the modeling of seemingly suboptimal choices (Section 11).

Special features

Every course re�ects some of the teacher's own preferences. Although the material covered hereis pretty standard for a �rst PhD course in microeconomic theory, what distinguishes these notesfrom other graduate texts is:

Focus on preferences: The notes have a relatively strong focus on preferences, rather thanutility functions. Utility functions are practical in the sense that they allow you to use standardcalculus tools, but this tends to blur the picture by making economics into an exercise in advanceddi�erentiation. I try to avoid this. Although people make statements like �I like co�ee more thantea�, you hardly ever see them in a supermarket with a calculator and their utility functionwritten on a piece of paper.

This allows us to give a much more general answer � with a remarkably simple proof � tothe question when most preferred elements exist; see Proposition 3.1.

From preferences to utility:

iii

Page 5: Preferences

� Not all preferences can be represented by means of a utility function. Graduate textstypically give exactly one example, lexicographic preferences, as if it concerns an exoticphenomenon. These notes try to give some counterweight by providing several economicallyrelevant examples, all arising from the same general principle; see Section 2.3.

� So the question remains, when does a utility function exist? Section 2.4 provides necessaryand su�cient conditions.

� As an important special case, when does a continous utility function exist? Proposition 2.6provides a detailed proof. Remarkably, not even Fishburn (1970a), the standard referenceon utility theory, contains such a proof, and neither does any of the standard textbooks inmicroeconomic theory.

I don't actually expect you to know the proof, I just wanted to �ll a gap and make sureyou have access to it.

Miscellanea: Other things not commonly found in standard texts include:

� An existence result for Walrasian equilibria in terms of excess demand correspondences,rather than excess demand functions; see Proposition 6.5.

� Some excursions into the realm of bounded rationality, with brief discussions of hyperbolicdiscounting (Section 10.2), probabilistic choice (Section 11), and some exotic preferences(Section 3.3).

Solutions manual: Like any textbook, these notes contain exercises. They also containsolutions to all1 of the exercises, in the hope of facilitating self-study: if you have time todo some exercises, you can immediately check your solutions. If you're pressed for time, youcan treat the worked exercises as a collection of a few dozen (cleverly disguised) examples andapplications.

Recommended reading

The lecture notes are the reading material for the course. You may omit the proof of Propositions2.5 and 2.6, as well as the more mathematical exercises in Section 10.3. For the interested reader,the following table refers to related material in Mas-Colell, Whinston, and Greene (1995, MWG),which is by no means obligatory.

Lecture notes See also MWG

1. Preference 1.A�B, 2.A�C, 3.B2. Utility 3.C3. Choice 1.C�D, 2.D4. Choices of a consumer 2.E, 3.D�E, G, I5. Choices of a producer 5.A�C, F�G6. General equilibrium 15.A�C, 16.A�D, 17.A�C, 18.A�B7. Expected utility theory 6.A�B8. Risk attitudes 6.C9. Some critique 6.B10. Time preference 20.A�B11. Probabilistic choice none

1Well, almost all exercises, as a couple of them will be used as this year's home assignments. . .

iv

Page 6: Preferences

Terminology

In economics, there is little consensus on terminology. For instance, following Arrow (1959)and Fishburn (1970b), I refer to a complete transitive binary relation that models an economicagent's preferences as a `weak order'. Other names include `rational preference relation' (Mas-Colell et al., 1995), a very loaded term, simply `preference relation' (Rubinstein, 2006), `completepreordering' (Debreu, 1959), `complete weak order' (Fishburn, 1979), and `complete ordering'(Debreu, 1954). The Micro I course and its exam use the de�nitions from these lecture notes.

v

Page 7: Preferences

1. Preference

1.1. Preference relations

Rational choice essentially means choosing from a set of feasible options a most preferred alter-native. Let X be a set of alternatives. A preference relation % is a binary relation on X,allowing the comparison between pairs of alternatives. For each x, y ∈ X, read

x % y as �x is at least as good as/weakly preferred to/weakly better than y�.

A binary relation % is a weak order if it satis�es:

Completeness: for all x, y ∈ X, x % y or y % x (or both).

Transitivity: for all x, y, z ∈ X, if x % y and y % z, then x % z.

Exercise 1.1 Are the following binary relations % necessarily complete, transitive?

(a) X consists of the items in an English dictionary, % is the alphabetical order in which they arelisted.

(b) X is a group of people and for x, y ∈ X: x % y if and only if x knows y.

From preference relation %, one can derive two other binary relations:

Strict preference: x � y if x % y, but not y % x (�x is better than/strictly preferredto y�).

Indi�erence: x ∼ y if x % y and y % x (�x and y are equally good/equivalent�).

We sometimes write y - x instead of x % y and y ≺ x instead of x � y. Economic theory reliesheavily on preferences. You should be aware of some �hidden� assumptions:

� Preferences are deterministic: they are not susceptible to a change of mind or mood shocks.Statements like �I like co�ee more than tea at any time, but today I prefer a cup of tea.�are ruled out.

� Preferences are ordinal: the intensity of preferences � as in �I'm rather fond of the 6o'clock news, but detest soap operas.� � plays no role.

� Preference is a binary relation: it compares pairs of alternatives, independently of externalfactors. Conditional statements like �If there are twenty types of co�ee to choose from, Iprefer tea to any type of co�ee. Otherwise, I take an espresso.� are ruled out.

Also completeness and transitivity deserve scrutiny. Completeness rules out the existence ofincomparable alternatives. Transitivity is violated in a number of plausible situations:

Majority rule voting: Consider three agents with strict preferences over three alternativesa, b, c as follows:

a �1 b �1 c and c �2 a �2 b and b �3 c �3 a.

This involves a slight but common abuse of notation: although this was not stated explicitly,the notation above is taken to suggest, for instance, that also a �1 c. De�ne a new preferencerelation � via majority rule voting: a � b, because a majority (namely the agents 1 and 2)

1

Page 8: Preferences

strictly prefers a over b. Similarly, b � c and c � a, in violation of transitivity. This example issometimes referred to as the Condorcet paradox.

Nonperceivable differences and similarity: The human body cannot perceive di�erencesin stimuli unless they exceed a certain threshold. For instance, you will typically not sense thedi�erence between a cup of tea with n ∈ N grains of sugar and n+ 1 grains of sugar. Therefore,you will be indi�erent between them. If preferences are transitive, you will be indi�erent betweena cup of tea with 1 grain of sugar, 2 grains of sugar, 3 grains of sugar. . . one kilo of sugar. Areyou? This example is related to the more general issue of similarity: nearby alternatives may beperceived similar and therefore equally good. But with a long chain of nearby alternatives, youcan create a huge change between alternatives, so that you may no longer be indi�erent betweenthem.

Properties of % imply some properties of the indi�erence relation ∼ and the strict preferencerelation �. The proofs involve only simple manipulations of the de�nitions of ∼ and �; checkthat you can do this. I only prove part (d).

Proposition 1.1 Let % be a weak order on X.

(a) The indi�erence relation ∼ is an equivalence relation, i.e., it satis�es:

� re�exivity: ∀x ∈ X : x ∼ x.� symmetry: ∀x, y ∈ X: if x ∼ y, then y ∼ x.� transitivity: ∀x, y, z ∈ X: if x ∼ y and y ∼ z, then x ∼ z.

(b) The strict preference relation � satis�es:

� irre�exivity: ∀x ∈ X: not x � x.� asymmetry: ∀x, y ∈ X: if x � y, then not y � x.� transitivity: ∀x, y, z ∈ X: if x � y and y � z, then x � z.

(c) ∀x, y, z ∈ X: if x ∼ y and y % z, then x % z.

(d) ∀x, y, z ∈ X: if x � y and y % z, then x � z.

Proof. (d): Let x, y, z ∈ X with x � y and y % z. By de�nition of �, x � y implies x % y.With y % z and transitivity of %, this implies x % z. It is not true that z % x: if it were, itwould imply with y % z and transitivity of % that y % x, contradicting that x � y. Since x % z,but not z % x: x � z. �

Exercise 1.2 Complete the proof of the proposition.

1.2. Preference over commodity bundles

In the standard microeconomic model of consumer choice, the set of alternatives X is usuallytaken to be RL or RL+ for some L ∈ N. The interpretation is that there are L commodities, inthe latter case to be �consumed� in nonnegative amounts. An element x = (x1, . . . , xL) ∈ X iscalled a (commodity) bundle ; its k-th coordinate xk indicates the quantity of commodity k.

2

Page 9: Preferences

The additional structure obtained this way allows us to introduce a number of new properties;throughout this subsection, assume therefore that X equals RL+ or RL. These properties aretypically illustrated using indi�erence curves. The indi�erence curve containing x ∈ X is theset {y ∈ X : x ∼ y} of points equivalent with x.

Recall that the (Euclidean) distance between vectors x, y ∈ RL is de�ned as

‖x− y‖ =

√√√√ L∑`=1

(x` − y`)2.

The preference relation % over X satis�es local nonsatiation if, for every alternative x, thereis an alternative arbitrarily close to x that is better: for each x ∈ X and each ε > 0 there is ay ∈ X with ‖x− y‖ < ε and y � x.

Monotonicity properties come in di�erent varieties, all re�ecting the intuition that �more isbetter�. Let k ∈ {1, . . . , L} and let ek ∈ RL denote the k-th standard basis vector with k-thcoordinate equal to one and all other coordinates equal to zero. For x, y ∈ RL, write

x ≥ y if xi ≥ yi for all coordinates i = 1, . . . , L,

x > y if xi > yi for all coordinates i = 1, . . . , L.

The preference relation % is:

strongly monotonic in coordinate k if increasing this coordinate gives betteralternatives: for each x ∈ X and each ε > 0 : x+ εek � x.strongly monotonic if an increase in at least one coordinate gives better alterna-tives: for all x, y ∈ X, if x ≥ y and x 6= y, then x � y.monotonic if, for all x, y ∈ X: x ≥ y implies x % y, and x > y implies x � y.

For instance, a strongly monotonic preference relation% is strongly monotonic in each coordinate.The converse holds if % is transitive.

Exercise 1.3

(a) Prove the previous two sentences.

(b) Give an example of a preference relation over R2+ that is strongly monotonic in coordinate k, both

for k = 1 and k = 2, but not strongly monotonic.

(c) Let X = RL+ and assume that according to the preference relation %, �less is better� (think of thecoordinates as measures of pollution, unhealthy commodities, etc.) in the sense that x ≥ y andx 6= y imply that x ≺ y. Is this preference relation locally nonsatiated?

(d) Answer the same question as in (c), but with X = RL.

Each of the three monotonicity properties implies local nonsatiation. On the other hand, localnonsatiation has no implications for monotonicity: the preference relation % on R2

+ with

(x1, x2) % (y1, y2)⇔ (x1 − x2)2 + x1 ≥ (y1 − y2)2 + y1

is locally nonsatiated, but satis�es none of the monotonicity properties. Figure 1 contains threeindi�erence curves of this preference relation � with the �better� ones further away from the

3

Page 10: Preferences

0

1

2

3

0 1 2 3

x1

x2

Figure 1: Local nonsatiation has no implications for monotonicity.

origin � and shows that small increases in one or both of the coordinates may lead into areaswith strictly worse alternatives.

Figure 2 summarizes the relations between the three monotonicity relations and local nonsa-tiation. An arrow from strong monotonicity to monotonicity means that the former implies thelatter; the absence of an arrow in the opposite direction means that the converse is not true.

locally nonsatiated

strongly monotonic in coordinate k

strongly monotonic monotonic

?

@@@@@R

������

��

���

-

Figure 2: Relation between monotonicity properties and local nonsatiation on RL or RL+.

A preference relation % is continuous if for every y ∈ X, the set {x ∈ X : x % y} ofalternatives weakly better than y and the set {x ∈ X : x - y} of alternatives weakly worse thany are closed. The literature contains some alternative de�nitions as well:

Proposition 1.2 Let % be a weak order on X. The following properties are equivalent:

(a) % is continuous, i.e., for every y ∈ X, the sets {x ∈ X : x % y} and {x ∈ X : x - y} areclosed.

(b) For every y ∈ X, the sets {x ∈ X : x � y} and {x ∈ X : x ≺ y} are open.

(c) The graph {(x, y) ∈ X ×X : x % y} of % is closed.

4

Page 11: Preferences

(d) For all sequences (xn)n∈N and (yn)n∈N in X, if xn → x, yn → y, and xn % yn for alln ∈ N, then also x % y.

(e) For all x, y ∈ X, if x � y, then there is a neighborhood Ux of x (i.e., an open set Uxcontaining x) and a neighborhood Uy of y such that x′ � y′ for all x′ ∈ Ux, y′ ∈ Uy.

Proof. Statements (a) and (b) are equivalent, since the complement of an open set is closed,and vice versa. Also the equivalence of (c) and (d) is a matter of de�nition: (xn, yn) ∈ X ×X isan element of the graph of % if and only if xn % yn. Proving three implications su�ces to closethe circle and make sure that all �ve statements are equivalent:[(b) implies (e):] Assume (b) holds. Let x, y ∈ X with x � y. Distinguish two cases:Case 1: There is an m ∈ X with x � m � y.

De�ne Ux = {z ∈ X : z � m} and Uy = {z ∈ X : m � z}. These sets are open by (b).Moreover, x ∈ Ux and y ∈ Uy by assumption. Let x′ ∈ Ux, y′ ∈ Uy. Then x′ � m and m � y′.By Proposition 1.1, � is transitive, so x′ � y′, as we had to show.Case 2: There is no m ∈ X with x � m � y.

De�ne Ux = {z ∈ X : z � y} and Uy = {z ∈ X : x � z}. These sets are open by (b).Moreover, x ∈ Ux and y ∈ Uy by assumption. Let x′ ∈ Ux, y′ ∈ Uy. Then x′ � y. It cannot bethat x � x′, otherwise we would have x � x′ � y. By completeness, x′ % x. Similarly, y % y′.So x′ % x � y % y′. By Proposition 1.1, x′ � y′, as we had to show.

Conclude from cases 1 and 2 that (e) holds.[(e) implies (c):] Assume (e) holds. To establish (c), we need to show that the complement ofthe graph {(x, y) ∈ X ×X : x % y} is open. By completeness of %, this complement is the set

S = {(x, y) ∈ X ×X : x ≺ y}.

For each (x, y) ∈ S, �x, using (e), neighborhoods Ux of x and Uy of y such that x′ ≺ y′ for allx′ ∈ Ux, y′ ∈ Uy. Conclude that

∀(x, y) ∈ S : (x, y) ∈ Ux × Uy ⊆ S.

Taking the union over all (x, y) ∈ S, one obtains

S = ∪(x,y)∈S Ux × Uy.

As the union of open sets, S is open, as we had to show.[(c) implies (a)]:2 Assume (c) holds. Let y ∈ X. We show that the set {x ∈ X : x % y} is

closed; establishing that also the set {x ∈ X : x - y} is closed is analogous.

2The following proof is more general, but requires some knowledge of topology. In case of emergency, don'tworry. Simply forget that this footnote even exists!

[(c) implies (b)]: Assume (c) holds, i.e., the set S de�ned above is open. Let y ∈ X. We show thatL(y) = {x ∈ X : x ≺ y} is open; establishing that also the set {x ∈ X : x � y} is open is analogous.Let x ∈ L(y). Then (x, y) ∈ S. Since S is open in the product topology generated by Cartesian products of

open sets in X, we can �x neighborhoods Ux of x and Uy of y such that Ux × Uy ⊆ S. In particular, for eachx′ ∈ Ux, it follows that (x

′, y) ∈ Ux × Uy, so x′ ≺ y. Conclude that

∀x ∈ L(y) : x ∈ Ux ⊆ L(y).

Taking the union over all x ∈ L(y), one obtains

L(y) = ∪x∈L(y) Ux.

As the union of open sets, L(y) is open, as we had to show.

5

Page 12: Preferences

Let (xn)n∈N be a sequence in {x ∈ X : x % y} with limit x∗. We need to show that x∗ alsolies in this set. By de�nition, (xn, y)n∈N is a sequence in the graph of %, which is closed byassumption. Therefore, it contains the limit (x∗, y), i.e., x∗ % y, as we had to show. �

Very roughly speaking, continuity of preferences requires that the strict preference relation isuna�ected by small changes in the alternatives: if x is better than y, the same holds for nearbyalternatives.

A subtlety about open sets: Continuity properties are typically de�ned in terms of open3

subsets of the feasible set X. We often consider commodity spaces like X = RL+. Open sets arede�ned using the usual distance between vectors x and y:

‖x− y‖ =

√√√√ L∑`=1

(x` − y`)2.

A subset Y ⊆ X is open if each y ∈ Y is an interior point of Y , i.e., if for each y ∈ Y , all pointsx ∈ X su�ciently close to y lie in Y as well:

there is an ε > 0 such that for all x ∈ X with ‖x− y‖ < ε: x ∈ Y . (1)

Many people overlook a slight subtlety, namely the statement �. . . for all x ∈ X. . . � in (1). Thislooks innocuous: if you want to de�ne whether a subset of X is open, then obviously you're notinterested in stu� that is outside of X. But it does matter in identifying open subsets! Notice,for instance, that as subsets of X = R2

+ � but not as subsets of X = R2 � sets like

Y 1 = R2+, Y 2 = {y ∈ R2

+ : y1 < 1}, Y 3 = {y ∈ R2+ : y1 + 2y2 < 4}

are open. You might want to draw their pictures. In topological language, X = RL+ is endowedwith the relative topology that it inherits from the larger set RL: a set Y ⊆ X is open if andonly if Y = X ∩ O, where O is an open set in the larger space RL. This provides quick proofsthat the sets Y 1, Y 2, Y 3 are open subsets of X = R2

+:

Y 1 = X ∩ R2, Y 2 = X ∩ {y ∈ R2 : y1 < 1}, Y 3 = X ∩ {y ∈ R2 : y1 + 2y2 < 4},

and the setsR2, {y ∈ R2 : y1 < 1}, {y ∈ R2 : y1 + 2y2 < 4}

are open in R2.

The next two properties are related to other changes, namely shifts in or rescaling of the coor-dinates. The preference relation % is:

quasilinear in coordinate k if, for all x, y ∈ X and all ε > 0, x % y implies thatx+ εek % y + εek: the preference relation is insensitive to parallel shifts in the sensethat adding the same positive amount of commodity k to both alternatives does nota�ect the preference over them.

homothetic if rescaling the coordinates does not a�ect the preferences: for all x, y ∈X and all α > 0, if x % y, then αx % αy.

3Of course, this requires knowing which subsets of X are open. In general � as you will recall from the mathcourse � this requires X to be a topological space, i.e., it comes equipped with a de�nition of open sets, subjectto three restrictions: (1) the empty set and X are open, (2) unions of open sets are open, (3) intersections of�nitely many open sets are open.

6

Page 13: Preferences

For instance, any preference relation where only the di�erence between the �rst coordinatesmatters, like

(x1, x2) % (y1, y2)⇔ 3x1 + expx2 ≥ 3y1 + exp y2,

is quasilinear in the �rst coordinate. Often, such a coordinate is referred to as �numeraire�or �money� and the economic idea is that not the exact amounts of money associated withtwo alternatives matter, but the di�erence between them. A simple example of homotheticpreferences arises in most linear production processes: let alternatives x and y denote vectors ofingredients and let x be weakly preferable to y if the ingredients of x su�ce to make at least asmuch of your favorite cake as y. Then also αx yields at least as much cake as αy. More generally,any preference relation de�ned in terms of a homogeneous function is homothetic. Recall thata function f : RL+ → R is homogeneous of degree k ∈ R if for each x ∈ RL+ and each α > 0:f(αx) = αkf(x). Suppose that x % y if and only if f(x) ≥ f(y). Then % is homothetic:

x % y ⇔ f(x) ≥ f(y)⇔ f(αx) = αkf(x) ≥ αkf(y) = f(αy)⇔ αx % αy.

Therefore, functions de�ned by f(x1, x2) = min{x1, x2} and f(x1, x2) = x1x32 generate homoth-

etic preferences.

Exercise 1.4 Give an example of a weak order % on R2+ that satis�es:

(a) strong monotonicity in coordinate 1, but not quasilinearity in coordinate 1.

(b) quasilinearity in coordinate 1, but not strong monotonicity in coordinate 1.

(c) homotheticity, but none of the three monotonicity properties.

(d) all three monotonicity properties, but not homotheticity.

Exercise 1.5 Consider a weak order % on X = RL+ with x � y if x > y.

(a) Prove: if % is continuous, then % is monotonic. That is, also x % y if x ≥ y.

(b) Not a drop too much: Your favorite drink requires mixing its two ingredients in the sameamount: if x1, x2 ≥ 0 indicate the two amounts, you can mix min{x1, x2} of your drink, whereasmax{x1, x2} − min{x1, x2} goes to waste. If you are primarily concerned about the amount ofdrink, but also feel it is unfortunate to waste ingredients, the following weak order % on R2

+ mayre�ect your preferences: for all x, y ∈ R2

+, x % y if and only if

� x yields more of the drink than y: min{x1, x2} > min{y1, y2}, or� x gives the same amount of the drink as y, but not more waste: min{x1, x2} = min{y1, y2},

but max{x1, x2} −min{x1, x2} ≤ max{y1, y2} −min{y1, y2}.Show that x � y whenever x > y, but not necessarily x % y if x ≥ y.

A preference relation % is convex if for each y ∈ X, the set {x ∈ X : x % y} of weakly betteralternatives is convex.

Proposition 1.3 Let % be a weak order on X. Then % is convex if and only if for all x, y ∈ Xwith x % y and all α ∈ [0, 1], also αx + (1 − α)y % y. Informally, if x is at least as good as y,just walking part of the way from y to x is a weak improvement.

Exercise 1.6

(a) Prove this proposition.

7

Page 14: Preferences

(b) Give an example to show that the proposition is false if % is not a weak order.

A somewhat stronger version: a preference relation % is strictly convex if for all x, y ∈ X withx 6= y and x % y and all α ∈ (0, 1), it holds that αx+ (1− α)y � y.

This property implies that if you are indi�erent between two distinct alternatives x, y ∈ X,you can still improve upon them: by strict convexity, the alternative 1

2x+ 12y is strictly better.

8

Page 15: Preferences

2. Utility

2.1. Utility functions

In many cases, preferences over alternatives can be evaluated by some numerical assessment: �Iprefer the alternative with the higher percentage of alcohol� or �I prefer the alternative yieldingthe higher pro�t�. In that case, we say that these functions � in the latter case the functionassigning to each alternative its associated pro�t � represent the decision maker's preferences.Formally, a function u : X → R is a utility function representing % if for all x, y ∈ X :

x % y ⇔ u(x) ≥ u(y). (2)

One often uses the following simple result to verify that u represents a complete preferencerelation %.

Proposition 2.1 Let % be a complete preference relation on a set X and let u : X → R be afunction. The following two claims are equivalent:

(a) u represents %;

(b) For all x, y ∈ X: {if x � y, then u(x) > u(y),if x ∼ y, then u(x) = u(y).

Proof. (a) ⇒ (b): Assume (a) holds. Let x, y ∈ X. If x � y, by de�nition of �: x % y and noty % x. Hence, by de�nition of a utility function, u(x) ≥ u(y) and not u(y) ≥ u(x). Concludethat u(x) > u(y). Similarly, if x ∼ y, u(x) = u(y).(b) ⇒ (a): Assume (b) holds. Let x, y ∈ X. To show:

x % y ⇔ u(x) ≥ u(y).

One direction is easy: if x % y, then x � y or x ∼ y, so by (b), either u(x) > u(y) or u(x) = u(y).Hence u(x) ≥ u(y). Conversely, assume that u(x) ≥ u(y). By completeness, x % y or y % x.Suppose x % y is not true. Then y � x, so by (b), u(y) > u(x), a contradiction. �

Exercise 2.1 The completeness condition in Proposition 2.1 cannot be omitted. Indeed, consider thepreference relation % on R with

∀x, y ∈ R : x % y ⇔ x ≥ y + 1

and the function u : R→ R with u(x) = x for all x ∈ R. Show that:

(a) % is transitive, but not complete.

(b) u satis�es Proposition 2.1(b), but not Proposition 2.1(a).

If one function represents a preference relation, then many others do as well: if preferencesare represented by a pro�t function, then also �twice the pro�t� or �pro�t to the power three�represent the same preference relation. In general:

9

Page 16: Preferences

Proposition 2.2 If u : X → R represents % and f : R→ R is strictly increasing, then also thefunction v : X → R de�ned by v(x) = f(u(x)) represents %.

Proof. By (2) and the de�nition of strictly increasing, we �nd for all x, y ∈ X :

x % y ⇔ u(x) ≥ u(y)⇔ v(x) = f(u(x)) ≥ f(u(y)) = v(y),

so v represents %. �

Since the ≥ ordering of the real numbers is complete and transitive, a preference relation thatcan be represented by a utility function is necessarily complete and transitive: it must be a weakorder. But is being a weak order enough to guarantee the existence of a utility function? Theanswer is positive for �nite or countable sets.

2.2. From preference to utility: �nite or countable sets

Representing a weak order on a �nite set by means of a utility function is easy: the more preferredan alternative x ∈ X is, the larger is the set of elements weakly worse than x. Therefore, countinghow many elements are weakly worse than x measures its utility.

Proposition 2.3 Assume:

� X is �nite,

� % is a weak order on X.

Then there is a utility function representing %.

Proof. For each x ∈ X, de�ne u(x) = |{z ∈ X : x % z}|. Then u : X → R represents %: letx, y ∈ X. If x ∼ y, then for each z ∈ X with y % z, Proposition 1.1(c) gives that x % z. So{z ∈ X : y % z} ⊆ {z ∈ X : x % z}. Similarly, the converse inclusion holds, so

{z ∈ X : x % z} = {z ∈ X : y % z}. (3)

Hence u(x) = u(y). If x � y, Proposition 1.1(d) and the fact that x lies in the former set, butnot in the latter, imply:

{z ∈ X : x % z} ⊃ {z ∈ X : y % z}. (4)

Hence u(x) > u(y). �

If X is countable, simply counting the number of weakly worse alternatives does not work: theremay be in�nitely many of them. But we can give each element a positive weight, make sure thatthe weights have a well-de�ned sum even if we add in�nitely many of them, and use the totalweight of the elements weakly worse than x as a measure of the utility of x. For instance, labelX = {x1, x2, . . .} and divide a bar of chocolate by giving half (weight 2−1) to x1, then half ofthe remainder (weight 2−2) to x2, then half of the remainder (weight 2−3) to x3, and so on.

Proposition 2.4 Assume:

� X is countable;

� % is a weak order on X.

Then there is a utility function representing %.

10

Page 17: Preferences

Proof. Since X is countable, there is an injective function n : X → N. For each x ∈ X, de�ne

u(x) =∑

z∈X:x%z

2−n(z).

The sequence (2−n)n∈N has a �nite sum∑

n∈N 2−n = 1, so u is well-de�ned. To see that urepresents %, let x, y ∈ X. If x ∼ y, (3) holds, so u(x) = u(y). If x � y, (4) holds, sou(x)− u(y) ≥ 2−n(x) > 0. �

2.3. Preference, but no utility

Not all preference relations � not even weak orders � can be represented by means of a utilityfunction. Graduate textbooks usually give exactly one example (lexicographic preferences), as ifit concerns an exotic phenomenon. This section gives some counterweight by providing severaleconomically relevant examples, all arising from the following general principle.

Fix a set of alternatives X. Suppose you can associate with each number z in some un-countable set I ⊆ R, one bad alternative b(z) ∈ X and one good alternative g(z) ∈ X with thefollowing two properties. Firstly, for each z ∈ I, the good alternative is strictly preferred to thebad one:

g(z) � b(z). (5)

Secondly, if z < z′, then the good alternative associated with z is worse than the bad alternativeassociated with z′:

∀z, z′ ∈ I : z < z′ ⇒ b(z′) � g(z). (6)

Combining (5) and (6), representing such preferences by a utility function requires, for z < z′:

u(b(z)) < u(g(z)) < u(b(z′)) < u(g(z′)).

So for each z ∈ I, the interval [u(b(z)), u(g(z))] has positive length and if z, z′ ∈ I have z 6= z′,the intervals [u(b(z)), u(g(z))] and [u(b(z′)), u(g(z′))] are disjoint: one of them lies entirely tothe left of the other on the real axis. So uncountably many intervals [u(b(z)), u(g(z))] of positivelength must somehow be placed on the real line without any two of them intersecting. Thisis impossible: we simply run out of space! Formally, each interval [u(b(z)), u(g(z))] contains arational number r(z) ∈ Q. Since the intervals associated with di�erent values of z are disjoint:z 6= z′ implies r(z) 6= r(z′), i.e., the function r : I → Q is injective. But I is uncountable and Qis countable, a contradiction. Some examples:

Lexicographic preferences. (Debreu, 1954) Let X = R2. De�ne % as follows:

(x1, x2) % (y1, y2)⇔ x1 > y1 or (x1 = y1 and x2 ≥ y2) .

Alternatives are compared according to their �rst coordinates; if these happen to be equal, theyare compared according to their second coordinates. Think of the way words are ordered ina dictionary. For each z ∈ R, let b(z) = (z, 0) and g(z) = (z, 1). Then g(z) � b(z) and, ifz, z′ ∈ R, z < z′, then g(z) = (z, 1) ≺ (z′, 0) = b(z′). So (5) and (6) hold: this preference relationcannot be represented by a utility function.

Preferences over information. (Dubra and Echenique, 2001) It is common in economicsto model information by means of partitions of a state space. Let z ∈ R be a certain threshold.

11

Page 18: Preferences

Suppose you get the following information about a number x ∈ R: you are told the exact valueof x if x < z, otherwise you are told that x lies in the interval [z,∞). That means you canperfectly distinguish between all real numbers x with x < z, but cannot distinguish between thenumbers in the interval [z,∞). Therefore, information is summarized by the partition

b(z) = {{x} : x < z} ∪ {[z,∞)}

of R. Similarly, de�ne the information partition

g(z) = {{x} : x ≤ z} ∪ {(z,∞)}

that arises if you are told the exact value of x also in the case where x = z: all numbers x ≤ zcan be perfectly distinguished, but larger ones not. Assume it is preferable to have more preciseinformation, i.e., �ner information partitions (partition P is �ner than partition Q if every setfrom P is contained in a set from Q). Partition g(z) is �ner than partition b(z), so g(z) � b(z).Also if z < z′, partition b(z′) is �ner than partition g(z), so b(z′) � g(z). So (5) and (6) hold:this preference relation cannot be represented by a utility function.

Preferences over utility flows. At every moment in time t ∈ [0,∞), an agent receivespayo� zero or one: an alternative x is simply a function x : [0,∞)→ {0, 1}. Suppose preferencessatisfy the following monotonicity condition: if x(t) ≥ y(t) at all times t, with strict inequalityfor at least one time period, then x � y. De�ne, for each z ∈ [0,∞), the alternative b(z) givingpayo� one before time z and payo� zero afterwards:

b(z)(t) =

{1 if t < z,0 otherwise.

Similarly, alternative g(z) gives payo� one at/before time z and payo� zero afterwards:

g(z)(t) =

{1 if t ≤ z,0 otherwise.

By the monotonicity requirement, g(z) � b(z) and if z < z′: b(z′) � g(z). So (5) and (6) hold:this preference relation cannot be represented by a utility function.

2.4. In no-man's-land: A necessary and su�cient condition for utility representation

We saw above that preference relations where there are uncountably many disjoint intervals be-tween bad and good alternatives cannot be represented by means of a utility function. On theother hand, complete and transitive preferences on a countable set do have a utility representa-tion. Is there something in between these two cases that allows uncountably many alternatives,but still has enough of a �countable character� that it allows a utility representation?

Let % be a complete, transitive preference relation over a set X. The pair (X,%) � or, witha minor abuse of notation, the set X � is Ja�ray order-separable if there is a countablesubset C ⊆ X such that for all x, y ∈ X:

if x � y, then there exist c1, c2 ∈ C s.t. x % c1 � c2 % y.

The condition roughly says that countably many alternatives su�ce to keep all pairs x, y ∈ Xwith x � y apart: x lies on one side of the �no-man's-land� between c1 and c2, whereas ylies on the other. This condition is both necessary and su�cient for the existence of a utilityrepresentation:

12

Page 19: Preferences

Proposition 2.5 Let % be a weak order on a set X. There is a utility function representing %if and only if X is Ja�ray order-separable.

Exercise 2.2 This exercise guides you through the steps of the proof. Assume that u represents %. LetU = {u(x) : x ∈ X} be the range of u. A jump in U is a pair (u1, u2) ∈ U × U where u1 < u2 and theopen interval (u1, u2) contains no elements of U : (u1, u2) ∩ U = ∅.

(a) Prove that u contains at most countably many jumps. (Suppose not. Use the idea behind (5) and(6) to �nd a contradiction.)

For each jump (u1, u2), �x a point x(u1, u2) with utility u1 and a point y(u1, u2) with utility u2. LetJ = ∪{x(u1, u2), y(u1, u2)} be the union (over all jumps (u1, u2)) of these points. By (a), J is countable.

Next, for each pair of rational numbers r1, r2 ∈ Q with r1 < r2 and (r1, r2) ∩ U 6= ∅, �x an elementx(r1, r2) ∈ X with utility in (r1, r2) ∩ U . Let R be the union of all such points x(r1, r2). Since there areonly countably many pairs (r1, r2) as above, R is countable. Let C = J ∪R.

(b) Show that C makes X Ja�ray order-separable.

Conversely, assume X is Ja�ray order-separable via the set C. Let n : C → N be injective. De�ne u byu(x) =

∑c∈C:c-x 2−n(c).

(c) Show that u represents %.

For �nite or countable sets X, simply let C = X to show that X is Ja�ray order-separable. Forpreferences over uncountable sets, additional restrictions are required. We will see in Proposition2.8, for instance, that on RL+, adding continuity to our list of requirements works.

2.5. Continuous utility

Economists usually work with continuous utility functions. Establishing existence of a continuousutility function is troublesome: not even Fishburn (1970a), the standard reference in the �eld,bothers to give the proof. A well-known continuity result is often wrongly attributed to Debreu(1954). However, his proof is �awed (Debreu, 1964) and a more general continuity result wasalready known from much older research on order types in the classical theory of sets, due toGeorg Cantor. See, for instance, Kamke (1950). The proof of Proposition 2.6 is not obligatoryreading; it follows Ja�ray (1975).

Proposition 2.6 Assume:

� % is a weak order on X;

� X is Ja�ray order-separable;

� X is endowed with a topology where, for all y ∈ X, the sets {x ∈ X : x � y} and{x ∈ X : x ≺ y} are open, i.e., % is continuous.

Then there exists a continuous utility function representing %.

Proof. Let C ⊆ X make X Ja�ray order-separable. Omitting redundant elements from C ifnecessary, one may assume that no two distinct elements of C are equivalent: for all c, c′ ∈ Cwith c 6= c′, either c � c′ or c′ � c.[De�ne utility on C:] Since C is countable, label C = {c1, c2, . . .}. Since the set Q = (0, 1)∩Qof rationals in (0, 1) is countable, label Q = {q1, q2, . . .}. De�ne a utility function f : C → Qby induction: f(c1) := q1. Let n ∈ N, n ≥ 2, and assume f was de�ned on {c1, . . . , cn−1}. To

13

Page 20: Preferences

extend the utility function to {c1, . . . , cn}, de�ne f(cn) to be �rst element of Q (de�ned4 as theelement q` ∈ Q with smallest index `) among those elements q` that give the desired extension:

∀k ∈ {1, . . . , n− 1} : q` > f(ck)⇔ cn � ck. (7)

A useful implication: let a, b ∈ C with a ≺ b. If the set of points in C between a and b,

β(a, b) = {c ∈ C : a ≺ c ≺ b},

is nonempty, it has a �rst element (Why?), say cm. By construction, cm is the �rst elementin β(a, b) to be assigned its value by f and therefore its image f(cm) is the �rst element in(f(a), f(b)) ∩Q.[Extend utility to X:] For each x ∈ X, de�ne u(x) = sup {f(c) : c ∈ C, c - x}. The set overwhich the supremum is taken is nonempty (it contains x) and bounded from above (by 1), sothis supremum exists. Moreover, u represents %. Let x, y ∈ X. If x ∼ y, the supremum is takenover the same set, so u(x) = u(y). If x � y, there exist, by Ja�ray order-separability, elementsa, b ∈ C with x % a � b % y, so that u(x) ≥ f(a) > f(b) ≥ u(y).[Establish continuity of utility:] The usual topology on R is generated by the intervals(−∞, r) and (r,∞), with r rational. Therefore, it su�ces to prove that u−1((−∞, r)) andu−1((r,∞)) are open for all r ∈ Q. Let's do the former; the latter is similar.

Now u−1((−∞, r)) equals (i) ∅ if r ≤ inf f(C), (ii) X if r > sup f(C) or if r = sup f(C) andr /∈ f(C), (iii) {x ∈ X : x ≺ f−1(r)} if r ∈ f(C). By assumption, all these sets are open.

The only remaining case is when r /∈ f(C) and inf f(C) < r < sup f(C). We show that rbelongs to a jump of f(C). Recall from Exercise 2.2 that a jump in f(C) is a pair of points(f1, f2) ∈ f(C)× f(C) with f1 < f2 and (f1, f2) ∩ f(C) = ∅.

Suppose not. Since inf f(C) < r < sup f(C), there exist a, b ∈ C with f(a) < r < f(b). Letm ∈ N be the maximum of the indices of f(a), r, f(b) ∈ Q. Then {q1, . . . , qm} contains r andelements p, p′ ∈ f(C) with p < r < p′. Let n ∈ N be the smallest index for which {q1, . . . , qn}has this property. Let

p1 = max f(C) ∩ {q1, . . . , qn} ∩ (−∞, r), so (p1, r) ∩ {q1, . . . , qn} = ∅,

p2 = min f(C) ∩ {q1, . . . , qn} ∩ (r,∞), so (r, p2) ∩ {q1, . . . , qn} = ∅.

So r is the �rst element of (p1, p2). Since it contains r, the interval (p1, p2) cannot be a jump,i.e., it contains elements from f(C). We show that this yields a contradiction.

Since p1, p2 ∈ f(C), there exist b1, b2 ∈ C with p1 = f(b1), p2 = f(b2). Since (p1, p2)∩f(C) 6=∅, there is a p ∈ C with f(p1) < f(p) < f(p2), i.e., the set β(b1, b2) of points in C between b1and b2 is nonempty. Let b∗ be its �rst element. By the implication following (7), its image f(b∗)must be the �rst element of (p1, p2), which was r. But r /∈ f(C), a contradiction.

This shows that r belong to a jump (f1, f2) of f(C). But then u−1((−∞, r)) = {x ∈ X : x ≺f−1(f2)}, which is open by assumption. �

Let us apply this result to show that continuous weak orders on RL+ can be represented by acontinuous utility function. We �rst establish an auxiliary result that is of interest in its ownright whenever we want to �nd alternatives �in between� two others.

4Caveat: `�rst element' is de�ned in terms of the chosen enumerations of C and Q. This allows us to speak,for instance, of the �rst element in (0, 1), which makes absolutely no sense if one � mistakenly � were to believeit was de�ned in terms of the usual ≥ order on R.

14

Page 21: Preferences

Proposition 2.7 Intermediate Value Theorem for preferences: Assume:

� X = RL+ for some L ∈ N;� % is a continuous weak order on X;

� Y is a connected subset of X.

The following two results hold:

(a) If x ∈ X and y, y′ ∈ Y are such that y % x % y′, then there is a y′′ ∈ Y with x ∼ y′′.

(b) If y, y′ ∈ Y are such that y � y′, then there is a y′′ ∈ Y with y � y′′ � y′.

Proof. (a): Suppose not: all elements of Y are strictly better/worse than x. That is, eachelement of Y belongs to exactly one of the sets A = {z ∈ X : z ≺ x} and B = {z ∈ X : z � x}.The former contains y′, the latter y. As A and B are open by continuity, they separate theconnected set Y , a contradiction.(b): Suppose not. Then each element of Y belongs to exactly one of the sets A = {z ∈ X : y � z}and B = {z ∈ X : z � y′}. The former contains y′, the latter y. As A and B are open by conti-nuity, they separate the connected set Y , a contradiction. �

In typical applications of this proposition, one takes Y to be equal to the entire set X, as inProposition 2.8, or to a suitably chosen convex set like the diagonal {x ∈ RL+ : x1 = · · · = xL}in Proposition 2.9.

Proposition 2.8 Assume:

� X = RL+ for some L ∈ N;� % is a continuous weak order on X.

Then there is a continuous utility function representing %.

Proof. The countable set C = QL+ makes X Ja�ray order-separable: let x, y ∈ X with x � y.

By Proposition 2.7, there is a z ∈ X with x � z � y. By continuity, the set

{a ∈ X : x � a � z} = {a ∈ X : x � a} ∩ {a ∈ X : a � z}

is the intersection of two open sets, hence open itself. It is nonempty by Proposition 2.7. Theset C is dense in X: every nonempty, open set in X has a nonempty intersection with C. Hence,there is a c1 ∈ C with x � c1 � z. Similarly, there is a c2 ∈ C with z � c2 � y. Conclude thatx � c1 � c2 � y, in correspondence with the requirement for Ja�ray order-separability. Now allconditions of Proposition 2.6 are satis�ed. �

Below we present a special case of Proposition 2.8 with a particularly simple proof.

Proposition 2.9 Assume:

� X = RL+ for some L ∈ N;� % is a continuous, monotonic weak order on X.

Then there is a continuous utility function representing %.

15

Page 22: Preferences

Proof. Let e = (1, . . . , 1) ∈ RL+ denote the vector of ones.Step 1: For each x ∈ X, there is a unique αx ≥ 0 with x ∼ αxe.

Let x ∈ X. Choose β ≥ max{x1, . . . , xL}. By monotonicity, βe % x % 0e. By Proposition2.7, the diagonal

{x ∈ RL+ : x1 = · · · = xL},

being connected, contains an element equivalent to x: there is an αx ≥ 0 with x ∼ αxe. Unicityfollows from monotonicity: increasing αx gives better alternatives, decreasing worse.Step 2: De�ne u(x) = αx. Then u represents %.

Let x, y ∈ X. Then x % y ⇔ αxe % αye⇔ u(x) = αx ≥ αy = u(y).Step 3: u is continuous.

It su�ces to show that the preimage u−1((α, β)) of every open interval (α, β) is open. Now

u−1((α, β)) = {x ∈ X : x � αe} ∩ {x ∈ X : x ≺ βe}

is the intersection of two open sets by continuity, and therefore open. �

As a simple application, suppose that preferences are also homothetic. Then x ∼ αxe and β ≥ 0implies that βx ∼ βαxe, so u(βx) = βαx = βu(x). This proves:

Corollary 2.10 If � in addition to the assumptions in Proposition 2.9 � the preference relation% is homothetic, there is a utility function homogeneous of degree one representing %.

The next exercise studies the connection between continuous preferences and continuous utility.The fact that statement (a) in that exercise is true, is useful: you will have relatively little troublerecognizing continuous functions, and continuous utility implies continuous preferences!

Exercise 2.3 Consider a weak order % on topological spaceX represented by utility function u : X → R.Are the following claims true or false?

(a) If u is continuous, then % is continuous.

(b) If % is continuous, then u is continuous.

2.6. Some special functional forms

Recall that if a preference relation over commodity bundles is quasilinear in some coordinate, thiscoordinate is often referred to by economists as `money' or a `numeraire'. Under mild additionalassumptions, such quasilinear preferences can be represented by means of a utility function ofthe form `money plus whatever utility I get from the other commodities'.

Proposition 2.11 Assume:

� X = RL+ for some L ∈ N;� % is a weak order on X;

� % is quasilinear and strongly monotonic in the �rst coordinate;

� �Getting something is at least as good as getting nothing�: x % (0, . . . , 0) for every x ∈ X;

� �Any di�erence can be compensated for by money�: ∀x, y ∈ X: if x % y, there is a v ≥ 0s.t. x ∼ (y1 + v, y2, . . . , yL).

16

Page 23: Preferences

Then there is a utility function of the form u(x) = x1 + v(x2, . . . , xL) representing %.

Proof. Let x ∈ X. By assumption:

(0, x2, . . . , xL) % (0, . . . , 0).

Hence there is a number v(x2, . . . , xL) ≥ 0 s.t.

(0, x2, . . . , xL) ∼ (v(x2, . . . , xL), 0, . . . , 0).

This number is unique, since % is strongly monotonic in the �rst coordinate. Adding x1 ≥ 0 tothe �rst coordinate, quasilinearity implies that

(x1, x2, . . . , xL) ∼ (x1 + v(x2, . . . , xL), 0, . . . , 0) .

The utility function u : X → R with u(x) = x1 + v(x2, . . . , xL) represents %:

∀x, y ∈ X : x % y ⇔ (x1 + v(x2, . . . , xL), 0, . . . , 0) % (y1 + v(y2, . . . , yL), 0, . . . , 0)

⇔ x1 + v(x2, . . . , xL) ≥ y1 + v(y2, . . . , yL),

where the second equivalence follows from strong monotonicity of % in the �rst coordinate. �

The proof establishes that each alternative is equivalent with receiving a su�ciently large amountof just the �rst commodity: utility can be measured in units of commodity 1. This explains thefrequent use of quasilinear preferences: only if they are measured on the same scale can one domeaningful comparisons between, say, your utility and mine.

Exercise 2.4 Is the �nal property

∀x, y ∈ X : if x % y, there is a v ≥ 0 s.t. x ∼ y + ve1 (8)

in Proposition 2.11 implied by the others?

Exercise 2.5 Preferences with money (Kaneko, 1976): Let A be a nonempty set. Let X =A×R+, where an element (a,m) ∈ X is interpreted as receiving a ∈ A and an amount of money m ∈ R+.A decision maker has a weak order % on X with the following three properties:

� �strict preference can be compensated for by money�: for all alternatives (a,m) and (a′,m′) in X:if (a,m) � (a′,m′), there is a number m∗ ≥ 0 such that (a,m) ∼ (a′,m∗).

� �% is strongly monotonic in money�: for all a ∈ A andm,m′ ∈ R+: ifm > m′, then (a,m) � (a,m′).

� �indi�erence is insensitive to shifts in money�: for all alternatives (a,m) and (a′,m′) in X and allc ≥ 0: if (a,m) ∼ (a′,m′), then (a,m+ c) ∼ (a′,m′ + c).

We construct a utility function assigning to each (a,m) ∈ X a utility of the form �money plus utilityfrom a�.

(a) Let a, a′ ∈ A. Show that there exist amounts of money m,m′ ∈ R such that (a,m) ∼ (a′,m′).

(b) Let a, a′ ∈ A and m,m′, w, w′ ∈ R+ satisfy (a,m) ∼ (a′,m′) and (a,w) ∼ (a′, w′). Show thatm−m′ = w − w′.

Fix an arbitrary element a∗ ∈ A. De�ne the function v : A→ R by taking, for each a ∈ A:

v(a) = m∗ −m, where m,m∗ are chosen such that (a∗,m∗) ∼ (a,m).

Such m,m∗ exist by (a) and the function v is independent of the particular choices of m,m∗ by (b), sothis function is well-de�ned.

17

Page 24: Preferences

(c) Show that the function u : X → R with u(a,m) = v(a) +m is a utility function representing %.

Also convexity and strict convexity of preferences have implications for the form of the utilityfunction. Recall that a real-valued function u on a convex domain X (Why convex?) is

quasiconcave if for all x, y ∈ X and all α ∈ (0, 1):

u(αx+ (1− α)y) ≥ min{u(x), u(y)}.

strictly quasiconcave if for all x, y ∈ X with x 6= y and all α ∈ (0, 1):

u(αx+ (1− α)y) > min{u(x), u(y)}.

Proposition 2.12 Assume:

� X = RL+ for some L ∈ N;� % is a convex weak order on X;

� u : X → R represents %.

Then u is quasiconcave. If % is strictly convex, u is strictly quasiconcave.

Proof. Let x, y ∈ X and α ∈ (0, 1). Assume without loss of generality that x % y. Thenu(x) ≥ u(y), so min{u(x), u(y)} = u(y). By convexity of %: αx+ (1− α)y % y, so

u(αx+ (1− α)y) ≥ u(y) = min{u(x), u(y)},

as we had to show. The proof for strict quasiconcavity is analogous. �

Exercise 2.6

(a) An equivalent way of de�ning a quasiconcave function u on a convex domain X is that for allr ∈ R, the upper contour set Xu(r) = {x ∈ X : u(x) ≥ r} is convex. Provide a second proof ofProposition 2.12, using this de�nition.

(b) As a converse to Proposition 2.12, prove that if u : X → R is a (strictly) quasiconcave utilityfunction on a convex set X, the corresponding preference relation % is (strictly) convex.

(c) Give an example of a convex weak order on R that can be represented by a utility function, butnot by a concave one.

Next, we provide conditions for a weak order to be representable by a linear utility function.Although we go into more detail, the proof follows Diecidue and Wakker (2002). A convenientmathematical tool is treated in the following exercise.

Exercise 2.7 Cauchy's functional equation: On two domains, we show that, under mild assump-tions, additive functions are linear. Let f : R→ R be additive: f(x+ y) = f(x) + f(y) for all x, y ∈ R.

(a) Let u ∈ R. Show that f(xu) = xf(u) for all rational x. Hint: First establish the claim for x ∈ N,then for x ∈ Z, then for x ∈ Q.

Setting u = 1 and c = f(1), it follows that f(x) = cx for all rational x, i.e., f is linear on the �eldQ. Approximating real numbers by rational ones and taking limits, it follows that continuous additivefunctions f : R→ R are linear. But much weaker conditions than continuity su�ce:

18

Page 25: Preferences

(b) Suppose f is not linear on R. Show that its graph {(x, y) ∈ R2 | y = f(x)} is dense.So any assumption that prevents the graph of f being dense implies that f must be linear! Such conditionsinclude continuity in a single point, boundedness/sign restrictions on small intervals, monotonicity, etc.

We now extend the domain to n-dimensional real vectors. Let F : Rn → R be additive: F (x+ y) =F (x) + F (y) for all x, y ∈ Rn.(c) Reduce this to the previously solved case by showing that there exist additive functions fi : R→ R

for i = 1, . . . , n such that, for all x ∈ Rn, F (x) = f1(x1) + · · ·+ fn(xn).

With this tool in our baggage, we can prove the linear representation result:

Proposition 2.13 Assume:

� X = RL for some L ∈ N;� % is a weak order on X;

� % is strongly monotonic;

� % is additive: for all x, y, z ∈ X, if x % y, then x+ z % y + z;

� For each x ∈ X there is a constant α ∈ R such that x ∼ α(1, . . . , 1).

Then there are α1, . . . , αL ∈ R++ such that the function u : X → R with u(x) = α1x1+· · ·+αLxLrepresents %.

Proof. By assumption, there is, for each x ∈ X, a number u(x) ∈ R such that x ∼ u(x)e. Bystrong monotonicity, this number is unique. So the function u : RL → R is well-de�ned andrepresents preferences %.

Moreover, u is additive. Let x, y ∈ X. Using additivity of % twice (for % and -), x ∼ u(x)eimplies that x + y ∼ u(x)e + y. Similarly, y ∼ u(y)e implies that u(x)e + y ∼ u(x)e + u(y)e =(u(x) + u(y))e. By transitivity, x+ y ∼ (u(x) + u(y))e. Hence u(x+ y) = u(x) + u(y).

As u : RL → R satis�es Cauchy's functional equation, Exercise 2.7 implies that there areadditive functions ui : R → R (i = 1, . . . , L) with u(x) =

∑Li=1 ui(xi). By strong monotonicity,

each ui is strictly increasing: its graph cannot be dense. Hence, each ui is linear: there areα1, . . . , αL ∈ R such that u(x) =

∑Li=1 αixi. The constants α1, . . . , αL are positive by strict

monotonicity. �

Most assumptions are familiar. Strong monotonicity assures that all the αi are positive; withmilder monotonicity requirements, one can only assure that some of them are. If you don'tlike the �nal assumption, recall from Proposition 2.9 that it can be replaced by continuity.Additivity of preferences is obviously the key assumption. It essentially states that in evaluatingtwo alternatives x, y ∈ X, only their di�erence x − y matters: preferences are insensitive totranslations.

With later applications in mind (see Proposition 2.14), there is no nonnegativity assumptionon the vectors over which preferences were de�ned: X = RL, not RL+. If this makes you ner-vous, notice that the proof hinges on the linearity of the function satisfying Cauchy's functionalequation. Fortunately, linearity can be derived even if additivity holds only on the nonnegativeorthant.

The remainder of this section is based on Voorneveld (2008), which contains more generalresults. Due to its analytical tractability, the Cobb-Douglas utility function

u : RL+ → R with u(x) = xa11 · · ·xaLL =

L∏i=1

xaii (L ∈ N, a1, . . . , aL > 0)

19

Page 26: Preferences

is among the most commonly used in economics; see also Exercise which? . Its name creditsCobb and Douglas (1928), who used it in the context of production theory. What properties ofan agent's preferences assure that they can be represented by a Cobb-Douglas utility function?

Part of the trick is in exploiting the fact that this function also goes under the name oflog-linear utility : taking logarithms, we have that for all x, y ∈ RL++:

x % y ⇔L∑i=1

ai lnxi ≥L∑i=1

ai ln yi.

This reduces preferences to a linear utility function in the logarithm of the variables, allowing usto exploit Proposition 2.13. Of course, this trick goes only part of the way, as one cannot takelogarithms on the boundary of RL+, where some coordinates equal zero.

Proposition 2.14 Assume:

� X = RL+ for some L ∈ N;� % is a weak order on X;

� % is strongly monotonic;

� % is homothetic in each coordinate: for each i ∈ {1, . . . , L}, all x, y ∈ X, and each t > 0:if x % y, then (x1, . . . , xi−1, txi, xi+1, . . . , xL) % (y1, . . . , yi−1, tyi, yi+1, . . . , yL).

� For each x ∈ X there is a constant α ∈ R+ such that x ∼ α(1, . . . , 1).

Then % can be represented by a Cobb-Douglas utility function.

Proof. We use Proposition 2.13 to show that % can be represented by a Cobb-Douglas utilityfunction on RL++. The domain is then extended to RL+.Step 1, domain RL++: De�ne f : RL → RL++ for each x ∈ RL by f(x) = (expx1, . . . , expxL).Notice that f and its inverse f−1 : RL++ → RL with f−1(y) = (ln y1, . . . , ln yL) are continuous.Given the weak order % on RL++, de�ne a weak order %f on RL as follows:

∀x, y ∈ RL : x %f y ⇔ f(x) % f(y). (9)

The exponential function is strictly increasing, so by substitution in (9), properties imposed on% carry over in a straightforward way to properties of %f : one easily veri�es that it is a weakorder satisfying strong monotonicity, and there exists, for each x ∈ RL, a scalar α such thatx ∼f α(1, . . . , 1). Applying coordinatewise homotheticity L times, if follows that

∀x, y, t ∈ RL++ : x % y ⇒ (t1x1, . . . , tLxL) % (t1y1, . . . , tLyL).

Hence, by de�nition (9), (lnx1, . . . , lnxL) %f (ln y1, . . . , ln yL) implies that

(lnx1, . . . , lnxL) + (ln t1, . . . , ln tL) %f (ln y1, . . . , ln yL) + (ln t1, . . . , ln tL).

As f is bijective, it follows that %f is additive.Conclude that %f on RL satis�es all assumptions of Proposition 2.13: there are a1, . . . , aL > 0

such that %f is represented by the utility function x 7→∑L

i=1 aixi. By (9), for all x, y ∈ RL++:

x % y ⇔ (lnx1, . . . , lnxL) %f (ln y1, . . . , ln yL) ⇔L∑i=1

ai lnxi ≥L∑i=1

ai ln yi.

20

Page 27: Preferences

Taking exponentials, % is represented by utility function u with u(x) =∏Li=1 x

aii on Rn++.

Step 2, domain RL+: To see that u represents % on the entire domain RL+, we must establishthat x ∼ (0, . . . , 0) for each x ∈ RL+ with some, but not all, coordinates equal to zero. Pick suchan x. As x+ (1/n)e ∈ RL++ for each n ∈ N, strong monotonicity implies (0, . . . , 0) ≺ x+ (1/n)e.Hence, there is an εn > 0 with x+ (1/n)e ∼ εne. As at least one coordinate of x+ (1/n)e goesto zero:

0 = limn→∞

u(x+ (1/n)e) = limn→∞

u(εne) = limn→∞

εa1+···+aLn .

As a1 + · · ·+ aL > 0, it follows that limn→∞ εn = 0.By assumption, x ∼ αe for some α ≥ 0. Positive α are ruled out: x ≺ x + (1/n)e ∼ εne for

all n ∈ N and limn→∞ εn = 0. So α must be zero. �

Again, most assumptions are familiar. The homotheticity requirement says that rescaling ofspeci�c coordinates does not a�ect preferences.

21

Page 28: Preferences

3. Choice

3.1. Existence of most preferred elements

Hitherto, we discussed how microeconomists usually model what economic agents want . Theobvious next step is to consider what they actually do. The rationality paradigm underlying theclassical microeconomic theory requires that given (1) a set of mutually exclusive alternativesand (2) a nicely behaved preference relation/utility function over the alternatives, the agent willchoose a most preferred alternative. This sounds pretty obvious, but an abundance of economicterminology sometimes blurs the picture: most of traditional microeconomics is plain and simpleconstrained optimization.

This begs the question: when do most preferred alternatives exist? This is not straightfor-ward: if you have strongly monotonic preferences over apples and face no consumption constraintswhatsoever, there is no optimal amount of apples. Here is a very general existence result:

Proposition 3.1 Assume:

� % is a weak order on a set X;

� % is upper semicontinuous: for all x ∈ X, the lower contour set L(x) = {y ∈ X | y ≺ x}is open;

� Y is a nonempty, compact subset of X.

Then Y contains a most preferred element:

∃y∗ ∈ Y : y∗ % y for all y ∈ Y.

Proof. Suppose not: for every y ∈ Y there is a y′ ∈ Y with y′ � y. Then the lower contoursets {L(y) : y ∈ Y } are an open covering of the compact set Y . By compactness, there is a �nitesubcovering, i.e., a �nite subset Y ′ ⊆ Y such that {L(y′) : y′ ∈ Y ′} covers Y . Since Y ′ is �nite,it contains a most preferred element y∗. But then L(y∗) covers Y , i.e., y∗ is a best element of Y ,contradicting our assumption. �

Application to consumer model: Let X = RL+. Suppose a consumer has a continuous (orupper semicontinuous) weak order % on X re�ecting his preferences and an amount of moneyw > 0 in his pocket (w for �wealth�). Suppose the price vector is p ∈ RL++. The budget setB(p, w) at prices p and wealth w consists of all a�ordable feasible commodity bundles:

B(p, w) = {x ∈ RL+ | p · x ≤ w}.

This set is:

� nonempty: it contains the zero vector,

� closed: it is the intersection of �nitely many closed halfspaces:5

B(p, w) = ∩Li=1{x ∈ RL | xi ≥ 0} ∩ {x ∈ RL | p · x ≤ w}. (10)

� bounded: 0 ≤ xi ≤ w/pi for all commodities i,

5Recall that a halfspace in Rn is a set of the type {x ∈ Rn : a · x ≤ c} or {x ∈ Rn : a · x ≥ c}, where a ∈ Rn,a 6= 0, and c ∈ R.

22

Page 29: Preferences

� compact: it is a closed and bounded subset of RL and therefore compact by the Heine-Boreltheorem,

� convex: by (10), it is the intersection of convex halfspaces.

Since B(p, w) is nonempty and compact and % is assumed to be an upper semicontinuous weakorder, the budget set contains at least one most preferred alternative.

Exercise 3.1 A decision maker has lexicographic preferences % over R2:

(x1, x2) % (y1, y2)⇔ x1 > y1 or (x1 = y1 and x2 ≥ y2) .

(a) Is % upper semicontinuous?

(b) Does each nonempty, compact subset Y ⊂ R2 contain a most preferred element?

3.2. Revealed preference

Rather than going from preferences to choices, this subsection, based on Arrow (1959), triesto move in the opposite direction: can we � under suitable assumptions � explain observedchoices by constructing a preference relation that makes such choices rational? Formally, achoice structure is a tuple (X,B, C), where

� X is a nonempty set of alternatives.

� B is a nonempty collection of choice sets. Each element of B is a nonempty subset B ⊆ X,interpreted as a potential problem for a decision-maker: `Please choose from B.'

� C is a choice rule, assigning to each choice set B ∈ B a nonempty set C(B) ⊆ B, interpretedas those elements from B that the decision maker �nds acceptable.

The choice structure (X,B, C) is rationalizable if there is a weak order % on X such that foreach choice set B ∈ B, the associated choices C(B) are the most preferred ones under %:

∀B ∈ B : C(B) = {x ∈ B | x % y for all y ∈ B}. (11)

Consider two properties one might expect from revealed preferences:

Weak axiom of revealed preference (WARP) The choice structure (X,B, C)satis�es WARP if

∀A,B ∈ B, ∀x, y ∈ A ∩B : if x ∈ C(A), y ∈ C(B), then x ∈ C(B).

The idea behind WARP is this: in both choice problems A and B, alternatives x and y areavailable. If x ∈ C(A), this reveals x to be at least good as y; otherwise x wouldn't be acceptable.Similarly, if y ∈ C(B), then y must be at least as good as x. But then x and y ought to beequivalent and you should �nd x acceptable also in B.

Independence of irrelevant alternatives (IIA) The choice structure (X,B, C)satis�es IIA if

∀A,B ∈ B : if A ⊆ B and C(B) ∩A 6= ∅, then C(A) = C(B) ∩A.

Intuitively, suppose that some items on �menu� B are not feasible after all and choice is restrictedto A. If A still contains some acceptable elements from B, choice should remain una�ected: anelement is acceptable in the smaller set A if and only if it was acceptable in the larger set B.

23

Page 30: Preferences

Proposition 3.2 Consider a choice structure (X,B, C).

(a) If it satis�es WARP, then it satis�es IIA.

(b) If it satis�es IIA and all choice sets with at most three elements are contained in B, then(X,B, C) is rationalizable.

Proof. (a): Assume WARP holds. Let A,B be as in the de�nition of IIA. Let a ∈ C(A) andb ∈ C(B) ∩A. To show: a ∈ C(B) ∩A, b ∈ C(A). Since C(A) ⊆ A ⊆ B, we have

a, b ∈ A ∩B,a ∈ C(A),b ∈ C(B).

By WARP, a ∈ C(B), b ∈ C(A).(b): For all x, y ∈ X, the set {x, y} lies in B by the assumption on B. Hence, we may de�nex % y if x ∈ C({x, y}). We need to check three things:[% is complete:] Let x, y ∈ X. By nonemptiness, either x ∈ C({x, y}) or y ∈ C({x, y}), i.e.,x % y or y % x.[% is transitive:] Let x, y, z ∈ X and assume that x % y and y % z. By de�nition of %:x ∈ C({x, y}) and y ∈ C({y, z}). To show: x % z, i.e., x ∈ C({x, z}).

If x = y or y = z, this follows immediately. If x = z, then x % z is the same as x % x, whichfollows from completeness. So let x, y, z be distinct and consider the set {x, y, z} ∈ B. It su�cesto show that x ∈ C({x, y, z}), because then x ∈ C({x, z}) by IIA.

Suppose, to the contrary, that x /∈ C({x, y, z}). By nonemptiness of C, C({x, y, z})∩{y, z} 6=∅. By IIA and y % z: y ∈ C({y, z}) = C({x, y, z}) ∩ {y, z}. So C({x, y, z}) ∩ {x, y} 6= ∅.By IIA and x % y: x ∈ C({x, y}) = C({x, y, z}) ∩ {x, y}, contradicting the assumption thatx /∈ C({x, y, z}).[% rationalizes (X,B, C):] To show that (11) holds, let B ∈ B.

Firstly, let z ∈ C(B). To show: z % y for all y ∈ B. So let y ∈ B. Then {y, z} ∈ B, {y, z} ⊆B, and z ∈ C(B) ∩ {y, z} 6= ∅. By IIA, z ∈ C({y, z}). So z % y.

Secondly, let z ∈ B satisfy z % y for all y ∈ B. To show: z ∈ C(B). By nonemptiness, thereis a y ∈ C(B). Then {y, z} ∈ B, {y, z} ⊆ B, and y ∈ C(B) ∩ {y, z} 6= ∅. By z % y and IIA:z ∈ C({y, z}) = C(B) ∩ {y, z}, so z ∈ C(B). �

Exercise 3.3 investigates the other relations between rationalizability, WARP, and IIA.

3.3. Exercises

Exercise 3.2 Weierstrass' Maximum Theorem: Use Proposition 3.1 to prove that a continuousfunction f : X → R on a nonempty, compact set X achieves a maximum and a minimum.

Exercise 3.3

(a) Show that if (X,B, C) is rationalizable, it satis�es WARP.

(b) Does IIA imply WARP?

(c) Can the restriction on B in Proposition 3.2 be omitted?

(d) Does WARP imply rationalizability?

24

Page 31: Preferences

Exercise 3.4 Let X = {1, 2, . . . , n} for some n ∈ N, n ≥ 3, and let B consist of all nonempty subsets ofX. For each of the following choice rules C, prove whether the choice structure (X,B, C) satis�es WARPand/or IIA. If possible, construct a weak order % rationalizing it.

(a) Satisficing (Simon, 1955): A function v : X → R assigns to each alternative x ∈ X a valuev(x) ∈ R. Those with a value at/above a given threshold r ∈ R are deemed `satisfactory'. For eachB ∈ B, the choice C(B) is de�ned as follows: go through the elements of B in increasing order andchoose the �rst satisfactory one. If no such element exists, choose the �nal (i.e., largest) elementof B.

(b) Madly in love: Assume your partner has a weak order % on X in which no two distinct elementsare equivalent. For each choice set B ∈ B with two/more elements, you politely abstain fromchoosing your partner's favorite: C(B) = {x ∈ B | ∃y ∈ B : y � x}.

Exercise 3.5 A taste for precious metals: A consumer faces two luxury goods, the �rst is gold,the second platinum, and spends the entire wealth on the good with the highest price. If prices areequal, half of the wealth is spent on each good. To investigate the rationality of such behavior, considera choice structure (X,B, C), where X = R2

+, the commodity space, and B consists of two choice sets:B1 = B((2, 1), 2), the budget set at prices p = (2, 1) and wealth w = 2, and B2 = B((1, 2), 2).

(a) Draw the choice sets B1 and B2 in the same �gure. Given the assumptions above, �nd C(B1) andC(B2) and also draw these in your �gure.

(b) Does the choice structure (X,B, C) satisfy IIA?

(c) Does the choice structure (X,B, C) satisfy WARP?

(d) Is the choice structure (X,B, C) rationalizable?

Economic models of luxury goods often allow price-dependent preferences.

(e) Give an example of a utility function depending both on the commodity bundle x and the pricevector p � denoted u(x, p) � that makes the consumer's behavior utility maximizing for every(p, w) ∈ R3

++.

25

Page 32: Preferences

4. Choices of a consumer: classical demand theory

4.1. The preference/utility maximization problem

Section 3.1 set the stage for the classical model of consumer behavior. This model consistsof a speci�cation of: (i) what the consumer wants: a preference relation or utility function;(ii) what the consumer �nds feasible: a budget set indicating the commodity bundles thathe can choose from; (iii) what the consumer � putting these two together � �nds the mostpreferable commodity bundles. Formally:

� there are L ∈ N commodities that can be consumed in nonnegative quantities, so thecommodity space is X = RL+;

� a price vector p ∈ RL++ assigns to each commodity i ∈ {1, . . . , L} a price pi > 0;

� the consumer has a given income/`wealth' w > 0, i.e., an amount of money to spend onbuying a commodity bundle;

� the consumer has a preference relation % on X or even a utility function u : X → Rrepresenting these preferences.

Typically, no additional restrictions are imposed on consumption, so the budget set

B(p, w) = {x ∈ RL+ : p · x ≤ w}

speci�es the commodity bundles the consumer can a�ord. At this stage, it would be a good ideato look back at Section 3.1 to recapitulate some properties of this budget set. The consumersolves the following preference maximization problem (%-MP):

%-MP: Find the set of most preferable commodity bundles according to % in thebudget set B(p, w).

Given utility function u, this yields the utility maximization problem (UMP):

UMP: Solve maxu(x) s.t. x ∈ B(p, w).

It is common economic practice to assign special names to the set of solutions and � in casea utility function is given � the corresponding optimal value of such optimization problems.The (Walrasian) demand correspondence assigns to each price vector p ∈ RL++ and wealthw > 0 the associated set x(p, w) of optimal commodity bundles:

x(p, w) = {x ∈ B(p, w) : x % y for all y ∈ B(p, w)}

= {x ∈ B(p, w) : u(x) = maxy∈B(p,w)

u(y)}.

Given a utility function u, the indirect utility function v : RL+1++ → R assigns to each price

vector p ∈ RL++ and wealth w > 0 the maximal utility the consumer can achieve. To compute itis easy:

v(p, w) = u(x∗), where x∗ ∈ x(p, w),

is the utility of an arbitrary vector in the demand at (p, w). This is independent of the particularchoice of x∗ ∈ x(p, w): since all such vectors are utility maximizers, their utility is the same.

26

Page 33: Preferences

Remark 4.1 If the utility function u is a C1-function (its partial derivatives exist and arecontinuous on an open set containing X), the UMP

max u(x)s.t. p · x ≤ w,

x1 ≥ 0,...xL ≥ 0,

is usually solved using the associated Kuhn-Tucker conditions. /

Remark 4.2 If the Walrasian demand correspondence is single-valued, i.e., if x(p, w) consistsof a single element for each (p, w) ∈ RL+1

++ , it is common to treat demand as a function, ratherthan a correspondence. /

Let us conclude this subsection with an example involving a well-known type of utility function.

Leontiev utility: Baking your favorite cake requires �xed proportions of its L ≥ 2 ingredients:one unit of cake takes a vector (a1, . . . , aL) ∈ RL++ of ingredients. Given ingredient vector x ∈ RL+,how much cake can you produce? Well, looking at the i-th ingredient, your guess will be at mostxi/ai units. What constrains you are those ingredients i where this fraction is the smallest.Therefore, a suitable utility function would be

u(x) = min{x1/a1, . . . , xL/aL}, (12)

specifying how many units of cake you can make from x. This utility function is not di�erentiable,so the Kuhn-Tucker conditions are not applicable.

Exercise 4.1 Check that the associated preference relation is continuous, monotonic (but not strongly),convex (but not strictly), and homothetic.

Let prices and wealth be (p, w) ∈ RL+1++ . Since preferences are continuous and the budget set

B(p, w) nonempty and compact, there is at least one solution to the UMP (see Section 3.1):x(p, w) 6= ∅. Let's compute it. Firstly, if x∗ solves the UMP, it must be that

x∗1/a1 = · · · = x∗L/aL. (13)

Why? Well, suppose this were not true: min{x∗1/a1, . . . , x∗L/aL} < max{x∗1/a1, . . . , x

∗L/aL}.

Then you're using the ingredients in the wrong proportions: you can only make u(x∗) =min{x∗1/a1, . . . , x

∗L/aL} units of cake, but there are commodities i where you have enough for

x∗i /ai = max{x∗1/a1, . . . , x∗L/aL} units, an utter waste. If you were to trade a small amount of

these wasted ingredients for the non-wasted ones, you would still be in your budget set, but ableto make more cake. Hurray!

Secondly, preferences are monotonic, so you will use your entire budget on ingredients: p·x∗ =w. Combining this with (13) gives us that there is a unique solution to the UMP at (p, w), namely

x∗ =

(a1w∑Li=1 aipi

, . . . ,aLw∑Li=1 aipi

).

27

Page 34: Preferences

By Remark 4.2, it is common to write this result down as a demand function:

∀(p, w) ∈ RL+1++ : x(p, w) =

(a1w∑Li=1 aipi

, . . . ,aLw∑Li=1 aipi

),

instead of a single-valued demand correspondence:

∀(p, w) ∈ RL+1++ : x(p, w) =

{(a1w∑Li=1 aipi

, . . . ,aLw∑Li=1 aipi

)}.

Substituting the demand vector in the utility function, we �nd the indirect utility function:

∀(p, w) ∈ RL+1++ : v(p, w) = u

(a1w∑Li=1 aipi

, . . . ,aLw∑Li=1 aipi

)=

w∑Li=1 aipi

.

Exercise 4.2 Our de�nition of the budget set is standard, but other realistic restrictions can be modeledjust as easily. In the commodity space X = R2

+, let the price vector be p = (8, 4). The consumer haswealth w = 40 and an upper semicontinuous weak order % on X. In each of the following cases separately,specify the budget set given the additional information. Does the new budget set necessarily contain atleast one most preferred bundle?

(a) Indivisibilities: The commodities cannot be cut into ever smaller pieces. Only integer quantitiesare feasible.

(b) Rationing: The consumer is not allowed to buy more than three units of the �rst commodity.

(c) Rebates 1: If the consumer buys more than �ve units of the second commodity, these additionalunits in excess of the �rst �ve have a lower price, namely two.

(d) Rebates 2: If the consumer buys more than �ve units of the second commodity, the price of thiscommodity (also the �rst �ve units) is decreased to two.

(e) Initial endowment: Instead of having wealth w, suppose the consumer has an initial endowmentω = (1, 1) of one unit of both commodities. He can sell (parts of) his initial endowment to generateincome to purchase other commodity bundles.

(f) Package deal: The consumer has to buy the same quantity of both commodities.

(g) Gift certificate: The consumer has received a gift certi�cate of one monetary unit, which hecan spend in its entirety on commodity one.

4.2. Properties of the demand correspondence and indirect utility

Section 1 listed a lot of properties that can be imposed on the consumer's preferences. The nextresult indicates the consequences of such restrictions on the demand correspondence.

Proposition 4.3 Let X = RL+ for some L ∈ N and let % be a weak order on X. The Walrasiandemand correspondence has the following properties:

(a) If % is upper semicontinuous, then x(p, w) is nonempty for all (p, w) ∈ RL+1++ .

(b) If % is continuous, the Walrasian demand correspondence has a closed graph: for eachsequence (pn, wn, xn)n∈N in RL+1

++ ×X with limit (p, w, x) ∈ RL+1++ ×X: if xn ∈ x(pn, wn)

for all n ∈ N, then also x ∈ x(p, w).

28

Page 35: Preferences

(c) Homogeneity of degree zero: ∀(p, w) ∈ RL+1++ , ∀α > 0 : x(αp, αw) = x(p, w).

(d) If % is convex, or equivalently, if u is quasiconcave, then x(p, w) is a convex set for all(p, w) ∈ RL+1

++ .

(e) If % is strictly convex, or equivalently, if u is strictly quasiconcave, then x(p, w) containsat most one element for all (p, w) ∈ RL+1

++ .

(f) Walras' law: �All money is spent�: If % is locally nonsatiated, then p · x = w for all(p, w) ∈ RL+1

++ and x ∈ x(p, w).

Proof. (a): See Section 3.1.(b): Let (pn, wn, xn)n∈N be a sequence in RL+1

++ ×X with limit (p, w, x) ∈ RL+1++ ×X. Assume

that xn ∈ x(pn, wn) for all n ∈ N. To show: x ∈ x(p, w).Firstly, x ∈ X and pn · xn ≤ wn, so taking limits: p · x ≤ w. Conclude that x ∈ B(p, w).Suppose that x /∈ x(p, w): there is a y ∈ B(p, w) with y � x. By continuity of % and

Proposition 1.2, there are neighborhoods Ux of x and Uy of y such that y′ � x′ for all (x′, y′) ∈Ux × Uy.

Choose y′ ∈ Uy with p · y′ < w. This is possible: y ∈ B(p, w) implies that p · y ≤ w. In caseof strict inequality, take y′ = y. In case of equality, small decreases in the positive coordinatesof y will give the desired y′.

As (pn, wn)→ (p, w), it follows that pn ·y′ ≤ wn for n su�ciently large, so y′ ∈ B(pn, wn)∩Uy.As xn → x, xn ∈ Ux for n su�ciently large. Hence, for large n, xn ∈ Ux and y′ ∈ B(pn, wn)∩Uy.But then y′ � xn, contradicting that xn was optimal at prices pn and wealth wn.(c): Since B(αp, αw) = {x ∈ RL+ : (αp) · x ≤ αw} = {x ∈ RL+ : p · x ≤ w} = B(p, w), the %-MPhas the same domain before and after rescaling and therefore the same set of solutions.(d): Assume % is convex. If x(p, w) = ∅, it is convex. If x(p, w) 6= ∅, let x∗ ∈ x(p, w). Thenx(p, w) = B(p, w)∩{x ∈ X : x % x∗} is the intersection of two convex sets and therefore convex.(e): Assume % is strictly convex. Suppose there are x, y ∈ x(p, w), x 6= y. Then 1

2x+ 12y lies in

B(p, w) by convexity of B(p, w). By strict convexity of %, this bundle is strictly better than xand y, contradicting that these were most preferred bundles in B(p, w).(f): Assume % is locally nonsatiated. Let x ∈ x(p, w). Then p · x ≤ w, since x ∈ B(p, w).Suppose p · x < w. For ε > 0 su�ciently small, the entire neighborhood {y ∈ X : ‖x− y‖ < ε}is contained in the budget set. By local nonsatiation, this neighborhood contains a point y withy � x, contradicting that x is a most preferred bundle in the budget set. �

An important consequence of the closed-graph property is that if Walrasian demand is single-valued, the Walrasian demand function is continuous!

Exercise 4.3 If % is homothetic, then. . . what can you conclude about Walrasian demand?

To formulate properties of indirect utility, we will need to assume (Surprise!) that preferencesare represented by means of a utility function and that the demand correspondence is non-emptyvalued: otherwise, indirect utility is unde�ned.

Proposition 4.4 Assume:

� X = RL+ for some L ∈ N;� The consumer's preference relation % is represented by utility function u : X → R;

29

Page 36: Preferences

� Walrasian demand is nonempty-valued: ∀(p, w) ∈ RL+1++ , x(p, w) 6= ∅.

Then the indirect utility function has the following properties:

(a) Homogeneity of degree zero: ∀(p, w) ∈ RL+1++ , ∀α > 0: v(αp, αw) = v(p, w).

(b) For each commodity i, v is nonincreasing in the price of i (higher prices cannot make youbetter o�).

(c) v is nondecreasing in wealth; if % is locally nonsatiated, v is even strictly increasing inwealth.

(d) v is quasiconvex: ∀r ∈ R : {(p, w) ∈ RL+1++ : v(p, w) ≤ r} is a convex set.

(e) If % is represented by a continuous utility function u, then v is continuous.

Proof. (a): Follows from Proposition 4.3(c).(b): Let (p, w) ∈ RL+1

++ and let i ∈ {1, . . . , L} be a commodity. Let p′ be obtained from p by astrict increase in the price of commodity i. Then B(p′, w) ⊂ B(p, w), so

v(p′, w) = maxy∈B(p′,w)

u(y) ≤ maxy∈B(p,w)

u(y) = v(p, w),

since the second maximum is taken over a larger set.(c): The nondecreasing part is similar to (b), so we will only do the strictly-increasing part.Assume % is locally nonsatiated. Let p ∈ RL++ and 0 < w < w′. To show: v(p, w) < v(p, w′).

Let x ∈ x(p, w). Then x ∈ B(p, w), so p · x ≤ w < w′. Since p · x < w′, for ε > 0 su�cientlysmall, the entire neighborhood {y ∈ X : ‖x − y‖ < ε} is contained in the budget set B(p, w′).By local nonsatiation, this neighborhood contains a point y with y � x. Conclude that

v(p, w) = u(x) < u(y) ≤ maxz∈B(p,w′)

u(z) = v(p, w′).

(d): Let r ∈ R. If {(p, w) ∈ RL+1++ : v(p, w) ≤ r} = ∅, it is convex. If it is nonempty, let

(p, w), (p′, w′) lie in this set and let α ∈ [0, 1]. Write (p′′, w′′) = α(p, w) + (1 − α)(p′, w′). Toshow: v(p′′, w′′) ≤ r, i.e., u(x) ≤ r for all x ∈ B(p′′, w′′).

Let x ∈ B(p′′, w′′). Then x ∈ RL+ and α(p · x) + (1− α)(p′ · x) ≤ αw + (1− α)w′. Therefore,p · x ≤ w or p′ · x ≤ w′ (or both). W.l.o.g., p · x ≤ w. Then x ∈ B(p, w), so u(x) ≤ v(p, w) ≤ r.(e): Follows from Proposition 4.3(b). �

Exercise 4.4

(a) Proposition 4.4(c) might suggest that also (b) can be strengthened a bit: �If % is locally nonsatiated,indirect utility is strictly decreasing in the price of commodity i.� But this wrong. Why?

(b) Write out the proof of Proposition 4.4(e) in detail.

(c) Why not just write �If % is continuous, v is continuous�?

30

Page 37: Preferences

4.3. The expenditure minimization problem

Consider a consumer with utility function u : RL+ → R, prices p ∈ RL++, and a utility level u ∈ R.What is the minimal amount the consumer has to pay, i.e., the minimal level of wealth needed toreach utility level u? The answer is given by the expenditure minimization problem (EMP):

min p · xs.t. x ∈ RL+,

u(x) ≥ u.

TheHicksian or compensated demand correspondence assigns to each price vector p ∈ RL++

and each utility level u the associated set h(p, u) of solutions to the EMP:

h(p, u) = {x ∈ RL+ : u(x) ≥ u and p · x ≤ p · y for all y ∈ RL+ with u(y) ≥ u}.

The Hicksian demand correspondence speci�es the set of consumption bundles solving the EMP,the expenditure function e(p, u) indicates its value:

e(p, u) = minx∈RL

+,u(x)≥up · x = p · x∗ for all x∗ ∈ h(p, u).

Similar to our earlier approach to Walrasian demand and indirect utility, one can derive propertiesof Hicksian demand and the expenditure function. To make the proposition at all sensible, oneneeds to restrict attention to utility levels that are actually reachable; therefore, let U = {u(x) :x ∈ RL+} be the range of the utility function u.

Proposition 4.5 Let X = RL+ for some L ∈ N and let u : X → R represent a consumer's weakorder %. The Hicksian demand correspondence has the following properties:

(a) If % is upper semicontinuous, then h(p, u) is nonempty for all (p, u) ∈ RL++ × U .

(b) Homogeneity of degree zero in prices: ∀(p, u) ∈ RL++ × U,∀α > 0 : h(αp, u) = h(p, u).

(c) If % is convex, or equivalently, if utility is quasiconcave, then h(p, u) is a convex set for all(p, u) ∈ RL++ × U .

(d) If utility is continuous and % is strictly convex, or equivalently, if utility is strictly quasi-concave, then h(p, u) contains at most one element for all (p, u) ∈ RL++ × U .

(e) �No excess utility�: If utility is continuous, then u(x) = u for all (p, u) ∈ RL++ × U with6

u ≥ u(0, . . . , 0) and all x ∈ h(p, u).

(f) Compensated law of demand: let p′, p′′ ∈ RL++ and u ∈ U . If x′ ∈ h(p′, u) andx′′ ∈ h(p′′, u), then (p′ − p′′) · (x′ − x′′) ≤ 0.

6Why this restriction? Well, suppose that u < u(0, . . . , 0). Since p · x ≥ 0 for all x ∈ RL+, it follows that

h(p, u) = {(0, . . . , 0)}: expenditure is not minimal at utility u, because the zero vector, with higher utility, is thecheapest option. Under suitable monotonicity restrictions, however, this will turn out to be an exotic case: thezero vector will often give you the lowest utility in RL

+, so that this footnote becomes irrelevant.

31

Page 38: Preferences

Proof. (a): Let (p, u) ∈ RL+ × U . By feasibility, u(y) = u for some y ∈ X. By uppersemicontinuity of preferences, the set {x ∈ X : u(x) ≥ u} = {x ∈ X : x % y} is closed. Therefore,the solution of the EMP lies in the nonempty set {x ∈ RL+ : u(x) ≥ u} ∩ {x ∈ RL+ : p · x ≤ p · y},which is the intersection of a closed and a compact set and therefore compact. The goal functionx 7→ p ·x is continuous. A continuous function on a nonempty, compact set achieves a minimum;see Section 3.1.(b): Minimizing x 7→ (αp) · x gives the same solutions as minimizing x 7→ p · x.(c): Let (p, u) ∈ RL++ × U . If h(p, u) = ∅, it is convex. If h(p, u) 6= ∅, let y ∈ h(p, u). Byde�nition,

h(p, u) = {x ∈ RL+ : u(x) ≥ u} ∩ {x ∈ RL+ : p · x ≤ p · y}

is the intersection of convex sets, hence convex.(d): The result is true if u ≤ u(0, . . . , 0), as h(p, u) = {(0, . . . , 0)} in those cases. So letu > u(0, . . . , 0). Suppose h(p, u) contains two distinct alternatives, x, x′. By strict convexity,(x + x′)/2 is strictly better, yet causes the same expenses. As (x + x′)/2 � x � (0, . . . , 0),(x + x′)/2 6= (0, . . . , 0): some of its coordinates are positive. By continuity, slight decreases inthese coordinates still yield alternatives at least as good as x, i.e., they remain feasible in theEMP at (p, u), but cheaper than x, a contradiction.(e): Assume the utility function is continuous. Let (p, u) ∈ RL++ × U with u ≥ u(0, . . . , 0) andx ∈ h(p, u).

If u = u(0, . . . , 0), then h(p, u) = {(0, . . . , 0)}, so the result is true: u(x) = u(0, . . . , 0) = u.Next, let u > u(0, . . . , 0). Suppose u(x) > u. Then x 6= (0, . . . , 0), so that at least somecoordinates of x exceed zero. By continuity, u(y) > u for all y in a neighborhood of x. Bycontinuity, limα→1 u(αx) = u(x) > u, so u(αx) > u for α ∈ (0, 1) close to one. But p · (αx) =α(p · x) < p · x, contradicting that x ∈ h(p, u).(f): Since x′ is optimal and x′′ feasible in the EMP at (p′, u), it follows that

p′ · x′ ≤ p′ · x′′.

Similarly,p′′ · x′′ ≤ p′′ · x′.

Adding these inequalities and rewriting gives the compensated law of demand. �

If h is single-valued, we will treat it as a function, rather than a correspondence, just as we didfor Walrasian demand (see Remark 4.2). The compensated law of demand implies that if youraise the price of one of the goods, then the Hicksian demand for this good will not increase.

The next proposition states some properties of the expenditure function. Given the similaritywith earlier results, proofs are left as an exercise.

Proposition 4.6 Assume:

� X = RL+ for some L ∈ N;� The consumer's preference relation % is represented by utility function u : X → R;� Hicksian demand is nonempty-valued: ∀(p, u) ∈ RL++ × U : h(p, u) 6= ∅.

Then the expenditure function e : RL++ × U → R has the following properties:

(a) Homogeneity of degree one in prices: ∀(p, u) ∈ RL++ × U,∀α > 0 : e(αp, u) = αe(p, u).

32

Page 39: Preferences

(b) Monotonicity in u: If utility is continuous, then for all p ∈ RL++ and all u′, u′′ ∈ U withu(0, . . . , 0) ≤ u′ < u′′:

e(p, u′) < e(p, u′′).

(c) For each commodity i, expenditure is nondecreasing in the price of i.

(d) For all u ∈ U , e(·, u) is concave in p.

Exercise 4.5 Prove this proposition.

Remark 4.7 7 Establishing continuity properties for Hicksian demand and expenditure is lessstraightforward than for Walrasian demand and indirect utility. Concave functions are continu-ous, so Proposition 4.6(d) implies that expenditure is continuous in prices. The utility functionu : R+ → R with u(x) = max{0, x− 1} shows that expenditure is not necessarily continuous inutility levels. Letting p > 0 be the price of the only commodity, one �nds

e(p, u) =

{0 if u = 0,p(u+ 1) if u > 0.

Since p > 0, e(p, ·) has a discontinuity at u = 0. However, if the utility function is bothcontinuous and locally nonsatiated, continuity of the expenditure function e : RL++ × U → Rcan be established using a result known as Berge's Maximum Theorem. Contrary to what mosttextbooks (which do not provide the proof) suggest, the proof is not straightforward. To establishcontinuity at an arbitrary (p0, u0) ∈ RL++×U , local nonsatiation is used to establish existence ofa y ∈ RL+ with u(y) > u0. Next, on a neighborhood of (p0, u0), the EMP reduces to minimizingp · x subject to x ∈ {z ∈ RL+ : u(z) ≥ u, p · z ≤ p · y}. This �nal condition assures that theconditions of the Maximum Theorem are satis�ed. /

Let us proceed with the example on Leontiev utility functions.

Leontiev utility (Continued): The Leontiev utility function in (12) has range U = R+.In order not to waste resources, a solution x∗ to the EMP at (p, u) ∈ RL++ × U must satisfy(13). Moreover, by continuity, it satis�es u(x∗) = u. Combining these two conditions gives usthat there is a unique solution to the EMP at (p, u), namely x∗ = (a1u, . . . , aLu). Since thesolution is unique, it is common to write the result as a Hicksian demand function, rather thana correspondence: h(p, u) = (a1u, . . . , aLu) and e(p, u) = p · (a1u, . . . , aLu) = u

∑Li=1 aipi.

The following result gives a relation between h(p, u) and e(p, u) in a particularly simple case.

Proposition 4.8 Assume the utility function u : RL+ → R is continuous and represents locallynonsatiated, strictly convex preferences. Then for all p ∈ RL++ and all u > u(0, . . . , 0), Hicksiandemand for each good ` = 1, . . . , L can be found by derivating the expenditure function withrespect to the price p`:

∀` = 1, . . . , L : h`(p, u) =∂e(p, u)

∂p`. (14)

7Requires some knowledge of topology. Can be omitted.

33

Page 40: Preferences

Proof. We will not prove that the expenditure function is di�erentiable.8 The remainder of theproof proceeds as follows. By strict convexity of preferences, Hicksian demand is single-valued,so we treat h(·) as a function. Fix p ∈ RL++ and u ∈ U and let x = h(p, u) denote Hicksiandemand at prices p and utility level u. For every price vector p′ ∈ RL++,

e(p′, u) = minx′∈RL

+,u(x′)≥up′ · x′ ≤ p′ · x,

with equality if p′ = p. Hence, the function f : RL++ → R with f(p′) = e(p′, u) − p′ · x ismaximized at p′ = p. By the �rst order conditions, its partial derivatives at p must be zero:

∀` = 1, . . . , L :∂f(p)

∂p`=∂e(p, u)

∂p`− x` =

∂e(p, u)

∂p`− h`(p, u) = 0,

proving the result. �

Exercise 4.6 Roy's identity: Similarly, one can prove:

Assume the utility function u : RL+ → R is continuous and represents locally nonsatiated,strictly convex preferences. Assume that the indirect utility function v(·) is di�erentiableat a point (p, w) with p ∈ RL++ and w > 0. Then the Walrasian demand for each good` = 1, . . . , L can be found as follows:

∀` = 1, . . . , L : x`(p, w) = −∂v(p, w)/∂p`∂v(p, w)/∂w

.

Do this by showing that the function f : RL++ → R with f(p′) = v(p′, p′ · x), where x = x(p, w), achievesits minimum at p′ = p.

4.4. Relations between UMP and EMP

Proposition 4.9 Assume the utility function u : RL+ → R is continuous and represents locallynonsatiated preferences. Fix a price vector p ∈ RL++. Then:

(a) If x∗ is optimal in the UMP with wealth w > 0, then x∗ is optimal in the EMP with utilitylevel u = u(x∗). Moreover, the expenditure level in this EMP is exactly p · x∗ = w :

x∗ ∈ x(p, w)⇒ x∗ ∈ h(p, u(x∗)) and e(p, u(x∗)) = w.

(b) If x∗ is optimal in the EMP with utility level u ∈ U , u > u(0, . . . , 0), then x∗ is optimal inthe UMP with wealth w = p · x∗. Moreover, the indirect utility level in this UMP is exactlyu :

x∗ ∈ h(p, u)⇒ x∗ ∈ x(p, p · x∗) and v(p, p · x∗) = u

Proof. (a): Let x∗ ∈ x(p, w). By Walras' law, p · x∗ = w. Bundle x∗ is feasible in the EMPwith prices p and utility level u(x∗). Let x ∈ h(p, u(x∗)). By de�nition,

e(p, u(x∗)) = p · x ≤ p · x∗ = w and u(x) ≥ u(x∗).

8It follows from a duality result in convex analysis: for �xed u, e(·, u) is the support function of the strictlyconvex set {x ∈ X : u(x) ≥ u}.

34

Page 41: Preferences

The �rst inequality means that x ∈ B(p, w). But then its utility cannot exceed that of the utilitymaximizing bundle x∗ ∈ x(p, w). So u(x) = u(x∗) and by Walras' law:

e(p, u(x∗)) = p · x = p · x∗ = w.

Conclude that x∗ ∈ h(p, u(x∗)) and e(p, u(x∗)) = w.(b): Let x∗ ∈ h(p, u). By Proposition 4.5(e), u(x∗) = u. Bundle x∗ is feasible in the UMP atprices p and wealth p · x∗. Let x ∈ x(p, p · x∗). By de�nition,

v(p, p · x∗) = u(x) ≥ u(x∗) = u and p · x ≤ p · x∗.

The �rst claim shows that x is feasible in the EMP at (p, u). But then the inequality in thesecond claim cannot be strict: p · x = p · x∗. By Proposition 4.5(e),

v(p, p · x∗) = u(x) = u(x∗) = u.

Conclude that x∗ ∈ x(p, p · x∗) and v(p, p · x∗) = u. �

Under the assumptions above, we obtain important relations between the UMP and EMP:

e(p, v(p, w)) = w (15)

v(p, e(p, u)) = u (16)

x(p, w) = h(p, v(p, w)) (17)

h(p, u) = x(p, e(p, u)) (18)

Proof. (15): Let x∗ ∈ x(p, w). By de�nition, v(p, w) = u(x∗). By Proposition 4.9(a),e(p, v(p, w)) = e(p, u(x∗)) = w.(17): We �rst show that x(p, w) ⊆ h(p, v(p, w)). Let x ∈ x(p, w). Then u(x) = v(p, w), so x ∈h(p, u(x)) = h(p, v(p, w)) by Proposition 4.9(a). Secondly, we show that h(p, v(p, w)) ⊆ x(p, w).Let x ∈ h(p, v(p, w)). By Proposition 4.9(b), x ∈ x(p, p ·x). Moreover, x ∈ h(p, v(p, w)) and (15)imply that p · x = e(p, v(p, w)) = w. Conclude that x ∈ x(p, p · x) = x(p, w).(16), (18): Similar. �

These results give convenient ways to �nd solutions to the UMP from those of the EMP and viceversa. Let us illustrate this in our Leontiev example.

Leontiev utility (Continued): Recall that

v(p, w) =w∑L

i=1 aipiand x(p, w) =

(a1w∑Li=1 aipi

, . . . ,aLw∑Li=1 aipi

).

By (16), expenditure solves u = v(p, e(p, u)) = e(p,u)∑Li=1 aipi

, so e(p, u) = u∑L

i=1 aipi, exactly (Good

news, isn't it!) as we saw before. Hicksian demand can now be found in di�erent ways. Firstly,using Proposition 4.8:

∀` = 1, . . . , L : h`(p, u) =∂e(p, u)

∂p`= a`u,

and, secondly, using (18): h(p, u) solves

h(p, u) = x(p, e(p, u)) =

(a1e(p, u)∑Li=1 aipi

, . . . ,aLe(p, u)∑Li=1 aipi

)= (a1u, . . . , aLu).

35

Page 42: Preferences

Exercise 4.7 For Leontiev utility, use (15) and (17) to �nd Walrasian demand and indirect utility fromthe solutions of the EMP.

Exercise 4.8 Slutsky equation: The so-called Slutsky equation provides a relation between thesensitivity to price changes of the Walrasian and Hicksian demand functions.

Assume the utility function u : RL+ → R is continuous and represents locally nonsatiated, strictlyconvex preferences. We know that in this case there are unique solutions to the UMP and EMP: we canconsider Walrasian and Hicksian demand functions. If these functions are di�erentiable, the followingholds. Fix (p, w) ∈ RL+1

++ and utility level u = v(p, w) > u(0, . . . , 0).9 Then for all commodities k, ` ∈{1, . . . , L}:

∂h`(p, u)

∂pk=∂x`(p, w)

∂pk+∂x`(p, w)

∂wxk(p, w). (19)

Prove (19) as follows: You know from (18) that h`(p, u) = x`(p, e(p, u)). Di�erentiate this equation w.r.t.pk, using the Chain rule. Continue by substituting (14), (15), and (18).

4.5. Welfare analysis for the consumer

Welfare analysis studies how changes in the consumer's environment � in our case: the budgetset � a�ect his well-being. Let B0 be the budget set before, and B1 the budget set after thechange. Assuming that optimal bundles exist, the consumer is better o� after the change if andonly if whatever is optimal in B1 is strictly preferred to whatever is optimal in B0. This is welfareanalysis in a nutshell. Some obvious ways of detecting changes that are (at least weakly) welfareimproving are:

� The budget set has grown: B0 ⊂ B1.

� An optimal bundle in B0 remains feasible in B1.

Exercise 4.9 How is the consumer's welfare a�ected by the changes described in Exercise 4.2?

Whereas the above describes the idea behind welfare analysis in its full generality and simplicity,economic textbooks tend to restrict attention to changes only in prices and wealth. The initialvector of prices and wealth is denoted (p0, w0) ∈ RL+1

++ and the vector of prices and wealth

after the change is denoted (p1, w1) ∈ RL+1++ . This allows changes in prices only, keeping wealth

constant (p0 6= p1, w0 = w1), changes in wealth only, keeping prices constant (p0 = p1, w0 6= w1),or simultaneous changes in prices and wealth (p0 6= p1, w0 6= w1).

Exercise 4.10 Let % be a locally nonsatiated weak order on RL+. Consider a change from (p0, w0) to(p1, w1). Let x0 ∈ x(p0, w0). Show that if p1 · x0 < w1, the consumer is strictly better o� under (p1, w1)than under (p0, w0).

Assume that the consumer's continuous, locally nonsatiated preference relation % can be repre-sented by means of a utility function. We can derive the consumer's indirect utility function vand conclude that the consumer is better o� after the change if and only if v(p1, w1) > v(p0, w0).

However, since the indirect utility function depends on which utility function is chosen torepresent %, this does not tell us how much better o� the consumer is. To express welfarechanges unambiguously in monetary units, one constructs a so-called money metric indirect

9This inequality holds because the zero vector cannot solve the utility maximization problem: by local nonsa-tiation and strict positivity of prices and wealth, there is an a�ordable bundle preferred to the zero vector.

36

Page 43: Preferences

utility function using the expenditure function. Fix an arbitrary price vector p ∈ RL++. Considerthe real-valued function e(p, ·). By Proposition 4.6, this function is strictly increasing, so

e(p, v(p1, w1)) > e(p, v(p0, w0))⇔ v(p1, w1) > v(p0, w0).

Moreover, since the expenditure function is expressed in monetary units,

e(p, v(p1, w1))− e(p, v(p0, w0)) (20)

can be used as a monetary measure of welfare change: if it is positive, the welfare of the consumerincreases as a consequence of the change from (p0, w0) to (p1, w1), if it is negative, the welfareof the consumer has decreased. It remains to prove that this money metric does not dependon the choice of utility function representing the consumer's preferences. This follows from thefact that expenditure can be expressed in a form independent of the utility function: for all(p, u) ∈ RL++ × U , there is a y ∈ RL+ with u(y) = u, so

e(p, u) = min p · xs.t. x ∈ RL+

u(x) ≥ u

= min p · xs.t. x ∈ RL+

x % y

In (20), two natural choices for p would be the initial vector of prices p0 and the new vectorof prices p1. These choices give rise to two well-known measures of welfare change: equivalentvariation (EV) and compensating variation (CV). Let u0 = v(p0, w0) and u1 = v(p1, w1).Notice that e(p0, u0) = w0 and e(p1, u1) = w1 by local nonsatiation. We de�ne

EV ((p0, w0), (p1, w1)) = e(p0, u1)− e(p0, u0) = e(p0, u1)− w0,

CV ((p0, w0), (p1, w1)) = e(p1, u1)− e(p1, u0) = w1 − e(p1, u0).

There is no obvious way to say that one of the measures is better than the other, although theequivalent variation has an advantage when comparing alternative changes: suppose (p0, w0)changes either to (p1, w1) or (p2, w2). Both EV ((p0, w0), (p1, w1)) and EV ((p0, w0), (p2, w2))are expressed in terms of wealth at prices p0 and can consequently be compared. However,CV ((p0, w0), (p1, w1)) is expressed in wealth at prices p1 and CV ((p0, w0), (p2, w2)) in wealth atprices p2, so they are incomparable.

Leontiev utility (Continued): The equivalent and compensating variation for Leontievutility follow immediately from the indirect utility function and expenditure function computedearlier:

u0 = v(p0, w0) =w0∑Li=1 aip

0i

and u1 = v(p1, w1) =w1∑Li=1 aip

1i

,

so

EV ((p0, w0), (p1, w1)) = e(p0, u1)− e(p0, u0) = w1

(∑Li=1 aip

0i∑L

i=1 aip1i

)− w0,

CV ((p0, w0), (p1, w1)) = e(p1, u1)− e(p1, u0) = w1 − w0

(∑Li=1 aip

1i∑L

i=1 aip0i

).

Lump-sum tax: Given initial prices and wealth (p0, w0), suppose that the government leviesa lump-sum tax T ∈ (0, w0) on the consumer's wealth, keeping prices unchanged. Then(p1, w1) = (p0, w0−T ). Hence e(p0, u0) = e(p1, u0) = w0 and e(p1, u1) = e(p0, u1) = w1 = w0−T ,

37

Page 44: Preferences

so EV ((p0, w0), (p1, w1)) = CV ((p0, w0), (p1, w1)) = −T . This is intuitive: since the pricesremain unchanged, the monetary measure of welfare change as a consequence of a decrease of Tin the consumer's wealth should equal −T .

Deadweight loss: Let the preference relation % be a continuous, locally nonsatiated, strictlyconvex weak order on RL+. Fix a price vector p0 ∈ RL++ and wealth w > 0. Suppose thegovernment levies a commodity tax t > 0 on the price of good `. Thus, the new price vector isp1 = p0 + te`, where e` = (0, . . . , 0, 1, 0, . . . , 0) is the `-th standard basis vector of RL with `-thcoordinate 1 and all other coordinates 0. The total tax revenue is T = tx`(p

1, w) and

EV ((p0, w), (p1, w)) = e(p0, u1)− w ≤ 0,

where u1 = v(p1, w) as before. Alternatively, to raise the same amount, the government canlevy a lump-sum tax T directly on the wealth of the consumer, keeping prices �xed, yielding anequivalent variation −T .

The consumer is at least weakly better o� under lump-sum taxation. Let x∗ solve the UMPunder commodity taxation. Then x∗ ∈ B(p0 +te`, w), so p0 ·x∗+tx∗` ≤ w, i.e., p0 ·x∗ ≤ w−tx∗` =w − T . So x∗ ∈ B(p0, w − T ), i.e., x∗ is feasible in the UMP under lump-sum taxation: theconsumer cannot be worse o� under lump-sum taxation than under commodity taxation.

Therefore, e(p0, u1)− w ≤ −T . The di�erence

w − T − e(p0, u1) ≥ 0

is called the deadweight loss of commodity taxation .

Exercise 4.11 Cobb-Douglas utility: In Section 4, the Leontiev utility function was used as arunning example to illustrate all de�nitions. Go through the same steps, now using the Cobb-Douglasutility function u : RL+ → R de�ned for all x ∈ RL+ by u(x) = xa11 xa22 · · ·x

aLL , where a1, . . . , aL > 0.

4.6. Welfare and Hicksian demand

Assume that the preferences of the consumer are continuous, locally nonsatiated and strictlyconvex. If the only change is in the price of a single good, equivalent and compensating variationcan simply be expressed in terms of the Hicksian demand function. To somewhat simplifynotation, we denote for an arbitrary price vector p′ and an arbitrary p` > 0 the price vectorobtained from p′ by changing the price of good ` to p` by (p`, p

′−`).

So given initial prices and wealth (p0, w), suppose that only the price of good ` ∈ {1, . . . , L}is changed to p1

` 6= p0` , giving rise to (p1, w) = ((p1

` , p0−`), w). Recall that

∂e(p, u)

∂p`= h`(p, u) and e(p1, u1) = w.

Hence

EV ((p0, w), (p1, w)) = e(p0, u1)− w= e(p0, u1)− e(p1, u1)

=

∫ p0`

p1`

∂e(p`, p0−`, u

1)

∂p`dp`

=

∫ p0`

p1`

h`(p`, p0−`, u

1)dp`. (21)

38

Page 45: Preferences

Similarly,

CV ((p0, w), (p1, w)) =

∫ p0`

p1`

h`(p`, p0−`, u

0)dp`. (22)

This means that the equivalent and compensating variation due to such a simple price changecan be represented by areas �to the left of� the Hicksian demand curve.

Normal goods: Suppose good ` is a normal good (i.e., its Walrasian demand is weakly in-creasing in income) and that its price is decreased: p0

` > p1` . We claim that EV ((p0, w), (p1, w)) ≥

CV ((p0, w), (p1, w)). To see this, write u0 = v(p0, w) and u1 = v(p1, w). Since v is nonincreasingin p`, u

0 ≤ u1. Since e is increasing in u, this implies that e(p`, p0−`, u

0) ≤ e(p`, p0−`, u

1) for allp` > 0. Since good ` is normal and x`(p, e(p, u)) = h`(p, u), it follows that

h`(p`, p0−`, u

0) = x`(p`, p0−`, e(p`, p

0−`, u

0)) ≤ x`(p`, p0−`, e(p`, p

0−`, u

1)) = h`(p`, p0−`, u

1)

for all p` > 0. Combining this with (21) and (22), it follows that

EV ((p0, w), (p1, w))− CV ((p0, w), (p1, w)) =

∫ p0`

p1`

h`(p`, p0−`, u

1)dp` −∫ p0`

p1`

h`(p`, p0−`, u

0)dp`

=

∫ p0`

p1`

[h`(p`, p

0−`, u

1)− h`(p`, p0−`, u

0)]dp`

≥ 0.

39

Page 46: Preferences

5. Choices of a producer: classical supply theory

5.1. Production sets

Having treated the demand side of the economy in detail, we now turn to the supply side. Thesupply side consists of �rms that use a technology to convert one set of commodities (inputs)to another (outputs). Just as for consumers, it is assumed that �rms take prices as given andthat all commodities are traded at the market at publicly quoted prices. Consider an economywith L ∈ N commodities. The �rm's production can be described by a production vector orproduction plan y = (y1, . . . , yL) ∈ RL which gives the net amount produced of each of theL commodities. If y` < 0, we say that good ` is used as an input in the production plan y, ify` > 0, we say that good ` is used as an output in y. For instance, if L = 2, the productionplan y = (−2, 6) indicates that two units of the �rst commodity are used as an input to producean output of 6 units of the second commodity. The production set of technologically feasibleproduction vectors is denoted by Y ⊂ RL. This general description allows that a commodity isused as an input in some production vectors, but as an output in others.

You may come across the following special cases:

Transformation functions: Sometimes the production set can conveniently be describedusing a function F : RL → R called the transformation function as follows:

Y = {y ∈ RL : F (y) ≤ 0} and F (y) = 0 if y lies on the boundary of Y .

The set of boundary points {y ∈ RL : F (y) = 0} is called the transformation frontier .

Single-output technologies: In many examples, one of the goods, say good L, is an outputthat is produced using the remaining goods, say 1, . . . , L−1, as inputs. These are single-outputtechnologies, typically summarized using a production function f : RL−1

+ → R that assignsto each vector of input quantities z the maximal amount f(z) of output that can be producedfrom it. One can then write

Y = {(−z1, . . . ,−zL−1, q) ∈ RL : q ≤ f(z) and z ∈ RL−1+ }.

Consider, for instance, a Cobb-Douglas production function f : R2+ → R given by f(z) = zα1 z

β2 ,

where α, β > 0. Then

Y = {(−z1,−z2, q) ∈ R3 : q ≤ zα1 zβ2 , and z1, z2 ≥ 0}.

5.2. Properties of production sets

Properties that are often imposed on the production set Y ⊂ RL include:

Y is nonempty : there is at least one feasible production vector.

Possibility of inaction : 0 ∈ Y . It is possible to do nothing, i.e., produce zerooutputs from zero inputs.

Y is closed . This assumption is mainly for mathematical convenience.

No free lunch : if y ∈ Y ∩ RL+, then y = 0. It is not possible to produce positiveamounts of output without using inputs.

40

Page 47: Preferences

Free disposal : if y ∈ Y and y′ ≤ y, then y′ ∈ Y . If y is feasible and y′ uses at leastas much of each input, yet gives no more of the outputs, then also y′ is feasible.

Irreversibility : if y ∈ Y and y 6= 0, then −y /∈ Y . It is impossible to reverse afeasible production vector, i.e., to turn the outputs into the same amount of inputsused to produce it.

Nonincreasing returns to scale : if y ∈ Y and α ∈ [0, 1], then αy ∈ Y . Thismeans that feasible production plans can be scaled down.

Nondecreasing returns to scale : if y ∈ Y and α ≥ 1, then αy ∈ Y . This meansthat feasible production plans can be scaled up.

Constant returns to scale (CRS): if y ∈ Y and α ≥ 0, then αy ∈ Y . This is theconjunction of the previous two properties.

Additivity/free entry : if y, y′ ∈ Y , then y + y′ ∈ Y . If both y and y′ are feasible,then it is feasible to set up two independent plants, one producing y, the other y′,together yielding y + y′.

Y is convex : if y, y′ ∈ Y and α ∈ [0, 1], then αy + (1− α)y′ ∈ Y .Y is a convex cone : if y, y′ ∈ Y and α, β ≥ 0, then αy + βy′ ∈ Y .

One easily establishes relations between these properties. Possibility of inaction implies nonempti-ness. Nondecreasing and nonincreasing returns to scale imply constant returns to scale. Someless trivial ones are:

Proposition 5.1 Let Y ⊂ RL be a production set.

(a) If Y is convex and 0 ∈ Y , then Y has nonincreasing returns to scale.

(b) Y is a convex cone if and only if Y is convex and has constant returns to scale.

(c) Y is a convex cone if and only if Y is additive and has nonincreasing returns to scale.

(d) If Y satis�es no free lunch and for all x, y ∈ Y and α ∈ (0, 1), there is a z ∈ Y withz ≥ αx+ (1− α)y, z 6= αx+ (1− α)y, then Y satis�es irreversibility.

Proof. (a): Let y ∈ Y and α ∈ [0, 1]. By convexity, αy + (1− α)0 = αy ∈ Y .(b): Assume Y is a convex cone. Then Y is trivially convex. To establish CRS, let y ∈ Y andα ≥ 0. Since Y is a convex cone, αy =

(12α)y +

(12α)y ∈ Y . Conversely, assume Y is convex

and has CRS. To show that Y is a convex cone, let y, y′ ∈ Y and α, β ≥ 0. By CRS, 2αy ∈ Yand 2βy′ ∈ Y . By convexity, 1

2(2αy) + 12(2βy′) = αy + βy′ ∈ Y .

(c): If Y is a convex cone, it is additive (take α = β = 1) and has nonincreasing returns to scale(similar to the proof of CRS above). Conversely, assume that Y is additive and has nonincreasingreturns to scale. Let y, y′ ∈ Y and α, β ≥ 0. By additivity, ky ∈ Y and ky′ ∈ Y for all k ∈ N.Choose k ∈ N such that α/k ≤ 1 and β/k ≤ 1. Since Y has nonincreasing returns to scale,(α/k)y ∈ Y and (β/k)y′ ∈ Y . By additivity: αy = k(α/k)y ∈ Y and βy′ ∈ Y . Again byadditivity αy + βy′ ∈ Y .(d): Let y ∈ Y, y 6= 0, and suppose −y ∈ Y . By assumption, there is a z ∈ Y such thatz ≥ 1

2y + 12(−y) = 0, z 6= 0, contradicting no free lunch. �

41

Page 48: Preferences

In the special case of a production function, properties of the production set are related toproperties of the production function. For instance:

Proposition 5.2 Consider a single-output technology with production function f : RL−1+ → R

with f(0, . . . , 0) = 0, so that

Y = {(−z, q) ∈ RL : q ≤ f(z) and z ∈ RL−1+ }.

(a) Y has constant returns to scale if and only if f is homogeneous of degree one.

(b) Y is convex if and only if f is concave.

Proof. (a): First, assume that Y satis�es constant returns to scale. We show that for eachz ∈ RL−1

+ and α > 0: αf(z) ≤ f(αz). So let z ∈ RL−1+ and α > 0.

• By de�nition of the production set Y , (−z, f(z)) ∈ Y .

• By CRS, this implies (−αz, αf(z)) ∈ Y .

• By de�nition of the production set Y , (−αz, αf(z)) ∈ Y means that αf(z) ≤ f(αz).

Next, we show that for each z ∈ RL−1+ and α > 0: αf(z) ≥ f(αz). So let z ∈ RL−1

+ and α > 0.

Fix z′ = αz ∈ RL−1+ and α′ = 1/α > 0.

• Apply the result above to z′ and α′ : α′f(z′) ≤ f(α′z′).

• Substitute z′ = αz and α′ = 1/α : (1/α)f(αz) ≤ f((1/α)αz) = f(z).

• Multiply both sides with α : f(αz) ≤ αf(z).

Using the above, it follows that if Y has CRS, then for each z ∈ RL−1+ and each α > 0 : αf(z) =

f(αz). So f is homogeneous of degree one.Conversely, assume that f is homogeneous of degree one: for each z ∈ RL−1

+ and eachα > 0 : αf(z) = f(αz). To show: Y has CRS, i.e, if (−z, q) ∈ Y and α ≥ 0, then (−αz, αq) ∈ Y .This follows from the assumption that f(0, . . . , 0) = 0 if α = 0. So let (−z, q) ∈ Y and α > 0.

• By de�nition of the production set Y , q ≤ f(z).

• Multiply both sides with α > 0 : αq ≤ αf(z).

• Since f is homogeneous of degree one, αf(z) = f(αz), so αq ≤ αf(z) = f(αz).

• By de�nition of the production set Y , αq ≤ f(αz) together with αz ∈ RL−1+ implies that

(−αz, αq) ∈ Y .

(b): The function f : RL−1+ → R is concave if and only if its subgraph

{(z, q) ∈ RL−1+ × R : q ≤ f(z)} = {(z, q) ∈ RL : q ≤ f(z) and z ∈ RL−1

+ }

is convex. Multiplying the �rst L−1 coordinates with−1 maintains convexity, so this is equivalentwith

{(−z, q) ∈ RL : q ≤ f(z) and z ∈ RL−1+ } = Y

being convex. �

42

Page 49: Preferences

5.3. The pro�t maximization problem

The production set Y speci�es a �rm's set of feasible options. To make the choice problem of the�rm complete, we have to endow it with preferences. These preferences are particularly simple.It is assumed that �rms maximize pro�ts given the commodity prices and the �rm's productionset: given production set Y ⊂ RL and a price vector p ∈ RL++, the pro�t maximizationproblem (PMP) is

max p · ys.t. y ∈ Y.

The pro�t function π assigns to every price vector p ∈ RL++ the maximal pro�t

π(p) = max{p · y : y ∈ Y }.

The supply correspondence y(·) assigns to every price vector p ∈ RL++ the set of pro�t-maximizing production vectors:

y(p) = {y ∈ Y : p · y = π(p)}.

As opposed to the utility maximization problem, which has a solution under mild conditions(like continuity of the utility function), there may not be a solution to the PMP: pro�ts may beunbounded. In that case, we set π(p) = +∞. Indeed, we may have the following:

Proposition 5.3 Let Y ⊂ RL be nonempty and satisfy nondecreasing returns to scale. For eachprice vector p ∈ RL++, either p · y ≤ 0 for all y ∈ Y , which means that no positive pro�t can bemade, or π(p) = +∞.

Proof. Consider a price vector p ∈ RL++. Suppose that p · y > 0 for some y ∈ Y . Since Y hasnondecreasing returns to scale, αy ∈ Y for all α ≥ 1, so p ·(αy) = α(p ·y) can be made arbitrarilylarge by letting α go to in�nity. �

This makes the existence of solutions to the PMP a nontrivial issue. The following two resultsprovide su�cient conditions.

Proposition 5.4 Assume that the production set Y ⊂ RL is:

� nonempty,

� closed,

� bounded above: there is an r ∈ R such that y` ≤ r for all y ∈ Y and all ` ∈ {1, . . . , L}.Then the pro�t maximization problem has at least one solution for each price vector p ∈ RL++.

Proof. Let p ∈ RL++. By nonemptiness, there is a y′ ∈ Y . A solution to the PMP must lie inthe set P = Y ∩ {y ∈ RL : p · y ≥ p · y′}.

P is closed: Y is closed by assumption and the second set in the intersection is closed, sinceit is the upper contour set of a continuous function. The intersection of two closed sets is closed.

P is bounded: By assumption, the coordinates of vectors in P are bounded above by r.Moreover, all coordinates are bounded from below as well: let y ∈ P and consider an arbitrarycoordinate ` ∈ {1, . . . , L}. Since p · y ≥ p · y′, it follows that

p`y` ≥ p · y′ −∑k 6=`

pkyk ≥ p · y′ −∑k 6=`

pkr,

43

Page 50: Preferences

so y` is bounded from below by(p · y′ −

∑k 6=` pkr

)/p`.

Hence, P is compact. Since we maximize a continuous pro�t function over a compact set Y ,there is at least one solution. �

The following result establishes existence of solutions to the pro�t maximization problem underresource constraints.

Proposition 5.5 Assume that the production set Y ⊂ RL:� satis�es possibility of inaction,

� satis�es no free lunch,

� is closed,

� is convex,

� has a resource constraint: there is a nonzero vector ω ∈ RL+ of inputs restricting feasibleproduction to vectors y ∈ Y with y ≥ −ω.

Then the pro�t maximization problem has at least one solution for each price vector p ∈ RL++.

Exercise 5.1 This exercise guides you through the proof of Proposition 5.5.

(a) Show that Y ′ = Y ∩ {y ∈ RL : y ≥ −ω} is nonempty and closed.

To show that Y ′ = Y ∩ {y ∈ RL : y ≥ −ω} is bounded, suppose it were not: there is a sequence (yn)n∈Nof vectors in Y ′ whose increasing length ‖yn‖ diverges to in�nity. De�ne zn = yn/‖yn‖.

(b) Show that for n ∈ N large enough, zn lies in Y and satis�es zn + ω/‖yn‖ ≥ 0.

(c) Show that (zn)n∈N has a convergent subsequence with limit z 6= 0 in Y .

(d) Combine this with (b) to derive a contradiction.

As Y ′ is nonempty and compact and the pro�t function is continuous, a maximum exists!

Thus, whenever we talk about properties of the pro�t function and the supply correspondence,we implicitly assume that the PMP has a solution, so that y(p) 6= ∅ and π(p) <∞.

Proposition 5.6 Consider a �rm with production set Y ⊂ RL.

(a) The pro�t function is homogeneous of degree one, the supply correspondence is homogeneousof degree zero.

(b) The pro�t function is convex.

(c) If Y is convex, y(p) is a convex set for all p ∈ RL++.

(d) Hotelling's lemma: Let p ∈ RL++. If y(p) consists of a single point y, then the pro�tfunction is di�erentiable at p and ∂π(p)/∂p` = y` for all goods ` = 1, . . . , L.

(e) Law of supply: for all p, p′ ∈ RL++ and all y ∈ y(p) and y′ ∈ y(p′):

(p− p′) · (y − y′) ≥ 0.

44

Page 51: Preferences

Proof. (a): Do this yourself.(b): We give two proofs.First proof: we show that the epigraph epi(π) = {(p, v) ∈ RL++×R : v ≥ π(p)} is a convex set.Let (p1, v1), (p2, v2) ∈ epi(π) and α ∈ [0, 1]. To show: αv1 + (1− α)v2 ≥ π(αp1 + (1− α)p2). Solet y ∈ y(αp1 + (1− α)p2). Then pi · y ≤ π(pi) ≤ vi for both i = 1, 2, so

π(αp1 + (1− α)p2) = αp1 · y + (1− α)p2 · y ≤ αv1 + (1− α)v2.

Second proof: we show that for all p1, p2 ∈ RL++ and all α ∈ [0, 1] : π(αp1 + (1 − α)p2) ≤απ(p1) + (1 − α)π(p2). So let p1, p2 ∈ RL++ and α ∈ [0, 1]. Let y ∈ y(αp1 + (1 − α)p2). Thenpi · y ≤ π(pi) for both i = 1, 2, so

π(αp1 + (1− α)p2) = αp1 · y + (1− α)p2 · y ≤ απ(p1) + (1− α)π(p2).

(c): Let p ∈ RL++. Then y(p) = Y ∩ {y ∈ RL : p · y = π(p)} is the intersection of Y and ahyperplane. Since both are convex, so is y(p).(d): We prove Hotelling's lemma, assuming that π is di�erentiable at p. By de�nition of thepro�t function we know that for all p′ ∈ RL++ : p′ · y ≤ π(p′), with equality if p′ = p. So thefunction h : RL++ → R with h(p′) = π(p′)− p′ · y achieves its minimum at p. But then its partialderivatives at p must be zero:

∀` = 1, . . . , L : ∂h(p)/∂p` = ∂π(p)/∂p` − y` = 0,

proving Hotelling's lemma.(e): Notice that

(p− p′) · (y − y′) = (p · y − p · y′) + (p′ · y′ − p′ · y) ≥ 0,

where the inequality follows from the de�nition of pro�t maximizers: p · y = π(p) ≥ p · y′ andp′ · y′ = π(p′) ≥ p′ · y. �

5.4. Solving the PMP

Just like in the utility maximization problem UMP, the Kuhn-Tucker conditions can be used to�nd necessary �rst order conditions for the pro�t maximization problem PMP: if the productionset is

Y = {y ∈ RL : F (y) ≤ 0},

where F is continuously di�erentiable and the price vector is p ∈ RL++, a necessary �rst ordercondition for y∗ ∈ Y to be a solution to the PMP

max p · ys.t. F (y) ≤ 0

is that there exists a Lagrange multiplier λ ≥ 0 such that for each good ` = 1, . . . , L :

p` = λ∂F (y∗)

∂y`. (23)

45

Page 52: Preferences

If we divide the �rst order condition for good ` with that for good k, we �nd that for all pairs ofgoods `, k :

p`pk

=∂F (y∗)/∂y`∂F (y∗)/∂yk

,

i.e., in an optimal production plan y∗, the price ratio between two goods equals its so-calledmarginal rate of transformation. If the set Y is convex, the �rst order conditions in (23) are alsosu�cient for a solution to the PMP.

In the single-output case, assume the production function f is di�erentiable and that theprice of input ` = 1, . . . , L− 1 equals w` > 0 and the price of the output equals p > 0.

Remark 5.7 I don't know the reason for this sudden change of notation from a price vector p toan output-input price vector (p, w). Do not confuse the vector of input prices w with the wealthlevel w of the consumer. This choice of notation is unfortunate, but widespread in economics. /

The PMP can be rewritten asmax pf(z)− w · zs.t. z ∈ RL−1

+ .

If z∗ is optimal, the Kuhn-Tucker conditions imply the existence of Lagrange multipliers λ` ≤ 0for each of the conditions z` ≥ 0 such that for all inputs ` = 1, . . . , L− 1 :

p∂f(z∗)

∂z`− w` = λ` and λ`z

∗` = 0. (24)

Assuming an interior solution (z∗` > 0 for all `), this implies that λ` = 0 for all `, so the �rstorder conditions become

∀` = 1, . . . , L− 1 : p∂f(z∗)

∂z`= w`,

so that for all inputs `, k :w`wk

=∂f(z∗)/∂z`∂f(z∗)/∂zk

, (25)

which has the interpretation that the price ratio between two goods has to equal their so-calledmarginal rate of technical substitution. Again, if the set Y is convex, the �rst order conditionsin (24) are also su�cient for a solution to the PMP.

5.5. The cost minimization problem

In a pro�t maximizing production plan, there is no way to produce the same amount of outputsat a lower total input cost. This motivates a study of the cost minimization problem (CMP),which we consider only in the single-output case. Assume the production function is f : RL−1

+ →R and the input price vector is w ∈ RL−1

++ . We want to produce at least an amount q of theoutput. What is the minimal amount we have to spend on inputs to achieve this? The answeris given by the CMP:

min w · zs.t. z ∈ RL−1

+ ,f(z) ≥ q.

The conditional factor demand correspondence assigns to each vector w ∈ RL−1++ of input

prices and each output level q the associated set z(w, q) of solutions to the CMP:

z(w, q) = {z ∈ RL−1+ : f(z) ≥ q and w · z ≤ w · z′ for all z′ ∈ RL−1

+ with f(z′) ≥ q}.

46

Page 53: Preferences

The conditional factor demand correspondence speci�es the set of input vectors solving the CMP,the cost function c(w, q) indicates its value:

c(w, q) = minz∈RL−1

+ ,f(z)≥qw · z = w · z∗ for all z∗ ∈ z(w, q).

The cost minimization problem and the expenditure minimization problem

min p · xs.t. x ∈ RL+

u(x) ≥ u

are identical, up to a relabeling of the involved functions. Therefore, rewriting Propositions 4.5,4.6, and 4.8 provides a long list of properties for conditional factor demand and the cost function.

If the production function f is continuously di�erentiable, the Kuhn-Tucker conditions canbe used to show that at a solution z∗ of the CMP, there must be a Lagrange multiplier λ ≥ 0associated with the condition q − f(z) ≤ 0 and Lagrange multipliers λ` ≥ 0 associated with theconditions −z` ≤ 0 such that for all ` = 1, . . . , L− 1 :

w` = λ∂f(z∗)

∂z`+ λ` and λ`z

∗` = 0.

If the solution uses positive amounts of all inputs (z∗` > 0 for all `), this implies that λ` = 0 forall `, so

w` = λ∂f(z∗)

∂z`

for all ` and consequentlyw`wk

=∂f(z∗)/∂z`∂f(z∗)/∂zk

,

as in (25)!

5.6. Linking the PMP and the CMP

In the case of a single-output economy with production function f : RL−1+ → R+, input price

vector w ∈ RL−1++ , and output price p > 0, the PMP becomes

max pq − w · zs.t. q ≤ f(z),

z ∈ RL−1+ .

The set of solutions is commonly denoted as y(p, w) and the maximal pro�t as π(p, w). In asolution (z, q), positivity of the output price (p > 0) implies that q = f(z), otherwise the pro�tcan be increased:

pq − w · z < pf(z)− w · z.

Consequently, the PMP simpli�es to

maxz∈RL−1

+

pf(z)− w · z.

Moreover, production has to be as cheap as possible, so there is a link with the CMP:

47

Page 54: Preferences

Proposition 5.8 Consider a production function f : RL−1+ → R+ such that for each q ≥ 0, the

set {z ∈ RL−1+ : f(z) ≥ q} is nonempty and closed. Let w ∈ RL−1

++ be the vector of input prices,p > 0 the output price. Consider the optimization problems

(P1) maxz∈RL−1+

pf(z)− w · z,

(P2) maxq≥0 pq − c(w, q).

The following claims are true:

(a) For each z ∈ RL−1+ , there is a qz ≥ 0 with pf(z)− w · z ≤ pqz − c(w, qz).

(b) For each q ≥ 0, there is a zq ∈ RL−1+ with pf(zq)− w · zq ≥ pq − c(w, q).

(c) If one of the problems (P1) and (P2) has a solution, so does the other and the correspondingmaximum values coincide:

maxz∈RL−1

+

pf(z)− w · z = maxq≥0

pq − c(w, q).

Exercise 5.2 Prove Proposition 5.8.

The PMP as formulated in (P2) is particularly easy: given the cost function, the PMP reducesto a single-variable maximization problem. In practice, this is often the easiest way to solvethe PMP. Under suitable di�erentiability assumptions, the necessary Kuhn-Tucker condition atan optimum q∗ is that there exists a Lagrange multiplier λ ≥ 0 associated with the condition−q ≤ 0 such that:

p− ∂c(w, q∗)

∂q= −λ and − λq∗ = 0.

Assuming q∗ > 0, this means that λ = 0 and hence that price equals marginal costs at a pro�tmaximizing quantity. If the cost function is convex in q, this condition is also su�cient.

Example: some calculations in a single-output economy: Consider a technology usinga single input to produce a single output via the production function f : R+ → R with f(z) =

√z

for all z ≥ 0. The production set is

Y = {(−z, q) ∈ R2 : q ≤ f(z), z ≥ 0} = {y ∈ R2 : y1 ≤ 0, y2 ≤√−y1}.

Assume that the input price is w > 0 and the output price is p > 0. The pro�t maximizationproblem (P1) becomes

maxz≥0

p√z − wz.

At z = 0, the pro�t is zero. At an interior solution z∗ > 0, the following �rst order conditionmust be satis�ed:

p

2√z∗− w = 0,

so z∗ =( p

2w

)2, yielding output

√z∗ = p

2w and pro�t p2

2w −p2

4w = p2

4w > 0. Conclude that thesupply function is

y(p, w) =

(−( p

2w

)2,p

2w

)∈ Y (26)

48

Page 55: Preferences

and the pro�t function π(p, w) = p2

4w . The cost minimization problem for production level q is

min wzs.t. z ≥ 0,√

z ≥ q.

At an optimum z∗, it is clear that√z∗ = q: no inputs are wasted. Hence the conditional factor

demand is z∗ = z(w, q) = q2 and the cost function is c(w, q) = wq2. This allows us to rewritethe pro�t maximization problem as in (P2):

maxq≥0

pq − c(q, w) = maxq≥0

pq − wq2.

Solving this optimization problem yields an optimal output quantity q∗ = p2w as in (26).

5.7. E�ciency

A production plan y ∈ Y is e�cient if there is no y′ ∈ Y with y′ ≥ y and y′ 6= y. In words,there is no di�erent production plan producing at least as much output while using at most asmuch input. There is a close connection between pro�t maximization and e�ciency:

Proposition 5.9 Consider a production set Y ⊂ RL.

(a) If y ∈ Y maximizes pro�ts at prices p ∈ RL++, i.e., if y ∈ y(p), then y is e�cient.

(b) If Y is convex, then for every e�cient y∗ ∈ Y there is a nonzero price vector p ∈ RL+ suchthat y∗ is pro�t maximizing at prices p.

Proof. (a): Suppose y is not e�cient: there is a y′ ∈ Y with y′ ≥ y, y′ 6= y. Then p · y′ > p · y:the pro�t from y′ exceeds that from the pro�t-maximizing y, a contradiction.(b): Let Z = {y′ ∈ RL : y′ > y∗}. Since y∗ is e�cient: Z ∩Y = ∅. By the separating hyperplanetheorem, there is a vector p ∈ RL, p 6= 0 such that p · y′ ≥ p · y for all y′ ∈ Z and y ∈ Y . Twothings remain to be shown:

Firstly, that p ∈ RL+. Suppose, to the contrary, that p` < 0 for some coordinate `. Thenp · y′ < p · y∗ for some y′ ∈ Z with y′` − y∗` > 0 su�ciently large. A contradiction.

Secondly, that y∗ is pro�t maximizing at prices p. Let y ∈ Y . To show: p · y∗ ≥ p · y. Foreach n ∈ N, de�ne the vector yn = (y∗1 + 1/n, . . . , y∗L + 1/n) ∈ Z. Then p · yn ≥ p · y. Sinceyn → y∗, it follows that also in the limit p · y∗ ≥ p · y. �

Exercise 5.3 This exercise investigates the need for the di�erent assumptions in Proposition 5.9.

(a) Give an example of a production set Y ⊂ R2, a point y ∈ Y and price vector p ∈ R2+, p 6= (0, 0),

such that y maximizes pro�ts at prices p, but y is not e�cient.

(b) Give an example of a convex production set Y ⊂ R2 and a point y ∈ Y which is e�cient but notpro�t maximizing for any p ∈ R2

++.

(c) Give an example of a production set Y ⊂ R2 which is not convex and a point y ∈ Y which ise�cient, but not pro�t maximizing for any nonzero price vector p ∈ R2

+.

49

Page 56: Preferences

6. General equilibrium

6.1. What is an equilibrium?

Earlier, we studied how consumers choose optimal consumption bundles given their preferences,wealth, and the price vector and how �rms choose optimal production plans given their technologyand the price vector. Are there price vectors where all these optimal choices are actually feasible?You don't, for instance, want people demanding ten apples if there only are �ve. Such a pricevector and the corresponding demand and supply constitute a Walrasian equilibrium . Itsde�nition follows the central idea behind any economic equilibrium concept with decent micro-foundations � it is a description of:

� something feasible, where

� each involved agent � taking as given those things beyond his control � makes a choicethat makes him as happy as possible.

Notice, in particular, that it involves no statements like �markets clear� or �supply equals de-mand�. Economic agents � quite frankly � couldn't care less: they have their preferences,some constraints, and all they wish for is to choose optimally. Nevertheless, some people becomevery nervous when one doesn't assume that markets clear (�excess demand equal to zero�) inequilibrium. I want to take this concern seriously, so let me brie�y explain this.

� Market clearing is an assumption about aggregate behavior that is not in line with themicroeconomic idea behind equilibrium that combines feasibility with optimal behavior ofindividual agents; Kreps (1990, p. 6), for instance, states:

�Generally speaking, an equilibrium is a situation in which each individual agentis doing as well as it can for itself, given the array of actions taken by othersand given the institutional framework that de�nes the options of individuals andlinks their actions.�

� Sometimes, it is downright silly to insist on market clearing. Suppose agents in an economyare endowed with a positive quantity of a commodity that is undesirable and of no usewhatsoever as an input. Why would you insist on supply and demand for this commoditybeing equal? What are you going to do? Stu� the good down people's throat?

Or what if agents only want to consume gloves in matching pairs? If there happen to bemore left- than right-hand gloves, simply leave excess gloves to gather dust somewhere.

� Consequently, market clearing is often not a part of the de�nition of equilibrium. See, forinstance, Arrow and Hahn (1971, p. 107), Kreps (1990, p. 190), Mas-Colell (1985, p. 169),and Varian (1992, p. 316).

� Market clearing in equilibrium, however, turns out to be a consequence of commonly im-posed restrictions. You may �nd Exercise 6.2(c) helpful.

To illustrate the main ideas behind general equilibrium analysis, we start by studying a pureexchange economy where there is no production, but where consumers are initially endowedwith certain amounts of the di�erent goods. This entails no real loss of generality: our main toolwill be to study excess demand, regardless of whether it involves producers or not. Walrasianequilibrium is de�ned and shown to exist in a particularly simple case. Also, we study someof its welfare properties. After introducing producers into the model, a more general existenceresult is provided in Section 6.4.

50

Page 57: Preferences

6.2. Pure exchange economies

A pure exchange economy is a tuple E = (%h, ωh)h∈H , where:

� H is a nonempty, �nite set of consumers/households,

and each consumer h ∈ H has

� a weak order %h over RL+, where L ∈ N,� an initial endowment ωh ∈ RL+ of the L commodities.

The total endowment is denoted ω =∑

h∈H ωh. An allocation x = (xh)h∈H assigns to each

consumer h ∈ H a commodity bundle xh ∈ RL+. Allocation x is:

feasible if∑

h∈H xh ≤ ω,

nonwasteful if∑

h∈H xh = ω.

If the price vector is p, the initial endowment of consumer h ∈ H is worth p · ωh, so consumer hcan a�ord bundles x ∈ RL+ with p · x ≤ p · ωh, i.e., consumer h's budget set is Bh(p, p · ωh). Letxh(·) denote this consumer's demand correspondence.

The basic idea behind equilibria (feasibility and optimal choices) leads to the following de�-nition. AWalrasian equilibrium of a pure exchange economy E = (%h, ωh)h∈H is a pair (p, x),where:

� p ∈ RL+, p 6= 0, is a price vector,

� x = (xh)h∈H is a feasible allocation,

� for each consumer h ∈ H, xh is a most preferred bundle at prices p, i.e., xh ∈ xh(p, p ·ωh).

Properties of Walrasian equilibrium are often studied using the excess demand correspon-dence z assigning to each price vector p the di�erence between total demand for and the totalavailability of the commodities:

z(p) =∑h∈H

(xh(p, p · ωh)− {ωh}

)=∑h∈H

xh(p, p · ωh)− {ω}.

By de�nition of Walrasian equilibrium, p is an equilibrium price vector if and only if there is acorresponding excess demand vector z ∈ z(p) where no commodity has positive excess demand,i.e., a z ∈ z(p) ∩ RL−.

Budget sets are homogeneous of degree zero in prices:

∀h ∈ H,∀p ∈ RL+,∀α > 0 : Bh(p, p · ωh) = Bh(αp, (αp) · ωh).

Therefore, if p∗ is an equilibrium price vector, then so is αp∗ for all α > 0. In the computationof Walrasian equilibria, this allows some simpli�cations, for instance by assuming that the equi-librium price of one of the goods is equal to one, or that the sum of the prices is equal to one,i.e., they lie in the unit simplex ∆ = {p ∈ RL+ :

∑L`=1 p` = 1} (also denoted ∆L if we want to

stress the dimension of the vectors).To illustrate the idea behind existence proofs of Walrasian equilibria, the next result makes

a lot of simpli�cations.

Proposition 6.1 Assume that excess demand z:

� is a well-de�ned function (rather than a correspondence) z : ∆→ RL,� is continuous,

51

Page 58: Preferences

� satis�es Walras' Law: p · z(p) = 0 for all p ∈ ∆.

Then there is a price vector p ∈ ∆ with z(p) ≤ 0.

Proof. The idea is to change prices by making goods in excess demand relatively more expensiveand hope that demand for them goes down. If there are no more changes, there is no excessdemand, and we found an equilibrium price vector. De�ne f : ∆→ ∆ by

f(p) =

(pi + max{zi(p), 0}

1 +∑L

j=1 max{zj(p), 0}

)i=1,...,L

.

Function f increases the price of commodities for which excess demand is positive and thenrescales the resulting price vector so that its coordinates add up to one. As the composition ofcontinuous functions, f is continuous. By Brouwer's �xed point theorem, there is a p ∈ ∆ withf(p) = p. We show that z(p) ≤ 0. By Walras' Law:

0 = p · z(p) = f(p) · z(p)

=1

1 +∑L

j=1 max{zj(p), 0}

[p · z(p)︸ ︷︷ ︸

=0

+∑L

i=1max{zi(p), 0}zi(p)

].

Therefore,L∑i=1

max{zi(p), 0}zi(p) = 0. (27)

Notice:

max{zi(p), 0}zi(p) =

{0 if zi(p) ≤ 0,zi(p)

2 > 0 if zi(p) > 0.

So (27) is the sum of nonnegative terms. The only way in which it can be zero, is if all its termsare zero, i.e., if zi(p) ≤ 0 for all i, as we had to show. �

The price vector p ∈ ∆ with z(p) ≤ 0 together with the allocation x = (xh(p, p · ωh))h∈H is aWalrasian equilibrium. Using z(p) ≤ 0 and Walras' Law (p · z(p) = 0), it follows that excessdemand is zero for commodities i with pi > 0: a good can be in excess supply in equilibrium,but only if its price equals zero. The desired properties of excess demand are usually derivedfrom conditions on consumer preferences, using Proposition 4.3.

6.3. Welfare analysis

A feasible allocation x is:

Pareto dominated if there is another feasible allocation x with xh %h xh for allh ∈ H and xh �h xh for some h ∈ H, i.e., if all consumers are at least as well o� inx as in x and at least one of them is strictly better o�.

Pareto optimal if it is not Pareto dominated.

Call a nonempty collection S ⊆ H of consumers a coalition . Coalition S can improve upon afeasible allocation x if there are commodity bundles xh for all h ∈ S such that

� these bundles simply redistribute initial endowments:∑

h∈S xh =

∑h∈S ω

h,

52

Page 59: Preferences

� all members of S are better o�: xh � xh for all h ∈ S.The core of E is the set of feasible allocations that no coalition can improve upon.

The requirement that no one-agent coalition can improve upon allocation x simply requiresthat xh % ωh for all h ∈ H. This condition is often referred to as individual rationality .

Proposition 6.2 If (p, x) is a Walrasian equilibrium of E, then x lies in the core.

Proof. Suppose coalition S ⊆ H can improve upon x via commodity bundles (xh)h∈S . Thenxh �h xh for each h ∈ S. By de�nition, xh is a most preferred bundle at prices p, so xh

cannot lie in the budget set Bh(p, p · ωh), i.e., p · xh > p · ωh. Summing over all h ∈ S givesp ·∑

h∈S xh > p ·

∑h∈S ω

h. This contradicts that (xh)h∈S redistributes initial endowments:∑h∈S x

h =∑

h∈S ωh. �

Under weak assumptions, Walrasian equilibrium allocations are Pareto optimal:

Proposition 6.3 First fundamental welfare theorem: If (p, x) is a Walrasian equilib-rium of E and consumers have locally nonsatiated preferences, then x is Pareto optimal.

Proof. Suppose x is Pareto dominated by feasible allocation x: xh %h xh for all h ∈ H andxk �k xk for some k ∈ H. By local nonsatiation, p · xh ≥ p · xh = p · ωh for all h ∈ Hand p · xk > p · xk = p · ωk. So, p ·

∑h∈H x

h > p ·∑

h∈H ωh, contradicting feasibility of x:∑

h∈H xh ≤

∑h∈H ω

h. �

As a partial converse to the previous result, some additional assumptions guarantee that anythingthat is Pareto optimal can be sustained as a Walrasian equilibrium allocation � at least if initialendowments can somehow be redistributed.

Proposition 6.4 Second fundamental welfare theorem: Assume:

� for each redistribution of initial endowments in the pure exchange economy E, a Walrasianequilibrium exists,

� consumers have strictly convex preferences.

If x is a Pareto optimal allocation, redistribute initial endowments such that ωh = xh for allh ∈ H. Then x is a Walrasian equilibrium allocation for the resulting pure exchange economy.

Proof. By assumption, the resulting pure exchange economy has a Walrasian equilibrium (p, x).For each h ∈ H, xh is optimal and xh is feasible in the budget set Bh(p, p · xh), so xh % xh.By Pareto optimality of x, none of these preferences can be strict, so xh ∼h xh for all h ∈ H.To see that xh = xh for all h ∈ H, suppose there is an h ∈ H with xh 6= xh. Consumer h cana�ord (xh + xh)/2. By strict convexity of preferences, this bundle is strictly preferred to xh,contradicting that xh is an optimal bundle for the consumer in the Walrasian equilibrium. �

6.4. Private ownership economies

Let us extend the pure exchange economy by adding �rms, owned by the households: each house-hold is entitled to a share (possibly zero) of each �rm's pro�t. Formally, a private ownershipeconomy is a tuple

E =(

(%h, ωh)h∈H , (Yf )f∈F , (θ

hf )h∈H,f∈F

),

where:

53

Page 60: Preferences

� H is a nonempty, �nite set of consumers/households, F a nonempty, �nite set of �rms,

� each �rm f ∈ F has a production set Y f ⊂ RL, where L ∈ N,and each consumer h ∈ H has

� a weak order %h over RL+,� an initial endowment ωh ∈ RL+ of the L commodities,

� a claim to a share θhf ∈ [0, 1] of the pro�t of �rm f ∈ F (where∑

h∈H θhf = 1 for all

f ∈ F ).An allocation (x, y) = ((xh)h∈H , (y

f )f∈F ) assigns to each consumer h ∈ H a commodity bundlexh ∈ RL+ and to each �rm f ∈ F a production plan yf ∈ Y f . Allocation (x, y) is feasible if∑

h∈Hxh ≤

∑h∈H

ωh +∑f∈F

yf .

If the price vector is p and �rms decide on production plans (yf )f∈F , consumer h ∈ H has budgetset {

x ∈ RL+ : p · x ≤ p ·(ωh +

∑f∈F

θhfyf)},

because the initial endowment is worth p · ωh and h receives share θhf of the pro�t p · yf of �rmf ∈ F .

Let xh(·) denote the demand correspondence of consumer h ∈ H, yf (·) the supply correspon-dence of �rm f ∈ F , and πf (·) its pro�t function. The basic idea behind equilibria (feasibilityand optimal choices) leads to the following de�nition. A Walrasian equilibrium of a privateownership economy E is a triple (p, x, y), where

� p ∈ RL+, p 6= 0, is a price vector,

� (x, y) = ((xh)h∈H , (yf )f∈F ) is a feasible allocation,

� for each consumer h ∈ H, xh is a most preferred bundle at prices p:

xh ∈ xh(p, p ·

(ωh +

∑f∈F

θhfyf)),

� for each �rm f ∈ F , yf maximizes pro�ts at prices p: yf ∈ yf (p) and πf (p) = p · yf .Once again, existence of Walrasian equilibrium is usually established by looking at the excessdemand correspondence z assigning to each price vector p the di�erence between total demandfor and total availability of the commodities:

z(p) =∑h∈H

xh(p, p ·

(ωh +

∑f∈F

θhfπf (p)))−∑f∈F

yf (p)− {ω},

and the interest is in �nding a price vector p where z(p)∩RL− 6= ∅. The following result (Debreu,1959, Section 5.6) establishes existence of such a price vector; as before, one may restrict attentionto prices in the unit simplex ∆.

Proposition 6.5 Assume that excess demand z:

� achieves values in some convex, compact set Z ⊂ RL: z(p) ⊆ Z for all p ∈ ∆,

� is nonempty-valued: z(p) 6= ∅ for all p ∈ ∆,

54

Page 61: Preferences

� is convex-valued: z(p) is a convex set for all p ∈ ∆,

� has a closed graph: {(p, z) ∈ ∆× Z : z ∈ z(p)} is a closed set,

� satis�es a weak form of Walras' Law: p · z ≤ 0 for all p ∈ ∆ and all z ∈ z(p).Then there is a price vector p ∈ ∆ with z(p) ∩ RL− 6= ∅.

Proof. Once again, the idea is to make goods with large excess demand expensive in the hope ofdecreasing it. This is achieved by maximizing, for a given excess demand vector z, the expressionp · z, which requires putting all weight of p ∈ ∆ on the largest coordinate(s) of z. De�ne thecorrespondence from Z to ∆ by

µ(z) = {p ∈ ∆ : p · z = maxp′∈∆

p′ · z}.

As it maximizes a continuous function p 7→ p ·z over a nonempty, compact set ∆, µ is nonempty-valued. Let z ∈ Z and p0 ∈ µ(z). Then µ(z) = ∆∩ {p ∈ RL : p · z = p0 · z} is the intersection ofconvex sets, so µ is convex-valued. A standard continuity argument shows that µ has a closedgraph.

The correspondence ϕ from and to ∆× Z with

ϕ(p, z) = µ(z)× z(p)

is nonempty-valued, convex-valued, and has a closed graph because µ and z have these properties.By Kakutani's �xed point theorem, there is a (p, z) ∈ ∆×Z with (p, z) ∈ ϕ(p, z) = µ(z)×z(p).

As z ∈ z(p), the weak Walras' Law implies that p · z ≤ 0. As p ∈ µ(z), p · z ≥ p′ · z for all p′ ∈ ∆.For each ` ∈ {1, . . . , L}, taking p′ = e` ∈ ∆ gives that z` = p′ · z ≤ p · z ≤ 0, so z ≤ 0. �

The trick, of course, is to derive the desired properties of the excess demand correspondence byimposing properties on the components of the private ownership economy E . Given the resultsof Sections 4 and 5, most of them should not come as a surprise. Only the �rst is somewhatcomplicated: what allows us to restrict attention to such a convex, compact set Z? Convexityof Z is not the issue: if you can �nd a compact set containing all the images z(p), they also liein a su�ciently large (convex) ball. Without going into details, compactness of Z is establishedby realizing that the relevant production plans, by feasibility, must satisfy

∑f∈F y

f + ω ≥ 0.Following the lines of Proposition 5.5, this set of attainable production plans can be shown tobe compact.

Appropriate modi�cations of the fundamental welfare theorems continue to hold for privateownership economies. As this section was meant only as a short introduction to the topic, theinterested reader is referred to Debreu (1959) for a more comprehensive treatment. Textbookson general equilibrium theory include Hildenbrand and Kirman (1988) and Starr (1997).

6.5. Exercises

Exercise 6.1

(a) What is wrong with the following argument: �Proposition 6.2 implies Proposition 6.3: if x lies inthe core, the coalition S = H of all consumers cannot improve upon it. So x is Pareto optimal.�

(b) Give an example of a pure exchange economy E and a Walrasian equilibrium (p, x) such that x liesin the core, but is not Pareto optimal.

55

Page 62: Preferences

Exercise 6.2 Market clearing: Consider a (pure exchange/private ownership) economy E whereWalras' Law holds: p · z = 0 for all price vectors p and all z ∈ z(p). Prove:

(a) In equilibrium, markets with a positive price �clear�:

∀p ∈ RL+, ∀z ∈ z(p) ∩ RL−, ∀` ∈ {1, . . . , L} : if p` > 0, then z` = 0.

(b) If prices are positive and L− 1 markets �clear�, then so does the �nal one:

∀p ∈ RL++, ∀z ∈ z(p), ∀` ∈ {1, . . . , L} : if zk = 0 for all k 6= `, then z` = 0.

Markets clear in most standard applications:

(c) Consider an equilibrium. Suppose (c1) or (c2) is true for at least one consumer h ∈ H:

(c1) %h is strongly monotonic on X = RL+.(c2) h has a positive amount of money to spend, h's least preferred alternatives are on the axes:

∀x, y ∈ RL+ : x ∈ RL++, y /∈ RL++ ⇒ x �h y,

and %h is strongly monotonic on X = RL++.

Prove that all markets clear.

Cobb-Douglas preferences, for instance, satisfy the requirements in (c2), not those in (c1).

Exercise 6.3 Consider a private ownership economy E . A feasible allocation (x, y) is

Pareto dominated if there is another feasible allocation (x, y) with xh %h xh for all h ∈ Hand xh �h xh for some h ∈ H.

Pareto optimal if it is not Pareto dominated.

(a) Why do you think Pareto dominance is de�ned in terms of consumer preferences, ignoring thoseof producers?

(b) Prove the First fundamental welfare theorem: If (p, x, y) is a Walrasian equilibrium of Eand consumers have locally nonsatiated preferences, then (x, y) is Pareto optimal.

Exercise 6.4 Restricting attention to prices in the unit simplex ∆ (to avoid trivialities), give an exampleof a pure exchange economy E with two consumers, two commodities, and

(a) no Walrasian equilibrium.

(b) exactly one Walrasian equilibrium.

(c) exactly two Walrasian equilibria.

(d) in�nitely many Walrasian equilibria.

Answer the same question for a private ownership economy by adding 714 producers (yes, seven hundredand fourteen. . . You don't seriously believe I'd ask this if the answer weren't trivial, do you?).

Exercise 6.5 King Solomon's problem: In a well-known parable, king Solomon settles a disputebetween two women, each claiming that a certain baby is hers, by suggesting to cut it in two with hissword: the true mother is revealed as she is willing to give up her child to the liar, rather than haveit killed. Swords make babies divisible commodities, so consider a pure exchange economy with twoconsumers (the two women), one commodity (the baby). Let x ∈ [0, 1] be a share of a baby. The truemother has utility function uT : [0, 1] → R with uT (x) = x if x ∈ {0, 1} and uT (x) = −1 otherwise.The liar has utility function uL : [0, 1] → R with uL(x) = x. Determine for each initial allocation(ωT , ωL) ∈ {z ∈ R2

+ : z1 + z2 = 1} the set of feasible allocations, the set Pareto optimal allocations, thecore, and the set of Walrasian equilibria.

56

Page 63: Preferences

7. Expected utility theory

Hitherto, we assumed that decision makers act in a world of absolute certainty; typically, however,the consequences of decisions entail some stochastic elements. This section treats the develop-ment of expected utility theory, using the axiomatic approach of von Neumann and Morgenstern.

7.1. Simple and compound gambles

We maintain the notion of preferences, but instead of assuming that a decision maker (DM) haspreferences over certain outcomes, we consider preferences over lotteries or gambles, which areprobability distributions over outcomes. Formally, let A = {a1, . . . , an} be a nonempty, �niteset of (deterministic) outcomes. A simple gamble assigns a probability pi to each outcomeai ∈ A. We denote a simple gamble by

g = (p1 ◦ a1, · · · , pn ◦ an).

Probabilities should be nonnegative and add up to one, so the set of simple gambles is

G1 =

{(p1 ◦ a1, · · · , pn ◦ an) : p1, . . . , pn ≥ 0,

n∑i=1

pi = 1

}. (28)

For instance, when tossing a coin, the outcome will be heads H or tails T , so A = {H,T}. Afair coin corresponds with the simple gamble (1

2 ◦H,12 ◦ T ). Some notational conventions:

� one often omits outcomes with probability zero from the notation of a simple gamble:(1

2 ◦ a1,12 ◦ an) is an abbreviation for the simple gamble(

1

2◦ a1, 0 ◦ a2, · · · , 0 ◦ an−1,

1

2◦ an

).

� one often writes ai for the simple gamble (1◦ai) whose outcome is ai with probability one.

Not all gambles are simple. Perhaps you decided to bet one dollar on your favorite number in aroulette game, but toss a coin to decide which of two roulette wheels you want to play in a casino:the outcome of the �rst gamble (the coin toss) is another gamble (the roulette game). This is anexample of a compound gamble. In principle, we can have any level of compound gambles. Forconvenience, we will assume that a compound gamble ends in a deterministic outcome after only�nitely many steps. Formally, the set of compound gambles is de�ned as follows. Let G0 = Aand, inductively, for each m ∈ N, let Gm be the set of gambles whose outcomes are gambles fromthe lower levels G0, . . . , Gm−1:

Gm =

{(p1 ◦ g1, · · · , pk ◦ gk) : k ∈ N, p1, . . . , pk ≥ 0,

k∑i=1

pi = 1, and g1, . . . , gk ∈ ∪m−1`=0 G`

}.

The set of compound gambles isG = ∪∞m=0Gm.

Associated with each compound gamble is a simple one, specifying the e�ective probabilitieswith which the outcomes in A occur. For instance, suppose that A = {a1, a2} and consider thecompound gamble g yielding a1 with probability α and a lottery ticket with probability 1 − α.

57

Page 64: Preferences

The lottery ticket is a simple gamble yielding a1 with probability β and a2 with probability1− β. Eventually, this implies that a1 occurs with probability α+ (1− α)β and a2 occurs withprobability (1− α)(1− β). Thus, g gives rise to the simple gamble

((α+ (1− α)β) ◦ a1, (1− α)(1− β) ◦ a2).

Similarly, for every gamble g ∈ G, let pi be the e�ective probability assigned to ai ∈ A by g.We say that g induces the simple gamble (p1 ◦ a1, · · · , pn ◦ an) ∈ G1 or that the latter is thereduced simple gamble associated with g. Notice that this reduced simple gamble is unique.

7.2. Preferences over gambles

Assume the DM has a preference relation % over the set G of compound gambles. Impose thefollowing properties:

(G1) % is a weak order.

Given the set of deterministic outcomes A = {a1, . . . , an}, every simple gamble g ∈ G1 is fullydescribed by its vector (p1, . . . , pn) ∈ Rn of probabilities, i.e., we can interpret G1 simply as theunit simplex ∆n = {p ∈ Rn+ :

∑i pi = 1}. And in Rn, we know what continuity means, so we

can state:

(G2) Continuity on G1: The preference relation % restricted to G1 is continuous.

Continuous weak orders have played an extensive role also in our earlier sections; the followingproperties explicitly exploit the speci�c structure of our gambling framework. Our next propertyrequires that in considering a gamble, the DM cares only about the e�ective probabilities assignedto each outcome in A: it su�ces to restrict attention to simple gambles:

(G3) Reduction to simple gambles: for each g ∈ G, if (p1 ◦ a1, · · · , pn ◦ an) is the simple gambleinduced by g, then g ∼ (p1 ◦ a1, · · · , pn ◦ an).

This is a strong assumption. It rules out, for instance, any preference relation that takes intoaccount the complexity of compound gambles: a DM may strictly prefer the associated reducedsimple gamble to some g ∈ G2562, since it involves a much less intricate chain of events leadingto eventual deterministic outcomes.

Our next property, independence, says that if we mix two gambles g and g′ with a third one,g′′, then the preference between the mixtures should be independent of the particular choice ofthe third gamble. It essentially requires some form of independence of irrelevant alternatives: inthe two gambles

(α ◦ g, (1− α) ◦ g′′) and (α ◦ g′, (1− α) ◦ g′′),

the gamble g′′ occurs with the same probability 1 − α. According to independence, this meansthat the preference should depend only on the part where the two gambles are di�erent, i.e., ongambles g and g′.

(G4) Independence: for all g, g′, g′′ ∈ G and all α ∈ (0, 1):

g % g′ ⇔ (α ◦ g, (1− α) ◦ g′′) % (α ◦ g′, (1− α) ◦ g′′).

58

Page 65: Preferences

These four properties have a number of intuitive consequences:

Proposition 7.1 Assume the preference relation % on G satis�es (G1) to (G4).

(a) There is a best element g and a worst element g in G1, i.e., for all g ∈ G1 : g % g % g.

(b) For each g ∈ G, there is a number αg ∈ [0, 1] such that

g ∼ (αg ◦ g, (1− αg) ◦ g).

(c) Substitution: let k ∈ N and let p1, . . . , pk > 0 add up to one. Let g1, . . . , gk, h1, . . . , hk ∈ Gbe such that gi ∼ hi for all i = 1, . . . , k. Then

(p1 ◦ g1, · · · , pk ◦ gk) ∼ (p1 ◦ h1, · · · , pk ◦ hk).

Finally, let us assume that g � g, to avoid trivial cases.

(d) Monotonicity: for all α, β ∈ [0, 1], if α > β, then

(α ◦ g, (1− α) ◦ g) � (β ◦ g, (1− β) ◦ g).

Proof. (a): Immediate from continuity (G2) of the weak order (G1) % on the compact unitsimplex ∆n.(b): Let g ∈ G and let gs ∈ G1 be its reduced simple gamble. Since g ∼ gs by (G3) andg % gs % g, it follows from transitivity (G1) that g % g % g.

Let p, p ∈ ∆n be the associated probabilities of g and g. By connectedness of the set of convexcombinations of these best and worst gambles in the unit simplex, Proposition 2.7 implies thatthere is a gamble with probabilities

αgp+ (1− αg)p

equivalent with g. By reduction to simple gambles (G3), this means

g ∼ (αg ◦ g, (1− αg) ◦ g)

(c): By induction on k ∈ N. The claim is trivially true if k = 1. Let k ∈ N, k ≥ 2, and supposethe claim is true for mixtures of less than k gambles. To prove the case with mixtures of kgambles, notice that

(p1 ◦ g1, · · · , pk ◦ gk) ∼ (p1 ◦ g1, (1− p1) ◦ ( p21−p1 ◦ g2, · · · , pk

1−p1 ◦ gk)) by (G1) and (G3)

∼ (p1 ◦ h1, (1− p1) ◦ ( p21−p1 ◦ h2, · · · , pk

1−p1 ◦ hk)) by induction

∼ (p1 ◦ h1, · · · , pk ◦ hk) by (G1) and (G3)

so the claim holds by transitivity of ∼.(d): Assume g � g and let α, β ∈ [0, 1] satisfy α > β. If α = 1 or β = 0, the result follows easilyfrom reduction (G3) and independence (G4), so assume that 1 > α > β > 0. Then

(α ◦ g, (1− α) ◦ g) � (α ◦ g, (1− α) ◦ g) by (G4)

∼ g by (G1) and (G3).

59

Page 66: Preferences

Since % is a weak order (G1):(α ◦ g, (1− α) ◦ g) � g.

Denote the left gamble by g. Then

(α ◦ g, (1− α) ◦ g) = g

∼ (βα ◦ g, (1−βα) ◦ g) by (G1) and (G3)

� (βα ◦ g, (1−βα) ◦ g) by (G4)

∼ (β ◦ g, (1− β) ◦ g) by (G1) and (G3).

Since % is a weak order (G1):

(α ◦ g, (1− α) ◦ g) � (β ◦ g, (1− β) ◦ g),

as we had to show. �

7.3. von Neumann-Morgenstern utility functions

Equipped with these results, one can show that properties (G1) to (G4) imply the existenceof a utility function u : G → R that is linear in the e�ective probabilities over the outcomes.Formally, a von Neumann-Morgenstern (vNM) utility function is a function u : G → Rthat

� represents the preference relation % on G:

∀g, h ∈ G : g % h⇔ u(g) ≥ u(h),

� and does so in a way that for every gamble g ∈ G:

u(g) =

n∑i=1

piu(ai),

where (p1 ◦ a1, · · · , pn ◦ an) is the simple gamble induced by g.

In words: a vNM utility function represents the preferences of the DM and the utility assignedto a gamble equals the expected utility of the induced simple gamble.

Proposition 7.2 If % is a preference relation over G satisfying (G1) to (G4), there exists avNM utility function representing %.

Proof. By Proposition 7.1(a), there exists a best gamble g and a worst gamble g in G1. In thetrivial case where g ∼ g, any constant function is a vNM utility function. So assume, w.l.o.g.,that g � g.

For each g ∈ G, Proposition 7.1 implies the existence of a unique number αg ∈ [0, 1] suchthat g ∼ (αg ◦ g, (1− αg) ◦ g). De�ne

u(g) = αg. (29)

60

Page 67: Preferences

This utility function represents %: let g, h ∈ G. Then

g % h ⇔ (αg ◦ g, (1− αg) ◦ g) % (αh ◦ g, (1− αh) ◦ g)

⇔ u(g) = αg ≥ αh = u(h),

where the �rst equivalence follows from transitivity (G1) of % and the second equivalence frommonotonicity and the de�nition of u.

To obtain the expected utility expression, let g ∈ G and let gs = (p1 ◦ a1, · · · , pn ◦ an) be thesimple gamble induced by g. By (G3), g ∼ gs, so u(g) = u(gs). For each ai ∈ A, we know fromProposition 7.1 and the de�nition of u(ai) that

ai ∼ (u(ai) ◦ g, (1− u(ai)) ◦ g).

For each i = 1, . . . , n, de�ne hi = (u(ai) ◦ g, (1− u(ai)) ◦ g). By substitution:

gs = (p1 ◦ a1, · · · , pn ◦ an) ∼ (p1 ◦ h1, · · · , pn ◦ hn).

Notice that h1, . . . , hn are gambles over the best and worst gambles only. By computing theprobability for the best gamble g and using reduction to simple gambles (G3), one �nds that(p1 ◦ h1, · · · , pn ◦ hn) is equivalent with((

n∑i=1

piu(ai)

)◦ g,

(1−

n∑i=1

piu(ai)

)◦ g

).

Combining the above with transitivity of % we �nd:

g ∼ gs ∼ (p1 ◦ h1, · · · , pn ◦ hn) ∼

((n∑i=1

piu(ai)

)◦ g,

(1−

n∑i=1

piu(ai)

)◦ g

). (30)

By de�nition, u(g) is the unique number in [0, 1] satisfying

g ∼ (u(g) ◦ g, (1− u(g)) ◦ g).

Combining this with (30) yields u(g) =∑n

i=1 piu(ai). �

Remark 7.3 Conversely, it is straightforward to verify that if a preference relation % on G canbe represented by a vNM utility function, it must satisfy properties (G1) to (G4). /

The linearity requirement on vNM utility implies that the earlier result from utility theory �any strictly increasing transformation of the utility function of the consumer still represents thesame preferences � no longer holds. Indeed, the only transformations of a vNM utility functionthat remain vNM utility functions, are positive a�ne transformations:

Proposition 7.4 Consider the vNM utility function u : G→ R de�ned in (29). For all a, b ∈ Rwith a > 0, also au+ b is a vNM utility function representing %. Conversely, if v : G→ R is avNM utility function representing % on G, there exist a, b ∈ R with a > 0 such that v = au+ b.

61

Page 68: Preferences

Proof. To avoid trivialities, assume that g � g. The �rst claim is simple. To establish thesecond claim, let a > 0 and b be the unique solution (do you understand why a solution existsand why it is unique?) to

v(g) = au(g) + b,

v(g) = au(g) + b.

Let g ∈ G. By construction � see (29) � g ∼ (u(g) ◦ g, (1− u(g)) ◦ g), so

u(g) = u(g)u(g) + (1− u(g))u(g), (31)

and, similarly,

v(g) = u(g)v(g) + (1− u(g))v(g)

= u(g)[au(g) + b] + (1− u(g))[au(g) + b]

= a[u(g)u(g) + (1− u(g))u(g)] + b

= au(g) + b,

where the last equation follows from (31). �

Our development of vNM utilities involved a �nite set A of deterministic outcomes and com-pound gambles of �nite length. These assumptions can be relaxed, but at the cost of increasedtopological and measure-theoretic complexity.

7.4. Exercises

Exercise 7.1 Throughout this exercise, let G = ∪∞n=0Gn be the set of compound gambles over a �niteset {a1, . . . , ak} ⊂ R of k ≥ 2 di�erent deterministic outcomes. Recall: Gn is the set of n-th level gambles.For each of the preference relations % over G de�ned below, answer the following questions:

� If possible, �nd the best and the worst elements of G.

� For each of the four properties (G1) to (G4) guaranteeing the existence of a vNM utility function,check whether % satis�es it.

� If (G1) to (G4) are satis�ed, �nd a vNM utility function representing %.

(a) Most likely outcomes: A decision maker bases preferences on the average of the deterministicoutcomes that are most likely to occur. Let g ∈ G and let (p1 ◦ a1, · · · , pk ◦ ak) be its inducedsimple gamble. Let

L(g) = {ai : pi ≥ pj for all j = 1, . . . , k}be the set of most likely deterministic outcomes and |L(g)| its number of elements. The preferencerelation % on G is de�ned as follows: for all g, h ∈ G:

g % h ⇔ 1

|L(g)|∑

ai∈L(g)ai ≥

1

|L(h)|∑

ai∈L(h)ai.

(b) Keeping it simple: A decision maker dislikes complex alternatives and has preferences % over Grepresented by the following utility function: for each g ∈ G, there is a unique n with g ∈ Gn. Let(p1 ◦ a1, · · · , pk ◦ ak) be its induced simple gamble. Then u(g) =

∑km=1 pmam − n.

(c) Satisficing: A decision maker is content with all deterministic outcomes larger than 5. Thepreference relation % on G is represented by the following utility function: for each g ∈ G, let(p1 ◦ a1, · · · , pk ◦ ak) be its induced simple gamble. Then u(g) =

∑i:ai>5 pi.

62

Page 69: Preferences

8. Risk attitudes

8.1. In for a gamble?

Let us con�ne attention to cases where the outcomes of the gambles are amounts of money: A isa convex set in R. Despite the fact that we now allow an in�nite set of outcomes, we will assumethat every gamble assigns positive probability to only �nitely many outcomes. The existencetheorem of vNM utility functions can be adjusted to this case by modifying the properties (G1)to (G4) to in�nite sets. We assume that the vNM utility function u is increasing in money andinvestigate the relation between this function and the DM's attitude towards risk.

Consider a nontrivial (i.e., at least two di�erent deterministic outcomes have positive prob-ability) simple gamble g = (p1 ◦ w1, · · · , pn ◦ wn) and suppose the DM is o�ered two scenarios:

1. Accept the gamble; this yields utility u(g) =∑n

i=1 piu(wi).

2. Accept the outcome that gives the expected value of the gamble with certainty (this is wherewe need convexity of A!). The expected value of the gamble is equal to E(g) =

∑ni=1 piwi.

This alternative has utility u(E(g)) = u(∑n

i=1 piwi).

The DM is said to be:

� risk averse at g if u(g) < u(E(g)),

� risk neutral at g if u(g) = u(E(g)),

� risk loving at g if u(g) > u(E(g)).

The DM is said to be risk averse (on G) if he is risk averse at every nontrivial simple gambleg over outcomes in A. Risk neutral and risk loving behavior are de�ned analogously. Theserisk attitudes directly translate to properties of the associated vNM utility function over money:

Proposition 8.1 Let A ⊆ R be nonempty and convex. Assume the DM has a vNM utilityfunction u. Then the DM is:

(a) risk averse if and only if u is strictly concave on A,

(b) risk neutral if and only if u is linear on A,

(c) risk loving if and only if u is strictly convex on A.

Proof. We only prove the �rst claim; the others are similar. Risk aversion means that for everynontrivial gamble (p1 ◦ w1, · · · , pn ◦ wn),

u(p1 ◦ w1, · · · , pn ◦ wn) =

n∑i=1

piu(wi) < u(E(g)) = u

(n∑i=1

piwi

).

But this is equivalent with strict concavity: by induction it follows that the function u isstrictly concave on A if and only if for all di�erent w1, . . . , wn ∈ A and all p1, . . . , pn > 0with

∑ni=1 pi = 1 :

∑ni=1 piu(wi) < u (

∑ni=1 piwi). �

Although we can always check whether a DM is risk averse/neutral/loving at a speci�c gambleg, he does not have to be risk averse/neutral/loving over the entire collection of lotteries. It maywell be, for instance, that he is risk averse at high-stake lotteries and risk loving at low-stakelotteries.

63

Page 70: Preferences

8.2. Certainty equivalent and risk premium

The certainty equivalent of a simple gamble g is an amount of money CE(g) o�ered withcertainty such that the DM is indi�erent between the gamble g and accepting CE(g):

u(g) = u(CE(g)).

Remark 8.2 For topologists (can be omitted): generalizing the continuity requirement G2 tothe case of an in�nite set A ⊆ R of deterministic outcomes entails in particular that preferenceson A are continuous. So for each simple gamble g, there is a w ∈ A (say, weight one on thebest deterministic outcome in g) with w % g and a w ∈ A with g % w. By the Intermediatevalue theorem for preferences, Proposition 2.7, there is a CE(g) ∈ A with g ∼ CE(g). Bymonotonicity of preferences in money, CE(g) is unique: the certainty equivalent is a well-de�nednotion. /

The risk premium of a simple gamble g is an amount of money P (g) such that u(g) = u(E(g)−P (g)). Clearly,

P (g) = E(g)− CE(g).

Intuitively, a risk averse DM prefers E(g) with certainty over the gamble g. But there will be someamount that makes him indi�erent between accepting that amount with certainty and acceptingthe gamble g. This amount is called the certainty equivalent. It is easy to show (see below) thatfor a risk averse DM who strictly prefers more money to less, the certainty equivalent is less thanthe expected value E(g) of the gamble: a risk averse person is willing to pay a positive amountof money to avoid the gamble's inherent risk. This willingness to pay is the risk premium.

Proposition 8.3 Consider a DM with vNM utility function u which is increasing in wealth. Thefollowing three statements are equivalent:

1. DM is risk averse,

2. CE(g) < E(g) for all nontrivial gambles g ∈ S,

3. P (g) > 0 for all nontrivial gambles g ∈ S.

Proof. Since P (g) = E(g) − CE(g), statements 2 and 3 are equivalent, so it su�ces to showthat statements 1 and 2 are equivalent. The DM is risk averse if and only if for every nontrivialg ∈ S, u(g) < u(E(g)). By de�nition of CE(g), this is equivalent with u(CE(g)) < u(E(g)),which is equivalent with CE(g) < E(g), since u is increasing. �

As a simple exercise, try to formulate similar characterizations of risk neutral and risk lovingbehavior.

Example. Take A = R++ and assume that u(w) = ln(w) for all w ∈ A. This DM is risk averse,since u is strictly concave. Assume DM's initial wealth is w0 and DM faces a gamble g o�ering50-50 odds of winning or losing an amount h ∈ (0, w0) :

g = ((1/2) ◦ (w0 − h) , (1/2) ◦ (w0 + h)).

64

Page 71: Preferences

Hence E(g) = 12(w0 − h) + 1

2(w0 + h) = w0. The certainty equivalent CE(g) must satisfy

u(CE(g)) = u(g) =1

2ln(w0 − h) +

1

2ln(w0 + h) = ln

√w2

0 − h2,

where the �nal equation follows from the properties of the natural logarithm. Hence CE(g) =√w2

0 − h2 < w0 = E(g) and P (g) = w0 −√w2

0 − h2 > 0.

8.3. Arrow-Pratt measure of absolute risk aversion

Arrow and Pratt considered the problem of measuring the extent of risk aversion. They assumedthat the vNM utility function u is an increasing, strictly concave function of wealth levels thatis twice di�erentiable. In particular, they assume:

∀w : u′(w) > 0 and u′′(w) < 0. (32)

Using this, the Arrow-Pratt measure of absolute risk aversion at wealth w is de�ned as

Ra(w) = −u′′(w)

u′(w).

Why is this a sensible measure of risk aversion? A heuristic derivation is provided in the nextsubsection. The intuition is as follows: the more risk averse a DM is, the more he is willing to payto avoid certain gambles. Thus, the size of the risk premium in some way measures risk aversion.It turns out that the Arrow-Pratt measure of absolute risk aversion is roughly proportional tothe risk premium the DM is willing to pay to avoid actuarially fair bets (a bet is actuariallyfair if its expected value equals initial wealth: the expected loss/gain is zero). Thus, if DM 1 ismore risk averse than DM 2, his risk premium for every nontrivial gamble exceeds that of DM 2,so the same should hold (due to proportionality) for the Arrow-Pratt measures of absolute riskaversion. The actual proof is somewhat more complicated; we omit it.

Proposition 8.4 Consider two DMs with vNM utility functions u and v respectively, both sat-isfying (32). The following two claims are equivalent:

1. R1a(w) = −u′′(w)

u′(w) > −v′′(w)v′(w) = R2

a(w) for all wealth levels w,

2. The risk premium P 1(g) of the DM with utility function u is strictly larger than the riskpremium P 2(g) of the DM with utility function v for every nontrivial gamble g ∈ S.

Notice that positive a�ne transformations of the utility functions do not a�ect Ra(w): it doesnot depend on the choice of vNM utility function.

It is common in the literature on for instance portfolio choice to assume that risk aversiondecreases with wealth. This is the DARA assumption (Decreasing Absolute Risk Aversion):

Ra(·) is a decreasing function of w.

65

Page 72: Preferences

8.4. A derivation of the Arrow-Pratt measure

The argument in this section is due to Pratt (1964). Assume (32) and let the DM's initial wealthbe w0. Consider the gamble with 50-50 odds of winning or losing an amount h :

g = ((1/2) ◦ (w0 − h) , (1/2) ◦ (w0 + h)).

The gamble is fair: E(g) = w0. Let P = P (g) > 0 be the risk premium of g :

u(g) =1

2u(w0 − h) +

1

2u(w0 + h) = u(E(g)− P ) = u(w0 − P ). (33)

Take a �rst order Taylor approximation of u(w0 − P ) around w0 :

u(w0 − P ) ≈ u(w0)− u′(w0)P. (34)

Take a second order Taylor approximation of u(w0 − h) and u(w0 + h) around w0 :

u(w0 − h) ≈ u(w0)− u′(w0)h+1

2u′′(w0)h2,

u(w0 + h) ≈ u(w0) + u′(w0)h+1

2u′′(w0)h2.

Consequently,1

2u(w0 − h) +

1

2u(w0 + h) ≈ u(w0) +

1

2u′′(w0)h2. (35)

Using (33), (34), and (35), it follows that

u(w0) +1

2u′′(w0)h2 ≈ u(w0)− u′(w0)P.

Rearranging terms, one �nds

P ≈ 1

2h2−u′′(w0)

u′(w0).

Conclude that the Arrow-Pratt measure of absolute risk aversion is approximately proportionalto the risk premium P , the willingness to pay in order to avoid the 50-50 odds of winning orlosing an amount h.

66

Page 73: Preferences

9. Some critique on expected utility theory

Expected utility theory is the main tool in economic models involving uncertainty. Nevertheless,expected utility theory has been under constant attack from behavioral economists and psychol-ogists who show that subjects in experiments or real-life situations systematically violate theproperties (G1) to (G4) or that mindless application of expected theory leads to counterintuitiveconclusions. For this reason, many alternative models for decision making under risk and uncer-tainty have been developed. Perhaps the most well-known � especially since Daniel Kahnemanwas awarded the 2002 Nobel Prize in economics � is Kahneman and Tversky's prospect theory(Kahneman and Tversky, 1964). Although we lack time to go into such alternative models, westand still for a while and consider a number of blows to the expected utility model.

9.1. Problems with unbounded utility: a variant of the St. Petersburg paradox

Nothing in the development of our expected utility model required the utility function to bebounded. Unbounded utility functions, however, make decision-makers susceptible to cunningexploitation. Suppose a DM with initial wealth w0 > 0 has a vNM utility function u over moneywhich is not bounded from above.

By assumption, there is some wealth w1 with u(w0) < 12(u(0) + u(w1)). Smile and o�er your

victim the gamble (12 ◦ 0, 1

2 ◦ w1), which he will accept by construction.If he loses, he ends up with wealth zero. If he wins, reach him w1 and just before he takes the

money from your hand, retract it, turn your smile back on, and o�er him a gamble (12 ◦0, 1

2 ◦w2),where w2 is chosen such that u(w1) < 1

2(u(0) + u(w2)). Again, by construction, the DM willaccept.

As long as the DM goes on winning, keep o�ering such 50-50 odds gambles. . . The DM willend up with wealth zero with probability one!

9.2. Allais' paradox

Consider the following four simple gambles:

g1 = (1 ◦ $1, 000, 000),

g2 = ((0.10) ◦ ($5, 000, 000), (0.89) ◦ ($1, 000, 000), (0.01) ◦ ($0)),

g3 = ((0.11) ◦ ($1, 000, 000), (0.89) ◦ ($0)),

g4 = ((0.10) ◦ ($5, 000, 000), (0.90) ◦ ($0)).

It turns out that in di�erent experiments, most people prefer g1 to g2, but g4 to g3. This violatesexpected utility theory. Suppose a DM has vNM utility function u. Then

g1 � g2 ⇔ u($1, 000, 000) > 0.10u($5, 000, 000) + 0.89u($1, 000, 000) + 0.01u($0).

Rearranging terms, we �nd

g1 � g2 ⇔ 0.11u($1, 000, 000) > 0.10u($5, 000, 000) + 0.01u($0)⇔ 0.11u($1, 000, 000) + 0.89u($0) > 0.10u($5, 000, 000) + 0.90u($0)⇔ g3 � g4,

where the last equivalence follows from computing the expected utility of g3 and g4.

67

Page 74: Preferences

9.3. Probability matching

You are paid $1 each time you guess correctly whether a red or a green light will �ash. The lights�ash randomly, but the red is set to turn on three times as often as the green. It has been foundthat many subjects in experiments of this type try to imitate the chance mechanism: they choosered about three quarters of the time and green one quarter. Obviously it would be more pro�tableto always choose red. Formally, the expected utility of the compound lottery of choosing redwith probability 3/4 gives you a one dollar payo� with probability (3/4)2 + (1/4)2 = 10/16,corresponding with the simple gamble

((10/16) ◦ $1, (6/16) ◦ $0),

while choosing red with probability one corresponds with the simple gamble

((3/4) ◦ $1, (1/4) ◦ $0).

Since 3/4 > 10/16, the second gamble should be strictly preferred over the �rst.This type of matching behavior has been frequently observed in real life, as well as laboratory

experiments, using both humans and animals as subjects. In an experiment with animals, forinstance, foraging behavior of pigeons was studied, using two food patches (call them red andgreen, as above) with food being dispatched at the red location three quarters of the time andat the green location one quarter of the time. The pigeons tried to match this probabilitydistribution.

A small personal anecdote: jointly with two colleagues, I published two papers on a gametheoretic model of bounded rationality in which players are assumed to display matching behav-ior. To explain the type of behavior to laymen and motivate that it is observed in real life, weused di�erent examples, among them the pigeon example mentioned above. This led the DutchFoundation for Mathematical Research, which at that time was �nancing my work, to publisha press statement proudly proclaiming: �People behave like pigeons when dealing with probabil-ity�, a press statement that gave us extensive media coverage but where we desperately tried toqualify our employers' overzealous interpretation. So in case you sometimes wonder what youare doing. . . you may just be behaving like a pigeon!

9.4. Rabin's calibration theorem

Matthew Rabin, one of the world's leading behavioral economists, published a remarkable article(Rabin, 2000) on the consequences of risk aversion with respect to small-stake gambles. Let usstart with an example to illustrate the result. Consider a risk averse DM who for each initialwealth level rejects a 50-50 odds gamble of winning 11 dollars or loosing 10 dollars: certainly arather unremarkable level of risk aversion. What does this imply about his preferences for othergambles? Consider, for instance, the following statements:

1. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing100 dollars and a 50 percent chance of gaining 150 dollars.

2. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing100 dollars and a 50 percent chance of gaining 1, 500 dollars.

3. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing100 dollars and a 50 percent chance of gaining 1, 000, 000 dollars.

68

Page 75: Preferences

4. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing100 dollars and a 50 percent chance of gaining 1, 000, 000, 000, 000, 000, 000 dollars.

5. You can proceed with gains G as high as you want, but the DM will always reject thelottery with a 50 percent chance of loosing 100 dollars and a 50 percent chance of gainingG.

Which of these statements are true? The �rst and the second may perhaps not be so surprisingand I probably wouldn't be asking you if the question was trivial, so even the third could betrue. On the other hand, one would certainly doubt the sanity of a DM rejecting the bet in thefourth claim and lingering doubt turns to certainty in the �fth case. Yet this is exactly what theDM will do: no amount of money in the world will make him accept a gamble with a 50 percentchance of loosing 100 dollars. Clearly, such behavior is absurd.

Let us try to establish some intuition. The fact that the DM at each wealth level w rejectsthe gamble (

1

2◦ (w − 10) ,

1

2◦ (w + 11)

)implies that

∀w :1

2u(w − 10) +

1

2u(w + 11) < u(w),

or, rewriting the expression, that

∀w : u(w + 11)− u(w) < u(w)− u(w − 10).

Hence, on average, the DM values each dollar between w and w + 11 by at most 10/11 times asmuch as he, on average, values each dollar between w − 10 and w :

∀w :u(w + 11)− u(w)

11<

10

11

u(w)− u(w − 10)

10.

By concavity of the utility function, this means that the marginal utility of the (w+11)-th dollaris at most 10/11 times the marginal utility of the (w − 10)-th dollar:

∀w : u′(w + 11) <10

11u′(w − 10). (36)

Repeated application of (36) implies an enormous decrease in marginal utility of money: themarginal utility of dollar w + 32 is at most 10/11 times the marginal utility of dollar w + 11,which is at most 10/11 times the marginal utility of dollar w − 10, so the marginal utility ofdollar w + 32 is at most (10/11)2 ≈ 0.83 times the value of dollar w − 10. Similarly, the DMvalues dollar w+53 by at most (10/11)3 ≈ 0.75 times the value of dollar w−10. More generally,the DM values dollar w+ 11 + 21k, where k ∈ N, by at most (10/11)k+1 times the value of dollarw − 10, which is an extremely high rate of deterioration for the value of money.

69

Page 76: Preferences

10. Time preference

Discounting essentially means that a given bene�t is valued higher when it is received immediatelythan when it is received with a delay. A common economic motivation for discounting is that,say, one dollar today is worth more than one dollar next year, as the immediate reward can beput into a bank at an annual interest rate r > 0, making the dollar today worth 1+r dollars nextyear. Another motivation, common in evolutionary models, is the risk that a delayed bene�tmay not be realized: you may die before receiving it (or be interrupted in achieving it, or becheated in the promise of receiving it).

In addition to the question how to model discounting in an appropriate way, decision theoryin the presence of time involves a number of careful considerations:

� Choice of horizon: should one look �nitely or in�nitely far into the future? Keynes' famousquote �In the long run, we are all dead� could be an argument in favor of a �nite horizon.Many economic models involve just two time periods as an abstraction of �now� and �thefuture�. On the other hand, many decisions have no clearly de�ned �nal period: you � orin an evolutionary sense as in overlapping generations models, your genes � may live tosee another day. In such cases, an in�nite horizon makes sense.

� Choice of time as a discrete or continuous variable: also here, common sense, the appro-priate level of abstraction, and (not rarely) the modeler's choice of mathematical tools isdecisive.

Unless speci�ed otherwise, this section takes time as being discrete and uses an in�nite horizon.We derive the standard exponential discounting model from a stationarity assumption on pref-erences and brie�y discuss a violation of stationarity and hyperbolic discounting. Section 10.3,based on Osborne and Rubinstein (1994, Sec. 8.3), considers two criteria for evaluating out-comes over time without discounting. The �nal section, based on Voorneveld (2007), illustratesthe somewhat paradoxical statement that a sequence of utility-maximizing choices can minimizeutility.

10.1. Stationarity and exponential discounting

The standard model of preferences over time assumes that:

� the set of alternatives consists of sequences of outcomes c = (c0, c1, . . .) = (ct)∞t=0 in some

arbitrary set, where ct denotes the outcome at time t.

� preferences % over such sequences are represented by a utility function U of the form

U(c) = δ(0)u(c0) + δ(1)u(c1) + δ(2)u(c2) + · · · =∞∑t=0

δ(t)u(ct), (37)

with δ(0) = 1.

The function in (37) is often interpreted as a sum of discounted instantaneous utilities: theoutcome ct at time t ∈ N gives utility u(ct), but is discounted by a factor δ(t) ∈ (0, 1) as it liesin the future. The discount factor δ(0) = 1 for current outcomes is mostly cosmetic, facilitatingthe notation involving an in�nite sum.

Exercise 10.1 The expression in (37) involves an in�nite sum, which may not be well-de�ned.

(a) Give an example to show this.

70

Page 77: Preferences

(b) Prove that the sum is well-de�ned if the sequence of discount factors is summable (∑∞t=0 δ(t) <∞)

and the instantaneous utility function u is bounded.

The most common form of (37) involves exponential discounting : there is a δ ∈ (0, 1) suchthat δ(t) = δt for all t, turning utility into

U(c) =∞∑t=0

δtu(ct). (38)

Recall the earlier motivation for discounting of money: given a �xed interest rate r > 0 perperiod, one dollar tomorrow is worth only (1 + r)−1 dollars today, so it makes sense to discountfuture money by powers of δ = (1 + r)−1. Following Koopmans (1960), exponential discountingcan also be derived by imposing a stationarity requirement on preferences.

Preferences % satisfy stationarity if they are not a�ected if a common �rst outcome isdropped, and the timing of all other outcomes is advanced by one period. By repeated appli-cation, it implies that for a comparison between two sequences all initial periods with commonoutcomes can be dropped, and the �rst period of di�erent outcomes can be taken as the initialperiod. Formally, the preference relation % is

stationary if for all pairs (ct)∞t=0 and (dt)

∞t=0 with c0 = d0:

(c0, c1, c2, . . .) % (d0, d1, d2, . . .) ⇔ (c1, c2, . . .) % (d1, d2, . . .).

Deriving exponential discounting usually proceeds along the following lines:

Proposition 10.1 For notational convenience, let 0 be a feasible outcome. Assume that:

� preferences % can be represented by a utility function as in (37),

� satisfy stationarity,

� the decision-maker is indi�erent between:

option 1: getting α today and α′ tomorrow (i.e., the sequence (α, α′, 0, 0, . . .)),

option 2: getting β today and β′ tomorrow,

where u(α′) 6= u(β′).

Then the discount factor is exponential: δ(t) =(u(α)−u(β)u(β′)−u(α′)

)t.

Proof. By induction on t. The result is trivial if t = 0. Let t ∈ N and assume that δ(τ) =(u(α)−u(β)u(β′)−u(α′)

)τfor all τ < t. Repeated application of stationarity implies that

( 0, . . . , 0︸ ︷︷ ︸t−1 times

, α, α′, 0, 0, . . .) ∼ ( 0, . . . , 0︸ ︷︷ ︸t−1 times

, β, β′, 0, 0, . . .),

i.e., their utility must be the same. Substitution in (37) gives

δ(t− 1)(u(α)− u(β)) + δ(t)(u(α′)− u(β′)) = 0,

so

δ(t) = δ(t− 1)

(u(α)− u(β)

u(β′)− u(α′)

)=

(u(α)− u(β)

u(β′)− u(α′)

)t,

where the �nal equality uses the induction hypothesis. �

71

Page 78: Preferences

Exercise 10.2 Rational suicide: A decision maker (DM) lives for at most two periods, t = 0 andt = 1. At each time t ∈ {0, 1} that he is alive, he must decide, depending on his mood, whether or notto commit suicide. Regardless of his initial mood, at time t = 1 he will be depressed with probability1/2 or happy with probability 1/2. His instantaneous utility is state-dependent , i.e., it depends not onlyon his action but also on the state of the world at time t. The set of states is S = {h, d,D}, where h is�alive and happy�, d is �alive and depressed�, and D is �dead�. The set of actions is A = {k, `}, where kis �commit suicide� and ` is �go on living�. The instantaneous utility function u : S × A → R is de�nedas follows:

u(s, a) =

1 if (s, a) = (h, `),−1 if (s, a) = (h, k),−α if (s, a) = (d, `),

0 otherwise,

where α > 0 is the intensity of the depression. Thus, given that you're happy, killing yourself appearssilly, but if you're depressed, it may seem less so. State D is irreversible: should the DM decide to killhimself at time t = 0, then he receives utility 0 at time t = 1. The DM discounts the future exponentiallyat rate 0 < δ < 1, and maximizes expected lifetime utility (of the standard additive form). We solvethe decision problem by backward induction, starting with optimal behavior in the �nal period t = 1.Assume the DM is alive at time t = 1.

(a) What is the optimal action and the resulting instantaneous utility if the DM at t = 1 is (a1) happy?(a2) depressed?

Now consider the initial period: assume the DM is depressed at time t = 0.

(b) Assuming optimal behavior at time t = 1, what is the optimal action at time t = 0? Note: (i) ifthe DM does not kill himself immediately, there is uncertainty about his mood at time t = 1; (ii)the answer depends on α and δ.

(c) A psychologist claims that the option of future suicide might prevent depressed people from killingthemselves straight away. Explain this claim using the answers above.

10.2. Preference reversal and hyperbolic discounting

Stationarity requires that if you prefer one apple today over two apples tomorrow, then shiftingthis choice by one year (one apple next year versus two apples one year and a day from now)doesn't change that you'd still rather have the single apple. On the other hand, empiricalevidence (Thaler, 1981) seems to suggest that people are much more sensitive to a waiting timeof one day when it occurs right now than to a waiting time in the far future: if you anywayhave to wait an entire year for a lousy apple, you might as well wait one day more and doublethe booty. Di�erent attempts to capture such a preference reversal go under the heading of�hyperbolic discounting�. It simply involves discount factors that are not exponential. Arguablythe simplest approach is the so-called (β, δ)-model of Phelps and Pollak (1968). They de�ne,for β, δ ∈ (0, 1), the discount factors as δ(0) = 1, δ(1) = βδ, δ(2) = βδ2, δ(3) = βδ3, . . ., turningthe utility function (37) into:

U(c) = u(c0) + β∞∑t=1

δtu(ct).

To see that this model can explain the preference reversal for the apples, assume that utility usatis�es u(0) = 0 and is strictly increasing in apples. Preferring one apple today over two applestomorrow means that

u(1) > βδu(2). (39)

72

Page 79: Preferences

Preferring two apples one year and a day from now to one apple a year from now (and assumingwe're not in a leap year) means that

βδ366u(2) > βδ365u(1). (40)

For (39) and (40) to hold simultaneously, we simply need (β, δ) ∈ (0, 1)× (0, 1) to satisfy

βδ <u(1)

u(2)< δ.

Taking δ su�ciently close to one and β su�ciently close to zero will do the trick.

Exercise 10.3 (Loewenstein and Prelec, 1992) Discount factors δ(t) = (1 +αt)−γ/α, with α, γ > 0,�t experimental data well. Show that also this model captures the preference reversal described above.

Exercise 10.4 (Wärneryd, 2007) Sex and time preference: In some evolutionary models ofintertemporal consumption, time periods represent generations and people care about future consumptionto the extent that it is exercised by their o�spring (children, grandchildren, etc.). To simplify matters,assume that (i) a DM cares about consumption of its o�spring only if it has a speci�c gene; (ii) matesare selected at random and have the relevant gene with probability α ∈ [0, 1], (iii) o�spring gets inexpectation half of its genes from each parent, (iv) we consider one unit of o�spring per time period.

(a) Let t ∈ N. Show, for instance by conditioning on the giver of the gene, that the probability pt ofthe DM's t-th period o�spring carrying the gene satis�es the recurrence relation pt = 1

2pt−1 + 12α.

(b) Set p0 = 1 and show that the solution to the recurrence relation is pt = α + (1 − α)(

12

)tfor all

t = 0, 1, 2, . . .

With these kinship parameters (pt)∞t=0 in the place of discount factors, the standard separable utility

function becomes U(c) =∑∞t=0 ptu(ct). Assume that consumption is in units of apples, u is strictly

increasing, and u(0) = 0. Let's investigate the opportunity of preference reversal.

(c) Let α and u be such that the DM prefers 1 apple now (t = 0) to 2 apples next generation (t = 1).Prove: for T ∈ N su�ciently large, the DM prefers 2 apples at time T + 1 to 1 apple at time T .

10.3. Limit-of-means and overtaking

By discounting, less weight is assigned to future utilities. This section introduces two other waysof evaluating sequences of utilities, attaching equal weight to all periods. To save on notation,we will denote a sequence of utilities simply by a sequence (xt)

∞t=0 of real numbers, rather than

using the more elaborate (u(ct))∞t=0. Probably the �rst thing that comes to mind is to value a

sequence of utilities (xt)∞t=0 using the long-term average of the utilities:

limT→∞

x0 + x1 + · · ·+ xT−1

T.

However, even if the sequence is bounded, this limit may not exist: the average may continueto oscillate. We verify this statement with a binary (zero-one) sequence. The idea is to appendenough ones to increase the average until it achieves a �xed high value, then to append enoughzeroes to decrease the average until it reaches a �xed low value, and continue this process.

An oscillating average: Consider the binary sequence

(0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, . . .)

73

Page 80: Preferences

obtained by starting with a zero and two ones, and then after each block of zeroes or ones, doublethe length of the sequence obtained so far with a block of the other number: after the �rst blockof ones, we have three coordinates, so we double the length to six coordinates by appending somezeroes. Then we double the length to twelve coordinates by adding some ones, etc. A simpleinductive proof shows that after the k-th block of ones, the sequence has 3 · 22k−2 coordinates,22k−1 of them equal to one, and therefore an average of 2/3. Doubling the length to 3 · 22k−1

coordinates by appending zeroes decreases the average by a factor 1/2 to 1/3. As appendingzeroes decreases, and appending ones increases the average, it follows that the average continuesto oscillate between 1/3 and 2/3.

Taking, instead, a pessimistic view of how the average utility changes over time will give usa well-de�ned criterion. This requires some mathematical preliminaries. Consider a boundedsequence (xt)

∞t=0 of real numbers. For each t, st = inf{xs : s ≥ t} indicates the in�mum (in

somewhat colloquial terms, the �worst� value) of the tail of the sequence from time t onwards.This in�mum is well-de�ned, as the sequence is bounded. Notice also that the sequence (st)

∞t=0

is weakly increasing: increasing t implies taking the in�mum over a smaller set. As (st)∞t=0 is a

monotonic, bounded sequence, it converges. Its limit is called the lower limit or limes inferior(liminf) of the original sequence (xt)

∞t=0:

lim inft→∞

xt = limt→∞

(inf{xs : s ≥ t}) .

By convention, lim inft→∞ xt = −∞ if (xt)∞t=0 is not bounded from below. If it is bounded

from below, but not from above, the sequence of in�ma may diverge, in which case one setslim inft→∞ xt = +∞.

The following characterization of the lower limit may come in handy. Let (xt)∞t=0 be a

sequence and let c ∈ R. Then lim inft→∞ xt = c if and only if:

[L1] for each ε > 0, there is a T ∈ N such that c− ε < xt for all t ≥ T ,[L2] for each ε > 0 and each T ′ ∈ N, there is a t ≥ T ′ with xt < c+ ε.

In words, the sequence eventually remains above c− ε, but dives below c+ ε in�nitely often, nomatter how small ε > 0.

Exercise 10.5 Prove this.

The limit-of-means criterion evaluates utility streams by means of the lower limit of the averageutility:

Limit of means: Let x = (xt)∞t=0 and y = (yt)

∞t=0 be sequences in R. Then x is

preferred to y by the limit-of-means criterion, denoted x �L y, if and only if

lim infT→∞

1

T

T−1∑t=0

(xt − yt) > 0. (41)

Inequality (41) is equivalent with the statement that for some ε > 0, the average di�erencebetween sequences x and y eventually exceeds ε:

1

T

T−1∑t=0

(xt − yt) > ε for all but �nitely many periods T .

74

Page 81: Preferences

Exercise 10.6 Prove this.

Changes in a single coordinate of a sequence become negligible once the average is taken over along time, so under the limit-of-means criterion, changes in any �nite number of periods do notmatter. In particular, these preferences are stationary.

Exercise 10.7 Some authors refer to the limit-of-means criterion as the preference relation repre-sented by the utility function assigning to each bounded sequence x = (xt)

∞t=0 the number U(x) =

lim infT→∞1T

∑T−1t=0 xt.

(a) Why must the sequences be bounded?

(b) Aside from this, are the two de�nitions really the same?

The following criterion also assigns equal weight to periods, but remains sensitive to changes insingle coordinates:

Overtaking: Let x = (xt)∞t=0 and y = (yt)

∞t=0 be sequences in R. Then x is preferred

to y by the overtaking criterion, denoted x �O y, if and only if

lim infT→∞

T∑t=0

(xt − yt) > 0.

Let us compare exponential discounting, and the limit-of-means and overtaking criteria. Thelatter were de�ned in terms of strict preferences. De�ne the corresponding indi�erence relation∼L as follows: x ∼L y if neither x �L y nor y �L x. Of course, ∼O is de�ned similarly.

Comparison:

� The sequence (1,−1, 0, 0, . . .) is preferred to the sequence (0, 0, . . .) under exponential dis-counting for all δ ∈ (0, 1). Under the other two criteria, they are equivalent.

� The sequence (−1, 2, 0, 0, . . .) is preferred to the sequence (0, 0, . . .) under the overtakingcriterion. Under the limit-of-means criterion, they are equivalent.

� For every n ∈ N, the sequence(0, . . . , 0︸ ︷︷ ︸n times

, 1, 1, . . .)

is preferred to (1, 0, 0, . . .) under the limit-of-means criterion. However, for each δ ∈ (0, 1),a large enough delay in a constant stream of ones makes the instant grati�cation of getting1 immediately the preferable option.

10.4. Better may be worse

Consider an alcoholic who has to decide at each moment in discrete time whether to take a drink(action 1) or not (action 0). Given his uncertain life-length, common modeling practice is totreat this as an in�nite horizon problem, discounting the impact of future decisions if so desired.A drinking pattern is a sequence x = (xt)

∞t=0 of zeroes and ones, with xt = 1 if the alcoholic

takes a drink at time t and xt = 0 otherwise. With a minor abuse of notation, (0, x−t) denotesthe drinking pattern obtained from x by not drinking at time t. The pattern (1, x−t) is de�nedlikewise.

75

Page 82: Preferences

The philosophy of Alcoholics Anonymous is to �ght the temptations of alcohol by forgettingabout the past or the future and concentrate exclusively on the present: stay away from adrink �one day at a time�. Let us investigate the possibility of having a utility function U thatsimultaneously models:

� Temptation: at any given day, the alcoholic is at least as well of � and sometimes better� by choosing to drink:

U(1, x−t) ≥ U(0, x−t) for all t, x−t,U(1, x−t) > U(0, x−t) for some t, x−t.

� Health concerns: nevertheless, the best thing is never to drink and the worst thing isto drink at all times: (0, 0, . . .) maximizes, and (1, 1, . . .) minimizes U .

This sounds paradoxical and is indeed impossible under a �nite horizon: suppose there are onlyT ∈ N periods. Start with an arbitrary drinking pattern and switch, one period at a time, anyabstention (0) to drinking (1). By temptation, each such switch weakly increases the utilityfunction. So drinking at all times maximizes utility, in con�ict with health concerns, whichwould require that all these weak increases in utility eventually lead to a plunge in utility: it islike climbing a stairway, but ending up lower than before (Figure 3).

Figure 3: An impossible stairway

The next example shows that temptation and health concerns can be reconciled under anin�nite horizon.

Drinking paradox: De�ne the utility for each drinking pattern x as follows:

U(x) =

3 if xt = 1 for only �nitely many t (a rare drinker),0 if xt = 0 for only �nitely many t (a heavy addict),∑

t xt · 2−t otherwise.

As a switch from 0 to 1 at time t leaves the utility una�ected in the �rst two cases and increasesit by 2−t > 0 otherwise, the temptation assumption is satis�ed. However,

U(0, 0, . . .) = 3 = maxx

U(x) > minxU(x) = 0 = U(1, 1, . . .),

in conformance with health concerns.

76

Page 83: Preferences

11. Probabilistic choice

Consider a DM with a �nite set A of alternatives. Earlier, we saw that if the DM has a weak order% over these alternatives, there is a utility function u : A → R representing these preferencesand making an optimal choice reduces to choosing an alternative a ∈ arg maxb∈A u(b), a utilitymaximizing alternative. However, in numerous experiments, it turns out that DMs:

� do not always make the same choice under seemingly identical circumstances,

� sometimes choose seemingly suboptimal alternatives.

Such apparently irrational behavior has led to the development of so-called probabilistic choicemodels, where the main idea is that:

� each alternative is chosen with some probability,

� if a and b are feasible choices and a % b, then the probability of choosing a should be atleast as large as the probability of choosing b.

This section gives a very short introduction to three probabilistic choice models: the Lucemodel , the logit model , and the linear probability model . Often, probabilistic choice modelsare derived in a random utility framework, where the true utility of each alternative consists ofa deterministic component plus a random component. Depending on the realization of therandom utility component, a feasible choice will look good under some circumstances and badunder others, thus motivating that observed choice is probabilistic: an alternative is only chosenin circumstances where it looks optimal. We will not consider such random utility models: theyare (or should be) treated in detail in the econometrics courses. The development of these modelswas one of the main causes for awarding Daniel McFadden the Nobel Prize in 2000. Instead, wederive the models either axiomatically or via the introduction of control costs: DMs want tochoose optimally, but incur costs to precisely implement their choices.

A good introduction to probabilistic choice models can be found in Anderson et al. (1992, Ch.2) and Ben-Akiva and Lerman (1985, Ch. 3). On the content of this section: The Luce modelis due to Luce (1959). The derivation of the logit choice probabilities using the entropy costfunction can be found in Mattsson and Weibull (2002). The derivation of the linear probabilitymodel using the Euclidean distance as cost function is due to Voorneveld (2006). It is based onan early contribution to the literature on bounded rationality in games by Rosenthal (1989).

11.1. The Luce model

Consider a �nite set A of alternatives. Some notation:

� in the remainder of this section, we assume that the DM has to choose from a subsetof alternatives in A containing at least two elements: choosing from a set with only onealternative is trivial. We will typically denote such sets by S ⊆ A or T ⊆ A.

� If the DM has to make a choice from a set S ⊆ A, we denote the probability that the DMchooses a ∈ S by PS(a) ∈ [0, 1]. Obviously, we require that

∑a∈S PS(a) = 1.

� If S ⊆ T ⊆ A, we denote the probability that an element from S is chosen when the choiceset is T by

PT (S) =∑a∈S

PT (a).

By the assumption above: PT (T ) = 1 for all T ⊆ A.

77

Page 84: Preferences

� The set obtained from S by removing an element a ∈ A is denoted by S \ {a}.With this notation, the following two properties should be intuitive. The �rst property statesthat if some alternative a ∈ T is never chosen in a pairwise comparison with some other b ∈ T ,i.e., P{a,b}(a) = 0, then a can be deleted from T without a�ecting the choice probabilities of theremaining alternatives:

(L1) Let T ⊆ A and a ∈ T . If there exists a b ∈ T with P{a,b}(a) = 0, then

PT (S) = PT\{a}(S \ {a})

for all S ⊂ T .

Taking S = T \ {a} in (L1), we get PT (T \ {a}) = PT\{a}(T \ {a}) = 1, so PT (a) = 0.What about cases where no alternative a is always rejected in pairwise comparisons? In that

case it is reasonable to assume the following path independence condition: if a ∈ S ⊂ T , thenthe probability of choosing a from T should be equal to the probability of (i) �rst selecting thesubset S and (ii) from S choosing the element a. Formally:

(L2) Let S ⊂ T ⊆ A and a ∈ S. If P{a,b}(a) /∈ {0, 1} for all b ∈ T , then

PT (a) = PT (S)PS(a).

When making a choice from a set T ⊆ A, (L1) allows us to restrict attention to the alternativesfor which there is �imperfect discriminatory power�: P{a,b}(a) /∈ {0, 1} for all a, b ∈ T, a 6= b. Thepath independence condition then yields the following result:

Proposition 11.1 Assume that P{a,b}(a) /∈ {0, 1} for all di�erent a, b ∈ A. Path independence(L2) holds if and only if there is a function u : A→ R++ such that

PS(a) =u(a)∑b∈S u(b)

(42)

for every S ⊆ A. Moreover, the function u is unique up to multiplication by a positive scalar.

Proof.Step 1: Assume path independence (L2) holds. We �rst prove that PA(a) > 0 for all a ∈ A.Suppose, to the contrary, that PA(a) = 0 for some a ∈ A. By (L2), we know that for everyb ∈ A \ {a} :

0 = PA(a) = PA({a, b})P{a,b}(a).

Since P{a,b}(a) 6= 0, it follows that PA({a, b}) = PA(a) + PA(b) = 0 for all b ∈ A \ {a}. Proba-bilities are nonnegative, so it must be that

∀b ∈ A : PA(b) = 0,

contradicting∑

b∈A PA(b) = 1. Having shown that PA(a) > 0 for all a ∈ A, de�ne u(a) = PA(a).Path independence (L2) implies that for every S ⊆ A :

PS(a) =PA(a)

PA(S)=

PA(a)∑b∈S PA(b)

=u(a)∑b∈S u(b)

.

78

Page 85: Preferences

Step 2: Conversely, suppose that there is a function u : A→ R++ such that

PS(a) =u(a)∑b∈S u(b)

for every S ⊆ A. To show: (L2) holds. So let S ⊂ T ⊆ A and a ∈ S. Then

PT (a) =u(a)∑b∈T u(b)

=

∑b∈S u(b)∑b∈T u(b)

· u(a)∑b∈S u(b)

= PT (S)PS(a).

Step 3: To show that the function u in (42) is unique up to multiplication with a positiveconstant, suppose there are two such functions u and u′. It follows that for every a ∈ A :

PA(a) =u(a)∑b∈A u(b)

=u′(a)∑b∈A u

′(b).

Hence u(a) = αu′(a), where α =(∑

b∈A u(b))/(∑

b∈A u′(b))> 0. �

In words: In Luce's choice model, each alternative can be assigned a positive value such that theprobability of choosing a given alternative from a choice set is proportional to its value.

Debreu (1960) showed that path independence � although reasonable at �rst sight � canlead to counterintuitive conclusions. Consider, for instance, the following well-known variant ofDebreu's argument:

The blue bus/red bus paradox. A DM has to make a traveling mode decision: he can eithergo to his destination by car or by bus. Assume the DM assigns the same probability to bothalternatives:

P{car, bus}(car) = P{car, bus}(bus) = 1/2. (43)

Suppose now that two buses can be used, which are completely identical, except in their colors:one of them is red, the other is blue. So the choice set is A = {car, blue bus, red bus}. Assumethat the DM pays no attention to color:

P{blue bus, red bus}(blue bus) = P{blue bus, red bus}(red bus). (44)

Intuitively, since the DM according to (43) doesn't seem to care whether he goes by car or bybus, it would seem reasonable to expect that he will choose to go by car with probability 1/2and to go by bus with probability 1/2, choosing randomly between the blue and the red bus:

PA(car) = 1/2 and PA(blue bus) = PA(red bus) = 1/4,

or � at least � that the probability of taking the car should be larger than the probability oftaking any of the two buses. However, path independence (L2) implies

PA(car) = PA(blue bus) = PA(red bus) = 1/3.

To see this, notice that

PA(car)(L2)= PA({blue bus, car})P{blue bus, car}(car)

(43)=

1

2PA({blue bus, car})

def=

1

2PA(blue bus) +

1

2PA(car),

79

Page 86: Preferences

so PA(car) = PA(blue bus) and, similarly, PA(car) = PA(red bus). As the probabilities mustadd up to one:

PA(car) = PA(blue bus) = PA(red bus) = 1/3.

So: in the choice problem with only one bus, the DM will choose to go by car or by bus withequal probability, but when faced with the choice between going by car or going by bus in casethere are two virtually identical buses, the probability of choosing the car decreases from 1/2 to1/3.

11.2. The logit model

Again, consider a choice set A = {1, . . . , n} with at least two distinct elements. Assume thateach alternative i ∈ A gives some utility or payo� π(i). In the logit model with parameterδ > 0, the probability of choosing alternative i from A is equal to

PA(i) =eπ(i)/δ∑j∈A e

π(j)/δ=

exp(π(i)/δ)∑j∈A exp(π(j)/δ)

. (45)

Notice from (42) that this is just a special case of Luce's model, where the utility assigned toeach alternative i ∈ A is equal to u(i) = exp(π(i)/δ) > 0. Our goal will be two-fold:

1. motivating these choice probabilities by introducing control costs,

2. studying the role of the parameter δ > 0.

Control costs. We allow the DM to choose each of the alternatives with a certain probability,so the DM chooses a probability distribution from

∆n =

{p ∈ Rn+ :

n∑i=1

pi = 1

}.

Of course, if the DM is faced with choice set A and has preferences % over the outcomes suchthat i % j if and only if π(i) ≥ π(j), the optimal thing to do is to choose only elements fromthe set arg maxi∈A π(i) with positive probability. In most real-life situations, the DM cannotguarantee the exact implementation of his choices: a careless driver may drive of the road, anabsentminded shopper may by mistake buy the wrong item. To model this, we assume that itrequires e�ort to implement choices: associated with each choice p ∈ ∆n will be a disutility orcontrol cost c(p) ∈ R.

The (expected total) utility associated with each choice p ∈ ∆n is de�ned as the di�erencebetween the expected payo�

∑ni=1 piπ(i) and δ > 0 times the control cost c(p), where δ is a

positive scalar representing the relative weight assigned to the e�ort of implementing choice p.Hence, the DM aims to solve

maxp∈∆n

n∑i=1

piπ(i)− δc(p).

Di�erent cost functions give rise to di�erent choice probabilities. A common control cost functionthat appears in many branches of science (physics, chemistry, information science, to name buta few) is the following entropy function :

c(p) =

n∑i=1

pi ln (pi) , (46)

80

Page 87: Preferences

where we use the convention that 0 ln 0 = 0. One can show (we will not do so) that this is astrictly convex function achieving its minimum at the vector (1/n, . . . , 1/n), where all alternativesare chosen with equal probability.

Proposition 11.2 The optimization problem

maxp∈∆n

n∑i=1

piπ(i)− δc(p), (47)

with the control cost function from (46) has a unique maximum location with

∀i ∈ A : pi =exp(π(i)/δ)∑j∈A exp(π(j)/δ)

,

the logit choice probabilities from (45).

Proof. The cost function c is strictly convex, so the function∑n

i=1 piπ(i) − δc(p) is strictlyconcave. Since we maximize a strictly concave, continuous function over a compact set, a max-imum exists and is unique. Since the feasible set is entirely de�ned by linear (in)equalities, theKuhn-Tucker conditions give necessary and su�cient conditions for a solution to be a maximum.The condition for an interior solution p ∈ ∆n, i.e., a solution where pi > 0 for all i, is that thereexists a Lagrange multiplier λ ∈ R associated with the constraint

∑ni=1 pi = 1, such that

∀i = 1, . . . , n : π(i)− δ(ln pi + 1) + λ = 0, (48)

since the gradient at p of the goal function∑n

i=1 piπ(i)− δc(p) has i-th coordinate

π(i)− δ ∂c(p)∂pi

= π(i)− δ(ln pi + 1).

Rewriting (48) gives, for each i = 1, . . . , n:

pi = c exp(π(i)/δ), with c = exp((λ− δ)/δ) a constant.

As∑n

j=1 pj = 1, it follows that

∀i = 1, . . . , n : pi =exp(π(i)/δ)∑j∈A exp(π(j)/δ)

,

as we had to show. �

The role of δ. Let us investigate what happens with the logit choice probabilities in (45) asδ → 0 and as δ → ∞. Consider two alternatives i, j ∈ A, i 6= j. Notice that the ratio of theirlogit choice probabilities equals

PA(i)

PA(j)=

exp (π(i)/δ)

exp (π(j)/δ)= exp

(π(i)− π(j)

δ

), (49)

which converges to one as δ → ∞. But if the ratios of any two choice probabilities convergeto one, their limits must be equal; together with the fact that probabilities add up to one, weconclude that the choice probabilities converge to 1/n as δ →∞.

81

Page 88: Preferences

To consider the limit behavior as δ → 0, suppose that π(i) > π(j). But then ratio (49) goesto in�nity as δ → 0. Since we are dealing with probabilities here, which are bounded below byzero and above by one, if must be that PA(j) → 0. If we let i be the alternative with maximalpayo� π(i), it follows that the probability of choosing an alternative with less than maximalpayo� converges to zero. So in the limit, all probability is restricted to optimal alternatives andit is clear from the de�nition of the choice probabilities that all of these will be chosen with equalprobability.

In summary, the parameter δ can be interpreted as a measure of irrationality of the DM: forlarge values of δ, the DM chooses by more or less blindly picking any of the alternatives, whilefor small values of δ, the choice of the DM is more or less optimal.

11.3. The linear probability model

The idea behind the linear probability model is the same as behind Luce's model and the logitmodel: the probability of choosing an alternative should be (weakly) increasing in the payo�associated to the alternative:

π(i) ≥ π(j)⇒ PA(i) ≥ PA(j). (50)

The adjective linear indicates that the di�erence between these two probabilities should be linearin the payo� di�erence: for a parameter δ > 0, we require that

PA(i)− PA(j) = δ(π(i)− π(j)). (51)

Unfortunately, it is not always possible to combine these two properties for large values ofδ. Let's consider a simple example with two alternatives: A = {1, 2} and respective payo�sπ(1) = 4, π(2) = 0. By (50), we want PA(1) ≥ PA(2) and by (51), we want PA(1) − PA(2) =δ(π(1)− π(2)) = 4δ. If we take δ = 1/8, this gives PA(1)− PA(2) = 1/2. The probabilities haveto add up to one, so the unique solution is that PA(1) = 3/4 and PA(2) = 1/4. So far, so good.Now take δ = 100: PA(1)−PA(2) = 4δ = 400. Since PA(1) and PA(2) are probabilities betweenzero and one, making their di�erence equal to 400 (or � for that matter � any number largerthan 1) is simply impossible.

So we have to relax our requirements (50) and (51) somewhat. Unwilling to change (50), letus adapt (51). Indeed, we require the linearity condition whenever possible, but when we runinto problems like the one in the example above, we simply require that alternatives with lowpayo� are chosen with probability zero. Formally, choice probabilities PA(i) for all alternativesi ∈ A satisfy the linear probability model with parameter δ > 0 if the following holds:

if PA(i) > 0, then PA(i)− PA(j) ≤ δ(π(i)− π(j)) for all j ∈ A. (52)

Let us check to see that (52) gives us what we want:

� If both i and j are chosen with positive probability, we �nd from (52) that

PA(i)− PA(j) ≤ δ(π(i)− π(j)) and PA(j)− PA(i) ≤ δ(π(j)− π(i)).

This impliesPA(i)− PA(j) = δ(π(i)− π(j)),

in correspondence with the linearity requirement (51).

82

Page 89: Preferences

� The choice probabilities also satisfy (50): take i, j ∈ A with π(i) ≥ π(j). We need toshow that the choice probabilities in the linear probability model satisfy PA(i) ≥ PA(j).Discern two cases. First, if PA(j) = 0, it automatically follows that PA(i) ≥ 0 = PA(j). IfPA(j) > 0, application of (52) yields

PA(j)− PA(i) ≤ δ(π(j)− π(i)) ≤ 0,

since π(i) ≥ π(j) and δ > 0.

� Combining the two points above, we see that the choice probabilities are weakly increasingin the associated payo�s. By necessity, we had to set the probability of choosing low-payo�alternatives equal to zero, but those that are chosen with positive probability still satisfythe linearity requirement.

Control costs. The choice probabilities can be derived in the same way as before by makinga clever choice of the cost function. Consider the cost function that assigns to every probabilityvector p ∈ ∆n the squared Euclidean distance to the vector (1/n, . . . , 1/n) that chooses each ofthe n alternatives with equal probability:

c(p) =n∑i=1

(pi −

1

n

)2

. (53)

So choosing all alternatives with equal probability gives zero costs and costs increase the furtheraway you go from the vector (1/n, . . . , 1/n). As in the proof of Proposition 11.2, it follows that:

Proposition 11.3 For each δ > 0, there is a unique solution to the maximization problem

maxp∈∆n

n∑i=1

piπ(i)− 1

2δc(p) (54)

with the cost function given in (53). The solution coincides with the choice probabilities in thelinear probability model with parameter δ.

The role of δ. Comparing the parameter δ in the two optimization problems with controlcosts in (47) and (54), you will notice that they switched roles: large values of δ correspond witha large weight assigned to the control cost function in the logit model, but with a small weightassigned to the control cost function in the linear probability model. This change was necessarybecause I wanted to follow the standard de�nition of the linear probability model in (52). Butthe intuition remains the same: δ measures (ir)rationality. In the case of the linear probabilitymodel: for large values, (52) indicates that the di�erence in the probability of choosing an optimalalternative (highest π(i)) and a suboptimal alternative must be large. In the limit, this forcesthe probability of choosing suboptimal alternatives to zero.

Conversely, for small values of δ, (52) indicates that the di�erence in the probability ofchoosing any two alternatives must be small. Combining this with the fact that probabilitiesadd up to one, this implies that in the limit, all alternatives will be chosen with equal probability.

83

Page 90: Preferences

11.4. Exercises

Exercise 11.1 Prove Proposition 11.3.

Exercise 11.2 Let A = {1, 2}, π(1) = 4, π(2) = 0.

(a) Compute for every δ > 0 the choice probabilities satisfying the linear probability model.

(b) What happens with the choice probabilities as δ → 0? Interpret.

(c) What happens with the choice probabilities as δ →∞? Interpret.

Exercise 11.3 Let A = {1, 2, 3}, π(1) = 0, π(2) = 2, π(3) = 8.

(a) Compute for each δ > 0 the choice probabilities in the logit model. Do these choice probabilities,for each δ > 0, satisfy path independence? What happens with the choice probabilities as δ →∞?

(b) Answer the same questions for the linear probability model.

Exercise 11.4 The penalty function approach: Two of the probabilistic choice models consideredabove could be rationalized using control cost functions giving a penalty to deviations from uniformrandomization. This exercise gives the general argument behind such rationalizations.

A penalty function on Rn is a function c : Rn → R+. A symmetric penalty function is independentof rearranging the coordinates: for each bijection r : {1, . . . , n} → {1, . . . , n} and each x ∈ Rn, it followsthat c(x1, . . . , xn) = c(xr(1), . . . , xr(n)).

Consider a probabilistic choice model over a �nite set A = {1, . . . , n} with n ≥ 2 elements andpayo� function π : A → R. Suppose a decision maker's choice probabilities can be rationalized using asymmetric penalty function: given parameter δ ≥ 0, they solve the problem

P (δ) : maxp∈∆n

n∑i=1

piπ(i)− δc(p− (1/n, . . . , 1/n)).

Show that the resulting choice probabilities satisfy the desired monotonicity requirement: if p solves P (δ)and π(i) > π(j), then pi ≥ pj .

84

Page 91: Preferences

Full circle

To make sure you get the big picture, let us � at the end of this course � turn back to where westarted: the overview of the course goals in the preface, and brie�y summarize how we achievedthem.

The general framework

A meaningful �microfounded� model in any branch of economics derives its conclusions fromassumptions about the behavior of individual economic agents. It requires careful answers to thefollowing questions:

(Q1) What can the agent choose from, i.e., what is the set of feasible alternatives?

(Q2) What does the agent like, i.e., what are the preferences over alternatives?

(Q3) How are the former two combined to make a choice, i.e., to select among alternatives?

We mostly stuck to �rational� choice: choose from your set of feasible alternatives a most preferredone.

Sections 1 to 3 provided a general framework for modeling preferences over and choice fromarbitrary sets of alternatives. Important stops along the way included:

Utility theory: utility functions are convenient tools to summarize an agent's preferences.Nevertheless, in relevant cases, no utility function exists (Section 2.3). We provided an exactanswer to when preferences can be represented by a utility function (Section 2.4). Moreover,we provided conditions under which utility functions had some additional nice structure. Forinstance, continuity was studied in Section 2.5, cases where preferences could be expressed interms of a numeraire in Section 2.6.

Existence of solutions: Proposition 3.1 gave a general answer to a fourth central question:

(Q4) When do most preferred elements exist?

If the weak order re�ecting the agent's preferences is upper semicontinuous, the agent can �nda most preferred alternative in any nonempty, compact set of options. We regularly appealedto this result to establish that problems faced by economic agents actually have a solution;sometimes (as in Propositions 4.3(a) and 7.1(a)) the result could be applied immediately, butsometimes (as in Propositions 4.5(a), 5.4, and 5.5) a little more caution was needed.

Applications of the general framework

In many of the remaining sections, this general framework was applied to speci�c economicproblems. This required giving the set of alternatives as well as the preferences a speci�c meaningthat seems relevant to the problem under consideration. Moreover, this allowed us to study a�fth central question:

(Q5) How are most preferred elements a�ected by changes in the agent's environment?

Below, I will go through these applications, summarize how feasible sets and preferences werede�ned, and � if applicable � indicate where we studied the answer to (Q5).

85

Page 92: Preferences

Application 1: consumer facing budget constraint.

� Feasible alternatives: commodity bundles x ∈ RL+ in a budget set B(p, w).

� Preferences: an arbitrary weak order % over the commodity space X = RL+.� Changes in agent's environment: see Sections 4.2 and 4.5.

Application 2: consumer minimizing expenditure.

� Feasible alternatives: commodity bundles x ∈ RL+ achieving a desired utility level.

� Preferences: de�ned in terms of the expenses p · x at price vector p ∈ RL++.

� Changes in agent's environment: see Section 4.3.

Application 3: producer maximizing profit.

� Feasible alternatives: production plans y in a production set Y ⊆ RL.� Preferences: de�ned in terms of the pro�t p · y at price vector p ∈ RL++.

� Changes in agent's environment: see Section 5.3.

Application 4: producer minimizing costs.

� Feasible alternatives: input vectors z ∈ RL−1+ achieving a desired output level.

� Preferences: de�ned in terms of the costs w · z at input price vector w ∈ RL−1++ .

� Changes in agent's environment: see Section 5.5.

Application 5: expected utility theory.

� Feasible alternatives: compound gambles g over a set of deterministic outcomes.

� Preferences: an arbitrary weak order % over the set of compound gambles G, under someassumptions resulting in a von Neumann-Morgenstern utility function.

� Changes in agent's environment: see Section 8 on risk attitudes.

Application 6: time preference.

� Feasible alternatives: sequences c = (ct)∞t=0 of outcomes occuring over time t.

� Preferences: come in di�erent forms, for instance:

1. represented by a utility function of the form U(c) =∑∞

t=0 δ(t)u(ct),

2. in terms of the limit of means criterion,

3. in terms of the overtaking criterion.

Application 7: probabilistic choice. Although slightly outside the general framework,in some probabilistic choice models like the logit and linear probability model, agents chooseprobabilities as if they maximize expected payo�s subject to implementation costs:

� Feasible alternatives: choice probabilities assigned to a �nite set A of alternatives.

� Preferences: represented by a utility function of the form �expected payo� minus controlcosts�; see Propositions 11.2 and 11.3.

86

Page 93: Preferences

Beyond these notes

Applications of the general framework abound also in other branches of economics. In macroeco-nomics, a government may evaluate alternative policies in terms of some social welfare functionsummarizing the well-being of its citizens. In game theory � the mathematical toolbox used tostudy interaction between agents, used in many branches of microeconomics, industrial organi-zation, and political economics � players have di�erent strategies to choose from and evaluatethem in terms of a preference relation that incorporates the uncertainty they face about, forinstance, the choices of the other players.

And what if we leave the realm of rational decision making? Parts of these notes (see, forinstance, Exercises 3.4, 3.5, and Section 11) illustrate that as long as we can write down formalpostulates about agents' behavior, our mathematical tools allow us to study their consequencesin a rigorous and consistent way. This is just the right amount of �rationality� we need:

Behavior is procedurally rational when it is the outcome of appropriate deliberation.Its procedural rationality depends on the process that generated it. (Simon, 1976, p.131)

Behavior is procedurally rational if there is a procedure � a recipe, if you wish � that translatesa decision problem to a well-de�ned choice. Procedurally rational decision makers are not wildmaniacs choosing without any logic whatsoever. Paraphrasing Shakespeare:

Though this be madnesse/Yet there is Method in't. Hamlet, 1603, Act 2, Sc. 2.

I hope that the tools you acquired during this course will help you to address also other economicproblems in a structured way.

87

Page 94: Preferences

Notation

If X is a �nite set, |X| denotes its cardinality, i.e., its number of elements.Weak set inclusion (each element of A is also an element of B): A ⊆ B.Strict/proper set inclusion (A ⊆ B, but A 6= B): A ⊂ B.Set of positive integers: N = {1, 2, 3, . . .}.Set of integers: Z = {. . . ,−2,−1, 0, 1, 2, . . .}.Set of rational numbers: Q = {p/q : p, q ∈ Z, q 6= 0}.Set of real numbers: R.For arbitrary L ∈ N :Set of vectors in RL with nonnegative coordinates: RL+ = {x ∈ RL : x1, . . . , xL ≥ 0}.Set of vectors in RL with positive coordinates: RL++ = {x ∈ RL : x1, . . . , xL > 0}.Sets like QL

++ are de�ned analogously.For two vectors x, y ∈ RL, their inner product is denoted by x · y = x1y1 + · · ·+ xLyL.Moreover, write

x ≥ y if xi ≥ yi for all coordinates i = 1, . . . , L,

x > y if xi > yi for all coordinates i = 1, . . . , L.

Relations ≤ and < are de�ned analogously.For k ∈ {1, . . . , L}, ek ∈ RL denotes the k-th standard basis vector with k-th coordinate equalto one and all other coordinates equal to zero:

ek = (0, . . . , 0, 1︸︷︷︸k−th coordinate

, 0, . . . , 0).

The vector of ones is denoted by e = (1, . . . , 1) ∈ RL.

88

Page 95: Preferences

References

Anderson, S.P., de Palma, A., Thisse, J.-F., 1992. Discrete choice theory of product di�erentia-tion. MIT Press.

Arrow, K.J., 1959. Rational choice functions and orderings. Economica 26, 121-126.Arrow, K.J., Hahn, F.J., 1971. General competitive analysis. Amsterdam: North-Holland.Ben-Akiva, M., Lerman, S.R., 1985. Discrete choice analysis. MIT Press.Cobb, C.W., Douglas, P.H., 1928. A theory of production. American Economic Review (supple-

ment) 18, 139-165.Debreu, G., 1954. Representation of a preference ordering by a numerical function. In: Decision

Processes. Thrall, Davis, Coombs (eds.), John Wiley, pp. 159-165.Debreu, G., 1959. Theory of value. Yale University Press.Debreu, G., 1960. Review of R.D. Luce, Individual Choice Behavior: A Theoretical Analysis.

American Economic Review 50, 186-188.Debreu, G., 1964. Continuity properties of Paretian utility. International Economic Review 5,

285-293.Diecidue, E., Wakker, P.P., 2002. Dutch books: avoiding strategic and dynamic complications,

and a comonotonic extension. Mathematical Social Sciences 43, 135-149.Dubra, J., Echenique, F., 2001. Monotone preferences over information. Topics in Theoretical

Economics 1, article 1. http://www.bepress.com/bejte/topics/vol1/iss1/art1Fishburn, P.C., 1970a. Utility theory for decision making. New York: John Wiley & Sons.Fishburn, P.C., 1970b. Intransitive individual indi�erence and transitive majorities. Economet-

rica 38, 482-489.Fishburn, P.C., 1979. Transitivity. Review of Economic Studies 46, 163-173.Hildenbrand, W., Kirman, A.P., 1988. Equilibrium analysis. North-Holland.Ja�ray, J.-Y., 1975. Existence of a continuous utility function: An elementary proof. Economet-

rica 43, 981-983.Kahneman, D., Tversky, A., 1964. Prospect theory: an analysis of decision under risk. Econo-

metrica 47, 263-291.Kamke, E., 1950. Theory of sets. New York: Dover Publications.Kaneko, M., 1976. Note on transferable utility. International Journal of Game Theory 5, 183-185.Koopmans, T.C., 1960. Stationary ordinal utility and impatience. Econometrica 28, 287-309.Kreps, D.M., 1990. A course in microeconomic theory. Hertfordshire: Harvester Wheatsheaf.Loewenstein, G., Prelec, D., 1992. Anomalies in intertemporal choice: evidence and interpreta-

tion. Quarterly Journal of Economics 107, 573-597.Luce, R.D., 1959. Individual choice behavior: A theoretical analysis. Wiley.Mas-Colell, 1985. The theory of general economic equilibrium; A di�erentiable approach. Cam-

bridge: Cambridge University Press.Mas-Colell, A., Whinston, M.D., Green, J.R., 1995. Microeconomic theory. Oxford: Oxford

University Press.Mattsson, L.-G., Weibull, J.W., 2002. Probabilistic choice and procedurally bounded rationality.

Games and Economic Behavior 41, 61-78.Osborne, M.J, Rubinstein, A., 1994. A course in game theory. Cambridge, MA: MIT Press.Phelps, E.S., Pollak, R.A., 1968. On second-best national saving and game-equilibrium growth.

Review of Economic Studies 35, 201-208.Pratt, J.W., 1964. Risk aversion in the small and in the large. Econometrica 32, 122-136.

89

Page 96: Preferences

Rabin, M., 2000. Risk aversion and expected-utility theory: a calibration theorem. Econometrica68, 1281-1292.

Rosenthal, R.W., 1989. A bounded-rationality approach to the study of noncooperative games.International Journal of Game Theory 18, 273-292.

Rubinstein, A., 2006. Lecture notes in microeconomic theory. Princeton NJ: Princeton Univer-sity Press. http://arielrubinstein.tau.ac.il/Rubinstein2007.pdf

Simon, H., 1955. A behavioral model of rational choice. Quarterly Journal of Economics 69,99-118.

Simon, H.A., 1976. From substantive to procedural rationality. In: Method and Appraisal inEconomics. Latsis, S.J. (ed.), Cambridge University Press, pp. 129-146.

Starr, R.M., 1997. General equilibrium theory. Cambridge University Press.Thaler, R., 1981. Some empirical evidence on dynamic inconsistency. Economics Letters 8,

201-207.Varian, H.R., 1992. Microeconomic analysis. New York: W.W. Norton & Company, 3rd edition.Voorneveld, M., 2006. Probabilistic choice in games: properties of Rosenthal's t-solutions. In-

ternational Journal of Game Theory 34, 105-121.Voorneveld, M., 2007. The possibility of impossible stairways: Tail events and countable player

sets. To appear in Games and Economic Behavior.Voorneveld, M., 2008. From preferences to Cobb-Douglas utility. SSE/EFI Working Paper Series

in Economics and Finance, No. 701.Wärneryd, K., 2007. Sexual reproduction and time-inconsistent preferences. Economics Letters

95, 14-16.

90

Page 97: Preferences

Suggested solutions

These are (sometimes short) solutions to most exercises in the lecture notes. In solutions to thehome assignments and exam questions, you are expected to start from relevant de�nitions, andclearly deduce and motivate your answers. Suggestions for improvements (and corrections ofpotential mistakes?) are welcome!

Section 1

Exercise 1.1(a): Each pair of words can be arranged in alphabetical order, so % is complete. Moreover, ifword x is found before or at the same place as (in case the words are identical) word y in thedictionary, and word y is found before or at the same place as word z in the dictionary, thenword x is found before or at the same place as word z in the dictionary. Conclude that % istransitive.(b): The binary relation % de�ned by �knows� is not necessarily complete or transitive. Aviolation of completeness occurs if there exist people who are unfamiliar with each other. Alsoviolations of transitivity are common: I know my wife, my wife knows her boss, but I do notknow my wife's boss.

Exercise 1.2(a): [Re�exivity of ∼] Let x ∈ X. By completeness of %: x % x and (simply changing theorder of writing) x - x. By de�nition of ∼: x ∼ x. Conclude that ∼ is re�exive.[Symmetry of ∼] Let x, y ∈ X with x ∼ y. By de�nition of ∼, x % y and y % x. But this isalso the de�nition of y ∼ x. Conclude that ∼ is symmetric.[Transitivity of ∼] Let x, y, z ∈ X have x ∼ y and y ∼ z. By de�nition of ∼, this means thatx % y, y % x, y % z, z % y. By transitivity of %, x % y and y % z give x % z. Similarly, z % yand y % x give z % x. Since x % z and z % x: x ∼ z. Conclude that ∼ is transitive.(b): [Irre�exivity of �] Let x ∈ X. By de�nition of �, x � x would require that x % x butnot x % x, a contradiction. Conclude that � is irre�exive.[Asymmetry of �] Let x, y ∈ X with x � y. By de�nition of �, x % y but not y % x. Byde�nition of �, not y � x. Conclude that � is asymmetric.[Transitivity of �] Let x, y, z ∈ X have x � y and y � z. By de�nition of �, this means thatx % y but not y % x and that y % z, but not z % y. By transitivity of %, x % y and y % z givex % z. It is not true that z % x. If it were, transitivity of % with z % x and x % y would implyz % y, contradicting y � z. Since x % z, but not z % x: x � z. Conclude that � is transitive.(c): Let x, y, z ∈ X have x ∼ y and y % z. By de�nition of ∼, this implies that x % y. As x % yand y % z, transitivity of % gives x % z.

Exercise 1.3(a): Assume % is strongly monotonic. Let k ∈ {1, . . . , L} be one of the coordinates, let x ∈ X,and ε > 0. Then x+ εek ≥ x and x+ εek 6= x, so by strong monotonicity, x+ εek � x. Concludethat % is strongly monotonic in coordinate k.

Now assume that % is strongly monotonic in each of its coordinates and transitive. Letx, y ∈ X with x ≥ y and x 6= y. To show: x � y.

Starting with x, change the coordinates one by one to those of y. Formally, let z(0) = x and,for each k ∈ {1, . . . , L}, de�ne z(k) = x +

∑k`=1(y` − x`)e`. Then either z(k) = z(k − 1) if the

91

Page 98: Preferences

k-th coordinates of x and y are the same, or z(k− 1) � z(k) by strong monotonicity in the k-thcoordinate. By transitivity, we �nd that x = z(0) � z(L) = y.(b): The preference relation % on R2

+ with

∀x, y ∈ R2+ : x % y ⇔ xk > yk for exactly one coordinate k ∈ {1, 2}

is strongly monotonic in coordinate k for both k = 1 and k = 2, but not strongly monotonic:(2, 2) is not strictly preferred to (1, 1). Notice: in line with (a), relation % is not transitive.(c): No. The point (0, . . . , 0) ∈ RL+ cannot be improved upon: since �less is better�, (0, . . . , 0) � xfor every x ∈ RL+ with x 6= (0, . . . , 0).(d): Yes. Notice that the issue above, that improvements beyond the zero vector are impossibleif one is constrained to vectors with nonnegative coordinates, disappears. Let x ∈ RL and ε > 0.De�ne y = x − ε

2e1 ∈ RL. Then ‖x − y‖ = ε2 < ε and y ≤ x, with strict inequality in the �rst

coordinate. Since �less is better�, y � x. Conclude that % is locally nonsatiated.

Exercise 1.4(a): The preference relation on R2

+ with

x % y ⇔ (x1 + 1)(x2 + 1) ≥ (y1 + 1)(y2 + 1)

is strongly monotonic in coordinate 1, but not quasilinear in coordinate 1: let x = (1, 2) andy = (2, 1). Then

(x1 + 1)(x2 + 1) = (y1 + 1)(y2 + 1) = 6, so x ∼ y.

Increase the �rst coordinate of x and y by ε > 0. Then

(x1 + ε+ 1)(x2 + 1) = 3(2 + ε) > 2(3 + ε) = (y1 + ε+ 1)(y2 + 1), so x+ εe1 � y + εe1.

Quasilinearity would require that the indi�erence remains una�ected.(b): The preference relation % on R2

+ where all alternatives are equivalent with each other (x % yfor all x, y, represented by a constant utility function) is trivially quasilinear but not stronglymonotonic in coordinate 1.(c): Same preference relation as in (b).(d): The preference relation on R2

+ with

x % y ⇔ 4x1 + 3x22 ≥ 4y1 + 3y2

2

satis�es all three monotonicity properties, but is not homothetic. For instance, (1, 0) � (0, 1), as4 · 1 + 3 · 02 > 4 · 0 + 3 · 12, but 2(1, 0) ≺ 2(0, 1), as 4 · 2 + 3 · 02 < 4 · 0 + 3 · 22.

Exercise 1.5(a): Let x, y ∈ RL+ have x ≥ y. For each n ∈ N, xn = x+ (1/n, . . . , 1/n) ∈ RL+ satis�es xn > y,so xn % y (in fact, even xn � y). Letting n→∞, continuity implies that limn→∞ xn = x % y.(b): Let x, y ∈ RL+ have x > y. Then min{x1, x2} > min{y1, y2}, so x % y, but not y % x, i.e.,x � y. Let x = (2, 1) and y = (1, 1). In both cases, you can only mix one unit of drink, but xwastes one unit of the �rst ingredient, so even though x ≥ y, x ≺ y.

Exercise 1.6

92

Page 99: Preferences

(a): Assume the �rst de�nition of convexity holds. Let y ∈ X. To show: {x ∈ X : x % y} is aconvex set.

Let z, z′ ∈ {x ∈ X : x % y} and α ∈ [0, 1]. Using completeness of %, we may assume w.l.o.g.that z % z′. By convexity, αz + (1− α)z′ % z′ % y, so αz + (1− α)z′ % y by transitivity of %.

Conversely, assume the second de�nition of convexity holds. Let x, y ∈ X with x % y andα ∈ [0, 1]. To show: αx+ (1− α)y % y.

Elements x and (by completeness) y both lie in the set {x ∈ X : x % y}, which is convex byassumption, so it also contains αx+ (1− α)y. Conclude that αx+ (1− α)y % y.(b): Consider the preference relation % on R with

∀x, y ∈ R : x % y ⇔ x ≥ 0 > y.

For each y ∈ R:

{x ∈ R : x % y} =

{∅ if y ≥ 0,R+ if 0 > y,

is convex. Therefore, it satis�es the �rst convexity condition. However, if x = 1, y = −3, α = 1/2,then x % y, but not αx+ (1− α)y % y, in violation of the second convexity de�nition.

Section 2

Exercise 2.1(a): [Transitivity] Let x, y, z ∈ R satisfy x % y, y % z. By de�nition, x ≥ y+ 1 and y ≥ z+ 1,so x ≥ y + 1 ≥ z + 2 ≥ z + 1, so x % z.[Violation of completeness] Completeness requires in particular that for each x ∈ R: x % x,i.e., that x ≥ x+ 1. Clearly, this is not true.(b): [Prop. 2.1(b) satis�ed] Let x, y ∈ R. If x � y, then x % y, so x ≥ y + 1. Therefore,u(x) = x ≥ y+ 1 > y = u(y). Moreover, there are no x, y ∈ X with x ∼ y (as this would requirex ≥ y + 1 and y ≥ x+ 1), so the second condition is vacuous.[Prop. 2.1(a) violated] u does not represent %, since % is not complete and the order inducedby u is.

Exercise 2.2(a): Suppose the collection of jumps in U is uncountable. Consider two distinct jumps (u1, u2)and (v1, v2). The intervals (u1, u2) and (v1, v2) are disjoint by de�nition of a jump. Moreover,each such interval contains a rational number, necessarily distinct from the one in the otherinterval, since these intervals are disjoint. Therefore, there is an injective function from theuncountable set of jumps to the countable set of rational numbers, a contradiction.(b): C is the union of two countable sets J and R and therefore countable itself. Let x, y ∈ Xwith x � y. To show: there are c1, c2 ∈ C with x % c1 � c2 % y.Case 1: (u(y), u(x)) is a jump in U . By de�nition of J , there are points c1, c2 ∈ J ⊆ C withutility u(c1) = u(x), u(c2) = u(y). Hence x ∼ c1 � c2 ∼ y, as in the requirement for Ja�rayorder-separability.Case 2: (u(y), u(x)) is not a jump in U . Then (u(y), u(x))∩U 6= ∅. By de�nition of R, there isa c ∈ R ⊆ C with u(c) ∈ (u(y), u(x)). Now apply the reasoning so far to (u(c), u(x)). If it is ajump in U , Case 1 says that there are c1, c2 ∈ C with x ∼ c1 � c2 ∼ c � y, as in the requirementfor Ja�ray order-separability. If it is not a jump, repeating the construction of Case 2 says thatthere is a c′ ∈ C with u(c′) ∈ (u(c), u(x)), so that x � c′ � c � y, as in the requirement forJa�ray order-separability.

93

Page 100: Preferences

(c): Let x, y ∈ X. If x � y, there exist, by Ja�ray order-separability, c1, c2 ∈ C with x %c1 � c2 % y. Therefore, {c ∈ C : c - x} ⊃ {c ∈ C : c - y}, as the former set includesc1, whereas the latter doesn't. Conclude that u(x) − u(y) ≥ 2−n(c1) > 0. If x ∼ y, then{c ∈ C : c - x} = {c ∈ C : c - y}, so u(x) = u(y).

Exercise 2.3(a): True. By de�nition of a continuous function, pre-images of open sets are open sets. Conse-quently, for each x ∈ X, the sets

{y ∈ X : y ≺ x} = u−1((−∞, u(x)︸ ︷︷ ︸open

) and {y ∈ X : y � x} = u−1((u(x),∞)︸ ︷︷ ︸open

)

are open sets.(b) False. The usual �greater than or equal to� order ≥ on R is represented by the continuousutility function u : R → R with u(x) = x and hence, by (a), continuous. However, any strictlyincreasing function u : R→ R represents ≥, including the discontinuous function

u(x) =

{x if x < 0,x+ 1 if x ≥ 0.

Exercise 2.4

� No. Lexicographic preferences (modi�ed in such a way that you start comparing the secondcoordinates, then the �rst) on R2

+ constitute an example where preferences cannot even berepresented by a utility function. Let x, y ∈ R2

+ have x2 > y2. The modi�ed lexicographicpreference started by looking at these second coordinates, so no matter how much moneyyou add to the �rst coordinate of y, you will strictly prefer x.

� Here is an example where preferences can be represented by a utility function. It makeshaving a second coordinate below one so bad, that you can never compensate this withmoney and make it look as nice as an alternative whose second coordinate is at least one.The preference relation % on R2

+ represented by the utility function

u(x) =

{Φ(x1) + 1 if x2 ≥ 1,Φ(x1) if x2 < 1,

where Φ : R→ (0, 1) is strictly increasing (like the cdf of a standard normal distribution),satis�es all properties in Proposition 2.11, except (8).

� Under additional assumptions (like continuity, monotonicity), the answer is yes. See Ru-binstein (2006, Lecture 4).

Exercise 2.5(a): Consider (a, 0) and (a′, 0) in X. Either (a, 0) ∼ (a′, 0), in which case we take m = m′ = 0,or one of the alternatives is strictly preferred over the other, w.l.o.g. (a, 0) � (a′, 0). In thelatter case, invoke the �rst property to conclude that there is an amount of money m∗ such that(a, 0) ∼ (a′,m∗). Take m = 0,m′ = m∗.(b): W.l.o.g., m ≤ w. By the third property with c = w −m:

(a′, w′) ∼ (a,w) = (a,m+ (w −m)) ∼ (a′,m′ + (w −m)),

94

Page 101: Preferences

so (a′, w′) ∼ (a′,m′ + (w − m)) by transitivity of ∼. But then w′ = m′ + (w − m) by strongmonotonicity in money.(c): Let (a,m), (a′,m′) ∈ X. To show: (a,m) % (a′,m′) if and only if u(a,m) ≥ u(a′,m′).

By the �rst two properties, there are unique amounts of moneym1,m2 ≥ 0 such that (a,m) ∼(a∗,m1) and (a′,m′) ∼ (a∗,m2). By de�nition of v, we �nd that

u(a,m) = (m1 −m) +m = m1 and, similarly, that u(a′,m′) = m2. (55)

Therefore,

(a,m) % (a′,m′) ⇔ (a∗,m1) % (a∗,m2)

⇔ m1 ≥ m2

⇔ u(a,m) ≥ u(a′,m′),

where the �rst equivalence follows from the fact that (a,m) ∼ (a∗,m1) and (a′,m′) ∼ (a∗,m2),the second equivalence from strong monotonicity in money, and the �nal one from (55).

Exercise 2.6(a): Let r ∈ R. If Xu(r) contains at most one element, it is convex. If it contains two or more,let x, y ∈ Xu(r) and let α ∈ (0, 1). To show: αx+ (1− α)y ∈ Xu(r).

Without loss of generality, assume that x % y, so that u(x) ≥ u(y) ≥ r. By convexity of %:αx+ (1− α)y % y, so u(αx+ (1− α)y) ≥ u(y) ≥ r, i.e., αx+ (1− α)y ∈ Xu(r).(b): Let's do the quasiconcavity part; strict quasiconcavity proceeds similarly. Assume u : X →R is quasiconcave. Let y ∈ X. To show: {x ∈ X : x % y} is a convex set.

By de�nition, {x ∈ X : x % y} = {x ∈ X : u(x) ≥ u(y)} = Xu(r), with r = u(y). The latterset is convex by the de�nition of a quasiconcave function under (a).(c): A function u on a convex domain X is concave if its subgraph

subgraph(u) = {(x, y) ∈ X × R : y ≤ u(x)}

is a convex set. Consider the weak order % on X = R represented by the utility function

u(x) =

{0 if x ≤ 0,1 if x > 0.

This preference relation is convex, as, for each y ∈ X, the upper contour sets are convex:

{x ∈ X : x % y} =

{R if y ≤ 0,(0,∞) if y > 0.

Suppose v : X → R were a concave utility function representing %. By de�nition, (−1, v(−1))and (1, v(1)) are elements of subgraph(v). Take α = 1/2 and consider the convex combination

1

2(−1, v(−1)) +

1

2(1, v(1)) = (0,

1

2v(−1) +

1

2v(1)).

Since v(−1) < v(1), this point does not lie in the subgraph of v:

v(0) = v(−1) <1

2v(−1) +

1

2v(1).

Exercise 2.7(a):

95

Page 102: Preferences

� For all n ∈ N, f(nu) = nf(u) by additivity and induction on n.

� f(0) = f(0 + 0) = f(0) + f(0), so f(0) = 0. Hence f(0u) = 0f(u).

� For all n ∈ N, f(−nu) = −nf(u): indeed, 0 = f(0) = f(nu+ (−nu)) = f(nu) + f(−nu),so f(−nu) = −f(nu) = −nf(u).

� So f(xu) = xf(u) for all x ∈ Z.� For x ∈ Q, write x = p/q for some p, q ∈ Z, q 6= 0. Rewriting xu = (p/q)u gives q(xu) = pu.Hence f(q(xu)) = f(pu). By the above, qf(xu) = pf(u), so f(xu) = (p/q)f(u) = xf(u).

(b): If f is not linear, there are x, y ∈ R\{0} with f(x)/x 6= f(y)/y. Hence, vectors a = (x, f(x))and b = (y, f(y)) are linearly independent: vectors αa + βb with α, β ∈ R span R2. So vectorsαa + βb with α, β ∈ Q are dense in R2. The latter vectors are in the graph of f : for α, β ∈ Q,(a) implies that

(αx+ βy, f(αx+ βy)) = (αx+ βy, f(αx) + f(βy)) = (αx+ βy, αf(x) + βf(y)) = αa+ βb.

(c):

� For each i ∈ {1, . . . , n}, de�ne fi : R→ R as follows:

∀xi ∈ R : fi(xi) = F (xiei).

� Applying additivity of F (n− 1) times gives, for each x ∈ Rn that

F (x) = F

(n∑i=1

xiei

)=

n∑i=1

F (xiei) =n∑i=1

fi(xi).

� To see that each fi must be additive, let xi, yi ∈ R. By additivity of F :

fi(xi + yi) = F (xiei + yiei) = F (xiei) + F (yiei) = fi(xi) + fi(yi).

Section 3

Exercise 3.2By continuity of f , the weak order % on X with

∀x, y ∈ X : x % y ⇔ f(x) ≥ f(y)

has open lower contour sets: for each x ∈ X,

L(x) = {y ∈ X : y ≺ x} = {y ∈ X : f(y) < f(x)} = f−1((−∞, f(x)))

is the pre-image of an open interval. By Proposition 3.1, X contains a best element. By de�nitionof%, this best element is a maximum of f . Existence of a minimum can be established by applyingthe proposition to the weak order %∗ with

∀x, y ∈ X : x %∗ y ⇔ f(x) ≤ f(y).

Exercise 3.3(a): Assume (X,B, C) is rationalizable by the weak order % on X. Let A,B ∈ B, x, y ∈ A ∩B,x ∈ C(A), y ∈ C(B). To show: x ∈ C(B) = {z ∈ B : z % z′ for all z′ ∈ B}.

96

Page 103: Preferences

Since y ∈ A and x ∈ C(A) = {z ∈ A : z % z′ for all z′ ∈ A}: x % y. Let z′ ∈ B. Sincey ∈ C(B): y % z′. Using x % y and transitivity of %: x % z′. So x % z′ for all z′ ∈ B, i.e.,x ∈ C(B).(b): No. Consider the choice structure with

X = {a, b, c, d},B = {{a, b, c}, {b, c, d}}, C({a, b, c}) = {b}, C({b, c, d}) = {c}.

It trivially satis�es IIA: there are no distinct sets A,B ∈ B with A ⊆ B. It does not satisfyWARP: in the �rst problem, b is revealed at least as good as c, in the second c is revealed atleast as good as b. So b should have been contained in C({b, c, d}).(c): No. The choice structure in (b) satis�es IIA, but is not rationalizable. Suppose, to thecontrary, that % rationalizes it. Since C({a, b, c}) = {b}, we must have that b % c and b % a.Since C({b, c, d}) = {c}, we must have that c % b and c % d. But then b ∼ c, so c ∼ b % aimplies c % a. But then c % y for all y ∈ {a, b, c}, so c should have been included in C({a, b, c}).(d): No. Consider the choice structure with X = {a, b, c},B = {{a, b}, {b, c}, {a, c}}, C({a, b}) ={a}, C({b, c}) = {b}, C({a, c}) = {c}. As distinct choice sets have only one point in common,WARP is trivially satis�ed. It is not rationalizable, as a rationalizing % should satisfy a � b, b �c, c � a, in violation of transitivity.

Exercise 3.4(a): [WARP satis�ed] Let A,B ∈ B, x, y ∈ A ∩ B, x ∈ C(A), and y ∈ C(B). To show:x ∈ C(B).

We will simply show that x = y. By de�nition of C:

∀B ∈ B : if B contains a satisfactory alternative, C(B) selects one of them. (56)

Distinguish two cases:Case 1: v(x) < r. Then also v(y) < r by (56). Now x ∈ C(A) implies that x is the largestelement of A. In particular, since y ∈ A: y ≤ x. Similarly, y ∈ C(B) implies that x ≤ y. Sox = y ∈ C(B).Case 2: v(x) ≥ r. Then also v(y) ≥ r by (56). Now x ∈ C(A) implies that x is the smallestsatisfactory element of A. In particular, since y ∈ A: x ≤ y. Similarly, y ∈ C(B) implies thaty ≤ x. So x = y ∈ C(B).[IIA satis�ed] WARP implies IIA.[A rationalizing weak order] Some conditions need to be satis�ed: A satisfactory elementis always preferred to a nonsatisfactory one; Among nonsatisfactory alternatives, the largest ischosen, so there having a high index is preferable. Among satisfactory alternatives, the smallestis chosen, so there having a low index is preferable. One weak order (verify!) rationalizing thechoice structure is obtained by writing down (from worst to best) all nonsatisfactory alternativesfrom smallest to largest, then all satisfactory alternatives from largest to smallest.(b): [IIA violated] For each B ∈ B, let x∗(B) be your partner's most preferred element of B.By de�nition of C: C(B) = B \ {x∗(B)} for each B ∈ B with more than one element. TakeB = X,A = C(B). Both sets lie in B and A ⊆ B. Moreover, C(B) ∩A = C(B) 6= ∅. IIA wouldimply that C(A) = C(B) ∩A = C(B), but C(A) = A \ {x∗(A)} ⊂ A = C(B), a contradiction.[WARP violated] WARP implies IIA and is therefore violated as well.[Rationalizability] As WARP is violated, the choice structure is not rationalizable.

Exercise 3.5

97

Page 104: Preferences

(a:) In B1, the �rst commodity has the highest price p1 = 2, so spending wealth w = 2 on the�rst commodity gives C(B1) = {(1, 0)}. Similarly, C(B2) = {(0, 1)}.(b): Yes, there is no set-inclusion between the two choice sets, so IIA holds vacuously.(c): No, bundles x = (1, 0) and y = (0, 1) lie in B1 ∩ B2. Since x ∈ C(B1) and y ∈ C(B2),WARP would require x ∈ C(B2).(d): No: C(B1) = {x} would require x � y, whereas C(B2) = {y} would require y � x.(e): For instance:

u(x, p) =

x1 if p1 > p2,x2 if p2 > p1,x1x2 if p1 = p2.

Section 4

Exercise 4.1[Continuity:] As % is represented by the continuous utility function u, it is continuous. For-mally, for each y ∈ X,

{x ∈ X : x % y} = {x ∈ X : u(x) ≥ u(y)} = u−1([u(y),∞))

is the preimage of a closed set under the continuous function u and therefore closed. Similarly,the set {x ∈ X : x - y} is closed.[Monotonicity, but not strong:] Take x, y ∈ RL+ with x ≥ y. There is an i ∈ {1, . . . , L} suchthat u(x) = min{x1/a1, . . . , xL/aL} = xi/ai. As x ≥ y, it follows that

u(x) = xi/ai ≥ yi/ai ≥ min{y1/a1, . . . , yL/aL} = u(y), so x % y.

Similarly, if x > y, then x � y. For a violation of strong monotonicity, notice that

u(0, . . . , 0) = u(1, 0, . . . , 0) = 0,

i.e., if you start with nothing, but get one unit of the �rst ingredient, you still cannot bake acake due to lack of all the other ingredients![Convexity, but not strict:] Let y ∈ RL+ and let u(y) = α. Then

{x ∈ RL+ : x % y} = {x ∈ RL+ : min{x1/a1, . . . , xL/aL} ≥ α}= ∩L`=1{x ∈ RL+ : x`/a` ≥ α}

is the intersection of convex halfspaces and therefore convex. For a violation of strict convexity,take x = (a1 +1, a2, . . . , aL), y = (a1, . . . , aL). Both vectors (and any convex combination) su�ceto make one cake: for each α ∈ (0, 1):

x ∼ y ∼ αx+ (1− α)y,

in contradiction with strict convexity.[Homotheticity:] u is homogeneous of degree one.

Exercise 4.2With the additional restrictions, the budget sets become:Indivisibilities: B(p, w) ∩ Z2

+.

98

Page 105: Preferences

Rationing: B(p, w) ∩ {x ∈ R2+ : x1 ≤ 3}.

Rebates 1: {x ∈ R2+ : p1x1 + 4 min{x2, 5} + 2 max{x2 − 5, 0} ≤ w}, as the �rst �ve units of

commodity two cost p2 = 4 and any additional ones only 2.Rebates 2: {x ∈ R2

+ : x2 ≤ 5, p · x ≤ w} ∪ {x ∈ R2+ : x2 > 5, 8x1 + 2x2 ≤ 40}.

Initial endowment: {x ∈ R2+ : p · x ≤ p · ω}.

Package deal: B(p, w) ∩ {x ∈ R2+ : x1 = x2}.

Gift certificate: B(p, w) ∪ {x ∈ R2+ : x1 ≥ 1/p1, p1(x1 − 1/p1) + p2x2 ≤ w}, the �rst set

being the budget set if he does not use the gift certi�cate, the second one if he does and thereforeacquires 1/p1 units of the �rst commodity without needing to address his budget.

Except for Rebates 2, the budget sets are nonempty, compact, so a most preferred bundleexists. Under Rebates 2, the budget set is not closed (it doesn't contain the boundary point(30/8, 5)) and a most preferred bundle need not exist. For instance, if the utility function of theconsumer is u(x) = min{4x1, 3x2}, there is no optimal bundle in the budget set. Drawing thebudget set and some indi�erence curves will help you to verify this.

Exercise 4.3Walrasian demand is homogeneous of degree one in wealth: for all (p, w) ∈ RL+1

++ and all α > 0,if x ∈ x(p, w), then αx ∈ x(p, αw).Proof. Suppose not: there is a z ∈ B(p, αw) with z � αx. Then y := (1/α)z ∈ B(p, w). Asx ∈ x(p, w), x % y. As % is homothetic, also αx % αy = z, contradicting that z � αx. �

Exercise 4.4(a): Consider a consumer with utility function u(x) = x1 + x2. Local nonsatiation is obvious.If p1 > p2, the consumer spends the entire income on the second commodity, so v(p, w) = w/p2

if p1 > p2. Increasing p1 even further does not a�ect indirect utility, i.e., indirect utility is notstrictly decreasing in the price of commodity 1.(b): To show: for each sequence (pn, wn)n∈N in RL+1

++ with limit (p, w) ∈ RL+1++ , v(pn, wn) →

v(p, w).Proof. For each n ∈ N, let xn ∈ x(pn, wn), which is possible by the assumptions in Proposition4.4. As xn ∈ B(pn, wn) for all n and (pn, wn) → (p, w), the sequence (xn)n∈N eventually lies inthe slightly enhanced budget set B(p, w + 1), which is compact: taking a subsequence if neces-sary, we may assume w.l.o.g. that the sequence (xn)n∈N is convergent, with limit x ∈ X. Thesequence (pn, wn, xn)n∈N satis�es the properties of Proposition 4.3(b). In particular, x ∈ x(p, w),i.e., limn→∞ v(pn, wn) = limn→∞ u(xn) = u(x), by continuity of u. �

(c): Roughly speaking, because continuous preferences may be represented by discontinuousutility functions, which may cause jumps in the indirect utility function as well.

For instance, suppose a consumer has continuous utility function U : R+ → R with U(x) =min{x, 1} and hence continuous preferences. These preferences can also be represented by thediscontinuous utility function u : R+ → R with

u(x) =

{x if x ≤ 1,2 if x > 1.

Notice that

x(p, w) =

{{w/p} if w ≤ p,[1, w/p] if w > p.

99

Page 106: Preferences

The indirect utility function given u is

v(p, w) =

{w/p if w ≤ p,2 if w > p,

with discontinuities at all points where p = w.

Exercise 4.5(a): Follows since

e(αp, u) = min (αp) · xs.t. x ∈ RL+,

u(x) ≥ u.

= αmin p · xs.t. x ∈ RL+,

u(x) ≥ u.

= αe(p, u).

(b): Let p ∈ RL++. Suppose there are u′, u′′ ∈ U with u(0, . . . , 0) ≤ u′ < u′′ and e(p, u′) ≥e(p, u′′). Let x′ ∈ h(p, u′) and x′′ ∈ h(p, u′′). Then x′′ 6= (0, . . . , 0) and p · x′ ≥ p · x′′. Bycontinuity, limα→1 u(αx′′) = u(x′′) ≥ u′′ > u′, so u(αx′′) > u′ for α ∈ (0, 1) close to one. Butthen p · (αx′′) = α(p · x′′) ≤ α(p · x′) < p · x′, contradicting that x′ ∈ h(p, u′).(c): Let (p, u) ∈ RL++ × U , i ∈ {1, . . . , L}, and ε > 0. For each x ∈ RL+ with u(x) ≥ u,(p+ εei) · x ≥ p · x, so e(p+ εei, u) ≥ e(p, u).(d): Let u ∈ U . To show: the set {(p, r) ∈ RL++ × R : r ≤ e(p, u)} is convex.

Let (p1, r1), (p2, r2) lie in this set, let α ∈ [0, 1], and de�ne (p, r) = α(p1, r1) + (1−α)(p2, r2).Let x ∈ h(p, u). Then x is feasible in the EMP at (p1, u) and at (p2, u), so

e(p, u) = p · x= α(p1 · x) + (1− α)(p2 · x)

≥ αe(p1, u) + (1− α)e(p2, u)

≥ αr1 + (1− α)r2

= r.

Exercise 4.6Let (p, w) ∈ RL+1

++ and x = x(p, w). By Walras' Law, p · x = w. For each p′ ∈ RL++, x is feasiblein the UMP at prices p′ and wealth p′ · x:

v(p′, p′ · x) ≥ u(x) = v(p, w) = v(p, p · x).

So the function f : RL++ → R with f(p′) = v(p′, p′ · x) achieves its minimum at p′ = p. By the�rst order conditions, it partial derivatives must be zero at p:

∀` = 1, . . . , L :∂f(p)

∂p`=∂v(p, p · x)

∂p`+∂v(p, p · x)

∂wx` = 0.

As x = x(p, w) and p · x = w, the result now follows.

Exercise 4.7By (15), indirect utility solves w = e(p, v(p, w)) = v(p, w)

∑Li=1 aipi, so v(p, w) = w/

∑Li=1 aipi.

By (17), x(p, w) = h(p, v(p, w)) = (a1v(p, w), . . . , aLv(p, w)) =(

a1w∑Li=1 aipi

, . . . , aLw∑Li=1 aipi

).

100

Page 107: Preferences

Exercise 4.8We know from (18) that h`(p, u) = x`(p, e(p, u)). Di�erentiating this equation w.r.t. pk andusing the Chain rule gives

∂h`(p, u)

∂pk=∂x`(p, e(p, u))

∂pk+∂x`(p, e(p, u))

∂w

∂e(p, u)

∂pk.

Recall from (14) that ∂e(p,u)∂pk

= hk(p, u):

∂h`(p, u)

∂pk=∂x`(p, e(p, u))

∂pk+∂x`(p, e(p, u))

∂whk(p, u).

It follows from (15) and u = v(p, w) that e(p, u) = e(p, v(p, w)) = w and it follows from (18) andu = v(p, w) that h(p, u) = h(p, v(p, w)) = x(p, w), so:

∂h`(p, u)

∂pk=∂x`(p, w)

∂pk+∂x`(p, w)

∂wxk(p, w).

Exercise 4.9Indivisibilities, rationing, package deals, as well as the speci�c initial endowment ω = (1, 1)imply smaller budget sets and therefore a (weakly) lower welfare. Rebates 1 and 2 and the giftcerti�cate imply a larger budget set and therefore a (weakly) higher welfare.

Exercise 4.10As p1 · x0 < w1, x0 ∈ B(p1, w1). As x0 does not exhaust the budget, p1 · y ≤ w1 for all y with‖x− y‖ su�ciently close to zero. By local nonsatiation, this neighborhood contains a y strictlypreferred to x0.

Exercise 4.11Write A =

∑Li=1 ai. Standard calculations give:

x(p, w) =

(a1

A

w

p1, ...,

aLA

w

pL

),

v(p, w) =(wA

)A L∏i=1

(aipi

)ai,

h(p, u) = u1/AL∏i=1

(piai

)ai/A(a1

p1, . . . ,

aLpL

),

e(p, u) = Au1/AL∏i=1

(piai

)ai/A,

EV ((p0, w0), (p1, w1)) = A(u1)1/AL∏i=1

(p0i

ai

)ai/A− w0,

CV ((p0, w0), (p1, w1)) = w1 −A(u0)1/AL∏i=1

(p1i

ai

)ai/A.

101

Page 108: Preferences

It is commonly assumed (w.l.o.g., as this is just a monotonic transformation of the utility) thatA = 1, which yields slightly more sympathetic expressions.

Section 5

Exercise 5.1(a): Y ∩ {y ∈ RL : y ≥ −ω} is the intersection of closed sets, hence closed. It contains the zerovector 0.(b): As the length of the vectors (yn)n∈N diverges to in�nity, ‖yn‖ ≥ 1 for n su�ciently large.

By convexity and possibility of inaction, zn = 1‖yn‖yn +

(1− 1

‖yn‖

)0 ∈ Y . By assumption,

yn + ω ≥ 0, so dividing by ‖yn‖ gives zn + ω/‖yn‖ ≥ 0.(c): All vectors zn have length one. A bounded sequence contains a convergent subsequence.Let z be its limit. Firstly, z 6= 0, as it is the limit of a sequence of vectors of length one. Secondly,as zn lies in Y for n large, and Y is closed, also the limit z lies in Y .(d): Letting n → ∞, and realizing that ω/‖yn‖ → 0, (b) implies that z ≥ 0. As z 6= 0, thiscontradicts no free lunch.

Exercise 5.2 Reasoning as in the EMP, the assumptions on f guarantee that the CMP issolvable.(a): De�ne qz = f(z) ≥ 0. The CMP at (w, qz) has a solution and z is feasible in this CMP, soc(w, qz) ≤ w · z. Conclude that pf(z)− w · z ≤ pqz − c(w, qz).(b): Let zq solve the CMP at (w, q), i.e., zq ∈ RL−1

+ , f(zq) ≥ q, and c(w, q) = w · zq. Concludethat pf(zq)− w · zq ≥ pq − c(w, q).(c): Assume (P1) has a solution z (the case where (P2) has a solution is similar). By (a), thereis a feasible qz in (P2) with equal or higher pro�t. It cannot be higher. Otherwise, by (b), thereis a feasible zqz in (P1) yielding a higher pro�t than qz and therefore higher than the pro�tmaximizing z, a contradiction. Conclude that qz solves (P2) and yields the same pro�t as z in(P1).

Exercise 5.3(a), (b): Consider the convex production set Y = {y ∈ R2 : y1 ≤ 0, y2 ≤

√−y1}. The

point (0,−1) ∈ Y maximizes pro�t at price vector p = (1, 0) ∈ R2+, but is not e�cient, as also

(0, 0) ∈ Y . The point (0, 0) ∈ Y is e�cient, but does not maximize pro�t at strictly positiveprices.(c): Consider the production set Y = {y ∈ R2 : y ≤ (1, 1), (y1 − 1)2 + (y2 − 1)2 ≥ 2}. The point(0, 0) ∈ Y is e�cient, but not pro�t maximizing for any nonzero vector p ∈ R2

+: if p1 ≥ p2, then(1, 1 −

√2) ∈ Y yields a positive pro�t, and if p1 ≤ p2, then (1 −

√2, 1) ∈ Y yields a positive

pro�t, whereas (0, 0) ∈ Y yields only zero pro�t.

Section 6

Exercise 6.1(a): Look at the de�nitions of improvements and Pareto optimality: the fact that the coalitionS = H of all consumers cannot improve upon x means that there is nothing feasible that makeseverybody better o�. But there may still be room for improvement for some if not all consumers:it may still be Pareto dominated.(b): Consider a pure exchange economy with two consumers and two commodities. The �rst con-sumer's preferences are represented by the utility function u1(x) = x1x2, the second consumer's

102

Page 109: Preferences

preferences by a constant utility function: he is indi�erent between all commodity bundles. Ifω1 = ω2 = (1, 1), then (p, x) = ((1, 1), (1, 1), (1, 1)) (i.e., prices are equal and each consumersticks to the initial endowment) is a Walrasian equilibrium. By Proposition 6.2, the allocationlies in the core. But the allocation is not Pareto optimal: giving the total endowment to the �rstconsumer makes him better o�, while not a�ecting the happiness of the second consumer.

Exercise 6.2(a): Let p ∈ RL+, z ∈ z(p) ∩ RL−. By Walras' Law, p · z =

∑k:pk>0 pkzk = 0. As the sum of

nonpositive terms, it can be zero only if z` = 0 whenever p` > 0.(b): Let p, z, ` be as in the statement of the exercise. As zk = 0 for k 6= `, Walras' Law impliesp · z = p`z` = 0. As p` > 0, this implies z` = 0.(c): If in equilibrium the market for good ` ∈ {1, . . . , L} does not clear, its price is zero by(a). So consumer h is not constrained in his consumption of `. In equilibrium, h must choose amost preferred bundle from the budget set, but there is none: under (c1), each bundle can beimproved upon by adding more of good `; under (c2), a most preferred bundle can't lie on theaxes, as h can a�ord a better alternative in RL++; the latter can be improved upon by addingmore of good `.

Exercise 6.3(a): Pareto dominance tries to compare allocations regardless of prices. Preferences of �rms(pro�t) are functions of prices.(b): Let (p, x, y) be a Walrasian equilibrium of E . Suppose there is a feasible allocation (x, y)Pareto dominating (x, y). Local nonsatiation implies

∀h ∈ H :

xh %h xh ⇒ p · xh ≥ p · xh = p ·(ωh +

∑f∈F θ

hfyf),

xh �h xh ⇒ p · xh > p · xh = p ·(ωh +

∑f∈F θ

hfyf).

By Pareto dominance, such a weak preference holds for all, and strict preference for some h ∈ H.Summing over h ∈ H and using that equilibrium production plans (yf )f∈F are pro�t maximizingat prices p gives

p ·∑h∈H

xh > p ·∑h∈H

xh

=∑h∈H

p ·(ωh +

∑f∈F

θhfyf)

= p · ω + p ·∑f∈F

yf

≥ p · ω + p ·∑f∈F

yf .

But p ·∑

h∈H xh > p · ω + p ·

∑f∈F y

f contradicts feasibility of (x, y).

Exercise 6.4Pure exchange economies: You may verify that the following pure exchange economiesE = (%1,%2, ω1, ω2) have the desired property:(a): Let %1 and %2 be lexicographic preferences over R2

+, ω1 = (1, 0), and ω2 = (0, 1). There is

no Walrasian equilibrium:

103

Page 110: Preferences

� if p ∈ ∆ has both prices positive, then consumer 1 demands ω1 and consumer 2 demands(p2/p1, 0), so there is excess demand for the �rst commodity;

� if one of the commodities has price zero, demand for this commodity is unbounded.

(b): The standard Cobb-Douglas case.(c): Let %1 be represented by the utility function u1(x) = max{min{2x1, x2},min{x1, 2x2}}and %2 by the utility function u2(x) = x1 + x2. Let ω

1 = ω2 = (1, 1).

� if one of the commodities has price zero, demand for this commodity is unbounded: thereare no Walrasian equilibria at such prices;

� if both prices are positive and p1 > p2, the �rst consumer demands a bundle with 2x1 = x2,i.e., the bundle (p · ω1/(p1 + 2p2), 2p · ω1/(p1 + 2p2)) and the second consumer spendsthe entire income on the second commodity, i.e., demands the bundle (0, p · ω2/p2). Inparticular, demand for the second commodity is at least twice the demand for the �rstcommodity. As the total endowment of both commodities is equal, not both markets canclear at the same time, contradicting the fact that (given local nonsatiation) markets witha positive price must clear. There are no Walrasian equilibria at such prices;

� similarly, Walrasian equilibria with positive prices and p2 > p1 are ruled out;

� if both prices are positive and equal, i.e., p = (1/2, 1/2), the �rst consumer's demandis {(2/3, 4/3), (4/3, 2/3)} and the second consumer's demand is {x ∈ R2

+ : x1 + x2 =2}. There are two (equilibrium/market clearing) allocations: ((2/3, 4/3), (4/3, 2/3)) and((4/3, 2/3), (2/3, 4/3)).

(d): Preferences %1,%2 are such that the consumers are indi�erent between all commoditybundles; ω1 = ω2 = (1, 1). Every (p, x) with p ∈ ∆ and x = (x1, x2) ∈ R2

+ × R2+ with

xh ∈ Bh(p, p · ωh) for both h = 1, 2 is a Walrasian equilibrium.Private ownership economies: Take the examples above and give the producers the trivialproduction set {0} consisting of the remarkable feat of producing absolutely nothing using abso-lutely nothing. If you prefer slightly larger production sets, you may want to choose them equalto R2

−, containing all production plans producing absolutely nothing, possibly using something.

Exercise 6.5Feasible allocations: {(xT , xL) ∈ R2

+ : xT + xL ≤ 1}.Pareto optimal allocations: Must be nonwasteful, otherwise the remainder can be givento the liar, who becomes happier, while the true mother is not harmed. Moreover, xT /∈ (0, 1):otherwise the true mother can be made happier by giving her 0, while not harming the liar. Onlyallocations (0, 1) and (1, 0) are Pareto optimal.Core: The core depends on the initial allocation (ωT , ωL). Denote an allocation by a vectorx = (xT , xL).

� The liar can improve upon any allocation with xL < ωL, so xL ≥ ωL in the core.

� For the true mother:

• if ωT = 0, individual rationality and feasibility require that xT ∈ {0, 1},• if ωT ∈ (0, 1), individual rationality has no bite: everything is at least as good as herinitial allocation,

• if ωT = 1, individual rationality and feasibility require that xT = 1.

104

Page 111: Preferences

� The coalition of both women can improve upon any feasible allocation with xT ∈ (0, 1) bygiving the liar the entire baby, so xT ∈ {0, 1} in the core.

� Combining the above gives that the core is{(1, 0)} if the initial endowment is (ωT , ωL) = (1, 0),{(0, x) : ωL ≤ x ≤ 1} if the initial endowment has ωT ∈ (0, 1),{(0, 1)} if the initial endowment is (ωT , ωL) = (0, 1).

� Notice that if ωT ∈ (0, 1), there are wasteful core allocations.

Walrasian equilibria: The Walrasian equilibria depend on the initial allocation (ωT , ωL). Asequilibrium involves a nonzero price vector, we may assume w.l.o.g. that the equilibrium priceis p > 0.

� The true mother demands 0 if ωT ∈ [0, 1) and 1 if ωT = 1.

� The liar demands ωL.

� Therefore, the set of Walrasian equilibria is{{(p, xT , xL) ∈ R3 : p > 0, xT = 0, xL = ωL} if the initial endowment has ωT ∈ [0, 1),{(p, xT , xL) ∈ R3 : p > 0, xT = 1, xL = 0} if ωT = 1.

Section 7

Exercise 7.1(a):

� Best elements ofG: those whose reduced simple gambles put largest probability on max{a1, . . . , ak}.Worst elements of G: those whose reduced simple gambles put largest probability onmin{a1, . . . , ak}.

� (G1) satisfied: preferences represented by utility function u(g) = 1|L(g)|

∑ai∈L(g) ai.

(G2) violated: assume w.l.o.g. that a1 > a2 and consider the gambles a1 and (p◦a1, (1−p)◦a2). If p > 1/2, a1 is the most likely outcome in both gambles, so the DM is indi�erentbetween them. Continuity would require a1 ∼ (1

2 ◦ a1,12 ◦ a2). However, at p = 1/2, the

DM assigns value a1+a22 < a1 to the second gamble, so he strictly prefers the gamble giving

a1 for sure.

(G3) satisfied: preferences are de�ned in terms of reduced simple gambles: u(g) = u(gs).

(G4) violated: assume w.l.o.g. that a1 > a2. Then the DM strictly prefers g = a1 tog′ = a2. Independence requires that also

(α ◦ g, (1− α) ◦ a1) � (α ◦ g′, (1− α) ◦ a1)

for all α ∈ (0, 1). However, for α close to zero, a1 is the most likely outcome in bothgambles, so the DM is indi�erent between them.

� As (G2) and (G4) are violated, Remark 7.3 implies that % cannot be represented by avNM utility function.

(b):

105

Page 112: Preferences

� Best element of G: deterministic outcome max{a1, . . . , ak}. Worst elements of G do notexist: for each g ∈ G, the gamble (1

2 ◦ g,12 ◦ g) has higher complexity and is therefore worse

than g.

� (G1) satisfied: preferences represented by a utility function.

(G2) satisfied: on G1, the DM's utility function u(g) =∑k

m=1 pmam − 1 is continuous.

(G3) violated: the gambles a1 ∈ G0 and (12 ◦ a1,

12 ◦ a1) ∈ G1 both have reduced simple

gamble (1 ◦ a1), yet the former lies in G0 and is therefore strictly preferred to the latter inG1.

(G4) violated: Letg = a1 ∈ G0,g′ = (1

2 ◦ a1,12 ◦ a1) ∈ G1,

g′′ = (12 ◦ g

′, 12 ◦ g

′) ∈ G2.

Let α ∈ (0, 1). By construction,

(α ◦ g, (1− α) ◦ g′′), (α ◦ g′, (1− α) ◦ g′′) ∈ G3.

Hence

u(g) = a1 − 0,

u(g′) = a1 − 1,

u(α ◦ g, (1− α) ◦ g′′) = a1 − 3,

u(α ◦ g′, (1− α) ◦ g′′) = a1 − 3,

in violation of (G4).

� As (G3) and (G4) are violated, Remark 7.3 implies that % cannot be represented by avNM utility function.

(c):

� To characterize the best and worst elements of G, distinguish two cases:

1. min{a1, . . . , ak} ≤ 5 < max{a1, . . . , ak}.Best elements of G: those putting probability one on outcomes am > 5 (utility equalto its maximum, one).

Worst elements of G: those putting probability one on outcomes am ≤ 5 (utility equalto its minimum, zero).

2. Otherwise, if all ak exceed 5 or all ak are at most �ve, the utility function is constant(one in the former case, zero in the latter), so all gambles are equivalent (and henceboth best and worst elements of G).

� Shortcut: for each i = 1, . . . , k, de�ne u(ai) = 0 if ai ≤ 5 and u(ai) = 1 otherwise. Then forevery g ∈ G with reduced simple gamble (p1 ◦ a1, · · · , pk ◦ ak), we have u(g) =

∑i:ai>5 pi =∑k

i=1 piu(ai), i.e., this de�nes a vNM utility function. By Remark 7.3, % must satisfy (G1)to (G4).

106

Page 113: Preferences

Section 10

Exercise 10.1(a): If u has no upper bound, construct a sequence of instantaneous utilities (u(ct))

∞t=0 with

u(ct) > 1/δ(t) for each time t. Then δ(t)u(ct) > 1 at each time t and∑∞

t=0 δtu(ct) diverges.(b): Let u be bounded by B ∈ R and let c = (ct)

∞t=0 be an arbitrary stream of choices. For

each t, |δ(t)u(ct)| ≤ Bδ(t) and∑∞

t=0Bδ(t) = B∑∞

t=0 δ(t) converges. By the comparison test forsummable sequences, also

∑∞t=0 δ(t)u(ct) converges.

Exercise 10.2(a1) k gives instantaneous utility u(h, k) = −1, ` gives instantaneous utility u(h, `) = 1, so theoptimal action is ` with instantaneous utility 1.(a2) k gives instantaneous utility u(d, k) = 0, ` gives instantaneous utility u(d, `) = −α, so theoptimal action is k with instantaneous utility 0.(b) k gives expected discounted utility u(d, k) + δ · 0 = 0, ` gives expected discounted utilityu(d, `) + 1

2δ(u(h, `) + u(d, k)) = −α+ 12δ(1 + 0) = 1

2δ − α, so the optimal action isk if 1

2δ − α < 0,k and ` if 1

2δ − α = 0,` if 1

2δ − α > 0.

(c) If the severity of the depression is relatively small (12δ−α > 0), an initially depressed person

may decide not to take his life in the hope of becoming happy later while still having the optionof suicide in case of continued depression.

Exercise 10.3Preferring one apple today over two apples tomorrow means that

u(1) > (1 + α)−γ/αu(2).

Preferring two apples one year and a day from now to one apple a year from now (and assumingwe're not in a leap year) means that

(1 + 366α)−γ/αu(2) > (1 + 365α)−γ/αu(1).

These two inequalities hold simultaneously if(1

1 + α

)γ/α<u(1)

u(2)<

(1 + 365α

1 + 366α

)γ/α.

Given α, it remains possible to choose the exponent γ/α arbitrarily: having it equal to β simplymeans choosing γ = αβ. So we can simplify the problem and show that there are α, β > 0solving (

1

1 + α

)β<u(1)

u(2)<

(1 + 365α

1 + 366α

)β,

or similarly

1

1 + α<

(u(1)

u(2)

)1/β

<1 + 365α

1 + 366α.

107

Page 114: Preferences

Notice that α > 0 implies that

0 <1

1 + α<

1 + 365α

1 + 366α< 1,

The expression (u(1)/u(2))1/β is a continuous function of β > 0. As u(1)/u(2) ∈ (0, 1), it goesto zero to as β → 0 and to one as β →∞. By the Intermediate Value Theorem, there exists, foreach α > 0, a β > 0 such that (u(1)/u(2))1/β lies between the two desired bounds.

Exercise 10.5lim inft→∞ xt = c implies [L1] and [L2]: Let ε > 0. As limt→∞ inf{xs : s ≥ t} = c, there is aT ∈ N such that

c− ε < inf{xs : s ≥ t} < c+ ε,

for all t ≥ T . Apply the �rst inequality in the special case of t = T :

c− ε < inf{xs : s ≥ T},

so c− ε < xt for all t ≥ T , proving [L1].Let T ′ ∈ N and apply the second inequality to the special case of t = max{T, T ′}:

inf{xs : s ≥ max{T, T ′}} < c+ ε,

i.e., there is a t ≥ T ′ with xt < c+ ε, proving [L2].[L1] and [L2] imply lim inft→∞ xt = c: Let ε > 0. By [L1] there is a T ∈ N such that

c− ε/2 < xt

for all t ≥ T . Hence,c− ε < c− ε/2 ≤ inf{xs : s ≥ T}.

As the in�mum increases weakly if the bound T does, it follows that, for each t ≥ T :

c− ε < inf{xs : s ≥ t}. (57)

By [L2] applied to an arbitrary t ≥ T , there is an s ≥ t such that

xs < c+ ε/2 < c+ ε,

i.e.,inf{xs : s ≥ t} ≤ c+ ε/2 < c+ ε. (58)

Combining (57) and (58) gives that for each ε > 0 there is a T ∈ N such that

c− ε < inf{xs : s ≥ t} < c+ ε,

i.e., lim inft→∞ xt = c.

Exercise 10.6It su�ces to show, for an arbitrary sequence (xt)

∞t=0:

lim inft→∞

xt > 0 ⇔ ∃ε > 0 : xt > ε for all but �nitely many t.

108

Page 115: Preferences

(⇒): Assume lim inft→∞ xt > 0. If the liminf is in�nite, the weakly increasing sequence of in�mainf{xs : s ≥ t} diverges, so there is a T ∈ N with inf{xs : s ≥ T} ≥ 1. In particular, xt ≥ 1 for allt ≥ T . If the liminf is �nite, [L1] with ε = c/2 implies that there is a T ∈ N with xt > c−ε = c/2for all t ≥ T .(⇐): Assume there is an ε > 0 such that xt > ε for all but �nitely many t: there is a T ∈ Nsuch that xt > ε for t ≥ T . Then inf{xs : s ≥ t} ≥ ε for t ≥ T , so also the limit of the in�maexceeds ε: it must be positive!

Exercise 10.7(a): If a sequence is unbounded, the liminf of average payo�s need not converge. For instance,the unbounded sequence x = (xt)

∞t=0 de�ned recursively by x0 = 1 and, for all t ∈ N, xt =

(t+ 1)2 −∑t−1

k=0 xk, has time average 1T

∑T−1t=0 xt = T , so its liminf diverges to in�nity.

(b): Let x = (xt)∞t=0 and y = (yt)

∞t=0 be two bounded sequences. We need to investigate whether

lim infT→∞

1

T

T−1∑t=0

(xt − yt) > 0 ⇔ lim infT→∞

1

T

T−1∑t=0

xt > lim infT→∞

1

T

T−1∑t=0

yt. (59)

To see that this is not the case, let x = (0, 0, . . .) be the zero sequence. Substitution in (59) andusing, for any sequence z = (zt)

∞t=0, that lim inft→∞−zt = − lim supt→∞ zt � where the limes

superior is de�ned analogously to liminf as lim supt→∞ zt = limt→∞(sup{zs : s ≥ t}) � yields

lim supT→∞

1

T

T−1∑t=0

yt < 0 ⇔ lim infT→∞

1

T

T−1∑t=0

yt < 0.

This is obviously false. For an explicit example, take the sequence from page 73 with theoscillating average and subtract 1/2 from each entry to obtain a sequence of averages with liminfequal to 1/3− 1/2 = −1/6 < 0, but limsup equal to 2/3− 1/2 = 1/6 > 0.

Section 11

Exercise 11.1The cost function c is strictly convex, so the function

∑ni=1 piπ(i) − 1

2δ c(p) is strictly concave.Since we maximize a strictly concave, continuous function over a compact set, a maximum existsand is unique. Notice that the gradient of the goal function has i-th coordinate

π(i)− 1

∂c(p)

∂pi= π(i)− 1

2δ2

(pi −

1

n

)= π(i)− 1

δ

(pi −

1

n

).

Since the feasible set is entirely de�ned by linear (in)equalities, the Kuhn-Tucker conditionsgive necessary and su�cient conditions for a solution to be a maximum. So p∗ ∈ ∆ solves themaximization problem if and only if there are Lagrange multipliers λi ≥ 0 associated with theinequality constraints p∗i ≥ 0 and µ ∈ R associated with the equality constraint

∑ni=1 p

∗i = 1

such that for each i = 1, . . . , n :

π(i)− 1

δ

(p∗i −

1

n

)+ λi + µ = 0 and λip

∗i = 0. (60)

Rewriting we �nd

∀i = 1, . . . , n : p∗i = δπ(i) + δ (λi + µ) +1

n.

109

Page 116: Preferences

Assume that p∗ solves the maximization problem. We check that it satis�es the linear probabilitymodel with parameter δ. If p∗i > 0, then λi = 0 by complementary slackness. Hence for everyj ∈ A, we �nd, using (60):

p∗i − p∗j =

[δπ(i) + δµ+

1

n

]−[δπ(j) + δ (λj + µ) +

1

n

]= δ(π(i)− π(j))− δλj≤ δ(π(i)− π(j)),

where the inequality follows from the fact that δ > 0 and λj ≥ 0. This is exactly requirement(52).

Conversely, if p∗ ∈ ∆ satis�es requirement (52), one can easily show that it satis�es theKuhn-Tucker conditions. Recall that if p∗i > 0 and p∗j > 0, then

p∗i − p∗j = δ(π(i)− π(j)),

so1

δ

(p∗i −

1

n

)− π(i) =

1

δ

(p∗j −

1

n

)− π(j). (61)

Hence if we choose i ∈ {1, . . . , n} with p∗i > 0 and de�ne

µ =1

δ

(p∗i −

1

n

)− π(i) ∈ R,

we have from (61) that

µ =1

δ

(p∗j −

1

n

)− π(j)

for all j with p∗j > 0. Now de�ne for each k :

λk =

{0 if p∗k > 0,1δ

(p∗k −

1n

)− π(k)− µ if p∗k = 0.

To see that λk ≥ 0 if p∗k = 0, choose an alternative j with p∗j > 0. By de�nition of the linearprobability model,

p∗j − p∗k ≤ δ(π(j)− π(k)),

which implies

(π(j)− π(k))− 1

δ

(p∗j − p∗k

)≥ 0.

Hence

λk =1

δ

(p∗k −

1

n

)− π(k)− µ

=1

δ

(p∗k −

1

n

)− π(k)− 1

δ

(p∗j −

1

n

)+ π(j)

= (π(j)− π(k))− 1

δ

(p∗j − p∗k

)≥ 0,

110

Page 117: Preferences

as we had to show. Substituting the de�nition of the Lagrange multipliers in (60) shows thatthe Kuhn-Tucker conditions are satis�ed.

Exercise 11.2(a): Choice probabilities are weakly increasing in payo�s, so the probability of choosing 1 mustbe positive. If also the probability of choosing 2 is positive, the linearity requirement implies

PA(1)− PA(2) = δ(π(1)− π(2)) = 4δ.

Together with PA(1) + PA(2) = 1, this gives

PA(1) =4δ + 1

2, PA(2) =

1− 4δ

2. (62)

Obviously, this is possible if and only if both these probabilities are nonnegative, i.e., if and onlyif δ ≤ 1/4. So for δ ∈ (0, 1/4], the choice probabilities in (62) satisfy the linear probability modeland we know that for every δ there is only one such vector of choice probabilities. For δ > 1/4,we �nd

PA(1) = 1, PA(2) = 0. (63)

(b) (c): Answered in the notes on The role of δ.

Solution 11.3(a):

� In the logit model with parameter δ > 0, the choice probability for each alternative i ∈ Ais

PA(i) =exp(π(i)/δ)∑j∈A exp(π(j)/δ)

. (64)

� Substituting the payo�s, we �nd:

PA(1) =exp(0/δ)

exp(0/δ) + exp(2/δ) + exp(8/δ)

=1

1 + exp(2/δ) + exp(8/δ),

PA(2) =exp(2/δ)

1 + exp(2/δ) + exp(8/δ),

PA(3) =exp(8/δ)

1 + exp(2/δ) + exp(8/δ).

Since the exponential function takes strictly positive values, all choice probabilities lie in(0, 1).

� The logit model is a special case of Luce's choice model (see (42) and (45)), which satis�espath independence. Hence the logit model satis�es path independence.

� As δ →∞, the choice probabilities converge to 1/3. See the motivation in Section 11.2.

(b):

� Choice probabilities PA(i) for all alternatives i ∈ A satisfy the linear probability modelwith parameter δ > 0 if the following holds:

if PA(i) > 0, then PA(i)− PA(j) ≤ δ(π(i)− π(j)) for all j ∈ A. (65)

111

Page 118: Preferences

� Since choice probabilities are weakly increasing in payo�s and π(3) > π(2) > π(1), thereare three cases to consider:

• Case 1: PA(i) > 0 for all i ∈ A.• Case 2: PA(3), PA(2) > 0, PA(1) = 0.

• Case 3: PA(3) > 0, PA(2) = PA(1) = 0, or equivalently, PA(3) = 1.

� Using (65), the �rst case requires:

PA(3)− PA(2) = δ(π(3)− π(2)) = 6δ,

PA(3)− PA(1) = δ(π(3)− π(1)) = 8δ,

PA(1) + PA(2) + PA(3) = 1.

So:

PA(2) = PA(3)− 6δ,

PA(1) = PA(3)− 8δ,

3PA(3)− 14δ = 1.

Conclude that PA(3) = 1+14δ

3 ,

PA(2) = 1+14δ3 − 6δ = 1−4δ

3 ,

PA(1) = 1+14δ3 − 8δ = 1−10δ

3 .

(66)

To make sure that all probabilities are positive, this requires that δ ∈ (0, 1/10). So theprobabilities in (66) satisfy the linear probability model for δ ∈ (0, 1/10).

� Using (65), the second case requires:

PA(3)− PA(2) = δ(π(3)− π(2)) = 6δ,

PA(3)− PA(1) = PA(3) ≤ δ(π(3)− π(1)) = 8δ,

PA(1) = 0,

PA(1) + PA(2) + PA(3) = 1.

Rewrite:

PA(2) = PA(3)− 6δ,

PA(1) = 0,

PA(3) ≤ 8δ,

2PA(3)− 6δ = 1.

Conclude that PA(3) = 1+6δ

2 ,

PA(2) = 1+6δ2 − 6δ = 1−6δ

3 ,PA(1) = 0.

(67)

To make sure that PA(2) and PA(3) are positive and

PA(3) =1 + 6δ

2≤ 8δ,

112

Page 119: Preferences

this requires that δ ∈ [1/10, 1/6). Conclude that the choice probabilities in (67) satisfy thelinear probability model for δ ∈ [1/10, 1/6).

� Using (65), the third case requires:

PA(3)− PA(2) = 1 ≤ δ(π(3)− π(2)) = 6δ,

PA(3)− PA(1) = 1 ≤ δ(π(3)− π(1)) = 8δ.

So choice probabilities PA(1) = PA(2) = 0, PA(3) = 1 satisfy the linear probability modelas long as δ ≥ 1/6.

� The linear probability model does not satisfy path independence for every δ > 0. Inparticular, we will show that for a speci�c value of δ > 0, PA(1) 6= PA({1, 2})P{1,2}(1).This means that we have to consider choice probabilities in the smaller problem with onlyalternatives 1 and 2. Let us assume that both P{1,2}(1) and P{1,2}(2) are positive. Thisrequires that

P{1,2}(2)− P{1,2}(1) = δ(π(2)− π(1)) = 2δ,

P{1,2}(1) + P{1,2}(2) = 1,

so

P{1,2}(1) =1− 2δ

2, P{1,2}(2) =

1 + 2δ

2.

These choice probabilities satisfy the linear probability model as long as δ ∈ (0, 1/2). Nowlet us choose δ = 1/20. Then

PA(1) =1− 10δ

3=

1

6

but

PA({1, 2})P{1,2}(1) =

(1− 10δ

3+

1− 4δ

3

)1− 2δ

2

=2− 14δ

3· 1− 2δ

2

=(1− 7δ)(1− 2δ)

3

=39

200

6= 1

6.

� As δ → ∞, it follows from our earlier analysis that Case 3 is the only feasible one: thedecision maker rationally chooses alternative 3 with probability one.

Exercise 11.4Suppose pi < pj . Exchange the probabilities assigned to the i-th and j-th alternative to obtain avector p′. By construction,

∑ni=1 p

′iπ(i) >

∑ni=1 piπ(i), and by symmetry, the control cost term

is una�ected, contradicting that p solves P (δ).

113