Ambiguity, data and preferences for information – A case-based approach

30
Available online at www.sciencedirect.com Journal of Economic Theory 148 (2013) 1433–1462 www.elsevier.com/locate/jet Ambiguity, data and preferences for information – A case-based approach Jürgen Eichberger a,1 , Ani Guerdjikova b,a University of Heidelberg, Alfred Weber Institute, Grabengasse 14, 69117 Heidelberg, Germany b University of Cergy-Pontoise, THEMA, 33 bd. du Port, 95011 Cergy-Pontoise, France Received 9 June 2011; final version received 18 December 2012; accepted 7 January 2013 Available online 24 April 2013 Abstract We model decision making under ambiguity based on available data. Decision makers express pref- erences over actions and data sets. We derive an α-max–min representation of preferences, in which beliefs combine objective characteristics of the data (number and frequency of observations) with sub- jective features of the decision maker (similarity of observations and perceived ambiguity). We identify the subjectively perceived ambiguity and separate it into ambiguity due to a limited number of observations and ambiguity due to data heterogeneity. The special case of no ambiguity provides a behavioral foundation for beliefs as similarity-weighted frequencies as in Billot et al. (2005) [3]. © 2013 Elsevier Inc. All rights reserved. JEL classification: D81 Keywords: Case-based decisions; Ambiguity; Similarity; Optimism; Pessimism; Preferences for information precision This research was supported by grant PCIG-GA-2011-293529 of the European Research Council as well as by grant ANR 2011 CHEX 006 01 of the Agence Nationale de Recherche. We would like to thank the associate editor and the anonymous referee, Larry Blume, Eric Danan, David Easley, Gaby Gayer, Itzhak Gilboa, John Hey, Edi Karni, David Kelsey, Nick Kiefer, Peter Klibanoff, Mark Machina, Charles Manski, Francesca Molinari, Ben Polak, Regis Renault, Karl Schlag, David Schmeidler, conference participants at RUD 2008 in Oxford, FUR 2008 in Barcelona, the Workshop on Learning and Similarity 2008 in Alicante, RUD 2012 in Northwestern, ESEM 2012 in Malaga, as well as seminar participants at Cornell, UC Davis, Cergy Pontoise, George Washington University, École Polytechnique, University of Paris 2 and the University of Sankt Gallen for helpful suggestions and comments. * Corresponding author. Fax: +33 1 34 25 62 33. E-mail addresses: [email protected] (J. Eichberger), [email protected] (A. Guerdjikova). 1 Fax: +49 62 21 54 29 97. 0022-0531/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jet.2013.04.002

Transcript of Ambiguity, data and preferences for information – A case-based approach

Page 1: Ambiguity, data and preferences for information – A case-based approach

Available online at www.sciencedirect.com

Journal of Economic Theory 148 (2013) 1433–1462

www.elsevier.com/locate/jet

Ambiguity, data and preferences for information –A case-based approach ✩

Jürgen Eichberger a,1, Ani Guerdjikova b,∗

a University of Heidelberg, Alfred Weber Institute, Grabengasse 14, 69117 Heidelberg, Germanyb University of Cergy-Pontoise, THEMA, 33 bd. du Port, 95011 Cergy-Pontoise, France

Received 9 June 2011; final version received 18 December 2012; accepted 7 January 2013

Available online 24 April 2013

Abstract

We model decision making under ambiguity based on available data. Decision makers express pref-erences over actions and data sets. We derive an α-max–min representation of preferences, in whichbeliefs combine objective characteristics of the data (number and frequency of observations) with sub-jective features of the decision maker (similarity of observations and perceived ambiguity). We identify thesubjectively perceived ambiguity and separate it into ambiguity due to a limited number of observations andambiguity due to data heterogeneity. The special case of no ambiguity provides a behavioral foundation forbeliefs as similarity-weighted frequencies as in Billot et al. (2005) [3].© 2013 Elsevier Inc. All rights reserved.

JEL classification: D81

Keywords: Case-based decisions; Ambiguity; Similarity; Optimism; Pessimism; Preferences for information precision

✩ This research was supported by grant PCIG-GA-2011-293529 of the European Research Council as well as by grantANR 2011 CHEX 006 01 of the Agence Nationale de Recherche. We would like to thank the associate editor and theanonymous referee, Larry Blume, Eric Danan, David Easley, Gaby Gayer, Itzhak Gilboa, John Hey, Edi Karni, DavidKelsey, Nick Kiefer, Peter Klibanoff, Mark Machina, Charles Manski, Francesca Molinari, Ben Polak, Regis Renault,Karl Schlag, David Schmeidler, conference participants at RUD 2008 in Oxford, FUR 2008 in Barcelona, the Workshopon Learning and Similarity 2008 in Alicante, RUD 2012 in Northwestern, ESEM 2012 in Malaga, as well as seminarparticipants at Cornell, UC Davis, Cergy Pontoise, George Washington University, École Polytechnique, University ofParis 2 and the University of Sankt Gallen for helpful suggestions and comments.

* Corresponding author. Fax: +33 1 34 25 62 33.E-mail addresses: [email protected] (J. Eichberger), [email protected]

(A. Guerdjikova).1 Fax: +49 62 21 54 29 97.

0022-0531/$ – see front matter © 2013 Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.jet.2013.04.002

Page 2: Ambiguity, data and preferences for information – A case-based approach

1434 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

1. Introduction

Since Hume’s “Enquiry Concerning Human Understanding”, see [23], it has been recognizedthat the notion of induction is fundamental to human knowledge. “Reasoning rests on the prin-ciple of analogy”, writes Knight [25, p. 199], “we judge the future by the past”. In practice,decisions are often informed by data consisting of past observations. Randomized statistical ex-periments represent an ideal method of data collection, since the recorded information can bedirectly aggregated into a probability distribution over outcomes. In contrast, in real-life decisionsituations, the available data might contain a limited number of heterogeneous observations withdiffering degrees of relevance for the decision situation at hand. [12, p. 657] summarizes theproblem as follows: “What is at issue might be called the ambiguity of this information, a qualitydepending on the amount, type, reliability, and “unanimity” of information, and giving rise toone’s degree of “confidence” in an estimate of relative likelihoods”.

This naturally raises the question of how decision makers aggregate information from datainto beliefs over outcomes. In this paper, we pursue a behavioral approach to this issue. We usethe case-based framework pioneered by [16] to study agents who choose actions in the face ofdata. The natural primitives of the model are the decision maker’s preferences over pairs of ac-tions and data sets. We impose axioms which allow us to deduce agents’ beliefs about uncertainoutcomes and directly relate them to the content of the available data. In particular, we identifythe perceived ambiguity of information and decompose it into two parts: ambiguity due to limitednumber of observations and ambiguity due to data heterogeneity. We also derive a representa-tion of preferences which shows how data influence the evaluation of actions. Furthermore, ourapproach allows us to determine how decision makers value information of different type.

The analysis of decisions based on data is important, since data represent a major sourceof information, but do not fit the standard classification of information into risk and uncertainty.While data carry objective information about the stochastic process of outcomes, this informationmight be insufficient to point-identify the probability distribution of outcomes. Limited numberof observations, heterogeneity of observations and missing data are the main reasons for thisambiguity, see [27]. Hence, the agent’s beliefs will have to combine objective characteristics ofthe data (such as number, frequency and type of observations) with subjective considerations(such as perception of ambiguity and similarity between observations).

The two common approaches towards beliefs cannot capture this type of considerations. Thestate-based approach established by [28] derives a purely subjective probability distribution overstates. It thus completely detaches prior beliefs from objectively given information and leaves noroom for using data inductively in the process of belief formation.

The objective approach integrates available probabilistic information directly into the decisionmaker’s beliefs. When such information is not directly available, non-Bayesian statisticians useobserved frequencies in the data to infer probabilities of outcomes. In this case, the number andthe relevance of observations become an issue.

These two approaches represent limit cases of our analysis: beliefs based on a large data setresulting from a randomized statistical experiment will be (almost) objective, while purely sub-jective beliefs seem appropriate when the data set is small or contains cases of limited relevance.By considering the set of all possible data sets, our model spans a universe of possible scenarios,in which both objective and subjective factors determine beliefs.

Decision theory has so far treated the issues of data and ambiguity separately. Following [25],the literature deals with two types of situations: risk and uncertainty. In the expected utilityframework developed by [28,31], this distinction is inconsequential. The models of Knightian

Page 3: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1435

uncertainty, which have emerged in response to Ellsberg’s experiments in [12], highlight theimportance of ambiguity and ambiguity aversion for human behavior. Different models of am-biguity and ambiguity attitude have been provided by [4,6,15,18,26,29]. In this literature, bothperceived ambiguity and ambiguity attitude are purely subjective concepts, and, hence, unrelatedto any potentially available information. At the opposite end of the spectrum, several recent pa-pers [1,14,30], assume that ambiguity can be related to an objectively given set of probabilitydistributions. This allows them to separate the “objective ambiguity” from the “subjective atti-tude” towards ambiguity.

Case-based decision theory developed by [16] incorporates data directly into the decisionmaking process.2 It allows for heterogeneity of observations and introduces the concept of simi-larity to capture the different relevance of these observations for the evaluation of a given action.The different representations fail, however, to separate beliefs over outcomes from the evaluationof these outcomes. [3] represent beliefs as similarity-weighted frequencies of observations. How-ever, their method is not behavioral, i.e., the existence of such beliefs is postulated. Moreover, itdoes not take into account ambiguity of data and attitude towards such ambiguity.

The empirical relationship between ambiguity and data is largely unexplored. Some recentexperimental studies [2,22] examine, however, behavior when information is provided in theform of data. Both studies report significant behavioral effects of the form in which data is pro-vided.

In this paper, we combine the case-based approach to information processing with the litera-ture on ambiguity. We model a decision maker who chooses from a finite set of actions knowingthe set of possible outcomes. As in the case-based theory of [16], the information context ofthe decision situation is specified by a data set containing past observations of actions and theiroutcomes. Similar to [14], we assume that the decision maker can compare pairs consisting ofan action and an information context. Based on behavioral axioms, we derive a representation ofpreferences by an α-MEU functional, as in [6,15]:

V (a;D) = α maxp∈Ha(D)

∑r∈R

u(r)p(r) + (1 − α) minp∈Ha(D)

∑r∈R

u(r)p(r).

A pair (a;D), consisting of an action a and a data set D, is evaluated by the convex combinationof the maximal and minimal expected utility over outcomes r ∈ R,

∑r∈R u(r)p(r) obtained on

a set of probability distributions Ha(D). The beliefs over outcomes Ha(D) are set-valued, thuscapturing the fact that information might be ambiguous. They depend on the action a and on thecharacteristics of the data D. The decision maker’s attitude towards ambiguity is described bya degree of optimism α, the weight assigned to the maximal expected utility, and the degree ofpessimism (1 − α), the weight assigned to the minimal expected utility.

Our first contribution is to deduce the sets of probability distributions over outcomes, Ha(D),associated with a specific action a in a given information context D. Thus, we provide a behav-ioral foundation for the work of [3,9]. The set Ha(D) combines objective criteria, such as thefrequency and the number of observations in the data, with subjective features of the decisionmaker, such as the perceived degree of ambiguity and the similarity of observations. The rep-resentation obtained generalizes the idea of beliefs as similarity-weighted frequencies in [3] byallowing for ambiguity.

Our second contribution is to identify the degree of ambiguity associated with a particulardata set and behaviorally separate it from ambiguity attitude captured by α. Perceived ambiguity

2 See also [17,19] for alternative axiomatizations.

Page 4: Ambiguity, data and preferences for information – A case-based approach

1436 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

can be separated into two parts: vanishing ambiguity due to a limited number of observationsand persistent ambiguity due to the heterogeneity of cases in the data set. Since cases containthe observation of a single action, correlation across actions cannot be learned from the data.Hence, there is ambiguity associated with predictions about the performance of a given action a

using cases containing actions different from a and this ambiguity is persistent. We thus capturethe well-known insight of identification theory: if relevant characteristics are unobservable, themodel might not be identifiable [27]. The distinction between vanishing and persistent ambiguitycorresponds to a similar distinction in [13]. Our model also extends the approach of [7,20] toincorporate data heterogeneity.

Our third contribution is to demonstrate that the representation obtained in this paper can serveto model preferences for additional information. In particular, we show that the degrees of opti-mism and pessimism determine the decision maker’s preferences for more precise information:the more pessimistic the decision maker, the more he values precision of information.

The rest of the paper is organized as follows. The next section describes the framework andprovides examples which illustrate the scope of our approach. Section 3 introduces the desiredrepresentation and considers important special cases. Section 4 provides the axioms. Section 5contains the main representation theorem and discusses the notion of preferences for more pre-cise information. All proofs are collected in Appendix A.

2. Framework and motivating examples

We start this section by presenting the framework for our analysis.

2.1. Framework

Consider a decision problem (A;R) consisting of a finite set of actions A and a finite set ofoutcomes R. The decision maker’s information is given in form of data. A data set consists ofcases c which are tuples of an action a ∈ A and the outcome r ∈ R observed as a consequenceof this action, c = (a; r). The set of all cases is C = A × R. An information context is identifiedwith a data set D. A data set of length T , D = (c1 . . . cT ) = ((a1; r1); . . . ; (aT ; rT )), is a vectorof T cases. The set of all data sets of length T ∈ N, is denoted by DT := CT . D := ⋃

T�1 DT

denotes the set of data sets of finite length. The empty data set denoted D∅ does not contain anycases and captures a situation in which no information is available. D∗ =:D∪ {D∅} is the set ofall data sets including the empty one.

We remain agnostic as to how the information context D has been generated. The presumptionis that the decision maker trusts that the data is objective and reproducible and that the processdetermining the outcomes of the actions has not changed. Furthermore, we assume that an ob-servation of an action per se (i.e., without reference to its outcome) does not carry additionalinformation about the desirability of this action.3

Decision makers compare actions belonging to different information contexts. They rank pairsof actions and data sets, expressing preferences of the type (a;D) � (a′;D′). Hence, A ×D∗ isthe domain of the decision maker’s preference order �.

3 E.g., if the observations refer to past choices, the presumption is that these choices were not made based on superiorinformation which is not available to the decision maker.

Page 5: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1437

Table 1Data set D1.

R

1 0

A ab 2 1

aw 4 3

Table 2Data set D2.

R

1 0

A ab 80 90

aw 60 70

2.2. Discussion of the preference order � and examples

We are not the first to include the information context of an action in the domain of prefer-ences. [14] model preferences over pairs of Savage acts and sets of probability distributions overstates. The interpretation of the two preference relations is similar. As [14, p. 29] note, it canbe best understood using Ellsberg’s two urn example. In urn 1, there are 50 black and 50 whiteballs. The composition of urn 2 is unknown. The decision maker chooses the color and the urn tobet on. We can describe urn 2 by the empty data set and urn 1 by a very large data set, in whichthe frequency of success is 1

2 for both bets. While the bets are the same, the information contextof the bet varies across the two urns, rendering natural the idea of preferences on action-data-setpairs. Our first example illustrates how describing urns by data sets provides a generalization ofthe Ellsberg experiment.

Example 1 (Betting on a draw from an urn). A decision maker is offered to bet on the color ofthe ball drawn from one of two urns. Each of the urns contains the same total number of blackand white balls. A bet on white (black), aw (ab) pays 1 if the ball drawn is white (black) and 0,otherwise. Hence, the set of actions is A = {aw;ab} and the set of outcomes is R = {0;1}. Aninformation context specifies the available information about urn i, i ∈ {1;2} and is representedby a data set. Suppose, e.g., that there are 10 observations available for urn 1 as summarized indata set D1:

D1 = ((aw;1); (aw;1); (ab;0); (aw;0); (aw;1); (aw;0); (aw;0);

(ab;1); (ab;1); (aw;1)).

If the order of the draws does not matter, D1 can be represented as in Table 1. The data set forurn 2, D2, contains 300 observations, see Table 2.

In D1, half of the available observations involved a white ball being drawn and half – a blackone. The same is true for D2, but the number of observations is significantly larger.

Preferences � are defined on the set A × D∗: the decision maker decides which urn to beton and which bet to place. Given two urns characterized by equal frequencies, but differentnumber of observations, would the decision maker prefer the longer data set, regardless of the

Page 6: Ambiguity, data and preferences for information – A case-based approach

1438 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

bet, (ab;D2) � (ab;D1) and (aw;D2) � (aw;D1)? As in the Ellsberg paradox, such behaviorcould be attributed to the higher precision of information in D2.

Our second example shows that information contexts can describe distinct economic environ-ments, which offer the same set of choices, but differ with respect to available information. Sincedata provide information about the process governing the outcome realizations of actions, theyaffect the decision maker’s beliefs about a given action in a given information context.

Example 2 (Loan market). Consider an economy, in which entrepreneurs invest in risky projectsa ∈ A, such as environmental technologies. The set of possible financial returns r is given by R.To start a project, an entrepreneur needs one unit of capital. Entrepreneurs do not possess capital,but can borrow from lenders, each of whom owns exactly one unit of capital.

Lenders and entrepreneurs can decide in which of two markets to be active, a well establishedmarket 1, or an emerging market 2. The set of projects A and the set of outcomes R are identicalfor both markets. The information about each market i ∈ {1;2} is summarized in a data set Di

containing the observed returns of projects in this market, Di = ((ai1; ri

1) . . . (aiT i ; ri

T i )).Agents’ preferences � are defined on the set A × D∗ expressing the idea that each of them

has to choose a project and a market, in which to invest. E.g., a preference of an entrepreneur(a1;D1) � (a2;D2) means that he prefers to invest in project a1 in market 1 described by infor-mation context D1 to investing in a2 in market 2 described by D2. This might reflect a higherfrequency of good outcomes, better information precision, or higher relevance of observations inthe environment described by D1, or some combination of all of these. The information charac-teristics of the markets determine the agents’ market participation decisions.

Preferences on the domain A × D∗ can be elicited in an experimental setting by asking deci-sion makers to choose between bets on urns characterized by different sets of past observations,or, more generally, between information contexts characterized by different data sets. Our frame-work allows a decision maker to express also preferences for additional information. E.g., givena small sample of cases D2 about the emerging market, an entrepreneur can order a marketingstudy for the project a he is interested in. The data set obtained from the study, D′

2 might makehim feel more confident in the success of the project, (a;D′

2) � (a;D2). Note that such prefer-ences for data sets related to a given action do not imply the availability of all data sets. Whilethe entrepreneur can decide on the type and quantity of information collected by the study, hecannot control the outcomes in the new data set. Hence, the decision to collect additional obser-vations will depend on the perceived benefits from potential data sets generated by such a study.Preferences for information are thus hypothetical, but experimentally testable.

2.3. Notation

We conclude this section by introducing some notation which will be used throughout thepaper. Let N =: {1;2; . . .}. For a case c, ac is the action observed in c. For D = (c1 . . . cT ), andk ∈N, Dk denotes the k-fold of D:

Dk = (c1 . . . cT ; . . . ; c1 . . . cT︸ ︷︷ ︸).

k-times
Page 7: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1439

ck is the data set with k observations of c. For T ∈ N, the frequency of D ∈ DT is fD =(fD(c))c∈C =: (

|{t |ct=c}|T

)c∈C . δc is the Dirac measure on case c. The set of frequencies of alldata sets of length T is given by

FT =:{f ∈ �|C|−1 with f (c) ∈

{0; 1

T. . .

k

T. . .

T − 1

T;1

}for all c ∈ C

}.

For μ ∈ [0;1], a convex combination of two frequencies f and f ′ is defined as: μf +(1−μ)f ′ =(μf (c) + (1 − μ)f ′(c))c∈C . Note that (μf + (1 − μ)f ′) need not itself be a frequency of a dataset. Da is the set of data sets containing only observations of action a, Da = {D ∈ D | fD(a′; r) =0 for all a′ �= a and all r ∈ R}. DT

a =:Da ∩DT . FTa is the corresponding set of frequencies. δr is

the Dirac measure on outcome r .

3. The representation

We propose the following α-MEU representation of preferences � over action-data-set pairs:

V (a;D) = α maxp∈Ha(D)

u · p + (1 − α) minp∈Ha(D)

u · p, (1)

where u : R → R is a utility function over outcomes, α ∈ [0;1] is the decision maker’s degreeof optimism and Ha(D) ⊆ �|R|−1 is a set of probability distributions over outcomes associatedwith action a given the information in D.

Before introducing the general form of Ha(D), we consider several special cases which cor-respond to well-known representations.

1. Objective probabilities as limit frequencies: Assume that D contains T observations of asingle action a, fD ∈ Fa . We suggest that Ha(D) should be a convex combination of thesimplex, �|R|−1 with the empirically observed frequency of outcomes,

Ha(D) = γT �|R|−1 + (1 − γT ){ ∑

r∈R

fD(a; r)δr

}.

The parameter γT captures the ambiguity due to the limited number of observations in thedata. As the number of observations in D, T grows, limT →∞ γT = 0 and ambiguity com-pletely vanishes. Using (1), we obtain

limT →∞V

(a;D = (fD;T )

) =∑r∈R

u(r)fD(a; r).

Hence, for statistical experiments with a large number of observations, beliefs coincide withthe limit frequencies of outcomes. Preferences are represented by EU with respect to theselimit frequencies.

2. Similarity-weighted frequencies: Assume that the decision maker has a data set D, in whichdifferent actions have been observed. Assume that for a given action a and for each observedcase c, whether c contains action a or a different action a′, he can form an unambiguousprediction about the outcome of a based on the observation of case c, ρc

a ∈ R. Suppose alsothat the relevance attached to each of the cases c is given by the similarity sa(ac) between a

and the action ac contained in case c. Then, for an arbitrary T , the set of beliefs is centeredaround the similarity-weighted frequencies:

Ha(D) = γT �|R|−1 + (1 − γT )

{ ∑ fD(c)sa(ac)δρca∑

c′∈C fD(c′)sa(ac′)

}.

c∈C

Page 8: Ambiguity, data and preferences for information – A case-based approach

1440 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

This is an extension of [3] to the case in which the decision maker perceives ambiguity γT

due to the limited number of observations in the data. As the number of observations T ina data set D with frequency fD grows, limT →∞ γT = 0 and we obtain the representation ofbeliefs suggested in [3] and an expected utility evaluation of actions and data,

limT →∞V (a;D) = (fD;T ) =

∑r∈R

u(r)∑c∈C

fD(c)sa(ac)δρca(r)∑

c′∈C fD(c′)sa(ac′).

3. Hurwicz criterion: If the decision maker has no information, i.e., D = D∅, he cannot ex-clude any probability distribution over outcomes as possible and his beliefs are given byHa(D∅) = �|R|−1. Hence,

V (a;D∅) = α maxp∈�|R|−1

u · p + (1 − α) minp∈�|R|−1

u · p.

In this case, the representation coincides with the Hurwicz criterion [24], applicable to thecase of complete ignorance about the true probability distribution.

Based on these special cases, we suggest the following general rule of belief formation basedon available data: Ha(D∅) =: �|R|−1 and for a data set D ∈ D with frequency fD and length T ,

Ha(D) =[γT + (1 − γT )

∑c∈C γ c

a fD(c)sa(ac)∑c′∈C fD(c′)sa(ac′)

]�|R|−1

+ (1 − γT )∑

c∈C(1 − γ ca )sa(ac)fD(c)∑

c′∈C sa(ac′)fD(c′)

{∑c∈C(1 − γ c

a )fD(c)sa(ac)δρca∑

c∈C(1 − γ ca )sa(ac)fD(c)

}(2)

where ρ : A × C → R is a prediction function, sa : A → R++, a ∈ A, is a family of similarityfunctions, (γT )T ∈N is a sequence of perceived degrees of ambiguity, and γ c

a : A×C → [0;1) are(minimal) coefficients of perceived ambiguity depending on cases and actions.

3.1. Interpretation of the proposed representation

The α-MEU representation in (1) says that when evaluating the choice of a for a given dataset D, the decision maker considers a set of probability distributions over outcomes Ha(D). Heassigns a weight of α (his degree of optimism) to the expected utility derived using the bestprobability distribution in this set and a weight (1 −α) (his degree of pessimism) to the expectedutility derived using the worst probability distribution in Ha(D).

To explain how beliefs depend on data, we restate (2) with a suggested interpretation for itscomponents:

Ha(D) =[

γT︸︷︷︸vanishingambiguity

+ (1 − γT )∑

c∈C γ ca fD(c)sa(ac)∑

c′∈C fD(c′)sa(ac′)︸ ︷︷ ︸persistent ambiguity

]

︸ ︷︷ ︸perceived ambiguity

�|R|−1

+ (1 − γT )∑

c∈C(1 − γ ca )sa(ac)fD(c)∑

c′∈C sa(ac′)fD(c′)︸ ︷︷ ︸{∑

c∈C(1 − γ ca )fD(c)sa(ac)δρc

a∑c∈C(1 − γ c

a )sa(ac)fD(c)︸ ︷︷ ︸}.

perceived confidence prediction

Page 9: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1441

Observe first that Ha(D) is a convex combination of the simplex, �|R|−1 with a probabilitydistribution referred to as prediction, which is computed based on the frequency of D, fD . Itaggregates the observations in the data set by assigning to each case c an unambiguous predictionabout the outcome of a given the observation of this case, ρc

a ∈ R. The probability distributionconcentrated on the unambiguous prediction ρc

a , δρca, is weighted by the frequency of case c in

the data, fD(c), by its similarity to the action a, sa(ac), as well as by the degree of confidence,(1 − γ c

a ) ∈ (0;1] assigned by the decision maker to this prediction.ρc

a is interpreted as the decision maker’s prediction about the outcome of a based on a dataset D = (cT ) containing only observations of case c. If c contains the action under considera-tion, c = (a; r), the agent should be persuaded that the payoff of a is r . For large values of T ,his confidence in this prediction should be 1. Hence, ρ

(a;r)a = r and γ

(a;r)a = 0, reflecting the

objective character of the data. In contrast, when c contains an action a′ different from a, thelack of objective information about the correlation between the actions means that the predictionρc

a is subjective and the confidence in it may be less than 1,4 i.e., γ ca > 0. By choosing γ c

a tobe minimal for each a ∈ A and c ∈ C, we implicitly assume that the decision maker adopts theprediction he is most confident in.

The weight assigned to the simplex captures the subjectively perceived ambiguity about theoutcome of a given data set D. It is composed of vanishing ambiguity due to a limited number ofobservations in the data, γT ∈ (0;1) and of persistent ambiguity due to the heterogeneity of casesin the data, captured by the weighted average of the coefficients γ c

a . For the empty data set, thenumber of observations is 0 and the corresponding degree of ambiguity can be set to 1, implyingthat Ha(D∅) coincides with the simplex �|R|−1. In general, while the number of observationsT is small, γT is close to 1 and the impact of ambiguity due to unobservables is relatively small.The sequence γT is strictly decreasing with limT →∞ γT = 0. Hence, as T increases, ambiguitydue to a limited number of observations goes to 0 and all remaining ambiguity can be attributedto data heterogeneity.

3.2. Connection with existing models of ambiguity

The α-MEU approaches in the literature5 differ from ours in two respects. Firstly, they usethe Savage framework which contains states as a primitive concept and derive a single act-independent set of priors over these states of the world. In contrast, the case-based frameworkdoes not specify states of the world, nor are these observables in the data. Hence, in general, un-certainty cannot be dissociated from the actions. Ha(D) is a set of probability distributions overpayoffs for a given action a. It corresponds to a marginal probability distribution over outcomesin the Savage framework obtained by conditioning on the choice of action. Without additionalassumptions which might allow us to specify a state space, there is no sense in which the set ofpriors over outcomes derived in our framework could be constant across actions.

While a general analysis falls outside the scope of this paper, the following remark indicatesfor a special case, how one can introduce a notion of subjective states into our framework.

Remark 3.1 (NEO-additive capacity). Assume that all observations are equally relevant,sa(a

′) = 1 for all a, a′ ∈ A, and that there is no ambiguity about unobservables, γ ca = 0 for

4 [9] provide an example.5 There does not seem to exist an axiomatization for the general α-MEU representation in the Savage framework.

[8,15] are relevant references. A special case has been axiomatized in [6].

Page 10: Ambiguity, data and preferences for information – A case-based approach

1442 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

all a ∈ A, c ∈ C. In this case, one can then define a state space, which coincides with the set ofcases, S = C. The outcome of action a in state sc = c is given by the prediction ρc

a . For a dataset D = (fD;T ), the representation can be written as

V (a;D) = α maxp∈Π(D)

∑c∈C

u(ρc

a

)p(sc) + (1 − α) min

p∈Π(D)

∑c∈C

u(ρc

a

)p(sc),

Π(D) := γT �|C|−1 + (1 − γT ){fD}.This representation is equivalent to Choquet expected utility with respect to a NEO-additivecapacity νD on S = C, defined as in [6] by νD(E) := αγT + (1 − γT )

∑{c|sc∈E} fD(c) for ∅ �=

E � S, ν(S) = 1, ν(∅) = 0. Here, the additive belief on S is frequency-based and given byfD(sc) := fD(c).

The second important distinction between the standard models of ambiguity and our approachlies in the fact that the set of probability distributions over outcomes Ha(D) incorporates objec-tive features of the data set.6 In particular, if the number of observations is large, frequenciesalone determine beliefs. The frequencies of observations in the data can then be taken as abenchmark of unambiguous beliefs.7 Deviations from this benchmark are interpreted as am-biguity perceived by the agent. In absence of any information, this ambiguity is maximal andthe representation coincides with the Hurwicz criterion. The evaluation of an action is then fullydetermined by the degrees of optimism and pessimism, i.e., by α.

We conclude by stating an alternative way of writing the representation in (1), which will beuseful in Section 5.2.

Remark 3.2. Since R is finite, for D ∈ D, the representation in (1) can be rewritten as8

V (a;D) =(

γT + (1 − γT )∑

c∈C γ ca fD(c)sa(ac)∑

c′∈C fD(c′)sa(ac′)

)[αu(r) + (1 − α)u(r)

]

+ (1 − γT )∑c∈C

(1 − γ ca )fD(c)sa(ac)∑

c′∈C fD(c′)sa(ac′)u(ρc

a

). (3)

4. Axioms

We now propose a set of axioms which imply the representation specified above. Theseaxioms can be roughly divided into three categories: Axiom 1 (Complete order), Axiom 2 (In-variance), Axiom 6 (Most preferred and least preferred outcome) and Axiom 7 (Continuity) arestandard in the literature. Axioms 3, 4, 5 and 10 all imply some sort of separability of preferences.The last group of axioms, Axioms 8 and 9, deal with ambiguity and the value of information.Axiom 8 (Neutral outcome) is the key axiom for calibrating the decision maker’s attitude towardsambiguity by determining the evaluation of an action in absence of information. Axiom 9 (De-creasing ambiguity) ensures that perceived ambiguity decreases as the number of observationsgrows.

6 In this sense, our representation resembles the inductive approach to probabilities taken in [5]. For a family of condi-tional preferences over acts, [32] provides an axiomatic foundation for Carnap’s formula. We are grateful to Peter Wakkerfor referring us to this strand of literature.

7 This is similar, e.g., to [14], who choose the Steiner point of the set of priors as a benchmark for ambiguity-neutrality.8 See, e.g., [6, p. 543, Remark 3.2].

Page 11: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1443

Axiom 1 (Complete order). The preference relation � on A ×D∗ is complete and transitive.

Axiom 1 is standard and without it a real-valued representation is impossible. While transitiv-ity seems to be an innocuous assumption, completeness might be too demanding in this setting.In particular, it requires the decision maker to be able to imagine any two hypothetical data sets D

and D′ and to be able to compare the prospects of any two actions a and a′ with respect to thesetwo data sets.9 In a given choice situation, however, only subsets of action-data set pairs may befeasible. The examples in Section 5.2 illustrate this point and demonstrate the applicability ofthis approach.

Axiom 2 (Invariance). For a given T ∈ N, let π be a one-to-one mapping π : {1 . . . T } →{1 . . . T }. Then, for any action a ∈ A and any data set D = ((ct )

Tt=1) ∈DT ,(

a; (ct )Tt=1

) ∼ (a; (cπ(t))

Tt=1

).

This is an exchangeability condition, which implies that the order in which data arrive isirrelevant. It is a standard assumption in case-based theory, see e.g., [3], and ensures that learningfrom data is possible. Axiom 2 implies that for any T ∈ N, an information context D ∈ DT isfully characterized by its length T and the frequency of observations fD . We can thus writeD = (fD;T ). We will use the notation D and (f ;T ) interchangeably for data sets in D. Note,however, that a combination (f ;T ) defines a data set only if f ∈ FT .

Axiom 3 (Betweenness for sets of equal length). For any a ∈ A, T , T ′ ∈ N, f , f ′ ∈ FT ∩ FT ′,

if (a; (f ;T ))�

(∼)(a; (f ′;T )), then (a; (f ;T ′)) �

(∼)(a; (f ′;T ′)), and for any μ ∈ (0;1) such

that μf + (1 − μ)f ′ ∈ FT ′

(a; (f ;T ′)) �

(∼)

(a; (μf + (1 − μ)f ′;T ′)) �

(∼)

(a; (f ′;T ′)).

Axiom 3 suggests that the length of the data set can be separated from the frequency of obser-vations when evaluating the information context of action a. Intuitively, the frequency determineshow much support the data provide for the choice of a, while the length determines the precisionof the information. Keeping the precision constant across two data sets, they can be ranked basedsolely on their content. Hence, the preference between (a; (f ;T )) and (a; (f ′;T )) should notchange, if the length of the two sets is changed to T ′. Moreover, since the linear combination ofthe two frequencies μf + (1 − μ)f ′ contains the preferred evidence from f and the less pre-ferred one from f ′ in proportions μ and (1 − μ), μf + (1 − μ)f ′ should be ranked between f

and f ′ as long as the length of all three data sets is equal.

Axiom 4 (Independence). For all T , T ′ ∈ N, all a and a′ ∈ A, all f1, f2 ∈ FTa , f ′

1, f ′2 ∈ FT ′

a′ and

all μ ∈ (0;1] such that μf1 + (1 − μ)f2 ∈ FTa and μf ′

1 + (1 − μ)f ′2 ∈ FT ′

a′(a; (f1;T )

) �(≺)

(a; (f ′

1;T ′)),(a; (f2;T )

) �(�)

(a; (f ′

2;T ′)) (4)

9 Similarly, in the Savage framework, the decision maker has to express preferences over acts he might never be ableto afford, or acts whose consequences contradict the state, see [28]. To avoid such absurd examples, the sets of states andconsequences, respectively, the set of cases have to be carefully chosen.

Page 12: Ambiguity, data and preferences for information – A case-based approach

1444 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

implies

(a; (μf1 + (1 − μ)f2;T

)) �(≺)

(a; (μf ′

1 + (1 − μ)f ′2;T ′)) (5)

and if (a; (f2;T )) ∼ (a; (f ′2;T ′)), then the two statements (4) and (5) are equivalent.

If the evidence (f1;T ) provides more support for the choice of a than (f ′1;T ′), and if (f2;T )

provides more support for the choice of a than (f ′2;T ′), then the convex combination of the fre-

quencies f1 and f2 should give stronger support for the choice of a than the convex combination(with the same coefficient μ) of f ′

1 and f ′2. This is justified since (f1;T ) and (f2;T ) have the

same length and contain only observations of the same action a. If cases containing the obser-vation of the same action are considered equally relevant, regardless of the outcomes observed,in the evaluation of (μf1 + (1 − μ)f2;T ), the weight on the evidence from (f1;T ) should be μ

and that of (f2;T ) should be (1 − μ). The same argument applies to (f ′1;T ′) and (f ′

2;T ′). Thisaxiom seems reasonable if the decision maker does not consistently overweigh or underweighthe evidence from cases based on the observed outcomes.

Axiom 5 (Action-independent evaluation of outcomes). For all a, a′ ∈ A, (a;D∅) ∼ (a′;D∅)

and for all T ∈ N, f ∈ FTa , f ′ ∈ FT

a′ such that for all r ∈ R, f (a; r) = f ′(a′; r),(a; (f ;T )

) ∼ (a′; (f ′;T ))

.

Axiom 5 requires that preferences on actions depend only on the available observations. If twoactions have performed identically for the same number of periods, their evaluation is the same.In absence of information, D = D∅, the decision maker should be indifferent among all actions.Hence, the evaluation of payoffs can be separated from the action they have resulted from.

Axiom 6 (Most preferred and least preferred outcome). There exist r and r ∈ R such that for alla ∈ A, (a; (a; r)) � (a;D∅) � (a; (a; r)) and for all c ∈ C,

(a; (a; r)) � (a; c)� (

a; (a; r)).Furthermore, for all a, a′ ∈ A, there are r ′ and r ′′ ∈ R such that

(a; (a; r)) � (

a; (a′; r ′)) � (a; (a′; r ′′))� (

a; (a; r))and at least one of the two weak inequalities is strict.

This axiom postulates that there is a best outcome r such that the observation of the case(a; r), provides the most preferred evidence in favor of choosing a compared to any other casein C or to obtaining no information. The least preferred outcome r is defined symmetrically. ByAxiom 5, r and r will coincide for all actions a ∈ A, and we disregard the potential dependenceon a in the statement of the axiom. The second part of the axiom is a richness condition, whichis not necessary for the representation, but guarantees that the utility function over outcomes isnon-constant and the similarity function is unique.

Remark 4.1. Axioms 3 and 6 imply that for all a ∈ A, and all T ∈ N, (a; (a; r)T ) � (a; (a; r)T )

and for all D ∈ DT , (a; (a; r)T ) � (a;D) � (a; (a; r)T ).

Page 13: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1445

Axiom 7 (Continuity). For all a, a′ ∈ A, D, D′ ∈ D∗, if (a;D) � (a′;D′), there is a ξ > 0 andfor each k ∈N, there exist a k > k and μk

1, μk2 ∈ {0; 1

k! ; . . . ; k!−1k! ;1} with |μk

1 −μk2|� ξ such that

(a;D) �(a; (μk

1δ(a;r) + (1 − μk

1

)δ(a;r); k!))

� (a; (μk

2δ(a;r) + (1 − μk

2

)δ(a;r); k!)) � (

a′;D′).Here, μk

1 and μk2 identify two data sets of length k!, which contain only observations of (a; r)

and (a; r) and which can be nested between (a;D) and (a′;D′). For a given k, choosing μk1 and

μk2 so that their difference is maximized amounts to choosing the best approximations to (a;D)

and (a′;D′) on the set of data sets of length k! containing only (a; r) and (a; r). The axiomrequires that if preferences between (a;D) and (a′;D′) are strict, the difference between theirbest approximations is bounded away from 0.

Axiom 8 (Neutral outcome). There exists an r ∈ R such that for all a ∈ A and all k ∈ N,(a;D∅) ∼ (a; (a; r)k).

Axiom 8 postulates the existence of an outcome r such that the observation of (a; r) is iden-tical to receiving no information about the performance of a. Hence, additional observations of(a; r) do not change the evaluation of a. E.g., r could indicate the missing record of an outcomeof an action, a problem often encountered in empirical analysis, see [27]. This axiom plays a keyrole for calibrating the decision maker’s attitude towards ambiguity. It allows us to identify thedegrees of optimism and pessimism by determining the decision maker’s evaluation of an actionin absence of information.

Axiom 9 (Decreasing ambiguity). For all a ∈ A, k ∈N and all D ∈ D, (a;D) � (a;D∅) implies(a;Dk+1) � (a;Dk) and (a;D∅) � (a;D) implies (a;Dk) � (a;Dk+1).

Axiom 9 captures the idea that ambiguity decreases with increasing number of observations.It establishes the connection between preferences for information precision and the content of adata set. If the choice of a given D is preferred to choosing a absent any information, then thecontent of D provides positive evidence for choosing a. As the number of observations increases,while the frequency remains constant, the information in the data set becomes more precise, thusproviding stronger evidence in favor of a, (a;Dk+1) � (a;Dk). The reverse holds for evidenceconsidered worse than no information at all.

The last axiom will guarantee that the utility over outcomes, the similarity of cases and theperceived correlation between actions do not depend on the length of the data sets.10 Consideran arbitrary action a and a data set D. We can find a sequence of data sets (Dk)k∈N such thateach Dk has a length of k!, contains only observations of (a; r) and (a; r) and the evaluation of(a;Dk) approximates the evaluation of (a;D). Furthermore, the frequencies fDk

(a; r) convergeto a limit value μa

D . Intuitively, the repeated observation of cases (a; r) and (a; r) represents anoutcome of a statistical experiment with respect to a. As the number of observations k! grows,

10 The first two properties are standard in case-based theory, see [16]. The last property of case-based beliefs wasintroduced in [3]. One may question these properties, compare the discussion in [9,16] but it makes sense to impose themon our representation in order to make it comparable to most of the case-based literature.

Page 14: Ambiguity, data and preferences for information – A case-based approach

1446 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

the ambiguity caused by a limited number of observations in Dk vanishes. The limit predictionabout a associated with the sequence of data sets (Dk)k∈N should therefore assign a probabilityμa

D to r and (1 − μaD) to r . Hence, (up to normalization), the utility assigned to (a;D) should

equal μaD . We call the so-defined μa

D the unambiguous equivalent of (a;D). A formal definitionof μa

D is given in Appendix A.

Remark 4.2. Under Axioms 1–9, the function V (a;D) =: μaD can be used to represent � on

A ×D∗, see Lemma A.2 in Appendix A.

Consider three data sets with equal number of observations T , and frequencies, f , f ′ andf ′′ ∈ FT such that (a; (f ;T )) � (a; (f ′′;T )) � (a; (f ′;T )). The corresponding unambiguousequivalents satisfy μa

(f ;T )> μa

(f ′′;T )> μa

(f ′;T ). This allows us to state:

Definition 4.1. For any a ∈ A, any T ∈ N and any three frequencies f , f ′ and f ′′ ∈ FT suchthat (a; (f ;T )) � (a; (f ′′;T )) � (a; (f ′;T )), the coefficient λ(f ;f ′;f ′′;T ) ∈ (0;1) is definedby

λ(f ;f ′;f ′′;T )

μa(f ;T ) + (

1 − λ(f ;f ′;f ′′;T ))

μa(f ′;T ) = μa

(f ′′;T ).

The so-defined coefficient λ(·) has different meanings depending on the specific frequenciesof the three data sets. For data sets with frequencies which are averages of other frequencies,f ′′ = ηf + (1 − η)f ′ for some η ∈ (0;1), the weight λ(f ;f ′;f ′′;T ) reflects the relative simi-larity between the action under consideration a and the different actions observed in the two datasets (f ;T ) and (f ′;T ). In particular, if f = δ(a;r), f ′ = δ(a′;r) and f ′′ = 1

2δ(a;r) + 12δ(a′;r), the

similarity between a and a′ is the weight put on the evidence from the observation of a′ in theevaluation of a and is given by sa(a

′) = 1−λ(f ;f ′;f ′′;T )λ(f ;f ′;f ′′;T )

.For three data sets, each of which contains only observations of a with a single outcome,

e.g., f = δ(a;r), f ′ = δ(a;r) and f ′′ = δ(a;r), λ(δ(a;r); δ(a;r); δ(a;r);T ) represents the evaluation ofoutcome r relative to the best and the worst outcome. Normalizing the utility of r to 1 and thatof r to 0, the utility of r is given by λ(δ(a;r); δ(a;r); δ(a;r);T ).

Finally, for three data sets with frequencies f = δ(a;r), f ′ = δ(a;r) and f ′′ = δ(a′;r ′) witha′ �= a, λ(δ(a;r); δ(a;r); δ(a′;r ′);T ) represents the evaluation of action a given the observation(a′; r ′). Hence, it reflects the perceived correlation between the outcomes of a and a′.

Axiom 10 (Length independence). Let a ∈ A. Suppose that for some f , f ′ and f ′′ ∈ FT ,(a; (f ;T )) � (a; (f ′′;T )) � (a; (f ′;T )) and

(i) either ηf + (1 − η)f ′ = f ′′ for some η ∈ (0;1),(ii) or f = δ(a;r), f ′ = δ(a;r) and f ′′ = δc for some c ∈ C.

Then, for any T ′ such that f , f ′ and f ′′ ∈ FT ′, λ(f ;f ′;f ′′;T ) = λ(f ;f ′;f ′′;T ′).

Axiom 10 requires11 that the coefficient λ(f ;f ′;f ′′;T ) is independent of T . Consider firstcase (i). Intuitively, the relevance of a case for the evaluation of an action is based on some a

11 It is possible to state a behavioral version of Length Independence, which does not make use of Definition 4.1 and isequivalent to Axiom 10, see [11, Appendix B] for details.

Page 15: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1447

priori information, which is encoded in the structure of the action set A and which cannot belearned from the data. Hence, λ(f ;f ′;f ′′;T ) should not depend on the number of observations.In case (ii), if f = δ(a;r), f ′ = δ(a;r) and f ′′ = δ(a′;r ′), the weight λ(δ(a;r); δ(a;r); δ(a′;r ′);T )

reflects the perceived utility of an outcome relative to the best and the worst outcome (if a = a′)or a perceived correlation between the realizations of a and a′ (if a �= a′), neither of whichcan be learned from the data. Hence, in this case λ(δ(a;r); δ(a;r); δ(a′;r ′);T ) is also a subjectivecharacteristic of the decision maker which should not depend on the length of the data set T .

Showing independence of the axioms is straightforward12 for Axioms 1, 3–9 and (a behavioralversion of) Axiom 10. Axiom 2 (Invariance) implies that only the frequency and the length ofthe data set determine preferences. It allows us to use the notation (f ;T ) to write the rest ofthe axioms in a more concise form. In this sense, the rest of the axioms are not independent ofAxiom 2.

5. The representation theorem

Axioms 1–10 imply the desired representation for the case of more than three outcomes,13

|R| > 3.

Theorem 5.1. Let |R| > 3. A preference relation � on A ×D∗ satisfies Axioms 1–10 if and onlyif it can be represented by V (a;D) as in (1), where for all a ∈ A, Ha(D∅) = �|R|−1 and fora given action a and a data set D ∈ D with frequency fD and length T , the set of probabilitydistributions Ha(D) is defined as in (2).

The elements of the representation satisfy the following conditions:

(i) u is unique up to affine-linear transformations;(ii) ρ is unique up to indifference14 and satisfies ρ

(a;r)a = r for all a ∈ A and all r ∈ R;

(iii) the functions sa are unique up to a multiplication by a positive number;(iv) α ∈ [0;1] is unique and for all a ∈ A, V (a;D∅) = u(r) = αu(r) + (1 − α)u(r), where r is

the best, r is the worst and r is the neutral outcome;(v) the sequence (γT )T ∈N is unique, strictly decreasing with γT ∈ (0;1) and limT →∞ γT = 0;

(vi) the minimal coefficients γ ca are unique and satisfy γ

(a;r)a = 0 for all a ∈ A and r ∈ R;

(vii) for all a, a′ ∈ A, there are r ′ and r ′′ ∈ R such that V (a; (a; r)) � V (a; (a′; r ′)) >

V (a; (a′; r ′′)) � V (a; (a; r)) and at least one of the weak inequalities is strict.

Theorem 5.1 yields the general representation proposed in Section 3. A slight modificationof Axiom 3 allows us to represent preferences of a decision maker who does not perceive anyambiguity due to the limited number of observations.

Remark 5.1. Extending Axiom 3 (Betweenness) to hold for data sets of arbitrary lengths togetherwith Axioms 1, 2 and 4–7 implies the representation of Theorem 5.1 with γT = 0 for all T .

12 Proofs are available from the authors upon request.13 This condition is not necessary for the representation and, hence, does not restrict the application of our model.Combined with Axiom 6, it ensures that the similarity function is unique. We can also prove the statement of the maintheorem for |R| = 3, and a somewhat more restrictive assumption on the preference order. Details are available from theauthors upon request.14 Lemma A.6 and its proof provide more details.

Page 16: Ambiguity, data and preferences for information – A case-based approach

1448 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

Furthermore, assuming that for each a ∈ A and c ∈ C, there exists a ρca ∈ R such that (a; c) ∼

(a; (a;ρca)) implies the representation of Theorem 5.1 with γ c

a = 0 for all a ∈ A and c ∈ C.The resulting representation is expected utility maximization with respect to beliefs given bysimilarity-weighted frequencies as in [3], see [11] for details.

5.1. Preferences for more precise information

The novel aspect of our approach is the domain of preferences: the decision maker can rankinformation contexts in which a given action is chosen. Increasing the length of the data set,while keeping frequencies unchanged increases the precision of information in an objectivesense. More precise information, i.e., less ambiguity, must not necessarily be desirable: “Thereare basically two types of people. There are “want-to-knowers” and there are “avoiders”. Thereare some people who, even in the absence of being able to alter outcomes, find information ofthis sort beneficial. The more they know, the more their anxiety level goes down. But there areothers who cope by avoiding, who would rather stay hopeful and optimistic and not have theunanswered questions answered” [21, p. 234].

We say that i values information precision more than j if, whenever j prefers to obtain alonger data set to a shorter one with the same frequency of observations, so does i:

Definition 5.1. Consider two preference relations �i and �j on A × D∗ which satisfyAxioms 1–10 and have identical15 utility functions u, similarity functions, s, prediction func-tions ρ and coefficients of perceived ambiguity γ c

a . We say that i values information precisionmore than j if for every a ∈ A, D ∈D, k ∈N, (a;Dk) �j (a;D) implies (a;Dk) �i (a;D).

Definition 5.1 is equivalent to i being more pessimistic than j , i.e. αi � αj :

Proposition 5.2. i values information precision more than j if and only if αi � αj .

Intuitively, a pessimist (optimist) overweighs the probability of the worst (best) outcome,relative to its frequency in the data. As the number of observations increases, the weight assignedto the extreme outcomes diminishes. Hence, controlling for the frequency of observations, themore pessimistic a decision maker is, the stronger his preference for longer data sets.

5.2. Examples resumed

In this section we reconsider the examples introduced in Section 2 and show how the repre-sentation derived in Theorem 5.1 can be applied to these decision problems.

Example 1 ((Resumed) Betting on a draw from an urn). An important characteristic of thisexample is that counterfactuals are observable, i.e., the observation of the outcome of a given bet,say (aw;1), uniquely identifies the color of the ball drawn from the urn (white), and with this the(counterfactual) outcome of the other bet, ab, 0. Hence, all observations are equally relevant forthe prediction, sai

(aj ) = 1 for all ai , aj ∈ A; the unambiguous predictions ρcai

satisfy

15 This would be the case if for any a ∈ A and any D and D′ ∈ DT for some T ∈ N, we have that (a;D) �i (a;D′) if

and only if (a;D) �j (a;D′) and if λi(f ;f ′;f ′′) = λj (f ;f ′;f ′′) for any three frequencies satisfying the conditionsof Axiom 10 [11, Lemma 5.1].

Page 17: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1449

ρcai

={

1 for c = (ai;1) or c = (aj ;0),

0 for c = (ai;0) or c = (aj ;1)for ai, aj ∈ A, ai �= aj ;

and there is no persistent ambiguity, γ ca = 0 for all a ∈ A and c ∈ C. Just as in case 1 in Sec-

tion 3.2, beliefs are centered around the empirical frequencies and converge to them as thenumber of observations becomes large. Hence, for D1 and D2 described in Tables 1 and 2, andfor a given degree of optimism α, we obtain

V (ab;Di) = V (aw;Di) = γTi

[αu(1) + (1 − α)u(0)

] + (1 − γTi)u(1) + u(0)

2.

Since γ10 > γ300, for α < 12 , we obtain the typical Ellsberg preferences, V (aw;D2) =

V (ab;D2) > V (aw;D1) = V (ab;D1). Whenever the decision maker is predominantly pes-simistic, he chooses to bet on the less ambiguous urn, regardless of the bet.

This example shows how the Ellsberg paradox can be extended to various degrees of informa-tion precision. The notion of a data set allows us to capture the “amount, quality, and unanimityof information” [12, p. 657] and incorporate them into the beliefs. The parameter α can then beused to characterize behavior in face of imprecise information.

The example further illustrates an important difference between our approach and theBayesian updating of a prior. The decrease in ambiguity due to additional information can forcethe evaluations of two perfectly negatively correlated actions, aw and ab , to move in the samedirection, strictly increasing (decreasing) if the decision maker is pessimistic (optimistic). Thiscannot happen in a Bayesian framework, in which incoming information considered to be astrictly positive signal for aw , will necessarily be interpreted as a strictly negative signal for ab .

Example 2 ((Resumed) Loan market). Assume that all agents have identical utility functions overoutcomes, u(·) and, for a data set of a given length T , the same degree of perceived ambiguity γT .Denoting an entrepreneur by E and a lender by L, the degree of optimism for an agent of typei ∈ {E;L} is given by αi , where αi = 1 stands for a pure optimist, and αi = 0 for a pure pessimist.Let there be just two outcomes: R = {r; r} with r > r . We consider standard credit contracts: foran agreed repayment qj in market j ∈ {1;2}, and a realization of the project r , the payoff ismin{qj ; r} for the lender and max{r − qj ;0}, for the entrepreneur.

The impact of information precision: We first focus on the impact of information precision onthe choice of a market. Assume that there is only one type of project. Assume as well thatthe empirically observed frequency of success is identical for both markets, but there are lessobservations for the emerging market than for the established one, i.e., D1 = (fD1;T1) and D2 =(fD2;T2) are such that fD1 = fD2 = f and T1 > T2. Similarly to Example 1, we can show thatif the cost of credit q is identical in both markets, a purely pessimistic agent (αi = 0) will preferto invest in the more informative market 1, whereas a pure optimist (αi = 1) will prefer theemerging market 2, for any value of f .

It is then easy to show that: (i) if both entrepreneurs and lenders are pure pessimists, for anyvalue of the frequency f , in equilibrium, investment occurs only in the established market 1and there are no transactions in the emerging market; (ii) if entrepreneurs are purely optimistic,whereas lenders are purely pessimistic, in equilibrium, investment occurs in both markets, butthe cost of credit is lower in the more informative established market 1.

Page 18: Ambiguity, data and preferences for information – A case-based approach

1450 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

The impact of similarity: We now assume that the data set of the established market 2 containsinformation about two types of projects, an old technology, aO , used in the past and a newtechnology, aN , which has become available only recently. To isolate the impact of similarityon the choice of a market, we assume that both D1 and D2 contain T observations of aN withthe same frequency of success fN(r). However, D1 also contains T observations of aO with afrequency of success given by fO(r). We set saN

(aO) = s ∈ (0;1) and γ(aO ;r)aN

= γ for r ∈ {r; r}.The cost of credit is assumed to be equal to q in both markets. Using Remark 3.2 to write theevaluations of the agent for (aN ;D1) and (aN ;D2), and normalizing u(0) = 0, it is easy to seethat the a purely pessimistic entrepreneur prefers to invest in aN in market 1 if

(1 − γ )[fO(r)u(min{ρ(aO ;r)aN

− q;0}) + (1 − fO(r))u(min{ρ(aO ;r)aN

− q;0})]fN(r)u(r − q)

� (1 − γT )s + (γ2T − γT )

(1 − γ2T )s.

Since the r.h.s. of this inequality is always smaller than 1, market 1 is preferred, if

– either the l.h.s. is larger than 1, i.e., the expected utility of aN predicted based on the in-formation about aO and discounted by the degree of confidence in this prediction (1 − γ )

exceeds the expected utility based on the prediction from the directly relevant data about aN ;– or, the l.h.s. is smaller than 1, but the similarity between aO and aN , s, is sufficiently small,

(or) the perceived level of ambiguity in the emerging market 2, γT , is sufficiently large, and(or) the perceived level of ambiguity in the established market 1, γ2T is sufficiently small.

It follows that available information about similar alternatives might make a market moreattractive for investors through two channels: either by providing positive evidence for the envi-sioned investment or by substantially reducing perceived ambiguity.

6. Conclusion

In this paper, we analyze decisions informed by data. Introducing preferences on action-data-set pairs allows us to derive an α-MEU representation and to separate the utility over outcomesfrom beliefs represented by sets of probability distributions over outcomes. We identify the sub-jectively perceived degree of ambiguity and separate it from the decision maker’s attitude towardsambiguity as represented by his degrees of optimism and pessimism. We distinguish between twotypes of ambiguity: ambiguity due to a limited number of observations and ambiguity due to het-erogeneity of cases. While the first type of ambiguity decreases as the number of observationsgrows, the second persists even for large data sets. We show how beliefs depend on the perceivedambiguity of information and represent them as a function of the frequency of cases in the dataset, the relevance of each of the cases for the prediction to be made and the unambiguous pre-diction associated with each case. Finally, we relate preferences for information precision to thedecision maker’s degrees of optimism and pessimism.

Assuming that the decision maker can compare pairs of actions and data sets is a novel featureof our model. Our examples show that preferences of this type are in principle observable. As-sumptions about the preference relation can thus be tested in a controlled experiment. Field datafor such preferences can be obtained, e.g., from market participation decisions for markets char-acterized by different information, from observed choices of technologies, for which differentquality and amount of data is available, or from observed decisions about data acquisition.

Page 19: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1451

The value added by a new model of decision making depends foremost on its applicabilityto real-life phenomena and on its ability to generate novel predictions. Our example shows howmarket participation decisions and, thus prices and allocations can be influenced by the infor-mation structure of markets. Extending this example can help understand the evolution of theinformation structure of a market and explain differences in price dynamics across markets. Animportant application of our approach is to learning under ambiguity: [10] apply this frameworkto technology adoption triggered by climate change.

Preferences for information are central to our approach and allow us to evaluate additionalinformation depending on its content and on the subjective characteristics of the decision maker.Hence, our framework can be also used to evaluate the welfare effects of different policies ofinformation provision and to design efficient institutions governing the flow of information.

Appendix A. Proofs

Before proving the main theorem, we give a formal definition for the unambiguous equivalentof (a;D) explained in the text and establish its existence. For some T ∈ N, let D ∈ DT ∪ {D∅}.For every k ∈ N, k � T , define μa

k!(D) and νak!(D) as the frequencies of occurrence of (a; r) in

the data sets of length k! which best approximate D from above and from below. Formally,

μak!(D) =

⎛⎜⎜⎜⎜⎜⎜⎝

μ ∈{

0; 1

k! . . .k! − 1

k! ;1

} ∣∣∣(a; (μδ(a;r) + (1 − μ)δ(a;r); k!)) � (a;D)

and there is no μ′ ∈ {0; 1k! . . .

k!−1k! ;1}

such that(a; (μδ(a;r) + (1 − μ)δ(a;r); k!))

� (a; (μ′δ(a;r) + (1 − μ′)δ(a;r); k!))� (a;D)

⎞⎟⎟⎟⎟⎟⎟⎠

(6)

and, similarly,

νak!(D) =

⎛⎜⎜⎜⎜⎝ν ∈

{0; 1

k! . . .k! − 1

k! ;1

} ∣∣∣(a;D) � (a; (νδ(a;r) + (1 − ν)δ(a;r); k!))and there is no ν′ ∈ {0; 1

k! . . .k!−1k! ;1}

such that(a;D) � (a; (ν′δ(a;r) + (1 − ν′)δ(a;r); k!))

� (a; (νδ(a;r) + (1 − ν)δ(a;r); k!))

⎞⎟⎟⎟⎟⎠ . (7)

We denote the common limit of μak!(D) and νa

k!(D) by μaD and call it the unambiguous equivalent

of data set D with respect to a, or the unambiguous equivalent of (a;D).

Definition 6.1. The unambiguous equivalent of (a;D), μaD , is the common limit of μa

k!(D) andνak!(D) as k → ∞.

Lemma A.1. Under Axioms 1–9, for any D ∈ D∗ and any a ∈ A, the sequences μak!(D) and

νak!(D) converge to a common limit μa

D . Hence, the unambiguous equivalent of (a;D) exists.

Proof. We proceed in three steps. Step 1 proves an intermediate result, which is then used toshow in Step 2 that limk→∞(μa

k!(D) − νak!(D)) = 0 and, in Step 3, that μa

k!(D) converges. Thesetwo statements imply the result of the lemma.

Page 20: Ambiguity, data and preferences for information – A case-based approach

1452 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

Step 1. For any a ∈ A, T ∈ N and μ, μ′ ∈ [0;1] such that μδ(a;r) + (1 − μ)δ(a;r) and μ′δ(a;r) +(1 − μ′)δ(a;r) ∈ FT , μ′ < μ iff(

a; (μδ(a;r) + (1 − μ)δ(a;r);T)) � (

a; (μ′δ(a;r) + (1 − μ′)δ(a;r);T

)). (8)

Proof. Axioms 3 and 6 imply that for all T ∈ N, (a; (a; r)T ) � (a; (a; r)T ). Axiom 4 then im-plies that for any μ ∈ (0;1],(

a; (μδ(a;r) + (1 − μ)δ(a;r);T)) � (

a; (δ(a;r);T )). (9)

Let first μ′ < μ. For μ′ = 0, (9) is equivalent to (8). If μ′ ∈ (0;μ), Axiom 4 gives(a; (μδ(a;r) + (1 − μ)δ(a;r);T

))�

(a;

(μ′

μ

(μδ(a;r) + (1 − μ)δ(a;r)

) +(

1 − μ′

μ

)δ(a;r);T

)),

which is equivalent to (8). Now assume (8) and note that by the argument above μ′ � μ implies(a; (μ′δ(a;r) + (

1 − μ′)δ(a;r);T))

�(a; (μδ(a;r) + (1 − μ)δ(a;r);T

)),

a contradiction. Hence, μ′ < μ. �Step 2. For any a ∈ A, any T ∈ N and any D ∈DT ∪ {D∅}, limk→∞(μa

k!(D) − νak!(D)) = 0.

Proof. By Axioms 6 and 9, sequences (μak!(D)) k∈N

k�Tand (νa

k!(D)) k∈Nk�T

satisfying (6) and (7)

exist. Step 1 allows us to rewrite (6) and (7) as

μak!(D) = min

μ∈{0; 1k! ...

k!−1k! ;1}

∣∣ (a; (μδ(a;r) + (1 − μ)δ(a;r); k!)) � (a;D)

},

νak!(D) = max

ν∈{0; 1k! ...

k!−1k! ;1}

∣∣ (a;D)�(a; (νδ(a;r) + (1 − ν)δ(a;r); k!))}. �

By Step 1, unless μak!(D) = νa

k!(D), it must be that μak!(D) − νa

k!(D) = 1k! . Hence, μa

k!(D) −νak!(D) � 1

k! , implying limk→∞(μak!(D) − νa

k!(D)) = 0. �Step 3. For any a ∈ A, any T ∈ N and any D ∈DT ∪ {D∅}, (μa

k!(D)) k∈Nk�T

converges.

Proof. By Axiom 8, there is an outcome r such that (a; (a; r)) ∼ (a; (a; r)n) for any n ∈ N.Suppose that (a;D)� (a; (a; r)). By (6),(

a; (μak!(D)δ(a;r) + (

1 − μak!(D)

)δ(a;r); k!)) � (a;D) �

(a; (a; r)).

By Axiom 9, this implies(a; (μa

k!(D)δ(a;r) + (1 − μa

k!(D))δ(a;r); (k + 1)!))

�(a; (μk!(D)δ(a;r) + (

1 − μk!(D))δ(a;r); k!)).

Since μa(k+1)!(D) ∈ {0; 1

(k+1)! . . .(k+1)!−1(k+1)! ;1}, by Step 1 we obtain μa

(k+1)!(D) � μak!(D). Hence,

the sequence (μak!(D)) k∈N is bounded and decreasing and, thus, it converges. For the case

k�T

Page 21: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1453

(a; (a; r)) � (a;D), a symmetric argument shows that νak!(D) converges and, by Step 2, so does

μak!(D). �

Proof of Theorem 5.1. It is straightforward to verify necessity of the axioms. Sufficiency isproved in four consecutive lemmas. Lemma A.2 shows that the axioms imply a utility represen-tation

V (a;D) = u · ha(D).

Here, u : R → R is a utility function over outcomes. h : A × D∗→�|R|−1 is any selection of amaximal with respect to set inclusion correspondence H : A × D∗ ⇒ �|R|−1, with the propertythat u · ha(D) = u · h′

a(D) for all ha(D) and h′a(D) ∈ Ha(D). In Lemma A.3, we use the result

proved in [9] to show that V (a;D) on A ×D can be expressed as

V (a;D) = u ·∑c∈C

sa(ac)fD(c)ha(cT )∑

c′∈C sa(ac′)fD(c′),

where T is the length of D and where for a ∈ A, sa : A →R++ is a family of similarity functionsacross actions, each of which is unique up to a multiplication by a positive number and ha(c

T )

is any element of Ha(cT ). In Lemma A.5, we identify the coefficients α and (γT )T ∈N and show

that

Ha

(cT

) = {h ∈ �|R|−1

∣∣ u · h = u · [γT

(αδr + (1 − α)δr

) + (1 − γT )hca

]},

where hca ∈ limT →∞ Ha(c

T ) and h(a;r)a = δr for all c ∈ C, a ∈ A and r ∈ R. In Lemma A.6, we

identify the prediction function ρ and the minimal coefficients of ambiguity γ ca and show that

u · hca = u · [γ c

a

(αδr + (1 − α)δr

) + (1 − γ c

a

)δρc

a

].

We then combine all four steps to show that Ha(D) can be chosen so as to have the desiredstructure in (2) and derive the representation in (1). �Lemma A.2. The preference relation � on A ×D∗ can be represented by a utility function

V (a;D) = u · ha(D)

where u : R →R is a utility function over outcomes and h : A ×D∗→�|R|−1 is any selection ofa maximal with respect to set inclusion correspondence H : A ×D∗ ⇒ �|R|−1 with the propertythat u · ha(D) = u · h′

a(D) for all ha(D) and h′a(D) ∈ Ha(D) and δr ∈ H (a;D∅) for all a ∈ A.

Proof. We proceed to prove the lemma in 4 steps. In Step 1, we define the function V usingthe unambiguous equivalents μa

D . In Step 2, we demonstrate that the so-defined V represents �.In Step 3, we elicit the utility function over outcomes u. In Step 4, we construct the correspon-dence H .

Step 1. Define the function V : A × D∗→[0;1] as V (a;D) =: μaD . By Lemma A.1 the unam-

biguous equivalents μaD ∈ [0;1] are well defined and so is the function V .

Remark A.1. Note that by Definition 6.1, limT →∞ μa(a;r)T = 1 and limT →∞ μa

(a;r)T = 0. Hence,

the definition of V (a;D) implies limT →∞ V (a; (a; r)T ) = 1 and limT →∞ V (a; (a; r)T ) = 0.

Page 22: Ambiguity, data and preferences for information – A case-based approach

1454 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

Step 2. The function V defined in Step 1 represents �.

Proof. To see that the function V represents �, consider two actions a and a′ ∈ A, and for T andT ′ ∈N, two data sets D ∈DT ∪ {D∅} and D′ ∈DT ′ ∪ {D∅}. Let T = max{T ;T ′}. By Axiom 5,for all k � T :(

a′; (μa′k!

(D′)δ(a′;r) + (

1 − μa′k!

(D′))δ(a′;r); k!))

∼ (a; (μa′

k!(D′)δ(a;r) + (

1 − μa′k!

(D′))δ(a;r); k!)). (10)

Suppose (w.l.o.g.) that (a;D)� (a′;D′). Then, (6) together with (10) implies(a; (μa

k!(D)δ(a;r) + (1 − μa

k!(D))δ(a;r); k!))

�(a; (μa′

k!(D′)δ(a;r) + (

1 − μa′k!

(D′))δ(a;r); k!)).

By Step 1 of Lemma A.1, this is equivalent to μak!(D) � μa′

k!(D′) for all k � T . Hence, by

Lemma A.1 and by the definition of V ,

V (a;D) = μaD = lim

k→∞μak!(D) � lim

k→∞μa′k!

(D′) = μa′

D′ = V(a′;D′).

Now suppose that V (a;D)� V (a′;D′), or μaD � μa′

D′ . If μaD > μa′

D′ , there is a k such that(a; (μa

k!(D)δ(a;r) + (1 − μa

k!(D))δ(a;r); k!))

� (a′; (μa′

k!(D′)δ(a′;r) + (

1 − μa′k!

(D′))δ(a′;r); k!)).

Assuming that (a′;D′) � (a;D) together with (10) contradicts (6) and we conclude (a;D) �(a′;D′).

Let now μaD = μa′

D′ and suppose that (a′;D′) � (a;D). By Axiom 7, there exists a ξ > 0 such

that for each k, there is a k > k and μk1, μk

2 ∈ {0; 1k! . . .

k!−1k! ;1} with |μk

1 − μk2| � ξ such that

(a;D)�(a; (μk

1δ(a;r) + (1 − μk

1

)δ(a;r); k!)) � (

a; (μk2δ(a;r) + (

1 − μk2

)δ(a;r)

); k!)�

(a′;D′).

Combining this with (8), (6) and (7), we obtain νa

k!(D) � μk1 � μk

2 + ξ � μa′k!(D

′)+ ξ . Hence, for

each k, there is a k > k such that νa

k!(D) − μa′k!(D

′) � ξ , in contradiction to the assumption that

limk→∞ νak!(D) = limk→∞ μa′

k!(D′) = μa

D = μa′D′ . It follows that (a;D)� (a′;D′) if μD

a = μD′a′ .

We conclude that (a;D) � (a′;D′) if and only if V (a;D)� V (a′;D′). �Step 3. Eliciting the function u : R → R.

For given a ∈ A, T ∈ N and r ∈ R, let λr =: λ(δ(a;r); δ(a;r); δ(a;r);T ), see Definition 4.1, andthus

V(a; (a; r)T ) = λrV

(a; (a; r)T ) + (1 − λr)V

(a; (a; r)T )

. (11)

Such a λr ∈ [0;1] exists by Axiom 6. By Axioms 5 and 10, λr is independent of a and T andis thus well defined. Define u : R → R by u(r) =: λr for all r ∈ R. Obviously, u(r) = 1 andu(r) = 0. By Remark A.1, taking T → ∞ on both sides of (11), we obtain for all r ∈ R:

lim V(a; (a; r)T ) = u(r). (12)

T →∞

Page 23: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1455

Note that by Axiom 8, V (a;D∅) = u(r) for all a ∈ A.

Step 4. Identifying the correspondence H : A ×D∗ ⇒ �|R|−1.

For any a ∈ A and D ∈D∗, let ha(D) ∈ �|R|−1 be such that

u · ha(D) = μaD, (13)

where u is the function defined in Step 3. Note that such an ha(D) exists for all a and D,e.g., ha(D) with ha(D)(r) = μa

D , ha(D)(r) = (1 − μaD) satisfies (13). The set of all ha(D)

satisfying (13) is

Ha(D) =: {h ∈ �|R|−1∣∣ u · h = μa

D

}. (14)

The correspondence H : A × D∗ ⇒ �|R|−1 satisfies the properties in the statement of thelemma. �Lemma A.3. The preference relation � on A ×D can be represented by

V (a;D) = u ·∑c∈C

sa(ac)fD(c)ha(cT )∑

c′∈C sa(ac′)fD(c′), (15)

for some and thus all ha(cT ) ∈ Ha(c

T ), where T is the length of D, Ha(cT ) are defined as

in (14) and (sa : A → R++)a∈A is a family of similarity functions, each of which is unique up toa multiplication by a positive number.

Proof. We proceed in 4 steps. In Step 1, we construct a correspondence P : A × D⇒ �|R|−1

such that for all D ∈ D, u · p = u · p′ =: V (a;D) for all p and p′ ∈ Pa(D) and V (a;D) repre-sents �. In Step 2, we show that the so constructed correspondence P satisfies a list of properties(B1)–(B5) stated below. In Step 3, we restate a theorem which appears in [9] and which impliesthat a correspondence P : A ×D⇒ �|R|−1 satisfying properties (B1)–(B5) can be representedas

Pa(D) =∑c∈C

sa(c)fD(c)Pa(cT )∑

c′∈C sa(c′)fD(c′), (16)

where T is the length of D and sa : A × R → R++ is a family of similarity functions, each ofwhich is unique up to a multiplication by a positive number. In Step 4, we show that all sa areindependent of r . Furthermore, for each a ∈ A, (sa(ac))c∈C are the unique up to a multiplicationby a positive number Ka > 0 coefficients such that for each D ∈ DT , μa

D can be represented as

μaD =

∑c∈C μa

cTsa(ac)fD(c)∑

c′∈C sa(ac′ )fD(c′) . This result combined with the definitions of V and the sets Ha(cT )

in (14) proves the lemma.

Step 1. Note that by Axiom 9, for any a ∈ A, c ∈ C, the sequence (μacT )T ∈N is either constant or

strictly monotonic. Since μacT ∈ [0;1], it follows that the limit limT →∞ μa

cT exists.For a given a ∈ A, let C∗

a =: {c ∈ C | (a; c) � (a; (a; r)) and (a; c) � (a; (a; r))}. By Ax-iom 6, C∗

a is non-empty. By Axiom 6 and Remark A.1, we can choose an ε ∈ (0;1) such that

ε< min ∗ min

{μa

c ;(1 − μa

c

); lim μacT ;

(1 − lim μa

cT

)}.

2 a∈A, c∈Ca T →∞ T →∞

Page 24: Ambiguity, data and preferences for information – A case-based approach

1456 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

Define the set P ∗ as P ∗ =: {p ∈ �|R|−1 | u · p = 12 } and for each D ∈ D, define

Pa(D) =: εP ∗ + (1 − ε)[μa

D{δr} + (1 − μa

D

){δr}]. (17)

The sets Pa(D) ⊂ �|R|−1 are non-empty, convex and compact and each Pa(D) is a translationof the set εP ∗ + (1 − ε){δr } with the property that u · p = (1 − ε)μa

D + ε2 for all p ∈ Pa(D).

Hence, the function V defined by V (a;D) =: u · p for some and thus, all p ∈ Pa(D), satisfiesV (a;D) = (1 − ε)V (a;D) + ε

2 for all a ∈ A and D ∈D, and, thus, represents � on A ×D.

Step 2. Fix a ∈ A and consider the projection of the correspondence P on D for this a, Pa(D) :D⇒ �|R|−1. Pa(D) satisfies the following properties:

(B1) Pa(D) depends only on the frequency and the length of D, but not on the order of casesin D.

(B2) Let (fi)ni=1 and f be in FT . Whenever

∑ni=1 ηifi = f , for some ηi ∈ (0;1),

∑ni=1 ηi =

1, there are coefficients (λi)ni=1 ∈ (0;1) with

∑ni=1 λi = 1 such that Pa(f ;T ) =∑n

i=1 λiPa(fi;T ).(B3) Under the conditions listed in (B2), Pa(f ;T ) = ∑n

i=1 λiPa(fi;T ) if and only if

Pa(f ; T ) = ∑ni=1 λiPa(fi; T ) holds for any T ∈ N, such that (fi)

ni=1 and f ∈ F T .

(B4) For all c ∈ C, the sequences (Pa(cT ))T ∈N have a limit, limT →∞ Pa(c

T ).(B5) No three of the sets limT →∞ Pa(c

T ) of dimension 0 or 1 are collinear.

Proof. (B1) follows directly from Axiom 2.(B2) and (B3): Take three data sets D, D′ and D′′ ∈ DT with corresponding frequencies f , f ′

and f ′′ ∈ FT such that ηf + (1 − η)f ′′ = f ′ for some η ∈ (0;1). If (f ;T ) � (f ′′;T ). Axiom 3implies (f ; T ) � (f ′; T ) � (f ′′; T ) for all T ∈N such that f , f ′ and f ′′ ∈ F T . Hence, by Step 2of Lemma A.2, μa

(f ;T )> μa

(f ′;T )> μa

(f ′′;T )and therefore, by Axiom 10, there is a λ(f ;f ′′;f ′) ∈

(0;1) independent of T such that μa(f ′;T )

= λ(f ;f ′′;f ′)μa(f ;T )

+ (1 − λ(f ;f ′′;f ′))μa(f ′′;T )

. If

(f ;T ) ∼ (f ′′;T ), Axiom 3 implies (f ; T ) ∼ (f ′; T ) ∼ (f ′′; T ) for all T ∈ N. Hence, μa

(f ;T )=

μa

(f ′;T )= μa

(f ′′;T )for all T ∈ N and the coefficient λ(f ;f ′′;f ′) can be chosen arbitrarily.

Applying the same reasoning inductively, we obtain that for any frequencies (fi)ni=1 and f

in FT such that∑n

i=1 ηifi = f , for some coefficients ηi ∈ (0;1) with∑n

i=1 ηi = 1, there arecoefficients (λi)

ni=1 ∈ (0;1) with

∑ni=1 λi = 1 such that μa

(f ;T )= ∑n

i=1 λiμa

(fi ;T )for all T ∈ N

such that (fi)ni=1 and f ∈ F T . It follows that for all T ∈N such that (fi)

ni=1 and f ∈ F T ,

Pa(f ; T ) = εP ∗ + (1 − ε){μa

(f ;T )δr + (

1 − μa

(f ;T )

)δr

}

=n∑

i=1

λi

[εP ∗ + (1 − ε)

{μa

(fi ;T )δr + (

1 − μa

(fi ;T )

)δr

}] =n∑

i=1

λiPa(fi; T ).

(B4): By Step 1, limT →∞ μacT exists. Hence, by the definition of Pa(c

T ),

lim Pa

(cT

) = εP ∗ + (1 − ε){

lim μacT δr +

(1 − lim μa

cT

)δr

}. (18)

T →∞ T →∞ T →∞

Page 25: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1457

(B5): Note that since |R| > 3, the set P ∗ is a subset of a hyperplane in �|R|−1 and has adimension 2 or higher and since ε ∈ (0;1), so do the sets in (18) for all c ∈ C. Hence, there areno three sets of dimension 0 or 1 and (B5) is trivially satisfied. �Step 3. In [9], we prove the following theorem:

Theorem A.4. Let Pa be a correspondence Pa : D⇒ �|R|−1 the images of which are non-emptyconvex, and compact sets and which satisfies (B4) and (B5). Then Pa satisfies (B1), (B2) and(B3) if and only if there exists a unique, up to multiplication by a positive number, functionsa : C →R++ such that for all T ∈N and any D ∈DT ,

Pa(D) =∑

c∈C sa(c)fD(c)Pa(cT )∑

c′∈C sa(c′)fD(c′). (19)

Step 4. The similarity functions sa : C → R++ derived in Theorem A.4 do not depend on the ob-served outcomes and can be written as sa : A → R++. Furthermore, for each a ∈ A, (sa(ac))c∈C

are the unique up to a multiplication by a positive number Ka > 0 coefficients such that for each

D ∈ DT , μaD can be represented as μa

D =∑

c∈C μa

cTsa(ac)fD(c)∑

c′∈C sa(ac′ )fD(c′) .

Proof. Take any a, a′ ∈ A. Note that by Axiom 6, for any r ∈ R, there is an r ′ ∈ R such that(a; (a′; r)) � (a; (a′; r ′)). Hence, μa

(a′;r)T �= μa(a′;r ′)T and Pa(a

′; r)T �= Pa(a′; r ′)T . For some

even number T ∈ N, consider a data set D ∈ DTa′ such that fD(a′; r) = fD(a′; r ′) = 1

2 . We willshow that μa

D = ∑r∈{r;r ′} 1

2μa(a′;r)T , which combined with (17) and (19) implies

Pa(D) = sa(a′; r)Pa(a

′; r)T + sa(a′; r ′)Pa(a

′; r ′)T

sa(a′; r) + sa(a′; r ′)= 1

2Pa

(a′; r)T + 1

2Pa

(a′; r ′)T

,

or, sa(a′; r) = sa(a′; r ′). Since for all r ′′ ∈ R\{r; r ′}, we have either (a; (a′; r ′′)) � (a; (a′; r ′)) or

(a; (a′; r ′′)) � (a; (a′; r)), an analogous argument implies that sa(a′; r ′′) = sa(a

′; r) = sa(a′; r ′).

To see that μaD = ∑

r∈{r;r ′} 12μa

(a′;r)T , construct for r ∈ {r; r ′} the sequences (μ∗ak! (a

′; r)T ) k∈Nk�T

and (ν∗ak! (a′; r)T ) k∈N

k�Tas in (6) and (7), however restricting them to be in {0; 2

k! ; 4k! . . .

k!−2k! ;1}

rather than {0; 1k! ; 2

k! . . .k!−1k! ;1}. Note that since for all k � T , μ∗a

k! (a′; r)T − ν∗a

k! (a′; r)T � 2k! ,

the convergence results obtained in Lemma A.1 apply. Since μ∗ak! (a

′; r)T − μak!(a

′; r)T � 1k! ,

limk→∞ μ∗ak! (a

′; r)T = limk→∞ μak!(a

′; r)T = μa(a′;r)T . Furthermore, for each k � T ,∑

r∈{r;r ′} 12μ∗a

k! (a′; r)T ∈ {0; 1

k! ; 2k! . . .

k!−1k! ;1}. Applying Axiom 4, we conclude that(

a;( ∑

r∈{r;r ′}

1

2

(μ∗a

k!(a′; r)T

δ(a;r) + (1 − μ∗a

k!(a′; r)T )

δ(a;r)); k!

))

� (a;D)�(

a;( ∑

r∈{r;r ′}

1

2

(ν∗ak!

(a′; r)T

δ(a;r) + (1 − ν∗a

k!(a′; r)T )

δ(a;r)); k!

)). (20)

Note that

limk→∞

∑′

1

2μ∗a

k!(a′; r)T = lim

k→∞∑

1

2ν∗ak!

(a′; r)T =

∑′

1

2μa

(a′;r)T .

r∈{r;r } r∈{r;r } r∈{r;r }

Page 26: Ambiguity, data and preferences for information – A case-based approach

1458 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

Assume that μaD >

∑r∈{r;r ′} 1

2μa(a′;r)T . Then there is a k ∈ N such that μa

k!(D) � νak!(D) >∑

r∈{r;r ′} 12μ∗a

k! (a′; r)T , in contradiction to (6), (7) and (20). A symmetric argument applies to

the case μaD <

∑r∈{r;r ′} 1

2μa(a′;r)T and we conclude that μa

D = ∑r∈{r;r ′} 1

2μa(a′;r)T .

Since sa does not depend on r , (19) implies Pa(D) = ∑c∈C

sa(ac)fD(c)Pa(cT )∑c′∈C sa(ac′ )fD(c′) , which by

the definition of Pa(D) in (17) is equivalent to μaD = ∑

c∈C

sa(ac)fD(c)μa

cT∑c′∈C sa(ac′ )fD(c′) . The unique-

ness of sa(ac) is established by Theorem A.4. Hence, V on A × D satisfies V (a;D) = u ·∑c∈C sa(ac)fD(c)ha(cT )∑

c′∈C sa(ac′ )fD(c′) for each h ∈ Ha(cT ). �

Lemma A.5. There exist a coefficient α ∈ [0;1] satisfying αu(r) + (1 − α)u(r) = u(r), astrictly decreasing sequence (γT )T ∈N satisfying γT ∈ (0;1) and limT →∞ γT = 0 and a func-tion h : A × C→�|R|−1 with hc

a ∈ limT →∞ Ha(cT ) and h

(a;r)a = δr for all a ∈ A, c ∈ C and

r ∈ R such that

Ha

(cT

) = {h ∈ �|R|−1

∣∣ u · h = u · [γT

(αδr + (1 − α)δr

) + (1 − γT )hca

]}.

The coefficient α and the sequence (γT )T ∈N are unique.

Proof.

Step 1. Define γT =: αT + βT , with αT =: μa(a;r)T and βT =: 1 − μa

(a;r)T . Then, γT > 0, the

sequence (γT )T ∈N is strictly decreasing and converges to 0.

Proof. Observe that by Axioms 6 and 9, (αT )T ∈N, (βT )T ∈N and (γT )T ∈N are decreasing. If(a; (a; r)) � (a; (a; r)), αT , is strictly decreasing and converges to 0, whereas if (a; (a; r)) ∼(a; (a; r)), αT = 0 for all T ∈ N. If (a; (a; r)) � (a; (a; r)), then βT is strictly decreasing andconverges to 0, whereas if (a; (α; r)) ∼ (a; (a; r)), βT = 0 for all T ∈ N. Since by Axiom 6,(a; (a; r)) � (a; (a; r)), (γT )T ∈N is always strictly decreasing, converges to 0 and γT > 0. �Step 2. There is a function h : A × C → �|R|−1 such that for all a ∈ A, r ∈ R, c ∈ C andT ∈N, hc

a ∈ limT →∞ Ha(cT ), h(a;r)

a = δr and Ha(cT ) = {h ∈ �|R|−1 | u ·h = u · [αT δr +βT δr +

(1 − γT )hca]}.

Proof. Consider a case c ∈ C. By Axioms 6 and 10, there is a λc =: λ(δ(a;r); δ(a;r); δc) ∈ [0;1]independent of T such that V (a; cT ) = μa

cT = λcμa(a;r)T + (1 − λc)μ

a(a;r)T = αT + (1 − γT )λc .

By Step 1 of Lemma A.3, we can take limT →∞ on both sides of the equation and obtain λc =limT →∞ μa

cT . Thus, for any hca ∈ {h | u · h = limT →∞ μa

cT } = limT →∞ Ha(cT ),

V(a; cT

) = u · [αT δr + βT δr + (1 − γT )hca

], (21)

and, thus Ha(cT ) = {h ∈ �|R|−1 | u · h = u · [αT δr + βT δr + (1 − γT )hc

a]}. Since by (12), for

every a ∈ A, r ∈ R, δr ∈ limT →∞ Ha((a; r)T ), we can set h(a;r)a = δr . �

Step 3. The ratio αT

αT +βTdoes not depend on T and equals u(r). Defining α := αT

αT +βT, we obtain

Ha

(cT

) = {h ∈ �|R|−1

∣∣ u · h = u · [γT

(αδr + (1 − α)δr

) + (1 − γT )hac

]}.

Page 27: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1459

Proof. By Axiom 8, the sequence (V (a; (a; r)T ))T ∈N is constant and by (21), it can be writtenas

V(a; (a; r)T ) = αT + (1 − αT − βT )u(r) = u(r)μa

(a;r)T + (1 − u(r)

)μa

(a;r)T .

By Step 1 of Lemma A.3, we can take limits on both sides and obtain limT →∞ V (a; (a; r)T ) =V (a; (a; r)T ) = u(r) for all T ∈ N. It follows that: αT + (1 − αT − βT )u(r) = u(r), or

α := αT

(αT + βT )= u(r) ∈ [0;1]. � (22)

Step 4. γT ∈ (0;1) for all T ∈N. The coefficients α and (γT )T ∈N are unique.

Proof. Using (22), we obtain for any a ∈ A, r ∈ R, T ∈ N,

V(a; (a; r)T ) = γT α + (1 − γT )u(r) = γT u(r) + (1 − γT )u(r). (23)

If (a; (a; r)) � (a; (a; r)), Axiom 9 implies that for all T ∈ N, V (a; (a; r)T +1)>V (a; (a; r)T )>

u(r). Since by (12) limT →∞ V (a; (a; r)T ) = u(r), we obtain that for all T ∈N, V (a; (a; r)T ) ∈(u(r);u(r)). It follows that γT ∈ (0;1) for all T ∈ N. The argument for (a; (a; r)) � (a; (a; r))is symmetric. The case in which (a; (a; r)) ∼ (a; (a; r)) for all r ∈ R is excluded by Ax-iom 6.

The uniqueness of α follows immediately from the requirement that for all T ∈N,

V(a; (a; r)T ) = γT

[αu(r) + (1 − α)u(r)

] + (1 − γT )u(r) = u(r)

together with the fact that u(r) > u(r). Once α is determined, the uniqueness of the sequence γT

follows from (23) and the fact that there is at least one r ∈ R such that u(r) �= u(r). �Lemma A.6. There exist minimal coefficients γ c

a ∈ [0;1) and a prediction function ρ : A ×C → R such that for every c ∈ C and every a ∈ A,

u · hca = u · [γ c

a

(αδr + (1 − α)δr

) + (1 − γ c

a

)δρc

a

], (24)

where hca and α are those identified in Lemma A.5. The minimal coefficients γ c

a are unique and

satisfy γ(a;r)a = 0 for all a ∈ A and all r ∈ R. The prediction function is unique up to indifference:

if ρ and ρ are two functions which satisfy (24), (a; (a;ρca)) ∼ (a; (a; ρc

a)) holds for all a ∈ A and

all c ∈ C. Furthermore, ρ(a;r)a = r for all a ∈ A and all r ∈ R.

Proof. For each a, a′ ∈ A define the function ρ(a′;r)a as follows: for r such that (a; (a; r)) �

(a; (a′; r)), let

ρ(a′;r)a =:

{r ∈ R

∣∣∣ (a; (a′; r)) � (a; (a; r)) and there is no r ′ ∈ R such that

(a; (a′; r)) � (a; (a; r ′)) � (a; (a; r))}

, (25)

for (a; (a′; r)) � (a; (a; r)), let

ρ(a′;r)a =:

{r ∈ R

∣∣∣ (a; (a; r)) � (a; (a′; r)) and there is no r ′ ∈ R such that

(a; (a; r)) � (a; (a; r ′)) � (a; (a′; r))}

(26)

Page 28: Ambiguity, data and preferences for information – A case-based approach

1460 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

and for r such that (a; (a′; r)) ∼ (a; (a; r)), let ρ(a′;r)a = r . If for a given case (a′; r) several

outcomes r satisfy condition (25) or (26), set ρ(a′;r)a equal to one of these outcomes.

Now define the coefficients γ(a′;r)a so that they satisfy

γ (a′;r)a lim

T →∞μa(a′;r)T + (

1 − γ (a′;r)a

)lim

T →∞μa

(a;ρ(a′;r)a )T

= limT →∞μa

(a′;r)T . (27)

By the definition of ρ(a′;r)a and Axiom 3, such γ

(a′;r)a ∈ [0;1] always exists. If (a; (a′; r)) �

(a; (a; r)), γ(a′;r)a is unique and satisfies γ

(a′;r)a < 1. If (a; (a′; r)) ∼ (a; (a; r)), set γ

(a′;r)a = 0.

Since h(a′;r)a ∈ limT →∞ Ha(a

′; r)T , (12) and (14) imply

u · h(a′;r)a = γ (a′;r)

a u(r) + (1 − γ (a′;r)

a

)u(ρ(a′;r)

a

)= u · [γ (a′;r)

a

(αδr + (1 − α)δr

) + (1 − γ (a′;r)

a

)δρ

(a′;r)a

]. (28)

To see that the so-defined coefficients γ ca are minimal, suppose that there exists a ρ �= ρ such

that (a; (a; ρca)) � (a; (a;ρc

a)) for some a ∈ A and c ∈ C. Let γ ca be the corresponding set of

coefficients which satisfy (24). Expression (28) implies that ρ and γ ca have to satisfy (27) for

all a ∈ A and c ∈ C. Since (a; (a; ρca)) � (a; (a;ρc

a)), the definition of ρca together with (27)

implies that γ ca > γ c

a . In particular, for ac = a, we have ρ(a;r)a = r and, hence, γ

(a;r)a = 0 for all

r ∈ R. Finally, by (28), if the coefficients γ ca have been chosen to be minimal, ρ

(a′;r)a is unique

up to indifference, i.e., r and r ′ both satisfy the definition of ρ(a′;r)a if and only if u(r) = u(r ′),

or (a; (a; r)) ∼ (a; (a; r ′)). �For a ∈ A and D ∈D, define Ha(D) as

Ha(D) =[γT + (1 − γT )

∑c∈C γ c

a fD(c)sa(ac)∑c′∈C fD(c′)sa(ac′)

]�|R|−1

+ (1 − γT )∑

c∈C(1 − γ ca )sa(ac)fD(c)∑

c′∈C sa(ac′)fD(c′)

×{∑

c∈C(1 − γ ca )fD(c)sa(ac)δρc

a∑c∈C(1 − γ c

a )sa(ac)fD(c)

}(29)

where (sa)a∈A is the family of similarity functions derived in Lemma A.3, (γT )T ∈N is thesequence of perceived degrees of ambiguity derived in Lemma A.5, ρ is the prediction func-tion and γ c

a are the coefficients of perceived ambiguity derived in Lemma A.6. Let Ha(D∅) =�|R|−1.

By Lemma A.2, V represents �. By Lemmas A.3, A.5 and A.6, V on A × D can be writtenas

V (a;D) = u ·∑c∈C

sa(ac)fD(c)∑c′∈C sa(ac′)fD(c′)

[γT

(αδr + (1 − α)δr

)

+ (1 − γT )[γ ca

(αδr + (1 − α)δr

) + (1 − γ c

a

)δρc

a

]]= αu ·

([γT + (1 − γT )

∑c∈C

γ ca

sa(ac)fD(c)∑c′∈C sa(ac′)fD(c′)

]δr

+ (1 − γT )

∑c∈C(1 − γ c

a )sa(ac)fD(c)δρca∑

′ ′

)

c′∈C sa(ac )fD(c )
Page 29: Ambiguity, data and preferences for information – A case-based approach

J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462 1461

+ (1 − α)u ·([

γT + (1 − γT )∑c∈C

γ ca

sa(ac)fD(c)∑c′∈C sa(ac′)fD(c′)

]δr

+ (1 − γT )

∑c∈C(1 − γ c

a )sa(ac)fD(c)δρca∑

c′∈C sa(ac′)fD(c′)

), (30)

whereas by Lemma A.2, V (a;D∅) = u(r) = αu · δr + (1 − α)u · δr .Combining (29) and (30), we obtain V (a;D) = α maxp∈Ha(D) u ·p+(1−α)minp∈Ha(D) u ·p,

which completes the proof of the existence part of the theorem. It remains to verify the unique-ness of the utility function u, which follows immediately from the fact that ρ

(a;r)a = r and, hence,

δr ∈ limT →∞ Ha(a; r)T for all a ∈ A and all r ∈ R. �Proof of Proposition 5.2. Suppose that αi � αj . According to Axiom 9, (a;Dk) �j (a;D) willhold whenever (a;Dk) �j (a; rj ), or whenever

∑c∈C

(1 − γ ca )fD(c)sa(ac)u(ρc

a)∑c′∈C(1 − γ c′

a )fD(c′)sa(ac′)− [

αju(r) + (1 − αj

)u(r)

]> 0. (31)

Since αi � αj , and therefore, [αiu(r) + (1 − αi)u(r)] � [αju(r) + (1 − αj )u(r)], (31) impliesthat (a;Dk) �i (a;D).

Conversely, if for all (a;D), (a;Dk) �j (a;D) implies (a;Dk) �i (a;D), it follows by Ax-iom 9 that whenever (a;D) �j (a; (a; rj )), (a;D) �i (a; (a; ri )). Since the utility functions ofi and j are identical, this implies that u(rj ) � u(ri ). Normalizing u(r) = 1, u(r) = 0 and notingthat u(rj ) = αj � u(ri) = αi , implies the result of the proposition. �References

[1] D. Ahn, Ambiguity without a state space, Rev. Econ. Stud. 71 (2008) 3–28.[2] A. Arad, G. Gayer, Imprecise datasets as a source of ambiguity: a model and experimental evidence, Manage.

Sci. 58 (2012) 188–202.[3] A. Billot, I. Gilboa, D. Samet, D. Schmeidler, Probabilities as similarity-weighted frequencies, Econometrica 73

(2005) 1125–1136.[4] T.F. Bewley, Knightian decision theory: part I, Discussion paper, Cowles Foundation, 1986.[5] R. Carnap, A basic system of inductive logic, in: R. Jeffrey (Ed.), Studies in Inductive Logic and Probability, vol. II,

UCP, Berkeley, 1980, pp. 7–155.[6] A. Chateauneuf, J. Eichberger, S. Grant, Choice under uncertainty with the best and the worst in mind: neo-additive

capacities, J. Econ. Theory 137 (2007) 538–567.[7] Y. Coignard, J.-Y. Jaffray, Direct decision making, in: S. Rios (Ed.), Decision Theory and Decision Analysis: Trends

and Challenges, Kluwer Academic Publishers, Boston, 1994, pp. 81–90.[8] J. Eichberger, S. Grant, D. Kelsey, G. Koshevoy, The α-MEU model: a comment, J. Econ. Theory 48 (2011) 1684–

1698.[9] J. Eichberger, A. Guerdjikova, Case-based belief formation under ambiguity, Math. Soc. Sci. 60 (2010) 161–177.

[10] J. Eichberger, A. Guerdjikova, Technology adoption and adaptation to climate change – a case-based approach,Climate Change Econ. 3 (2012).

[11] J. Eichberger, A. Guerdjikova, Ambiguity, data and preferences for information – a case-based approach, Workingpaper 2012-45, University of Cergy-Pontoise, 2012.

[12] D. Ellsberg, Risk, ambiguity and the savage axioms, Quart. J. Econ. 75 (1961) 643–669.[13] L. Epstein, M. Schneider, Learning under ambiguity, Rev. Econ. Stud. 74 (2007) 1275–1303.[14] Th. Gajdos, T. Hayashi, J.-M. Tallon, J.-C. Vergnaud, Attitude towards imprecise information, J. Econ. Theory 141

(2007) 68–99.[15] P. Ghirardato, F. Maccheroni, M. Marinacci, Differentiating ambiguity and ambiguity attitude, J. Econ. Theory 118

(2004) 133–173.[16] I. Gilboa, D. Schmeidler, A Theory of Case-Based Decisions, Cambridge University Press, Cambridge, UK, 2001.

Page 30: Ambiguity, data and preferences for information – A case-based approach

1462 J. Eichberger, A. Guerdjikova / Journal of Economic Theory 148 (2013) 1433–1462

[17] I. Gilboa, D. Schmeidler, Act similarity in case-based decision theory, Econ. Theory 9 (1997) 47–61.[18] I. Gilboa, D. Schmeidler, Maxmin expected utility with a non-unique prior, J. Math. Econ. 18 (1989) 141–153.[19] I. Gilboa, D. Schmeidler, P. Wakker, Utility in case-based decision theory, J. Econ. Theory 105 (2002) 483–502.[20] Ch. Gonzales, J.-Y. Jaffray, Imprecise sampling and direct decision making, Ann. Oper. Res. 80 (1998) 207–235.[21] S. Grant, A. Kaji, B. Polak, Intrinsic preference for information, J. Econ. Theory 83 (1998) 233–259.[22] R. Hau, T. Pleskac, R. Hertwig, Decisions from experience and statistical probabilities: why they trigger different

choices than a priory probabilities, J. Behav. Decis. Mak. 23 (2010) 48–68.[23] D. Hume, An Enquiry Concerning Human Understanding, Clarendon Press, Oxford, 1748.[24] L. Hurwicz, Some specification problems and applications to econometric models, Econometrica 19 (1951) 343–

344.[25] F.H. Knight, Risk, Uncertainty and Profit, Dover Publications, New York, 1921.[26] P. Klibanoff, M. Marinacci, S. Mukerji, A smooth model of decision making under uncertainty, Econometrica 73

(2005) 1849–1892.[27] C. Manski, Identification problems and decisions under ambiguity: empirical analysis of treatment response and

normative analysis of treatment choice, J. Econometrics 95 (2000) 415–432.[28] L.J. Savage, The Foundations of Statistics, John Wiley and Sons, New York, 1954.[29] D. Schmeidler, Subjective probability and expected utility without additivity, Econometrica 57 (1989) 571–587.[30] M. Stinchcombe, Choice with ambiguity as set of probabilities, Mimeo, University of Texas, Austin, 2003.[31] J. von Neumann, O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton,

NJ, 1944.[32] P. Wakker, Decision-principles to justify Carnap’s updating method and to suggest corrections of probability judge-

ments, in: A. Darwiche, N. Friedman (Eds.), UAI ’02, Proceedings of the 18th Conference in Uncertainty inArtificial Intelligence, Morgan Kaufmann, Alberta, 2002, pp. 544–551.