A very quick primer on risky choice and prospect theory

HW3 uses large-scale choice data collected for this paper…

COGNITIVE SCIENCE

Using large-scale experiments and machine learningto discover theories of human decision-makingJoshua C. Peterson1*, David D. Bourgin1†, Mayank Agrawal2,3,Daniel Reichman4, Thomas L. Griffiths1,2

Predicting and understanding how people make decisions has been a long-standing goal in manyfields, with quantitative models of human decision-making informing research in both the socialsciences and engineering. We show how progress toward this goal can be accelerated by using largedatasets to power machine-learning algorithms that are constrained to produce interpretablepsychological theories. Conducting the largest experiment on risky choice to date and analyzing theresults using gradient-based optimization of differentiable decision theories implemented throughartificial neural networks, we were able to recapitulate historical discoveries, establish that thereis room to improve on existing theories, and discover a new, more accurate model of humandecision-making in a form that preserves the insights from centuries of research.

Understanding howpeoplemake decisionsis a central problem in psychology andeconomics (1–3). Having quantitativemodels that can predict these decisionshas become increasingly important as

automated systems interact more closely withpeople (4, 5). The search for such models goesback almost 300 years (6) but intensified inthe latter half of the 20th century (7, 8) as em-pirical findings revealed the limitations of theidea that peoplemake decisions bymaximizingexpected utility (EU) (9–11). This led to the de-velopment of new models such as prospecttheory (PT) (8, 12). Recently, this theory-drivenenterprise has been complemented by data-driven research using machine learning topredict human decisions (13–19). Althoughmachine learning has the potential to accel-erate the discovery of predictive models of hu-man judgments (20–22), the resulting modelsare limited by small datasets and are oftenuninterpretable (5). To overcome these chal-lenges, we introduce a new approach based ondefining classes of machine-learning modelsthat embody constraints based on psycholog-ical theory. We present the largest experimentstudying people’s choices to date, allowing usto use our approach to systematically evalu-ate existing theories, identify a lower boundon optimal prediction performance, and pro-pose a new descriptive theory that reachesthis bound and contains classic theories asspecial cases.We focus on risky choice, one of the most

basic and extensively studied problems in de-cision theory (8, 23). Risky choice has largelybeen examined using “choice problems,” sce-

narios in which decision-makers face a choicebetween two gambles, each of which has a setof outcomes that differ in their payoffs andprobabilities (Fig. 1A). Researchers studyingrisky choice seek a theory, which we formal-ize as a function that maps from a pair ofgambles, A and B, to the probability P(A) thata decision-maker chooses gamble A over gambleB, that is consistent with human decisionsfor as many choice problems as possible. Dis-covering the best theory is a formidable chal-lenge for two reasons. First, the space of choiceproblems is large. The value and probability ofeach outcome for each gamble define the di-mensions of this space, meaning that describ-ing a pair of gambles could potentially requiredozens of dimensions. Second, the space ofpossible theories is even larger, with theoriesof choice between two options spanning allpossible functions mapping choice problemsin ℝ2d to ℝ, i.e., from a vector of d gambleoutcomes and d associated probabilities to achoice probability.Machine-learning methods such as deep

neural networks (24) excel at function approx-imation (25, 26) and thus provide a tool thatcould potentially be used to automate theorysearch. However, thesemethods typically requirelarge amounts of data. Historically, datasets onrisky choice have been small: Influential papersfocused on a few dozen choice problems (27) andthe largest previous dataset featured <300 (28).Consequently, off-the-shelf methods have per-formed poorly in predicting human choices (29).Furthermore, even when data are abundant,the functions discovered by machine-learningalgorithms are notoriously hard to interpret(30), making for poor explanatory scientificmodels.To address these challenges, we collected a

large dataset of human decisions for almost10,000 choice problems presented in a formatthat has been used in previous evaluations ofmodels of decision-making (27–29) (Fig. 1A).

This dataset includes >30 times the numberof problems in the largest previous dataset(27) (Fig. 1B). We then used this dataset toevaluate differentiable decision theories thatexploit the flexibility of deep neural networksbut use psychologically meaningful constraintsto pick out a smooth, searchable landscape ofcandidate theories with shared assumptions.Differentiable decision theories allow the intu-itions of theorists to be combinedwith gradient-based optimization methods frommachinelearning to broadly search the space of theoriesin a way that yields interpretable scientificexplanations.More formally, we define a hierarchy over

decision theories (Fig. 1C) reflecting the addi-tion of an increasing number of constraintson the space of functions. These constraintsexpress psychologically meaningful theoret-ical commitments. For example, one class oftheories contains all functions in which thevalue that people assign to one gamble can beinfluenced by the contents of the other gamble.If theories in this class are more predictivethan those that belong to the simpler classescontained within it (e.g., where the value ofgambles are independent), then we knowthat these simpler theories should be elimi-nated. We enforce each constraint by modify-ing the architecture of artificial neural networks,resulting in differentiable decision theories. Thistheory-driven approach to defining constraintscontrasts with genericmethods for constrainingneural networks, such as restricting their size orthe ranges of their weights (31). After optimizinga differentiable theory to best fit human be-havior, it will ideally have picked out the optimaltheory in its class.The lowest levels of our hierarchy contain

the simplest theories, including classic modelsof choice. Objectively, gambles that yield higherpayouts in the long run are those with higherexpected value (EV), with the value V(A) ofgamble A being

Pi xipi, where outcome i of

gamble A has payoff xi and probability pi.In our hierarchical partitioning, this is thesimplest possible theory because it has nodescendants. Moving up the hierarchy, andfollowing expected utility (EU) theory (6, 32),we can ask the question of whether payouts xiare viewed subjectively by decision-makers:V(A) =

Pi u(xi)pi. When u(•) is the identity

function u(x) = x, EU reduces to EV and thuscontains it. Theories based on EU have his-torically relied on explicit proposals for theform of u(•), which are typically simple, non-linear parametric functions (33). By contrast,we search the entire class by learning the op-timal u(•) with a neural network (we call theresulting model “neural EU”), and use auto-matic differentiation to optimize the modelP(A)º exp{h

Piu(xi)pi}, where h captures the

degree of determinism in people’s responses(34). This can be viewed as a neural network

RESEARCH

Peterson et al., Science 372, 1209–1214 (2021) 11 June 2021 1 of 6

1Department of Computer Science, Princeton University,Princeton, NJ 08540, USA. 2Department of Psychology,Princeton University, Princeton, NJ 08540, USA. 3PrincetonNeuroscience Institute, Princeton University, Princeton, NJ08544, USA. 4Department of Computer Science, WorcesterPolytechnic Institute, Worcester, MA 01609, USA.*Corresponding author. Email: [email protected]†Present affiliation: Spotify.

Corrected 11 June 2021. See full text.

on July 13, 2021

http://science.sciencemag.org/

Dow

nloaded from

A risky choice problem

architecture in which the output layer is asoftmax function ehzj=Skehzk , there is onenode for each gamble, the hidden units inthe second-to-last layer encode the utilities ofthe outcomes, and the final layer of weightscorresponds to their probabilities (Fig. 1D).Figure 2 shows that the discovered form ofu(•) is similar to those proposed by humantheorists (i.e., decreasing marginal utility andasymmetry) but outperforms any of those the-ories and can be learned using only a quarterof our data. [All theories are evaluated on theircross-validated generalization performance,meaning that model complexity is already im-plicitly accounted for in our analyses; we focuson mean-squared error (MSE) for consis-tency with previous evaluations of models ofdecision-making (28, 29) but also include analy-ses of cross-entropy in the supplementarymaterials.] The decision preference accuracy[i.e., the proportion of problems in which themodel prediction for P(A) is >0.5 when theobserved proportions are also >0.5] for thismodel was 81.41%.Next, mirroring subjective EU (7) and PT, we

can ask the question of whether the proba-bilities (pi) are also viewed subjectively by

decision-makers: v(A) = Siu(xi)p(pi). Again,p(•) can take on classic forms or be learnedfromdata (“neural PT”). Figure 2B shows thata form of p(•) that outperforms all proposalsby human theorists can be learned using one-fifth of our data and exhibits overweightingof events with medium to low probability.This overweighting is much smaller than istypically found in applications of PT, in partreflecting the difference in the range of choiceproblems that we consider relative to classicstudies. We will return to this point later. Thedecision preference accuracy for this modelwas 82.33%.Allowing separate p(•) functions for posi-

tive and negative outcomes, respectively, andapplying them cumulatively to an ordered setof outcomes corresponds to the most popularmodern variant of PT: cumulative PT (CPT)(12, 35) (Fig. 2B; see the materials and meth-ods). Notably, “neural CPT” does not containneural PT because the former cannot violatestochastic dominance.With small amounts ofdata, corresponding to the largest previousexperiments (28, 29), CPT outperforms PT,accounting for its popularity. However, thistrend reverses as the amount of data is in-

creased, which illustrates that suitably largedatasets, in addition to aiding machine learn-ing, provide more robust evaluation.Next, we can ask whether the possible out-

comes of a gamble affect the perception ofeach other and their probabilities and viceversa. More formally, we learn a neural networkf(•,•) such that P(A) º exp{f(xA,pA)}, wherexA and pA are the vector of payoffs and prob-abilities associated with gamble A, respec-tively. This function class computes the valueof a gamble (“value-based”) like PT and othersbut does not enforce linearity when combiningpayoffs and probabilities. Notably, this class ofmodels includes those that violate the in-dependence axiom in decision-making (32).Figure 3A shows that there exists a value-based theory that results in a greater improvementin performance over PT than PT does over EU.Relaxing the constraint that each gamble

is valued independently results in our mostgeneral class of functions, “context-dependent”functions g(•)whereP(A) = g(xA,pA, xB,pB). Thisclass of models includes those that violate boththe independence and transitivity axioms indecision-making (32). This formulation pro-vides a way to estimate the performance of the

Peterson et al., Science 372, 1209–1214 (2021) 11 June 2021 2 of 6

Fig. 1. Applying large-scale experimentation andtheory-driven machine learning to risky choice.(A) Experiment interface in which participantsmade choices between pairs of gambles (“choiceproblems”) and were paid at the end of the experimentbased on their choice in a single randomly selectedgamble. (B) Each pair of gambles can be described bya vector of payoffs and probabilities. Reducing theresulting space to two dimensions (2D) allows us tovisualize coverage by different experiments. Each pointis a different choice problem, and colors showreconstructions of the problems used in influentialexperiments (green), the previous largest dataset(red), and our 9831 problems, which provide muchbroader coverage of the problem space. This 2Dembedding results from applying t-distributedstochastic neighbor embedding (t-SNE) to the hidden-layer representation of our best-performing neuralnetwork model. (C) We define a hierarchy oftheoretical assumptions expressed as partitions overfunction space that can be searched. More complexclasses of functions contain simpler classes asspecial cases, allowing us to systematically search thespace of theories and identify the impact of con-straints that correspond to psychologically meaningfultheoretical commitments. All model classes aredescribed in the main text. (D) Differentiable decisiontheories use the formal structure of classic theoriesto constrain the architecture of the neural network.For example, our EU model uses a neural network todefine the utility function but combines thoseutilities in a classic form, resulting in a fullydifferentiable model that can be optimized bygradient descent.

RESEARCH | REPORT

Corrected 11 June 2021. See full text.

on July 13, 2021

http://science.sciencemag.org/

Dow

nloaded from

Dataset of 13,005 choice problems!

Problem 1, Feedback = True n = 15, bRate = 0.6267, std: 0.3845

Gamble A Gamble B Payout Probability Payout Probability0 26.0 0.95 0 21.0 0.951 -1.0 0.05 1 23.0 0.05

A couple more problems from the dataset…


Gamble A Gamble B Payout Probability Payout Probability1 8.0 0.95 0 -31.0 0.7500 37.0 0.05 1 86.5 0.125 2 87.5 0.125

Let’s take a computational view of the mind… How do people make these choices?

Can we explain these choices as computation? If so, what is the right algorithm?

A classic economic theory: People choose option that maximizes expected value/utility


Gamble A Gamble B Payout Probability Payout Probability1 8.0 0.95 0 -31.0 0.7500 37.0 0.05 1 86.5 0.125 2 87.5 0.125

x : vector of payouts

p : vector of probabilities EV =N

∑i=1

xipi

Expected value

EVA = 8 × .95 + 37 × .05 = 9.45

EVB = − 31 × .75 + 86.5 × .125 + 87.5 × .125 = − 1.5

Thus, people would choose Option A!

However, expected value/utility theory is a pretty poor fit to real human decisions… which brings us to prospect theory

Value of a gamble =N

∑i=1

u(xi)piu : utility function

p : vector of probabilitiesx : vector of payouts

Prospect theory

Point 1: People perceive gambles in terms of gains and losses, not their total wealthu(x)

x

u(x)

x

Value =N

∑i=1



Prospect theory: Loss aversion

Point 2: People are LOSS AVERSE, preferring a sure thing to a risky bet with the same expected payoff

(Notice that the loss curve is steeper than the gain curve)

Gamble A Gamble B Payout Probability Payout Probability1 0.0 1.0 0 -5.0 0.5

1 5.0 0.5

Preferred

u(x)

x

Value =N

∑i=1



Prospect theory: Risk aversion for gains

Point 3: Diminishing returns… $1B is not 1000x better than $1M

Leads to risk aversion for gambles with potential gains

Gamble A Gamble B Payout Probability Payout Probability1 50.0 1.0 0 100.0 0.5

1 0.0 0.5

Preferred

u(x)

x

Value =N

∑i=1



Prospect theory: Risk seeking for losses

Point 4:People are risk seeking for gambles with potential losses

Gamble A Gamble B Payout Probability Payout Probability1 -50.0 1.0 0 -100.0 0.5

1 0.0 0.5

Preferred

A very quick primer on risky choice and prospect theory

Documents

Transcript of A very quick primer on risky choice and prospect theory