FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF...

46
FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEW RABIN AND JOEL L. SCHRAG Psychological research indicates that people have a cognitive bias that leads them to misinterpret new information as supporting previously held hypotheses. We show in a simple model that such con rmatory bias induces overcon dence: given any probabilistic assessment by an agent that one of two hypotheses is true, the appropriate beliefs would deem it less likely to be true. Indeed, the hypothesis that the agent believes in may be more likely to be wrong than right. We also show that the agent may come to believe with near certainty in a false hypothesis despite receiving an in nite amount of information. The human understanding when it has once adopted an opinion draws all things else to support and agree with it. And though there be a greater number and weight of instances to be found on the other side, yet these it either neglects and despises, or else by some distinction sets aside and rejects, in order that by this great and pernicious predetermination the authority of its former conclusion may remain inviolate. Francis Bacon 1 I. INTRODUCTION How do people form beliefs in situations of uncertainty? Economists have traditionally assumed that people begin with subjective beliefs over the different possible states of the world and use Bayes’ Rule to update those beliefs. This elegant and powerful model of economic agents as Bayesian statisticians is the foundation of modern information economics. Yet a large and growing body of psychological research * We thank Jimmy Chan, Erik Eyster, Bruce Hsu, Clara Wang, and especially Steven Blatt for research assistance. We thank Linda Babcock, Steven Blatt, Jon Elster, Jeffrey Ely, Roger Lagunoff, George Loewenstein, and seminar participants at the University of California at Berkeley, Carnegie-Mellon University, Cornell University, Emory University, the University of Michigan, Northwestern Univer- sity, the 1997 meetings of the Econometrics Society, and the 1997 meetings of the European Economic Association, as well as three referees, for helpful comments. For nancial support, Rabin thanks the Alfred P. Sloan and Russell Sage Foundations, and Schrag thanks the University Research Committee of Emory University. This draft was completed while Rabin was a Fellow at the Center for Advanced Studies in the Behavioral Sciences, supported by NSF Grant #SBR- 960123. 1. From The New Organon and Related Writings {1960; 1620}, quoted in Nisbett and Ross {1980, p. 167}. r 1999 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology. The Quarterly Journal of Economics, February 1999 37

Transcript of FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF...

Page 1: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

FIRST IMPRESSIONS MATTERA MODEL OF CONFIRMATORY BIAS

MATTHEW RABIN AND JOEL L SCHRAG

Psychological research indicates that people have a cognitive bias that leadsthem to misinterpret new information as supporting previously held hypothesesWe show in a simple model that such conrmatory bias induces overcondencegiven any probabilistic assessment by an agent that one of two hypotheses is truethe appropriate beliefs would deem it less likely to be true Indeed the hypothesisthat the agent believes in may be more likely to be wrong than right We also showthat the agent may come to believe with near certainty in a false hypothesisdespite receiving an innite amount of information

The human understanding when it has once adopted anopinion draws all things else to support and agree with it Andthough there be a greater number and weight of instances tobe found on the other side yet these it either neglects anddespises or else by some distinction sets aside and rejects inorder that by this great and pernicious predetermination theauthority of its former conclusion may remain inviolate

Francis Bacon1

I INTRODUCTION

How do people form beliefs in situations of uncertaintyEconomists have traditionally assumed that people begin withsubjective beliefs over the different possible states of the worldand use Bayesrsquo Rule to update those beliefs This elegant andpowerful model of economic agents as Bayesian statisticians is thefoundation of modern information economics

Yet a large and growing body of psychological research

We thank Jimmy Chan Erik Eyster Bruce Hsu Clara Wang and especiallySteven Blatt for research assistance We thank Linda Babcock Steven Blatt JonElster Jeffrey Ely Roger Lagunoff George Loewensteinand seminar participantsat the University of California at Berkeley Carnegie-Mellon University CornellUniversity Emory University the University of Michigan Northwestern Univer-sity the 1997 meetings of the Econometrics Society and the 1997 meetings of theEuropean Economic Association as well as three referees for helpful commentsFor nancial support Rabin thanks the Alfred P Sloan and Russell SageFoundations and Schrag thanks the University Research Committee of EmoryUniversity This draft was completed while Rabin was a Fellow at the Center forAdvanced Studies in the Behavioral Sciences supported by NSF Grant SBR-960123

1 From The New Organon and Related Writings 1960 1620 quoted inNisbett and Ross 1980 p 167

r 1999 by the President and Fellows of Harvard College and the Massachusetts Institute ofTechnologyThe Quarterly Journal of Economics February 1999

37

suggests that the way people process information often departssystematically from Bayesian updating In this paper we formallymodel and explore the consequences of one particular departurefrom Bayesian rationality conrmatory bias A person suffersfrom conrmatory bias if he tends to misinterpret ambiguousevidence as conrming his current hypotheses about the worldTeachers misread performance of pupils as supporting theirinitial impressions of those pupils many people misread theirobservations of individual behavior as supporting their priorstereotypes about groups to which these individuals belongscientists biasedly interpret data as supporting their hypotheses

Our simple model by and large conrms an intuition commonin the psychology literature conrmatory bias leads to overcon-dence in the sense that people on average believe more stronglythan they should in their favored hypotheses The model alsoyields surprising further results An agent who suffers fromconrmatory bias may come to believe in a hypothesis that isprobably wrong meaning that a Bayesian observer who wasaware of the agentrsquos conrmatory bias would after observing theagentrsquos beliefs favor a different hypothesis than the agent Wealso show that even an innite amount of information does notnecessarily overcome the effects of conrmatory bias over time anagent may with positive probability come to believe with nearcertainty in the wrong hypothesis

In Section IImdashwhich readers impatient for math may wish toskipmdashwe review some of the psychological evidence that humansare prone to conrmatory bias In Section III we present ourformal model and provide examples and general propositionsillustrating the implications of conrmatory bias In our model anagent initially believes that each of two possible states of theworld is equally likely The agent then receives a series ofindependent and identically distributed signals that are corre-lated with the true state To model conrmatory bias we assumethat when the agent gets a signal that is counter to the hypothesishe currently believes is more likely there is a positive probabilitythat he misreads that signal as supporting his current hypothesisThe agent is unaware that he is misreading evidence in this wayand engages in Bayesian updating that would be fully rationalgiven his environment if he were not misreading evidence2

2 Researchers have of course documented many other biases in informationprocessing We develop a model ignoring these other biases assuming completerationality except for this one bias so as to keep our model tractable and becausewe feel that incorporating documented biases into the Bayesian model one at atime is useful for carefully identifying the effects of each particular bias

QUARTERLY JOURNAL OF ECONOMICS38

Because we assume that the agent always correctly interpretsevidence that conrms his current beliefs relative to properBayesian updating he is biased toward conrming his currenthypothesis

So for example a teacher may believe either that Marta issmarter than Bart or that Bart is smarter than Marta he initiallybelieves each is equally likely and over time he collects a series ofsignals that help him to identify who is smarter If after receivingone or more signals the teacher believes that Marta is probablysmarter than Bart conrmatory bias may lead him to erroneouslyinterpret his next signal as supporting this hypothesis Thereforethe teacherrsquos updated belief that Marta is smarter than Bart maybe stronger than is warranted

The notion that the teacher is likely to believe lsquolsquotoo stronglyrsquorsquothat Marta is smarter corresponds to the commonly held intuitionthat conrmatory bias leads to overcondence While qualifyingthis intuition with several caveats our model by and largeconrms it given any probabilistic assessment by an agent thatone of the hypotheses is probably true the appropriate beliefsshould on average deem it less likely to be true Intuitively aperson who believes strongly in a hypothesis is likely to havemisinterpreted some signals that conict with what he believesand hence is likely to have received more evidence against hisbelieved hypothesis than he realizes

Our analysis shows that a more surprising result arises whenconrmatory bias is severe a Bayesian observer with no directinformation of her own but who can observe the agentrsquos belief infavor of one hypothesis may herself believe that the otherhypothesis is more likely We show that such lsquolsquowrongnessrsquorsquo canarise when the agentrsquos evidence is sufficiently mixed Intuitively ifthe agent has perceived almost as much evidence against hishypothesis as supporting it then since some of the evidence heperceives as supportive is actually not supportive it is likely thata majority of the real signals oppose his hypothesis Because suchwrongness only arises when the agent has relatively weak evi-dence supporting his favored hypothesis however the agent onaverage correctly judges which of the two hypotheses is morelikely in the sense that his best guess is right most of the time

While seemingly straightforward the intuition for our over-condence and wrongness results conceals some subtle implica-tions of the agentrsquos conrmatory bias For example an agent whocurrently believes in Hypothesis A (say) may once have believed inHypothesis B at which time he had a propensity to misread as

FIRST IMPRESSIONS MATTER 39

supporting Hypothesis B evidence actually in favor of HypothesisA But then the agent may underestimate how many signalssupporting Hypothesis A he has received and thus he may beundercondent in his belief in favor of Hypothesis A Indeed weshow that an agent who has only recently come to believe in ahypothesis is likely to be undercondent in that hypothesisbecause until recently he has been biased against his currenthypothesis If a teacher used to think Bart was smarter thanMarta and only recently concluded that Marta is smarter thenprobably he has been ignoring evidence all along that Marta issmarter The simple overcondence and wrongness results holdbecause an agent has probably believed in his currently heldhypothesis during most of the time he has been receiving informa-tion and so on average has been biased toward this hypothesis

In Section IV we investigate the implications of conrmatorybias after the agent receives an innite sequence of signals In theabsence of conrmatory bias an agent will always come to believewith near certainty in the correct hypothesis if he receives aninnite sequence of signals If the conrmatory bias is sufficientlysevere or the strength of individual signals is weak however thenwith positive probability the agent may come to believe with nearcertainty that the incorrect hypothesis is true Intuitively oncethe agent comes to believe in an incorrect hypothesis the conrma-tory bias inhibits his ability to overturn his erroneous beliefs Ifthe bias is strong enough the expected drift once the agent comesto believe in the false hypothesis is toward believing more stronglyin that hypothesis guaranteeing a positive probability that theagent ends up forever believing very strongly in the false hy-pothesis The results of Section IV belie the common intuition thatlearning will eventually correct cognitive biases While this is truefor sufficiently mild conrmatory bias when the bias is suffi-ciently severe lsquolsquolearningrsquorsquo can exacerbate the bias

The premise of this paper is that explicit formalizations ofdepartures from Bayesian information processing are crucial toincorporating psychological biases into economic analysis For themost part we do not in this paper take the important next step ofdeveloping extended economic applications of the bias we modelIn Section V however we illustrate one implication of conrma-tory bias by sketching a simple principal-agent model We illus-trate how a principal may wish to mute the incentives that sheoffers an agent who suffers from conrmatory bias Indeed we

QUARTERLY JOURNAL OF ECONOMICS40

show that even if it is very easy for an agent to gather informationso that a principal can at negligible costs provide incentives for anagent to search for protable investment opportunities theprincipal may choose not to provide these incentives to a conrma-tory agent This arises when the expected costs in terms of anovercondent agent investing too much in risky projects outweighthe expected benet of the agent being better informed Weconclude in Section VI by discussing some other potential eco-nomic implications of conrmatory bias as well as highlightingsome likely obstacles to applying our model

II A REVIEW OF THE PSYCHOLOGY LITERATURE

Many different strands of psychological research yield evi-dence on phenomena that we are modeling under the rubric ofconrmatory bias Before reviewing this literature we rst wishto distinguish a form of lsquolsquoquasi-Bayesianrsquorsquo information processingfrom the bias we are examining Although the two phenomena arerelatedmdashand not always distinguished clearly in the psychologyliteraturemdashthey differ importantly in their implications for deci-sion theory Suppose that once they form a strong hypothesispeople simply stop being attentive to relevant new informationthat contradicts or supports their hypotheses Intuitively whenyou become convinced that one investment strategy is morelucrative than another you may simply stop paying attention toeven freely available additional information3

Bruner and Potter 1964 elegantly demonstrate such anchor-ing About 90 subjects were shown blurred pictures that weregradually brought into sharper focus Different subjects beganviewing the pictures at different points in the focusing processbut the pace of the focusing process and nal degree of focus wereidentical for all subjects Strikingly of those subjects who begantheir viewing at a severe-blur stage less than a quarter eventu-ally identied the pictures correctly whereas over half of thosewho began viewing at a light-blur stage were able to correctlyidentify the pictures Bruner and Potter p 424 conclude thatlsquolsquoInterference may be accounted for partly by the difficulty ofrejecting incorrect hypotheses based on substandard cuesrsquorsquo That

3 Such behavior corresponds to a natural economic lsquolsquocognitive-searchrsquorsquo modelif we posit a cost to information processing in many settings the natural stoppingrule would be to process information until beliefs are sufficiently strong in onedirection or another and then stop

FIRST IMPRESSIONS MATTER 41

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 2: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

suggests that the way people process information often departssystematically from Bayesian updating In this paper we formallymodel and explore the consequences of one particular departurefrom Bayesian rationality conrmatory bias A person suffersfrom conrmatory bias if he tends to misinterpret ambiguousevidence as conrming his current hypotheses about the worldTeachers misread performance of pupils as supporting theirinitial impressions of those pupils many people misread theirobservations of individual behavior as supporting their priorstereotypes about groups to which these individuals belongscientists biasedly interpret data as supporting their hypotheses

Our simple model by and large conrms an intuition commonin the psychology literature conrmatory bias leads to overcon-dence in the sense that people on average believe more stronglythan they should in their favored hypotheses The model alsoyields surprising further results An agent who suffers fromconrmatory bias may come to believe in a hypothesis that isprobably wrong meaning that a Bayesian observer who wasaware of the agentrsquos conrmatory bias would after observing theagentrsquos beliefs favor a different hypothesis than the agent Wealso show that even an innite amount of information does notnecessarily overcome the effects of conrmatory bias over time anagent may with positive probability come to believe with nearcertainty in the wrong hypothesis

In Section IImdashwhich readers impatient for math may wish toskipmdashwe review some of the psychological evidence that humansare prone to conrmatory bias In Section III we present ourformal model and provide examples and general propositionsillustrating the implications of conrmatory bias In our model anagent initially believes that each of two possible states of theworld is equally likely The agent then receives a series ofindependent and identically distributed signals that are corre-lated with the true state To model conrmatory bias we assumethat when the agent gets a signal that is counter to the hypothesishe currently believes is more likely there is a positive probabilitythat he misreads that signal as supporting his current hypothesisThe agent is unaware that he is misreading evidence in this wayand engages in Bayesian updating that would be fully rationalgiven his environment if he were not misreading evidence2

2 Researchers have of course documented many other biases in informationprocessing We develop a model ignoring these other biases assuming completerationality except for this one bias so as to keep our model tractable and becausewe feel that incorporating documented biases into the Bayesian model one at atime is useful for carefully identifying the effects of each particular bias

QUARTERLY JOURNAL OF ECONOMICS38

Because we assume that the agent always correctly interpretsevidence that conrms his current beliefs relative to properBayesian updating he is biased toward conrming his currenthypothesis

So for example a teacher may believe either that Marta issmarter than Bart or that Bart is smarter than Marta he initiallybelieves each is equally likely and over time he collects a series ofsignals that help him to identify who is smarter If after receivingone or more signals the teacher believes that Marta is probablysmarter than Bart conrmatory bias may lead him to erroneouslyinterpret his next signal as supporting this hypothesis Thereforethe teacherrsquos updated belief that Marta is smarter than Bart maybe stronger than is warranted

The notion that the teacher is likely to believe lsquolsquotoo stronglyrsquorsquothat Marta is smarter corresponds to the commonly held intuitionthat conrmatory bias leads to overcondence While qualifyingthis intuition with several caveats our model by and largeconrms it given any probabilistic assessment by an agent thatone of the hypotheses is probably true the appropriate beliefsshould on average deem it less likely to be true Intuitively aperson who believes strongly in a hypothesis is likely to havemisinterpreted some signals that conict with what he believesand hence is likely to have received more evidence against hisbelieved hypothesis than he realizes

Our analysis shows that a more surprising result arises whenconrmatory bias is severe a Bayesian observer with no directinformation of her own but who can observe the agentrsquos belief infavor of one hypothesis may herself believe that the otherhypothesis is more likely We show that such lsquolsquowrongnessrsquorsquo canarise when the agentrsquos evidence is sufficiently mixed Intuitively ifthe agent has perceived almost as much evidence against hishypothesis as supporting it then since some of the evidence heperceives as supportive is actually not supportive it is likely thata majority of the real signals oppose his hypothesis Because suchwrongness only arises when the agent has relatively weak evi-dence supporting his favored hypothesis however the agent onaverage correctly judges which of the two hypotheses is morelikely in the sense that his best guess is right most of the time

While seemingly straightforward the intuition for our over-condence and wrongness results conceals some subtle implica-tions of the agentrsquos conrmatory bias For example an agent whocurrently believes in Hypothesis A (say) may once have believed inHypothesis B at which time he had a propensity to misread as

FIRST IMPRESSIONS MATTER 39

supporting Hypothesis B evidence actually in favor of HypothesisA But then the agent may underestimate how many signalssupporting Hypothesis A he has received and thus he may beundercondent in his belief in favor of Hypothesis A Indeed weshow that an agent who has only recently come to believe in ahypothesis is likely to be undercondent in that hypothesisbecause until recently he has been biased against his currenthypothesis If a teacher used to think Bart was smarter thanMarta and only recently concluded that Marta is smarter thenprobably he has been ignoring evidence all along that Marta issmarter The simple overcondence and wrongness results holdbecause an agent has probably believed in his currently heldhypothesis during most of the time he has been receiving informa-tion and so on average has been biased toward this hypothesis

In Section IV we investigate the implications of conrmatorybias after the agent receives an innite sequence of signals In theabsence of conrmatory bias an agent will always come to believewith near certainty in the correct hypothesis if he receives aninnite sequence of signals If the conrmatory bias is sufficientlysevere or the strength of individual signals is weak however thenwith positive probability the agent may come to believe with nearcertainty that the incorrect hypothesis is true Intuitively oncethe agent comes to believe in an incorrect hypothesis the conrma-tory bias inhibits his ability to overturn his erroneous beliefs Ifthe bias is strong enough the expected drift once the agent comesto believe in the false hypothesis is toward believing more stronglyin that hypothesis guaranteeing a positive probability that theagent ends up forever believing very strongly in the false hy-pothesis The results of Section IV belie the common intuition thatlearning will eventually correct cognitive biases While this is truefor sufficiently mild conrmatory bias when the bias is suffi-ciently severe lsquolsquolearningrsquorsquo can exacerbate the bias

The premise of this paper is that explicit formalizations ofdepartures from Bayesian information processing are crucial toincorporating psychological biases into economic analysis For themost part we do not in this paper take the important next step ofdeveloping extended economic applications of the bias we modelIn Section V however we illustrate one implication of conrma-tory bias by sketching a simple principal-agent model We illus-trate how a principal may wish to mute the incentives that sheoffers an agent who suffers from conrmatory bias Indeed we

QUARTERLY JOURNAL OF ECONOMICS40

show that even if it is very easy for an agent to gather informationso that a principal can at negligible costs provide incentives for anagent to search for protable investment opportunities theprincipal may choose not to provide these incentives to a conrma-tory agent This arises when the expected costs in terms of anovercondent agent investing too much in risky projects outweighthe expected benet of the agent being better informed Weconclude in Section VI by discussing some other potential eco-nomic implications of conrmatory bias as well as highlightingsome likely obstacles to applying our model

II A REVIEW OF THE PSYCHOLOGY LITERATURE

Many different strands of psychological research yield evi-dence on phenomena that we are modeling under the rubric ofconrmatory bias Before reviewing this literature we rst wishto distinguish a form of lsquolsquoquasi-Bayesianrsquorsquo information processingfrom the bias we are examining Although the two phenomena arerelatedmdashand not always distinguished clearly in the psychologyliteraturemdashthey differ importantly in their implications for deci-sion theory Suppose that once they form a strong hypothesispeople simply stop being attentive to relevant new informationthat contradicts or supports their hypotheses Intuitively whenyou become convinced that one investment strategy is morelucrative than another you may simply stop paying attention toeven freely available additional information3

Bruner and Potter 1964 elegantly demonstrate such anchor-ing About 90 subjects were shown blurred pictures that weregradually brought into sharper focus Different subjects beganviewing the pictures at different points in the focusing processbut the pace of the focusing process and nal degree of focus wereidentical for all subjects Strikingly of those subjects who begantheir viewing at a severe-blur stage less than a quarter eventu-ally identied the pictures correctly whereas over half of thosewho began viewing at a light-blur stage were able to correctlyidentify the pictures Bruner and Potter p 424 conclude thatlsquolsquoInterference may be accounted for partly by the difficulty ofrejecting incorrect hypotheses based on substandard cuesrsquorsquo That

3 Such behavior corresponds to a natural economic lsquolsquocognitive-searchrsquorsquo modelif we posit a cost to information processing in many settings the natural stoppingrule would be to process information until beliefs are sufficiently strong in onedirection or another and then stop

FIRST IMPRESSIONS MATTER 41

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 3: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Because we assume that the agent always correctly interpretsevidence that conrms his current beliefs relative to properBayesian updating he is biased toward conrming his currenthypothesis

So for example a teacher may believe either that Marta issmarter than Bart or that Bart is smarter than Marta he initiallybelieves each is equally likely and over time he collects a series ofsignals that help him to identify who is smarter If after receivingone or more signals the teacher believes that Marta is probablysmarter than Bart conrmatory bias may lead him to erroneouslyinterpret his next signal as supporting this hypothesis Thereforethe teacherrsquos updated belief that Marta is smarter than Bart maybe stronger than is warranted

The notion that the teacher is likely to believe lsquolsquotoo stronglyrsquorsquothat Marta is smarter corresponds to the commonly held intuitionthat conrmatory bias leads to overcondence While qualifyingthis intuition with several caveats our model by and largeconrms it given any probabilistic assessment by an agent thatone of the hypotheses is probably true the appropriate beliefsshould on average deem it less likely to be true Intuitively aperson who believes strongly in a hypothesis is likely to havemisinterpreted some signals that conict with what he believesand hence is likely to have received more evidence against hisbelieved hypothesis than he realizes

Our analysis shows that a more surprising result arises whenconrmatory bias is severe a Bayesian observer with no directinformation of her own but who can observe the agentrsquos belief infavor of one hypothesis may herself believe that the otherhypothesis is more likely We show that such lsquolsquowrongnessrsquorsquo canarise when the agentrsquos evidence is sufficiently mixed Intuitively ifthe agent has perceived almost as much evidence against hishypothesis as supporting it then since some of the evidence heperceives as supportive is actually not supportive it is likely thata majority of the real signals oppose his hypothesis Because suchwrongness only arises when the agent has relatively weak evi-dence supporting his favored hypothesis however the agent onaverage correctly judges which of the two hypotheses is morelikely in the sense that his best guess is right most of the time

While seemingly straightforward the intuition for our over-condence and wrongness results conceals some subtle implica-tions of the agentrsquos conrmatory bias For example an agent whocurrently believes in Hypothesis A (say) may once have believed inHypothesis B at which time he had a propensity to misread as

FIRST IMPRESSIONS MATTER 39

supporting Hypothesis B evidence actually in favor of HypothesisA But then the agent may underestimate how many signalssupporting Hypothesis A he has received and thus he may beundercondent in his belief in favor of Hypothesis A Indeed weshow that an agent who has only recently come to believe in ahypothesis is likely to be undercondent in that hypothesisbecause until recently he has been biased against his currenthypothesis If a teacher used to think Bart was smarter thanMarta and only recently concluded that Marta is smarter thenprobably he has been ignoring evidence all along that Marta issmarter The simple overcondence and wrongness results holdbecause an agent has probably believed in his currently heldhypothesis during most of the time he has been receiving informa-tion and so on average has been biased toward this hypothesis

In Section IV we investigate the implications of conrmatorybias after the agent receives an innite sequence of signals In theabsence of conrmatory bias an agent will always come to believewith near certainty in the correct hypothesis if he receives aninnite sequence of signals If the conrmatory bias is sufficientlysevere or the strength of individual signals is weak however thenwith positive probability the agent may come to believe with nearcertainty that the incorrect hypothesis is true Intuitively oncethe agent comes to believe in an incorrect hypothesis the conrma-tory bias inhibits his ability to overturn his erroneous beliefs Ifthe bias is strong enough the expected drift once the agent comesto believe in the false hypothesis is toward believing more stronglyin that hypothesis guaranteeing a positive probability that theagent ends up forever believing very strongly in the false hy-pothesis The results of Section IV belie the common intuition thatlearning will eventually correct cognitive biases While this is truefor sufficiently mild conrmatory bias when the bias is suffi-ciently severe lsquolsquolearningrsquorsquo can exacerbate the bias

The premise of this paper is that explicit formalizations ofdepartures from Bayesian information processing are crucial toincorporating psychological biases into economic analysis For themost part we do not in this paper take the important next step ofdeveloping extended economic applications of the bias we modelIn Section V however we illustrate one implication of conrma-tory bias by sketching a simple principal-agent model We illus-trate how a principal may wish to mute the incentives that sheoffers an agent who suffers from conrmatory bias Indeed we

QUARTERLY JOURNAL OF ECONOMICS40

show that even if it is very easy for an agent to gather informationso that a principal can at negligible costs provide incentives for anagent to search for protable investment opportunities theprincipal may choose not to provide these incentives to a conrma-tory agent This arises when the expected costs in terms of anovercondent agent investing too much in risky projects outweighthe expected benet of the agent being better informed Weconclude in Section VI by discussing some other potential eco-nomic implications of conrmatory bias as well as highlightingsome likely obstacles to applying our model

II A REVIEW OF THE PSYCHOLOGY LITERATURE

Many different strands of psychological research yield evi-dence on phenomena that we are modeling under the rubric ofconrmatory bias Before reviewing this literature we rst wishto distinguish a form of lsquolsquoquasi-Bayesianrsquorsquo information processingfrom the bias we are examining Although the two phenomena arerelatedmdashand not always distinguished clearly in the psychologyliteraturemdashthey differ importantly in their implications for deci-sion theory Suppose that once they form a strong hypothesispeople simply stop being attentive to relevant new informationthat contradicts or supports their hypotheses Intuitively whenyou become convinced that one investment strategy is morelucrative than another you may simply stop paying attention toeven freely available additional information3

Bruner and Potter 1964 elegantly demonstrate such anchor-ing About 90 subjects were shown blurred pictures that weregradually brought into sharper focus Different subjects beganviewing the pictures at different points in the focusing processbut the pace of the focusing process and nal degree of focus wereidentical for all subjects Strikingly of those subjects who begantheir viewing at a severe-blur stage less than a quarter eventu-ally identied the pictures correctly whereas over half of thosewho began viewing at a light-blur stage were able to correctlyidentify the pictures Bruner and Potter p 424 conclude thatlsquolsquoInterference may be accounted for partly by the difficulty ofrejecting incorrect hypotheses based on substandard cuesrsquorsquo That

3 Such behavior corresponds to a natural economic lsquolsquocognitive-searchrsquorsquo modelif we posit a cost to information processing in many settings the natural stoppingrule would be to process information until beliefs are sufficiently strong in onedirection or another and then stop

FIRST IMPRESSIONS MATTER 41

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 4: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

supporting Hypothesis B evidence actually in favor of HypothesisA But then the agent may underestimate how many signalssupporting Hypothesis A he has received and thus he may beundercondent in his belief in favor of Hypothesis A Indeed weshow that an agent who has only recently come to believe in ahypothesis is likely to be undercondent in that hypothesisbecause until recently he has been biased against his currenthypothesis If a teacher used to think Bart was smarter thanMarta and only recently concluded that Marta is smarter thenprobably he has been ignoring evidence all along that Marta issmarter The simple overcondence and wrongness results holdbecause an agent has probably believed in his currently heldhypothesis during most of the time he has been receiving informa-tion and so on average has been biased toward this hypothesis

In Section IV we investigate the implications of conrmatorybias after the agent receives an innite sequence of signals In theabsence of conrmatory bias an agent will always come to believewith near certainty in the correct hypothesis if he receives aninnite sequence of signals If the conrmatory bias is sufficientlysevere or the strength of individual signals is weak however thenwith positive probability the agent may come to believe with nearcertainty that the incorrect hypothesis is true Intuitively oncethe agent comes to believe in an incorrect hypothesis the conrma-tory bias inhibits his ability to overturn his erroneous beliefs Ifthe bias is strong enough the expected drift once the agent comesto believe in the false hypothesis is toward believing more stronglyin that hypothesis guaranteeing a positive probability that theagent ends up forever believing very strongly in the false hy-pothesis The results of Section IV belie the common intuition thatlearning will eventually correct cognitive biases While this is truefor sufficiently mild conrmatory bias when the bias is suffi-ciently severe lsquolsquolearningrsquorsquo can exacerbate the bias

The premise of this paper is that explicit formalizations ofdepartures from Bayesian information processing are crucial toincorporating psychological biases into economic analysis For themost part we do not in this paper take the important next step ofdeveloping extended economic applications of the bias we modelIn Section V however we illustrate one implication of conrma-tory bias by sketching a simple principal-agent model We illus-trate how a principal may wish to mute the incentives that sheoffers an agent who suffers from conrmatory bias Indeed we

QUARTERLY JOURNAL OF ECONOMICS40

show that even if it is very easy for an agent to gather informationso that a principal can at negligible costs provide incentives for anagent to search for protable investment opportunities theprincipal may choose not to provide these incentives to a conrma-tory agent This arises when the expected costs in terms of anovercondent agent investing too much in risky projects outweighthe expected benet of the agent being better informed Weconclude in Section VI by discussing some other potential eco-nomic implications of conrmatory bias as well as highlightingsome likely obstacles to applying our model

II A REVIEW OF THE PSYCHOLOGY LITERATURE

Many different strands of psychological research yield evi-dence on phenomena that we are modeling under the rubric ofconrmatory bias Before reviewing this literature we rst wishto distinguish a form of lsquolsquoquasi-Bayesianrsquorsquo information processingfrom the bias we are examining Although the two phenomena arerelatedmdashand not always distinguished clearly in the psychologyliteraturemdashthey differ importantly in their implications for deci-sion theory Suppose that once they form a strong hypothesispeople simply stop being attentive to relevant new informationthat contradicts or supports their hypotheses Intuitively whenyou become convinced that one investment strategy is morelucrative than another you may simply stop paying attention toeven freely available additional information3

Bruner and Potter 1964 elegantly demonstrate such anchor-ing About 90 subjects were shown blurred pictures that weregradually brought into sharper focus Different subjects beganviewing the pictures at different points in the focusing processbut the pace of the focusing process and nal degree of focus wereidentical for all subjects Strikingly of those subjects who begantheir viewing at a severe-blur stage less than a quarter eventu-ally identied the pictures correctly whereas over half of thosewho began viewing at a light-blur stage were able to correctlyidentify the pictures Bruner and Potter p 424 conclude thatlsquolsquoInterference may be accounted for partly by the difficulty ofrejecting incorrect hypotheses based on substandard cuesrsquorsquo That

3 Such behavior corresponds to a natural economic lsquolsquocognitive-searchrsquorsquo modelif we posit a cost to information processing in many settings the natural stoppingrule would be to process information until beliefs are sufficiently strong in onedirection or another and then stop

FIRST IMPRESSIONS MATTER 41

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 5: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

show that even if it is very easy for an agent to gather informationso that a principal can at negligible costs provide incentives for anagent to search for protable investment opportunities theprincipal may choose not to provide these incentives to a conrma-tory agent This arises when the expected costs in terms of anovercondent agent investing too much in risky projects outweighthe expected benet of the agent being better informed Weconclude in Section VI by discussing some other potential eco-nomic implications of conrmatory bias as well as highlightingsome likely obstacles to applying our model

II A REVIEW OF THE PSYCHOLOGY LITERATURE

Many different strands of psychological research yield evi-dence on phenomena that we are modeling under the rubric ofconrmatory bias Before reviewing this literature we rst wishto distinguish a form of lsquolsquoquasi-Bayesianrsquorsquo information processingfrom the bias we are examining Although the two phenomena arerelatedmdashand not always distinguished clearly in the psychologyliteraturemdashthey differ importantly in their implications for deci-sion theory Suppose that once they form a strong hypothesispeople simply stop being attentive to relevant new informationthat contradicts or supports their hypotheses Intuitively whenyou become convinced that one investment strategy is morelucrative than another you may simply stop paying attention toeven freely available additional information3

Bruner and Potter 1964 elegantly demonstrate such anchor-ing About 90 subjects were shown blurred pictures that weregradually brought into sharper focus Different subjects beganviewing the pictures at different points in the focusing processbut the pace of the focusing process and nal degree of focus wereidentical for all subjects Strikingly of those subjects who begantheir viewing at a severe-blur stage less than a quarter eventu-ally identied the pictures correctly whereas over half of thosewho began viewing at a light-blur stage were able to correctlyidentify the pictures Bruner and Potter p 424 conclude thatlsquolsquoInterference may be accounted for partly by the difficulty ofrejecting incorrect hypotheses based on substandard cuesrsquorsquo That

3 Such behavior corresponds to a natural economic lsquolsquocognitive-searchrsquorsquo modelif we posit a cost to information processing in many settings the natural stoppingrule would be to process information until beliefs are sufficiently strong in onedirection or another and then stop

FIRST IMPRESSIONS MATTER 41

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 6: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

is people who use weak evidence to form initial hypotheses havedifficulty correctly interpreting subsequent better informationthat contradicts those initial hypotheses4

This form of anchoring does not necessarily imply that peoplemisinterpret additional evidence to either disconrm or conrminitial hypotheses only that they ignore additional evidence Sucha tendency to anchor on initial hypotheses can therefore bereconciled with Bayesian information processing While suchanchoring is potentially quite important psychological evidencereveals a stronger and more provocative phenomenon people tendto misread evidence as additional support for initial hypotheses Ifa teacher initially believes that one student is smarter thananother she has the propensity to conrm that hypothesis wheninterpreting later performance5 Lord Ross and Lepper 1979 p2099 posited some of the underlying cognitive mechanismsinvolved in such propensities

there is considerable evidence that people tend to interpret subse-quent evidence so as to maintain their initial beliefs The biased assimilationprocesses underlying this effect may include a propensity to remember thestrengths of conrming evidence but the weaknesses of disconrmingevidence to judge conrming evidence as relevant and reliablebut disconrm-ing evidence as irrelevant and unreliable and to accept conrming evidenceat face value while scrutinizing disconrming evidence hypercritically Withconrming evidence we suspect that both lay and professional scientistsrapidly reduce the complexity of the information and remember only a fewwell-chosen supportive impressions With disconrming evidence they con-tinue to reect upon any information that suggests less damaging lsquolsquoalterna-tive interpretationsrsquorsquo Indeed they may even come to regard the ambiguitiesand conceptual aws in the data opposing their hypotheses as somehowsuggestive of the fundamental correctness of those hypotheses Thus com-

4 A similar experiment Wyatt and Campbell 1951 was cited by Perkins1981 as one interpretation of the perspective that lsquolsquofreshrsquorsquo thinkers may be betterat seeing solutions to problems than people who have meditated at length on theproblems because the fresh thinkers are not overwhelmed by the lsquolsquointerferencersquorsquo ofold hypotheses

5 A related arena where the conrmation bias has been studied widely is incounselor judgments counselors in clinical settings tend to conrm originalsuppositions in their eventual judgments If you are told ahead of time that aninterviewee is combative then both your conduct and your interpretation of hisconduct during an interview may reinforce that supposition even if he is in fact nomore combative than the average person See eg Haverkamp 1993 There hasalso been extensive research on conrmatory bias in the interviewing process moregenerally see eg Dougherty Turban and Callender 1994 and Macan andDipboye 1994 Research applying variants of conrmatory bias to other domainsincludes Arkes 1989 and Borum Otto and Golding 1993 to the law BaumannDeber and Thompson 1991 to medicine and Souter 1993 discusses theimplications of overcondence to business insurance

QUARTERLY JOURNAL OF ECONOMICS42

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 7: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

pletely inconsistent or even random datamdashwhen lsquolsquoprocessedrsquorsquo in a suitablybiased fashionmdashcan maintain or even reinforce onersquos preconceptions

The most striking evidence for the conrmatory bias is aseries of experiments demonstrating how providing the sameambiguous information to people who differ in their initial beliefson some topic can move their beliefs farther apart To illustratesuch polarization Lord Ross and Lepper 1979 asked 151undergraduates to complete a questionnaire that included threequestions on capital punishment Later 48 of these students wererecruited to participate in another experiment Twenty-four ofthem were selected because their answers to the earlier question-naire indicated that they were lsquolsquo lsquoproponentsrsquo who favored capitalpunishment believed it to have a deterrent effect and thoughtmost of the relevant research supported their own beliefs Twenty-four were opponents who opposed capital punishment doubted itsdeterrent effect and thought that the relevant research supportedtheir viewsrsquorsquo These subjects were then asked to judge the merits ofrandomly selected studies on the deterrent efficacy of the deathpenalty and to state whether a given study (along with criticismsof that study) provided evidence for or against the deterrencehypothesis Subjects were then asked to rate on 16 point scalesranging from 2 8 to 1 8 how the studies they had read moved theirattitudes toward the death penalty and how they had changedtheir beliefs regarding its deterrent efficacy Lord Ross andLepper pp 2102ndash2104 summarize the basic results (all of whichhold with condence p 01) as follows

The relevant data provide strong support for the polarization hypothe-sis Asked for their nal attitudes relative to the experimentrsquos startproponents reported that they were more in favor of capital punishmentwhereas opponents reported that they were less in favor of capital punish-ment Similar results characterized subjectsrsquo beliefs about deterrentefficacy Proponents reported greater belief in the deterrent effect of capitalpunishment whereas opponents reported less belief in this deterrent effect

Plous 1991 replicates the Lord-Ross-Lepper results in thecontext of judgments about the safety of nuclear technology Pro-and antinuclear subjects were given identical information andarguments regarding the Three Mile Island nuclear disaster anda case of false military alert that could have led to the launching ofU S nuclear missiles Plous p 1068 found that 54 percent ofpronuclear subjects became more pronuclear from the informa-tion while only 7 percent became less pronuclear By contrast

FIRST IMPRESSIONS MATTER 43

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 8: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

only 7 percent of the antinuclear subjects became less antinuclearfrom the information while 45 percent became more antinuclear6

Darley and Gross 1983 demonstrate a related and similarlystriking form of polarization due to conrmatory bias Seventyundergraduates were asked to assess a nine-year-old girlrsquos aca-demic skills in several different academic areas Before complet-ing this task the students received information about the girl andher family and viewed a video tape of the girl playing in aplayground One group of subjects was given a fact sheet thatdescribed the girlrsquos parents as college graduates who held white-collar jobs these students viewed a video of the girl playing inwhat appeared to be a well-to-do middle class neighborhood Theother group of subjects was given a fact sheet that described thegirlrsquos parents as high school graduates who held blue-collar jobsthese students viewed a video of the same girl playing in whatappeared to be an impoverished inner-city neighborhood Half ofeach group of subjects were then asked to evaluate the girlrsquosreading level measured in terms of equivalent grade level7 Therewas a small difference in the two groupsrsquo estimatesmdashthosesubjects who had viewed the lsquolsquoinner-cityrsquorsquo video rated the girlrsquos skilllevel at an average of 390 (ie 910 through third grade) whilethose who had viewed the lsquolsquosuburban videorsquorsquo rated the girlrsquos skilllevel at an average of 429 The remaining subjects in each groupwere shown a second video of the girl answering (with mixedsuccess) a series of questions Afterwards they were asked to

6 These percentages were derived from Table 2 of Plous 1991 p 1068aggregating across two studies the remaining subjects in each case reported nochange in beliefs For other papers following on Lord Ross and Lepper 1979 seeFleming andArrowood 1979 Jennings Lepper and Ross 1981 Hubbard 1984Lepper Ross and Lau 1986 See also Miller McHoskey Bane and Dowd 1993for more mixed evidence regarding the Lord-Ross-Lepper experiment In thepassage above Lord Ross and Lepper posit that even professional scientists aresusceptible to such same-evidence polarization Indeed many economists andother academics have probably observed how differing schools of thought interpretambiguous evidence differently An example was once told to one of us by acolleague He saw the same modelmdashcalibrating the elasticity of demand facing aCournot oligopolist as a function of the number of rms in an industrymdashdescribedat the University of Chicago and at the Massachusetts Institute of Technology AChicago economist derived the formula and said lsquolsquoLook at how few rms you needto get close to innite elasticities and perfect competitionrsquorsquo An MIT economistderived the same formula and said lsquolsquoLook at how large n the number of rms hasto be before you get anywhere close to an innite elasticity and perfect competi-tionrsquorsquo These different schools each interpreted the same mathematical formula asevidence reinforcing their respective views For related analysis in the scienticdomain see also Mahoney 1977

7 The subjects were also asked to evaluate the girlrsquos mathematics and liberalarts skill levels we report the results that are least supportive of the existence ofconrmatory bias

QUARTERLY JOURNAL OF ECONOMICS44

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 9: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

evaluate the girlrsquos reading level The inner-city video group ratedthe girlrsquos skill level at an average of 371 signicantly below the390 estimate of the inner-city subjects who did not view thequestion-answer video Meanwhile the suburban video grouprated the girlrsquos skill level at an average of 467 signicantly abovethe 429 estimate of the suburban subjects who did not view thesecond video Even though the two groups viewed the identicalquestion-and-answer video the additional information furtherpolarized their assessments of the girlrsquos skill level Darley andGross interpret this result as evidence of conrmatory biasmdashsubjects were inuenced by the girlrsquos background in their initialjudgments but their beliefs were evidently inuenced even morestrongly by the effect their initial hypotheses had on theirinterpretation of further evidence8

Our reading of the psychology literature leads us to concludethat any of three different information-processing problems con-tribute to conrmatory bias First researchers widely recognizethat conrmatory bias and overcondence arise when people mustinterpret ambiguous evidence (see eg Keren 1987 and Griffinand Tversky 1992) Lord Ross and Lepperrsquos 1979 studydiscussed above clearly illustrates the point Keren 1988 notesthe lack of conrmatory bias in visual perceptions and concludesthat conrmatory tendency depends on some degree of abstractionand lsquolsquodiscriminationrsquorsquo (ie the need for interpretation) not presentin simple visual tasks A primary mechanism of stereotype-maintenance is our tendency to interpret ambiguous behavioraccording to previous stereotype9 Similarly a teacher may inter-pret an ambiguous answer by a student as either creative or justplain stupid according to his earlier impressions of the student

8 It should be noted that polarization of the form identied by Darley andGross 1983 provides more direct evidence of conrmatory bias than doespolarization identied by Lord Ross and Lepper 1979 and related papersAs JeffEly pointed out to us Lord Ross and Lepper permit an alternative interpretationthat some people are predisposed to interpret ambiguous evidence one way andsome the other Hence observing further polarization by groups who already differmay not reect conrmatory bias per se but underlying differences in interpreta-tion of evidence that would appear irrespective of subjectsrsquo current beliefs Whilethis interpretationalso departs from common-priors Bayesian information process-ing and will often yield similar implicationsas conrmatory bias it is conceptuallydistinct and would sometimes yield different predictions By demonstratingpolarization based on differing beliefs induced in two ex ante identical groups ofsubjects Darley and Gross are not subject to this alternative interpretation

9 A vast literature explores the mechanisms by which people retain ethnicgender and other group stereotypes See eg Hamilton and Rose 1980Bodenhausen and Wyer 1985 Bodenhausen and Lichtenstein 1987 Stangor1988 Stangor and Ruble 1989 and Hamilton Sherman and Ruvolo 1990

FIRST IMPRESSIONS MATTER 45

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 10: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

but will be less likely to biasedly interpret more objective feedbacksuch as answers to multiple-choice questions

Second conrmatory bias can arise when people must inter-pret statistical evidence to assess the correlation between phenom-ena that are separated by time Nisbett and Ross 1980 arguethat the inability to accurately identify such correlation (egbetween hyperactivity and sugar intake or between performanceon exams and the time of day the exams are held) is one of themost robust shortcomings in human reasoning10 People oftenimagine a correlation between events when no such correlationexists11 Jennings Amadibile and Ross 1982 argue that illusorycorrelation can play an important role in the conrmation of falsehypotheses nding that people underestimate correlation whenthey have no theory of the correlation but exaggerate correlationand see it where it is not when they have a preconceived theory ofit12

Third conrmatory bias occurs when people selectively col-lect or scrutinize evidence One form of lsquolsquoscrutiny-basedrsquorsquo conrma-tory bias is what we shall call hypothesis-based ltering13 While itis sensible to interpret ambiguous data according to currenthypotheses people tend to use the consequent lsquolsquolteredrsquorsquo evidence

10 As JenningsAmabile and Ross 1982 p 212 put it lsquolsquoeven the staunchestdefenders of the laypersonrsquos capacities as an intuitive scientist have had littlethat was attering to say about the laypersonrsquos handling of bivariate observationrsquorsquo

11 Chapman and Chapman 1967 1969 1971 demonstrate that cliniciansand laypeople often perceive entirely illusory correlation among (for instance)pictures and the personality traits of the people who drew the pictures Stangor1988 and Hamilton and Rose 1980 also discuss the role of illusory correlation inthe context of conrmatory-like phenomena

12 Similarly Redelmeier and Tversky 1996 argue illusory correlation mayhelp explain the persistent belief that arthritis pain is related to the weather

13 Another mechanism can be dened as lsquolsquopositive test strategyrsquorsquo People tendto ask questions (of others of themselves or of data) that are likely to be true iftheir hypothesis is truemdashwithout due regard to the fact that they are likely to betrue even if the hypothesis is false See Einhorn and Hogarth 1978 Klayman andHa 1987 Beattie and Baron 1988 Devine Hirt and Gehrke 1990 Hodginsand Zuckerman 1993 Friedrich 1993 and Zuckerman Knee Hodgins andMiyake 1995 We are using this term a bit differently than we suspectpsychologists would use it As far as we know the term was coined by Klayman andHa to point out that much of what was put under the rubric of conrmatory biascould indeed be a rational form of hypothesis testing Fischhoff and Beyth-Marom1983 pp 255ndash256 and Friedrich also point out that if people are fully aware thatasking lsquolsquosoftrsquorsquo questions teaches them little about the truth of hypotheses then nobias has occurred While we feel research on the positive test strategy needs morecareful calibration versus Bayesian updating we believe that the evidencesuggests that people do not fully appreciate how little they have learned about thevalidity of their hypotheses when asking soft questions (Mehle Gettys ManningBaca and Fisher 1981 for instance show that people with specied hypothesesfor observed data tend to overuse such hypotheses to explain the data because theydo not have lsquolsquoavailablersquorsquo the many unspecied hypothesis that could also explain thedata)

QUARTERLY JOURNAL OF ECONOMICS46

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 11: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

inappropriately as further evidence for these hypotheses If astudent gives an unclear answer to an exam question it isreasonable for a teacher to be inuenced in his evaluation of theanswer by his prior perceptions of that studentrsquos mastery of thematerial However after assigning differential grades to studentsaccording to differential interpretation of comparable answers itis a mistake to then use differential grades on the exam as furtherevidence of the differences in the studentsrsquo abilities14 This sort oferror is especially likely when the complexity and ambiguity ofevidence requires the use of prior theories when interpreting dataand deciding what data to examine15

Finally one of the main results in our model is conrmation ofthe conjecture common in the psychological literature that conr-matory bias leads to overcondence A vast body of psychologicalresearch separate from research on conrmatory bias nds thatpeople are prone toward overcondence in their judgments16

14 Lord Ross and Lepper 1979 pp 2106ndash2107 note a similar distinction inreecting on the bias in their experiment discussed above They note that it isproper for people to differentially assess probative value of different studiesaccording to their current beliefs about the merits of the death penalty The lsquolsquosinrsquorsquo isin using their hypothesis-based interpretations of the strength of different studiesas further support for their beliefs

15 We suspect that hypothesis-based ltering is especially important inunderstanding persistence and strengthening of beliefs in tenuous lsquolsquoscienticrsquorsquotheories Indeed Jon Elster drew our attention to an illustration by philosopherofscience Karl Popper 1963 pp 34ndash35 of conrmatory bias in intellectual pursuitsPopper observed that followers of Marx Freud and Adler found lsquolsquoconrmationrsquorsquoeverywhere and described the process by which they strengthened their convic-tion over time in terms remarkably similar to the process as wersquove described itbased on psychological research

Once your eyes were thus opened you saw conrming instances every-where the world was full of verications of the theory Whatever happenedalways conrmed it The most characteristic element in this situationseemed to me the incessant stream of conrmations As for Adler I wasmuch impressed by a personal experience Once in 1919 I reported to him acase which to me did not seem particularly Adlerian but which he found nodifficulty in analysing in terms of his theory of inferiority feelings althoughhe had not even seen the child Slightly shocked I asked him how he could beso sure lsquolsquoBecause of my thousandfold experiencersquorsquo he replied whereupon Icould not help saying lsquolsquoAnd with this new case I suppose your experiencehas become thousand-and-one-foldrsquorsquo

What I had in mind was that his previous observations may not havebeen much sounder than this new one that each in its turn had beeninterpreted in the light of lsquolsquoprevious experiencersquorsquo and at the same timecounted as additional conrmation

16 See eg Oskamp 1982 Mahajan 1992 and Paese and Kinnaly 1993An early paper that makes this point is Fischhoff Slovic and Lichtenstein 1977who also tested the robustness of overcondence with monetary stakes rather thanreported judgments No decrease in overcondence was found relative to theno-money-stakes condition (As Camerer 1995 notes there exist very fewconclusions reached by researchers on judgment that have been overturned when

FIRST IMPRESSIONS MATTER 47

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 12: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

III CONFIRMATORY BIAS AND BELIEF FORMATION

Consider two states of the world x [ AB where A and B aretwo exhaustive and mutually exclusive hypotheses regardingsome issue We consider an agent whose prior belief about x isgiven by prob (x 5 A) 5 prob (x 5 B) 5 05 so the agent initiallyviews the two alternative hypotheses as equally likely to be trueIn every period t [ 123 the agent receives a signal st [ ab that is correlated with the true state of the world Signals receivedat different times t are independently and identically distributedwith prob (st 5 a A) 5 prob (st 5 b B) 5 u for some u [ (51)After receiving each signal the agent updates his belief about therelative likelihood of x 5 A and x 5 B

To model conrmatory bias we suppose that the agent maymisinterpret signals that conict with his current belief aboutwhich hypothesis is more likely Suppose that given the signalsthe agent thinks he has observed in the rst t 2 1 periods hebelieves that state A is more likely than state B Because of hisconrmatory bias the agent may misread a conicting signal st 5b in the next period believing instead that he observes st 5 a

Formally in every period t [ 123 the agent perceives asignal s t [ a b When the agent perceives a signal s t 5 a hebelieves that he actually received a signal st 5 a and if heperceives s t 5 b he believes that he actually received a signal st 5b He updates his beliefs using Bayesrsquo Rule given his (possiblyerroneous) perceptions of the signals he is receiving We assumethat with probability q 0 the agent misreads a signal st thatconicts with his belief about which hypothesis is more likely andthat the agent always correctly interprets signals that conrm hisbelief If he currently believes that Hypothesis A is more likelythen for sure he interprets a signal st 5 a as s t 5 a but withprobability q he misreads st 5 b as s t 5 a

This model of conrmatory bias incorporates several unrealis-tic simplifying assumptions For instance we assume that theseverity of the bias summarized by q does not depend on thestrength of the agentrsquos beliefs about which of the two states ismore likely It would be reasonable to expect that q is greater if the

monetary stakes are added) There have however been criticisms of the evidencein support of overcondence See Bjorkman 1994 Pfeifer 1994 TomassiniSolomon Romney and Krogstad 1982 Van Lenthe 1993 and Winman andJuslin 1993 We feel nevertheless that the evidence makes a strong case forovercondence Indeed see Soll 1996 for evidence that overcondence doesextend to ecologically valid domains

QUARTERLY JOURNAL OF ECONOMICS48

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 13: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

agentrsquos beliefs are more extreme We conjecture that our qualita-tive results would continue to hold if we were to relax thisassumption Also we assume that the agent misreads conictingevidence as conrming evidence While we feel that this is oftenthe case a reasonable alternative model would be to assumeinstead that the agent merely has a tendency to overlook evidencethat conicts with his beliefs This model too would yield thesame qualitative results as our model intuitively ignoring thecounterhypothesis evidence in a cluster of mixed but mostlycounterhypothesis evidence is equivalent to misreading the wholecluster as hypothesis-supportive

The presence of conrmatory bias means that the agentrsquosperceived signals s t are neither independently nor identicallydistributed Suppose that after receiving signals st2 1 5 (s1 st2 1)the agent has perceived a sequence of signals s t 2 1 5 ( s 1 s t2 1)and holds beliefs prob (x 5 A s t2 1) Dene

u prob (s t 5 a prob (x 5 A s t 2 1) 05 x 5 B )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 A )

u prob ( s t 5 a prob (x 5 A s t 2 1) 05 x 5 A )

5 prob ( s t 5 b prob (x 5 B s t2 1) 05 x 5 B )

u and u summarize the distribution of the agentrsquos per-ceived signal s t when the agent believes that one hypothesis ismore likely than the other ie when prob (x 5 A s t 2 1) THORN 05 u isthe probability that the agent perceives a signal conrming hisbelief that one hypothesis is more likely when in fact the otherhypothesis is true u is the probability that the agent perceives asignal conrming his belief that a hypothesis is more likely whenin fact it is true Because with probability q the agent misreads asignal that conicts with his beliefs u 5 (1 2 u ) 1 q u and u 5u 1 q(1 2 u ) When prob (x 5 A s t 2 1) 5 05 ie when the agentbelieves that the two possible hypotheses are equally likely theagent does not suffer from conrmatory bias In this case hecorrectly perceives the signal that he receives and he updatesaccurately so u prob ( s t 5 a prob (x 5 A s t2 1) 5 05x 5 A) 5prob ( s t 5 b prob (x 5 B s t2 1) 5 05x 5 B)

If q 5 0 then the agent is an unbiased Bayesian statisticianwhile if q 5 1 the agentrsquos rst piece of information completelydetermines his nal belief since he always misreads signals that

FIRST IMPRESSIONS MATTER 49

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 14: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

conict with the rst signal he receives More generally thehigher is q the more extreme is the conrmatory bias

Suppose that the agent has perceived n a a signals and nb bsignals where n a nb Because the agent believes he has receivedn a a signals and nb b signals his updated posterior beliefs aregiven by

prob (x 5 A na nb ) 5u na 2 nb

u na 2 nb 1 (1 2 u )na 2 nb

Dene

L (na n b ) 5prob (x 5 A n a nb )

prob (x 5 B na n b )

L (n a nb ) represents the agentrsquos beliefs in terms of a relativelikelihood ratio Using Bayesrsquo Rule L (na nb ) 5 (u na 2 nb )(1 2 u )na 2 nb IfL (n a nb ) 1 the agent believes that A is more likely than B to bethe true state while if L (n a n b ) 1 the agent believes that B ismore likely than A If L (n a nb ) 5 1 the agent believes that the twostates are equally likely The agentrsquos interpretation of an addi-tional signal is biased whenever L (na n b ) THORN 1

In order to identify the effects of conrmatory bias it ishelpful to compare the agentrsquos beliefs with the beliefs of ahypothetical unbiased Bayesian observer who learns how many aand b signals the agent has perceived and who knows that theagent suffers from conrmatory bias Like the agent the Bayesianobserver initially believes that prob (x 5 A) 5 prob (x 5 B) 5 05and she has no independent information about whether x 5 A orx 5 B This hypothetical observerrsquos beliefs therefore reect thetrue probability that x 5 A and x 5 B given the signals that theagent has perceived

Dene L (n a nb ) as the Bayesian observerrsquos likelihood ratio ofA versus B when she knows that an agent who suffers fromconrmation bias has perceived n a a signals and n b b signalswhere na n b In general when q 0 the biased agentrsquoslikelihood ratio L (na nb ) and the unbiased observerrsquos likelihoodratio L (na n b ) are not equal If L (na nb ) when n a n b the agent isovercondent his belief in favor of the hypothesis that x 5 A isstronger than is justied by the available evidence Similarly ifL (n a nb ) L (n a nb ) the agent is undercondent in his belief thatx 5 A

QUARTERLY JOURNAL OF ECONOMICS50

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 15: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

In the formal results that we develop below we assume thatwhile the unbiased observer knows how many a and b signals theagent has perceived she does not know the order in which theagent perceived his signals But when q 0 the order of theagentrsquos perceived signals if known would inuence a Bayesianobserverrsquos beliefs since the agentrsquos conrmatory bias implies thathis perceived signals are not distributed independently Supposethat the agent has perceived three a signals and two b signals inwhich case his beliefs are L (na 5 3nb 5 2) 5 u (1 2 u ) If theBayesian observer knew the order of the agentrsquos signals herposterior belief L (na 5 3n b 5 2) could be less than greater thanor equal to u (1 2 u ) depending on the order of the signals Thusfrom the perspective of an outside observer the agent could beovercondent undercondent or perfectly calibrated in his beliefs

Suppose for example that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( a a a b b ) In thiscase the observerrsquos posterior likelihood ratio is

L 5u ( u 1 q(1 2 u ))2 (1 2 u )2

(1 2 u )(1 2 u 1 q u )2 u 2

5( u 1 q(1 2 u ))2(1 2 u )

(1 2 u 1 q u )2 u

u

1 2 u q [ (01

Intuitively the Bayesian observer recognizes the possibilitythat the agent may have misread his second and third signalsperceiving that they supported the hypothesis that x 5 A when infact one or both may have supported the hypothesis that x 5 BTherefore the Bayesian observer is less convinced that x 5 A thanthe agent who is overcondent in his belief More generally anobserver who knows that a biased agent has always believed in hiscurrent hypothesis should judge the agent to be overcondent inhis belief since there is a positive probability that the agent hasmisread signals that are counter to his favored hypothesis Anobserver who knows that a teacher has always believed that Bartis smarter than Marta should recognize that the teacherrsquos conr-matory bias may have led him to misread evidence that Marta isin fact smarter

Alternatively suppose that the Bayesian observer knew thatthe agentrsquos sequence of perceived signals was ( b b a a a ) Now the

FIRST IMPRESSIONS MATTER 51

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 16: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

observerrsquos posterior likelihood ratio is

L 5(1 2 u )(1 2 u 1 q u )u 3

u ( u 1 q(1 2 u ))(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

In this case the Bayesian observer believes that the agentmay have misread his second signal perceiving that it supportedthe hypothesis that x 5 B when in fact it may have supported thehypothesis that x 5 A Thus the Bayesian observer believes thatthere is a greater likelihood that x 5 A than the agent who isundercondent in his belief More generally an observer whoknows that a biased agent only recently came to believe in hiscurrent hypothesis after long believing in the opposite hypothesisshould judge the agent to be undercondent in his belief since theagent may have misread one or more signals that support hiscurrent hypothesis when he believed the opposite An observerwho knows that a teacher initially thought that Bart was smarterthan Marta but eventually started to believe that it was slightlymore likely that Marta was smarter than Bart should conjecturethat the teacher is undercondent about his new hypothesisWhen the teacher believed that Bart was smarter than Marta hemay have misinterpreted signals that Marta was smarter Thefact that the teacher came to believe that Marta was smarterdespite his initial bias toward believing that Bart was smarterindicates that the evidence is very strong that Marta is smarter17

The preceding examples illustrate how information about theorder of the agentrsquos signals would signicantly inuence an

17 As a discussant for this paper Roger Lagunoff made an interestingsuggestion that is especially relevant for the examples we are discussing here Inour model once the agent interprets a signal he never goes back and reinterpretsitmdasheven if he later changes his hypothesis about the world Hence we are notcapturing a form of belief updating we sometimesobserve when somebody (nally)comes around to change his world-view that he held for quite a while hesometimes experiences an epiphany whereby he goes back and reinterpretsprevious evidence in light of his new hypothesis realizing that lsquolsquothe signs werethere all alongrsquorsquo This suggests a model in which an agent is biased in interpretingnot just the next signal but all past signals as supporting his current hypothesisabout the world While we suspect there is some truth to this we donrsquot believe thatpeople fully retroactively rebias themselves in this way (We have found nopsychological evidence about this one way or another) While such an alternativemodel would rule out the possibility of lsquolsquoundercondencersquorsquo for recent converts itwould leave all the predictions regarding overcondence discussed in the remain-der of the paper qualitatively the same and magnify the magnitude of our results(and simplify the proofs)

QUARTERLY JOURNAL OF ECONOMICS52

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 17: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

outside observerrsquos judgment about whether and in what directionthe agentrsquos beliefs were biased Nevertheless for the remainder ofthe paper we assume that an outside observer only knows thenumber of a and b signals that the agent has received and not theorder in which he received them This assumption enables us toidentify whether on average the agent is over- or undercondentThis appears to be the question that the psychological literatureaddresses presumably it is also of interest to economists

Clearly if q 5 0 then L (na n b ) 5 L (na nb ) When q 0however Proposition 1 establishes that L (n a nb ) L (n a nb ) Thatis when the agent perceives that a majority of his signals support(say) Hypothesis A he believes in A with higher probability thanis warranted18

PROPOSITION 1 Suppose that n a n b and na 1 nb 1 ThenL (n a nb ) L (n a nb )

Proposition 1 establishes that an agent who suffers fromconrmatory bias will be overcondent in his belief about whichstate is most likely

An observer who knows the agentrsquos beliefs cannot usuallyobserve the exact sequence of the agentrsquos perceived signalsTherefore the observerrsquos judgment about whether the agent isunder- or overcondent depends on her belief regarding thelikelihood of the different possible sequences of signals Proposi-tion 1 establishes that overcondence is the dominant force Theintuition for this result is fairly straightforward if you cannotdirectly observe the agentrsquos past beliefs but you know that he nowbelieves in Hypothesis A you should surmise that on average hespent more time in the past believing Hypothesis A than Hy-pothesis B Consequently you should surmise that on averagethe agent misread more signals while believing in HypothesisAmdashcontributing to overcondencemdashthan he misread while believ-ing in Hypothesis Bmdashcontributing to undercondence Proposi-tion 1 hinges to some extent on our assumption that the agentreceives signals that are the same strength in every period Webelieve that (far more complicated) versions of Proposition 1 holdin more general models but we show in Appendix 1 that undercon-dence is sometimes possible when the agentrsquos signals are ofdifferent strengths in different periods

18 All proofs are in Appendix 2 Because our model is entirely symmetric weshall for convenience present all results and much of our discussion solely for thecase where A is perceived as more likely

FIRST IMPRESSIONS MATTER 53

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 18: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Proposition 1 shows that when the agent believes that thestate is x 5 A with probability micro 05 the true probability thatthe state is x 5 A is less than micro Interestingly the true probabilitythat A is the true state may be less than 05 meaning that B ismore likely than A The possibility that the agent may suffer notmerely from overcondence but also from lsquolsquowrongnessrsquorsquo ariseswhen the agentrsquos conrmatory bias is severe and he has perceivedat least two signals in favor of each hypothesis

To see the intuition for this result suppose that the agent hassince his rst signal s1 5 a believed that Hypothesis A is morelikely than B but that he nevertheless has perceived two signalss t 5 s t 5 b at two times t t8 1 If the agentrsquos conrmatory bias issevere (ie q lt 1) only his rst perceived signal in favor of Aprovides true evidence that x 5 A Once the agent believes that Ais true his conrmatory bias predisposes him to perceive thatsubsequent signals support this belief and therefore additionalsignals in favor of A are not very informative But because theagentrsquos two perceived signals in favor of B conict with what hebelievesmdashthat x 5 A is more likelymdashthey reect actual signals infavor of B Thus although the agent has always believed that x 5A is more likely he has effectively received only one signal in favorof A and two signals in favor of B In this case the agentrsquos beliefthat x 5 A represents extreme overcondence if he had correctlyinterpreted evidence he would believe that x 5 B is more likely

It is of course possible that hypothesis A is more likely thanthe agent realizes if he rst perceives a signal s1 5 b falsely readsarsquos as brsquos for a while and only later perceives enough a rsquos to come tobelieve in A And it is true that getting more true arsquos than true brsquosimplies that Hypothesis A is more likely Yet it can be shown thatthese possibilities may be far less likely than the cases leading toextreme overcondence so that the net effect that is more likelythat B is true than that A is true if the agent believes in A withmixed evidence

For example suppose that the agent has perceived sevensignals four a rsquos and three b rsquos Given these signals the agentrsquosposterior beliefs are L (na 5 4nb 5 3) 5 u (1 2 u ) 1 the agentbelieves that the state x 5 A is more likely Meanwhile the truelikelihood ratio is L (na 5 4nb 5 3) 5

(1 2 u )38 u 4 1 8 u 3u 1 7 u 2u 2 1 5 u u 3

1 (1 2 u )2 u 3u u 1 4 u 4u 1 2 u 4(1 2 u ) u 2

u 38(1 2 u )4 1 8(1 2 u )3u 1 7(1 2 u )2u 2 1 5(1 2 u ) u 3

1 u 2(1 2 u )3u u 1 4(1 2 u )4 u 1 2 u (1 2 u )4u 2

QUARTERLY JOURNAL OF ECONOMICS54

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 19: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Suppose that u 5 75 Then the agentrsquos posterior likelihood ratio isL (43) 5 3 Suppose further that q 5 95 and therefore the agentsuffers from severe conrmatory bias Then the true likelihoodratio is L (43) 5 63 and therefore x 5 B is more likely to be thetrue state despite the agent having perceived more a signals thanb signals

Indeed it turns out to be the case that when conrmatorybias is very severe and the signals are very informative thenwhenever you observe the agent believing in Hypothesis A andhaving perceived two or more b signals then you should assumethat it is more likely that B is true than A We formalize this inProposition 2 Let L (na n b q u ) be the appropriate beliefs as afunction of q and u Then

PROPOSITION 2 For na nb and nb 1 lim e 0 L (na n b 1 2 e 1 2 e ) 1 For all na nb $ 2 lime 0 L (na nb 1 2 e 1 2 e ) 1

That is for u and q both very close to 1 when the agent hasperceived one or fewer b signals and believes in Hypothesis A sheis probably correct (though overcondent) in her beliefs when theagent has perceived two or more b signals and believes inHypothesis A she is probably incorrect in her beliefsmdashHypothesisB is more likely to be true

We emphasize that the very premise of the proposition meansthat the situations to which it applies are uncommon when both qand u are close to 1 the probability of perceiving anything besidesa sequence of signals favoring the correct hypothesis is smallTherefore Proposition 2 tells us about a very low-probabilityevent In our example with seven signals q 5 95 and u 5 75 theprobability that the signals are sufficiently mixed that the agent isprobably wrong is a little more than one-half percent

While we do not know more generally the highest probabilitywith which the agent can be wrong some calibrations illustratethat it can be relatively likely that the agent ends up with beliefsthat a Bayesian observer would deem probably wrong TablesIndashIV display for various values of n u and q the probability thatL h and L 1 or L 1 h and L 1 where h representsdifferent thresholds for how wrong the agent is Table entries arein percentage terms (rounded to the nearest percent) with rowscorresponding to different values of q and columns to differentvalues of u (Dashes indicate an entry exactly equal to zero)19

For instance with u 5 6 and q 5 5 the probability that the

19 The entries in Tables IndashIV reect direct calculations (performed bycomputer) of the probabilities in question

FIRST IMPRESSIONS MATTER 55

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 20: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

TABLE IPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 12 2 0 mdash

3 21 9 1 0

4 29 15 5 1

5 27 18 10 3

6 33 22 12 5

7 27 21 15 7

8 33 24 15 8

9 21 17 12 9

TABLE IIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 12

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 10 5 1 mdash

4 15 13 5 1

5 19 18 9 3

6 16 18 12 5

7 18 21 13 7

8 11 16 15 7

9 4 8 12 6

TABLE IIIPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 50 h 5 19

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash 5 mdash mdash

5 3 12 7 1

6 3 14 11 4

7 2 11 12 6

8 1 6 12 7

9 0 4 7 6

QUARTERLY JOURNAL OF ECONOMICS56

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 21: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

agent has beliefs after 50 signals that the observer would deemprobably wrong is about 27 percent The probability in this samecase that his beliefs will lead the observer to believe in the otherhypothesis with at least probability 23 is 19 percent and theprobability that the observer would believe in the hypothesisopposite to the agentrsquos with at least 910 probability is about 3percent20

In the example above and in Proposition 2 the agent can bewrong in her beliefs Even more surprising perhaps the trueprobability that A is the correct hypothesis need not be monotoni-cally increasing in the proportion of a signals the agent perceivesContinue to assume that the agent has received seven signals butnow suppose that ve support x 5 A and two support x 5 B Thenbecause u 5 75 the agentrsquos posterior likelihood ratio is L (52) 527 L (43) Meanwhile the true likelihood ratio is

L (n a 5 5n b 5 2)

5(1 2 u )27 u 2u 3 1 9 u u 4 1 4 u 3u 2 1 (1 2 u ) u 3u 2u

u 27(1 2 u )2u 3 1 9(1 2 u ) u 4 1 4(1 2 u )3u 2 1 u (1 2 u )3u 2u

20 Readers may note that these probabilities generally increase in q andthen decrease with probability about 0 for q 5 0 and q 5 1 But they are notsingle-peaked in q This is because there are two factors at work in determiningtheinuence of q on the probability As q increases the probability that the agent willend up with close-to-even mixes of a and b signals decreases continuously Butbecause an increase in q increases the likelihood that any given combination of a rsquosand b rsquos involves the agent being probabilistically wrong there will be at certainpoints discrete jumps upward in the likelihood of wrongness for some values of qThe result is an extremely poorly behaved function

TABLE IVPROBABILITY OF lsquolsquoWRONGNESSrsquorsquo n 5 7 h 5 1

q

u

06 07 07 09

1 mdash mdash mdash mdash

2 mdash mdash mdash mdash

3 mdash mdash mdash mdash

4 mdash mdash mdash mdash

5 mdash mdash mdash 2

6 mdash 5 3 1

7 3 3 2 5

8 10 8 6 3

9 3 2 2 1

FIRST IMPRESSIONS MATTER 57

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 22: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Maintaining the assumption that q 5 95 L (52) 5 62 L (43) 5 63 Therefore the relative likelihood that the true stateis x 5 A versus x 5 B is smaller if the agent perceives that ve outof seven signals support x 5 A than if he perceives that only fourout of seven signals support x 5 A

While seemingly counterintuitive this result reects the factthat the agent is more likely to have perceived (truly informative)signals s j 5 b that conict with a belief that x 5 A when he hasperceived only two signals in favor of B than when he hasperceived three signals in favor of B Intuitively the agent is morelikely to have believed for many periods that x 5 A in the formercase than in the latter case Put differently the agent is less likelyto have perceived (truly informative) signals s j 5 a that conictwith a belief that x 5 B when he has perceived only two signals infavor of B than when he has perceived three signals in favor of B

The preceding examples illustrate that an agent who suffersfrom conrmatory bias may believe that one of the two possiblestates is more likely than the other when in fact the reverse istrue Nevertheless Proposition 3 shows that a Bayesian observerwho knows only that a biased agent believes that x 5 A is morelikely than x 5 B will herself believe that x 5 A is more likelyTherefore an agent who suffers from conrmatory bias will lsquolsquoonaveragersquorsquo correctly judge which of the two possible states is morelikely though as Proposition 1 establishes he will always beovercondent in his belief

Dene L (n) as the likelihood ratio of a Bayesian observerwho knows that a conrmatory agent has perceived a total of nsignals and knows that n a nb but does not know the exactvalues of n a and nb That is the observer knows only that theagent believes A is more likely than B but observes nothing aboutthe strength of his beliefs Then

PROPOSITION 3 For all n L (n) 1

In light of the above examples where the agent may be wrongthe simple generality of Proposition 3 may seem surprising It isreconciled with the examples by observing that the agent suffersfrom lsquolsquowrongnessrsquorsquo only when his conrmatory bias is very severemeaning that q is close to 1 and yet he has perceived mixedsignals about which state is more likely But the agent is unlikelyto receive mixed signals when his conrmatory bias is strongbecause each signal s t will tend to mirror s 1

QUARTERLY JOURNAL OF ECONOMICS58

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 23: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

IV BELIEFS AFTER AN INFINITE NUMBER OF SIGNALS

A fully Bayesian agentmdashfor whom q 5 0mdashwill after aninnite number of signals come to believe with near certainty inthe correct hypothesis We now investigate the implications ofconrmatory bias in the limit as an agent receives an innitenumber of signals

We begin with denitions and a lemma that will help toanalyze this question Suppose that the agent has thus farreceived m 5 na 2 nb 0 more perceived signals in support ofHypothesis A than in support of Hypothesis B Suppose furtherthat as long as na nb prob ( s t 5 a ) 5 g Note that g 5 u if B istrue and g 5 u if A is true We wish to consider somepreliminary results that hold in either case We dene p(m g ) asthe probability that there exists some time in the future when theagent will have received an equal number of a and b signals (Atthat time the agentrsquos posterior belief is the same as his prior beliefprob (x 5 A) 5 05) We have the following lemma which is arestatement of a well-known result from Feller 1968 pp 344ndash347)

LEMMA 1 For all m 0 g $ 05 p(m g ) 5 (1 2 g )g m For g 05p(m g ) 5 1

We dene PW as the probability that the agent beginningwith the prior belief prob (x 5 A) 5 05 comes to believe withcertainty in the wrong hypothesis after receiving an innitenumber of signals21 That is PW is the probability that althoughthe true state is x 5 A the agent instead comes to believeirreversibly with near certainty that x 5 B Proposition 4characterizes PW as a function of q and u

PROPOSITION 4 If q 1 2 1(2 u ) then

PW 5(1 2 u ) middot (1 2 (1 2 u )u )

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u )) 0

If q 1 2 1(2 u ) then PW 5 0

When q 1 2 1(2 u ) u 5 (1 2 u ) 1 q u 05 When u 05then once the agent comes to believe that the wrong hypothesisabout x is more likely he is consequently more likely to receive a

21 Formally PW prob ( k 0 and e 0 rsquo n such that prob (n a 2 nb kfor all n n) 1 2 e )

FIRST IMPRESSIONS MATTER 59

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 24: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

signal s t that conrms this incorrect belief than he is to receive asignal that conicts with this incorrect belief This guaranteesthat there is a positive probability that the agent will neveroverturn his incorrect hypothesis and in fact come to believe moreand more strongly in that wrong hypothesis Conversely if q 1 21(2 u ) then u 5 which guarantees that the agent will everytime he comes to believe the wrong hypothesis is more likelyeventually come to abandon that belief This in turn implies thatthe agent will repeatedly come to believe the correct hypothesis ismore likely and since u 5 u 1 q(1 2 u ) u 5 he willeventually come to believe in it with near certainty

The proposition shows that despite receiving an innitenumber of signals the agent may become certain that theincorrect hypothesis is in fact true22 This occurs when the agentrsquosconrmatory bias is sufficiently severe To illustrate the magni-tude of PW Table V displays PW for various values of u and q Tableentries are in percentage terms (rounded to the nearest percent)with rows corresponding to different values of q and columns todifferent values of u (Dashes indicate an entry exactly equal tozero)

For example suppose that q 5 05 and u 5 75 Then PW 5 752meaning that approximately 13 percent of the time the agent willeventually come to believe with certainty in the wrong hypothesisAs the quality of the agentrsquos true signal worsens he is more likelyto believe with certainty in the wrong hypothesis Indeed acorollary to Proposition 4 is that xing any q 0 lim u 12 PW 5 12

We now investigate the related question of when the agentwill maintain an incorrect initial belief To do so we relax our

22 It is straightforward to show that the agent becomes certain that thecorrect hypothesis about the state of the world is true with complementaryprobability Therefore after an innite number of signals the agent will believethat one of the hypotheses is certainly true

TABLE VPROBABILITY OF BELIEVING IN WRONG HYPOTHESIS AFTER OBSERVING AN INFINITE

NUMBER OF SIGNALS

q u 5 6 u 5 667 u 5 75 u 5 9

25 18 mdash mdash mdash

333 26 12 mdash mdash

5 34 24 13 2

75 38 31 22 8

QUARTERLY JOURNAL OF ECONOMICS60

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 25: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

assumption that the agent initially believes that each state x isequally likely and suppose instead that the agent initially believesthat the wrong hypothesis is more likely to be true For example ifx 5 B is the true state of the world then the agent initiallybelieves prob (x 5 A) 5 micro 05 Crucially we assume that thisbelief arose from signals that are independent of the new signalsthat the agent receives which are distributed as outlined above

Given the assumption that the signals are independentlydistributed and ignoring integer problems these prior beliefs canbe interpreted as if the agent has already received D more signalssupporting the incorrect hypothesis where

micro 5 u Du D 1 (1 2 u )D

This formula implicitly denes a function D(micro) The agentmust receive D(micro) more conicting signals s t than conrmingsignals in order to reach a posterior belief that the two possiblestates of the world A and B are equally likely

We dene PW(micro) as the probability that the agent beginningwith the prior belief micro 05 that the wrong hypothesis about thestate of the world is true comes to believe with certainty in thewrong hypothesis after receiving an innite number of signals23

PROPOSITION 5 Choose any e 0 and any micro 05 Then(i) For all u [ (051) there exists q 0 such that PW (micro)

1 2 e (ii) For all q 0 there exists u 05 such that PW (micro) 1 2 e

Proposition 5 says that an agent who begins with an arbi-trarily small bias in the direction of the incorrect hypothesis willalmost surely maintain his belief in this hypothesis when either oftwo conditions is satised First and not very surprisingly thiswill occur when the agent is subject to severe conrmatory biasWhen q is very close to 1 then the agent almost never receivessignals that conict with his initial belief and therefore it is notsurprising that this belief is rarely overturned Second andsomewhat more surprisingly the agent almost surely maintainshis incorrect belief provided that his true signals are very weakmeaning that u is very close to 05 This result does not depend onthe level of conrmatory bias so long as q 0 This result meansthat if the agent receives only very weak feedback from hisenvironment and is subject to any conrmatory bias he almost

23 PW (05) 5 PW

FIRST IMPRESSIONS MATTER 61

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 26: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

never overcomes any initial beliefs that are signicantly incorrectand in fact comes to believe that the incorrect hypothesis iscertainly true While one should not overinterpret the secondresult in Proposition 5mdashwe can question whether agents reallypay attention to such weak feedbackmdashthe conclusion is neverthe-less very striking Propositions 4 and 5 show that an innitesequence of signals will not necessarily lead people to overcomeerroneous beliefs rather people may simply become more andmore condent in those erroneous beliefs

Table VI displays PW(micro) for various values of u q and micro If uand micro are chosen in such a way that D(micro) is an integer and q 1 212 u it follows from Lemma 1 that

PW(micro) 5 1 21 2 u

u

D(micro)

1 PW(05)1 2 u

u

D(micro)

0

Table entries are in percentage terms (rounded to the nearestpercent) with rows corresponding to different values of q andcolumns to different values of u and the prior belief micro (Dashesindicate an entry exactly equal to zero)

For example suppose that u 5 551 and micro 5 6 In this caseD(micro) 5 2 meaning that the agent must receive two more signalsthat conict with rather than conrm his prior belief in order tobelieve that the states A and B are equally likely But if q 5 333there is nearly an 80 percent chance that the agent will neveroverturn his incorrect prior belief Clearly learning does notnecessarily lead the agent to correctly identify the true state

V CONFIRMATORY BIAS IN A PRINCIPAL-AGENT MODEL

Conrmatory bias is likely to inuence economic behavior inmany different arenas In this section we develop a simple

TABLE VIPROBABILITY OF MAINTAINING AN INCORRECT PRIOR BELIEF AFTER RECEIVING AN

INFINITE NUMBER OF SIGNALS

qu 5 6 micro 5 6

D(micro) 5 1u 5 551 micro 5 6

D(micro) 5 2u 5 75 micro 5 75

D(micro) 5 1u 5 634 micro 5 75

D(micro) 5 2

25 32 67 mdash 24

33 51 79 mdash 56

5 72 92 48 85

75 89 99 82 98

QUARTERLY JOURNAL OF ECONOMICS62

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 27: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

illustrative model of a principal-agent relationship a contextwhere we think conrmatory bias is likely to be important Thepremise of the model is that an agent may take inappropriateactions not solely because of intentional misbehaviormdashmoralhazardmdashbut also because of unintentional errors arising fromconrmatory bias Specically because an agent who suffers fromconrmatory bias will be overcondent in his judgment about howlikely various actions are to pay off he may be prone to takingactions that are riskier and more lsquolsquoextremersquorsquo than is optimal for theprincipal Such overcondence seems to reect the intuitionamong some researchers at a conference one of the authorsattended a leading economist conjectured that bad investmentdecisions by businesses in Eastern Europe receiving bank loanswere more often the result of overcondence by borrowers than ofintentions to mislead banks Even more directly along the lines ofour model Wood 1989 asserts that money managers becomemore condent in their investment decisions as they gather moreinformationmdasheven when the quality of their investment decisionsis not improved

A principal who is aware of an agentrsquos conrmatory bias willwish to design incentives that both cause the agent to internalizethe negative consequences of bad choices and prevent decisionsbased on good-faith overcondence In particular incentives thatlead the agent to collect a lot of information may not be optimal ifthe agent suffers from severe conrmatory bias and hencebecomes more overcondent as he collects more information Theprincipal may therefore wish to mute the agentrsquos incentivesrelative to what would be optimal in the absence of conrmatorybias

While an exhaustive analysis of the effect of conrmatory biason agency relationships is beyond the scope of this paper we nowdevelop a simple illustrative model along these lines Supposethat a principal hires an agent to allocate initial wealth W 5 1between the different investments in the set I 5 IAIBIC Theinvestment IC is risk-free it always yields a gross return r(IC) 5 1Investments IA and IB on the other hand are risky their returnsdepend on the state of nature x [ AB Conditional on the state xthe gross returns from IA and IB are r(IA A) 5 r(IB B) 5 R [ (12)and r(IA B) 5 r(IB A) 5 0 Let u(middot) be the principalrsquos Von Neumann-Morgenstern utility function for money Because the principalmay be risk-averse we assume that u8 0 and u9 0 Consistentwith the model that we developed in the previous sections the

FIRST IMPRESSIONS MATTER 63

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 28: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

principal and the agent cannot observe the state x and they hold acommon prior belief that prob (x 5 A) 5 prob (x 5 B) 5 05Hence if the agent learns nothing more about the true state theoptimal investment is in the riskless investment IC if he learnssufficiently moremdashgenerating beliefs sufficiently different from5mdashhe will perceive it as optimal to invest some money in one ofthe two risky investments24

Before choosing how to invest the principalrsquos wealth theagent has the opportunity to observe informative signals aboutthe true state although the agentrsquos conrmatory bias may leadhim to misinterpret these signals We assume that the signalsthat the agent receives and the way he perceives these signalsaccord with the model in the previous sections

For both analytic ease and to highlight the role of conrma-tory bias we abstract away from the usual moral-hazard con-cerns we assume that the agent costlessly observes signals aboutthe state x and expends no effort when making decisions on theprincipalrsquos behalf Under this assumption an arbitrarily smallincentive to identify the true state would lead the agent to observean innite number of signals after which he would believe that hecould identify the true state with near certainty Furthermore toabstract away from issues of optimal risk-sharing between theagent and the principal we assume that the agent is (nearly)innitely risk-averse Therefore the principal must offer theagent a nearly constant wage25 Under these assumptions the

24 While the language and notation suggest that we are referring towell-dened investment portfolios (eg three different bonds) we mean for themodel to apply as well to internal organizational incentives to pursue ambiguouslydened projects Indeed this alternative interpretation may better t the formalmodel in some respects Note that it is crucial to our analysis that the agent cannotor does not merely report his beliefs to the principal but rather implements astrategy himself based on his beliefs If the principal knew the agentrsquos beliefs andthe extent of his conrmatory bias she could form her own beliefs about the truestate of the world and then directly choose the action that would maximize herexpected payoff

25 These assumptions raise a subtle point If the agent anticipated gatheringan innite number of signals he would be willing to accept a contract that yieldeda payoff that depended on the outcome of a risky investment even if he wereinnitely risk-averse This is because the agent would anticipate being able toidentify the true state with virtual certainty But if the hypothesis of the rst partof Proposition 4 is satised the agent will be overcondent in his judgment afterobserving an innite number of signals Therefore such a contract would imposemore risk on the agentmdashand yield a lower expected utilitymdashthan he anticipatedWe assume here that the principal cannot exploit the agentrsquos conrmatory bias byconvincing him to sign a contract that yields an expected payoff that is less thanthe agentrsquos reservation payoff This assumption means that the principal cannotuse the agent as a lsquolsquomoney pumprsquorsquo and its empirical validity deserves investigationThe issue of whether to focus analysis on lsquolsquoefficiency contractsrsquorsquo rather thanlsquolsquomoney-pumping contractsrsquorsquo is a more general one that is likely to arise in

QUARTERLY JOURNAL OF ECONOMICS64

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 29: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

incentives that the principal offers to the agent affect the princi-palrsquos payoff only through their effect on the investment decisionthat the agent makes The principal does not need to compensatethe agent for gathering information and she cannot transfer riskto the agent

These assumptions permit us to focus on two polar cases Inthe rst case the principal gives the agent no incentive to collectinformation and the agent allocates the principalrsquos wealth withno information beyond his prior belief In the second case theprincipal gives the agent an arbitrarily small incentive to collectinformation and the agent allocates the principalrsquos wealth afterobserving an innite number of signals and coming to believe withvirtual certainty that he has identied the true state

Suppose rst that the principal offers the agent no incentiveto gather information Because R 2 and the principalrsquos priorbelief is prob (x 5 A) 5 05 it is optimal for the principal to directthe agent to invest all of the principalrsquos wealth in the risk-freeasset IC Now suppose that the principal offers the agent a smallincentive to identify the true state For instance suppose that theprincipal offers to pay the agent an arbitrarily small fraction ofthe principalrsquos gross return Because it is free for the agent tocollect information he would observe an innite number ofsignals and come to believe that he could identify the true statewith certainty These extreme beliefs would lead the agent toallocate all of the principalrsquos wealth to one of the risky assets Iffor example the agent thought that A was surely the true state hewould allocate all of the principalrsquos wealth to asset IA26

If q 1 2 1(2 u ) (ie if conrmatory bias is sufficiently weak)Proposition 4 establishes that the agent will eventually identifythe true state with near certainty if he collects enough signalsHence under our assumption that it is costless for the agent togather information and that in every state of the world one or theother of the lsquolsquoriskyrsquorsquo investments is optimal it would then beoptimal for the principal to offer a contract that would lead theagent to collect an innite number of signals and the principalwould receive a payoff u(R) u(1)

If q 1 2 1(2 u ) on the other hand Proposition 4 establishes

developing formal models of incentives for boundedly rational agents See forinstance OrsquoDonoghue and Rabin 1997 who study incentive design for agentswho irrationally procrastinate and who discuss various rationales for focusing onefficiency contracts

26 We assume that short-selling is impossibleso the agent could not allocatemore than $1 to asset a by for instance selling investment IC short

FIRST IMPRESSIONS MATTER 65

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 30: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

that the agentrsquos (completely condent) belief about the true stateis wrong with positive probability It can be shown that hecorrectly identies the true state with probability

micro(u q) 5u 2( u 1 q(1 2 u )) 2 1(1 2 u 1 q u )

q1 2 2(1 2 q) u (1 2 u )

Since the agent is fully condent that he has identied the truestate he invests all of the principalrsquos wealth in the risky asset hebelieves is most protable so the principalrsquos payoff is micro(u q)u(R) 1(1 2 micro(u q))u(0) Note that micro(u q) is increasing in u and decreas-ing in q meaning that the agent identies the true state withhigher probability when he receives more informative signals andlower probability when his conrmatory bias is more severe

The principal does not want the agent to become lsquolsquoinformedrsquorsquowhen u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0) Dene micro as satisfy-ing u(1) 5 microu(R) 1 (1 2 micro)u(0) if the agent correctly identies thetrue state with probability micro after observing an innite number ofsignals the principal is just willing for the agent to becomeinformed about the state x Because micro(u q) $ u the principalalways offers the agent an incentive to become informed if u $ microThat is if the principal prefers all her money invested in a riskyinvestment based on just one signal to having all her moneyinvested in the risk-free investment she will provide incentives tothe agent When u micro on the other hand we have Proposition 6

PROPOSITION 6 Suppose that u micro

(i) There exists q [ (1 2 12 u 1 such that the principaldoes not offer the agent an incentive to become informedabout the state x if and only if q $ q

(ii) For any q [ 01 there exists u [ (05micro such that theprincipal does not offer the agent an incentive to becomeinformed about the state x if and only if u u

When u micro the principal does not want the agent to observesignals about the state x either if conrmatory bias is very severeor if the agent receives very weak signals In either case there is astrong possibility that an lsquolsquoinformedrsquorsquo agent would erroneouslyidentify the true state although the agent himself would overcon-dently believe that he could identify the true state with nearcertainty If the agentrsquos overcondence is sufficiently severe theprincipal prefers not to offer the agent any incentive to becomeinformed in which case the agent will invest the principalrsquoswealth in the riskless asset

QUARTERLY JOURNAL OF ECONOMICS66

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 31: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

The principalrsquos degree of risk aversion also inuences whetheror not he wants the agent to observe signals about the state xWhile Proposition 6 shows that there are conditions where even arisk-neutral principal eschews incentives for the agent the princi-pal is more bothered by the overcondence when she is morerisk-averse in the usual sense dened by Pratt 1964 Indeedwhenever conrmatory bias is severe enough that the agent mightbe wrong even after gathering an innite number of signals aprincipal who is sufficiently risk-averse will prefer not to offer heragent any incentive to become informed We formalize this idea inProposition 7

PROPOSITION 7 Suppose that u [ (051) q [ (1 2 12 u 1 and u(middot)is a Von Neumann-Morgenstern utility function u(middot) satisfy-ing u8 0 u9 0

(i) Suppose that u(1) micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then there exists a function g(middot) such that g8 0 g9 0and g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(ii) Suppose that u(1) $ micro(u q)u(R) 1 (1 2 micro(u q))u(0)Then for any function g(middot) such that g8 0 g9 0g(u(1)) $ micro(u q)g(u(R)) 1 (1 2 micro(u q))g(u(0))

(iii) In both (i) and (ii) v(middot) 5 g(u(middot)) is a Von Neumann-Morgenstern utility function that represents preferencesthat are globally more risk-averse than those repre-sented by u(middot)

Suppose that Marta whose preferences are represented byu(middot) wishes to give her agent the incentive to become informedabout x The proposition establishes that there exist preferencesthat are globally more risk-averse than Martarsquos under which aprincipal would prefer not to give her agent the incentive tobecome informed Furthermore if Marta does not wish to give heragent an incentive to become informed about the state x then anyprincipal who is globally more risk-averse than Marta would alsochoose not to offer incentives to an identically biased agent facingthe same investment decision

The preceding analysis reects an assumption that a biasedagent who feels he is fully informed will invest all of the principalrsquoswealth in a single risky asset The agent will pursue such astrategy if for example the principal offers the agent a xed shareof the principalrsquos gross investment return Our analysis assumesof course that the principal cannot directly contract on decisions

FIRST IMPRESSIONS MATTER 67

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 32: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

only on returns But it also implicitly assumes that the principalcannot punish the agent for having too high an expected return Ifshe could then she might wish for the agent to gather someinformationmdashand then provide incentives such that the (overcon-dent) agent will be afraid of making too much money for theprincipal27 There might be a variety of reasons of course whysuch contracts are infeasible or undesirable If for example theprincipal is uncertain about either the true value of R or theextent of the agentrsquos conrmatory bias q she may not have enoughinformation to propose a contract that always leads the agent toinvest optimally

Nevertheless even in the presence of uncertainty the princi-pal would generally be better off if she could restrain the agentrsquosability to take an extreme action If feasible to restrain the agentthe principal could propose a contract that stipulates that theagent cannot invest more than a fraction of her wealth in anysingle asset Even more simply the principal could simply give theagent only a portion of her wealth to invest All of these strategiesserve the same purpose namely preventing an overcondentagent from investing too much of the principalrsquos money in a singleasset while still taking advantage of the information that theagent actually does possess

VI DISCUSSION AND CONCLUSION

We believe that conrmatory bias is important in many socialand economic situations and that variants of the formulationdeveloped in this paper can be usefully applied in formal economicmodels For instance conrmatory bias is likely to matter when a

27 From the principalrsquos point of view the optimal proportional allocation to arisky investment a maximizes the objective function V(a) micro( u q)u(1 1 (R 2 1)a) 1 (1 2 micro(u q)) middot (1 2 a) The optimal allocation a then satisesthe following necessary and sufficient condition

$ 0 a 5 1

micro(u q)u8(1 1 (R 2 1)a)(R 2 1) 2 (1 2 micro(u q))u8(1 2 a) 5 0 a [ 01

0 a 5 0

The principal would like to propose a contract specifying that the agentreceives an arbitrarily small reward when the principalrsquos gross return is 1 1(R 2 1)a no payoff when the principalrsquos gross return is 1 2 a and a large penaltyfor any other gross return Such a contract would punish the agent if he chooses anallocation that is more extreme than the principal desires

QUARTERLY JOURNAL OF ECONOMICS68

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 33: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

decision-maker must aggregate information from many sourcesIn a setting where several individuals (nonstrategically) transmittheir beliefs to a principal how should she combine these reportsto form her own beliefs If the principal thought that the agentswere Bayesians then she would be very sensitive to the strengthof the agentsrsquo beliefs Suppose for instance that the principalknows that all agents receive signals of strength u 5 6 Then iftwo agents report believing Hypothesis A with probability 6 andone agent reports believing Hypothesis B with probability 77(meaning he has gotten three more b signals than a signals) theprincipal should believe in Hypothesis B with probability 6

What if the principal were aware that agents were subject toconrmatory bias If conrmatory bias is so severe that only anagentrsquos rst signal is very informative then the principal maywish to discount the strength of agentsrsquo beliefs and basicallyaggregate according to a lsquolsquomajority rulesrsquorsquo criterion In the exampleabove for instance the principal should perhaps think Hypothe-sis A is more likely because two of three agents believe in it Wethink this intuition has merit but it is complicated by the fact thatagents who believe relatively weakly in a hypothesis may be morelikely to be wrong than right So if the principal thoughtconrmatory bias were severe and were very sure that all agentshad received lots of information then in our example she shouldbelieve that all three agents have provided evidence in favor ofHypothesis B Hence she should believe more in Hypothesis Bthan she would if the agents were Bayesian

We suspect nonetheless that the lsquolsquomajority-rules intuitionrsquorsquo ismore valid especially when considering realistic uncertainty bythe principal about how many signals each agent has received Ifshe were highly uncertain about how much information eachagent received she would assume weak beliefs merely reectedthat an agent got few signals Similarly if the principal thinkssusceptibility to conrmatory bias is heterogeneous she mightinfer that an agentrsquos weak beliefs indicate merely that he is notsusceptible to overcondence and count weak beliefs as much asstrong beliefs Indeed she may then count them more heavilysince conrmation-free agents are not only less likely to beovercondent they are also less likely to be wrong

This intuition that when aggregating information from agroup it may be wise to count the number of people with given

FIRST IMPRESSIONS MATTER 69

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 34: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

beliefs rather than the strength of their convictions suggests arelated prescription for organizational design relative to whatshe would do with Bayesian agents a principal may prefer to hiremore agents to collect a given amount of information That iswhile the lower value of information processing by conrmatoryagents may mean that either more or fewer should be hired than ifthey were Bayesian xing the total amount of informationprocessing a principal wants done with conrmatory agents sheshould prefer more people thinking than if they were fullyrational agents

Imagine for instance that a principal allocated 1000 lsquolsquosig-nalsrsquorsquo among different agents whose reports she would aggregateto form her own beliefs There are various costs that mightinuence how many agents to have or (equivalently) how manysignals to allocate per agent eg the xed cost of hiring newagents and decreasing returns from each individual due to fatigueor the increasing opportunity cost of time But the optimalnumber of conrmatory agents is likely to be greater than theoptimal number of Bayesian agents Intuitively the value ofallocating a signal to a conrmatory agent is less than the value ofallocating it to a Bayesian agent unless the conrmatory agent isunbiased by previous signals For example if the principal hires1000 conrmatory agents each to report his observation of asingle signal then she receives all of the information contained inthe signals If the principal instead hires one conrmatory agentto report his beliefs after interpreting 1000 signals she may getfar less information Both signal allocations would yield the sameamount of information if the agents were Bayesian

We suspect that a similar issue plays out less abstractly indifferent aspects of the legal system While other explanations areprobably more important conrmatory bias may help to explainsome features of the American jury system such as the biastoward more rather than fewer jurors and the use of a majority-rules criterion with no mechanism (other than jury deliberations)to extract the strength of all participantsrsquo convictions Conrma-tory bias may also help to justify the use of multiple judges toreach a decision when using a single judge seems to be morecost-effective Appeals for example are usually heard by a panelof judges that does not include the trial judge and some legalscholars (eg Resnik 1982) argue that the judge who adjudicatesat trial should not also supervise settlement bargaining and

QUARTERLY JOURNAL OF ECONOMICS70

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 35: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

pretrial discovery which is the process by which litigants requestinformation from each other These observers fear that the trialjudge might learn things during pretrial activities that wouldlsquolsquobiasrsquorsquo her during the trial The notion that the quality of thejudgersquos decisions during the trial suffers if she has more informa-tion relevant to the case is somewhat puzzling worries that shecan be lsquolsquobiasedrsquorsquo by more information certainly ies in the face ofthe Bayesian model While there are various types of bias that onecould imagine (eg that the judge will use her rulings during thetrial to punish perceived misbehavior during the discovery pro-cess) the evidence on conrmatory bias raises the possibility thatthe judge will form preconceptions during the discovery phase oflitigation that will cause her to misread additional evidencepresented at trial

Finally the discovery process itself nicely illustrates how thepolarization associated with conrmatory bias may have impor-tant implications Discovery takes place when potential litigantsthink that a trial is relatively likely and hence wish to engage inthe costly effort of preparing for that trial Nonetheless litigantsoften settle their case out of court during or after the discoveryprocess Discovery encourages this settlement by promoting theexchange of information between the litigants and hence helpingto align their perceptions of the likely outcome at trial But whilethe evidence garnered during the discovery process sometimesdoes lead to settlement before trial conrmatory bias suggeststhat the discovery process may be less efficient at achieving suchsettlement than would be hoped if a piece of evidence is ambigu-ous it may move the partiesrsquo beliefs farther apart Each litigantwill interpret the evidence through the prism of his or her ownbeliefs and each may conclude that the evidence supports his orher case More generally efforts to reduce disagreements byproviding evidence to the parties involved in a conict may not beas easy to achieve as one would hope

Much of our discussion above implicitly makes an assumptionabout judgment whose psychological validity has not (to ourknowledge) been determined by research that somebody design-ing an institution is aware of the bias of others We suspect thatusefully incorporating conrmatory bias into economic analysiswill depend upon the extent to which people believe that otherssuffer from conrmatory bias It could be that people are wellaware of biases in othersrsquo judgment or that people are unaware of

FIRST IMPRESSIONS MATTER 71

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 36: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

the general tendency toward conrmatory bias28 Investors whohire a money manager might or might not believe that the moneymanager suffers from a conrmatory bias (and is therefore pronetoward overcondence) A principal hiring an employee to makedecisions might or might not know that the employee will be proneto making such errors By the logic of economic models thatinvolve multiple agents these distinctions are likely to matterJust as assuming that rationality is common knowledge is oftenvery different than merely assuming that people are rationalassuming that agents are aware of othersrsquo irrationality may bevery different than merely assuming that people are irrational

How might economic implications depend on peoplersquos aware-ness of othersrsquo conrmatory bias One possibility is that peoplemight exploit the bias of others A principal may for instancedesign an incentive contract for an agent that yields the agentlower wages on average than the agent anticipates because theagent will be overcondent about her judgments in ways that maylead her to exaggerate her yield from a contract Converselyothers may wish to mitigate bias rather than exploit it A principalmay be more concerned with overcoming costly bias of an agentthan with exploiting it and design contracts that avoid errors

APPENDIX 1 DIFFERENTIAL-STRENGTH SIGNALS AND

UNDERCONFIDENCE

If the agent receives signals of different strengths in differentperiods it is possible that the agent will be undercondent in hisbelief about which of the two states is most likely Suppose forexample that the agent receives three signals st [ ab t [ 123 Suppose that the rst two signals are distributed according toprob (st 5 a A) 5 prob (st 5 b B) 5 u 05 t [ 12 but that theagentrsquos third signal is distributed according to prob (s3 5 a A) 5prob (s3 5 b B) 5 u 3u 3 1 (1 2 u )3 That is the agentrsquos thirdsignal is three times as strong as rst- or second-period signals Asbefore with probability q 0 the agent misreads signals thatconict with his belief about which state is more likely (This

28 Unfortunately while this issue may turn out to be central to economicapplications of conrmatory bias (and to applications of other psychologicalbiases) we have not found psychological research that convincingly resolves thisissue There is a small literature in lsquolsquoconstrualrsquorsquo that concerns third-party aware-ness of biases See eg Ross 1987 and tangentially Paese and Kinnaly 1993We have not found investigation of this issue in the context of conrmatory bias orovercondence

QUARTERLY JOURNAL OF ECONOMICS72

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 37: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

means that the probability of misreading is independent of thestrength of the signal)

Suppose that the agent perceives that his rst two signalssupport Hypothesis B while his third signal supports HypothesisA Formally the agent perceives ( s 1 5 b s 2 5 b s 3 5 a ) Giventhese perceived signals the agentrsquos posterior likelihood ratio isL (s1 5 bs2 5 bs3 5 a) 5 u (1 2 u ) 1 Now suppose that aBayesian observer knows both that the agentrsquos posterior likeli-hood ratio is L 5 u (1 2 u ) and that the agent suffers fromconrmatory bias Given the distributions of the signals theobserver is able to infer that the agent has perceived ( s 1 5 b s 2 5 b s 3 5 a ) Then the observerrsquos belief regarding the relativelikelihood that the state is x 5 A versus x 5 B is given by

L (b b a ) 5(1 2 u )(1 2 u 1 q u )(1 2 q) u 3

u (u 1 q(1 2 u ))(1 2 q)(1 2 u )3

5(1 2 u 1 q u ) u 2

( u 1 q(1 2 u ))(1 2 u )2

u

1 2 u q [ (01

Thereforegiven what she infers about the agentrsquos sequence ofperceived signals a Bayesian observer believes that the biasedagent is undercondent in his belief that the true state is A

This undercondence result arises here because the observerinfers the exact sequence of the agentrsquos perceived signals from hislikelihood ratio In this light the results here are the same as thepath-dependent undercondence example in the textmdashif theagent is known to have only recently come to believe in ahypothesis then he will be undercondent In our main model inwhich the agent receives signals of equal strength an observerwho knows the agentrsquos beliefs cannot infer the exact sequence ofthe agentrsquos perceived signals

While there may be some domains in which this differential-signal model is applicable constructing examples of undercon-dence seem to require clever contrivance It is rst of all clear thatthe lsquolsquoovercondencersquorsquo result will be stronger than the undercon-dence result in one sense in the model of this paper theovercondence result holds for all nal beliefs by the agent Anyundercondence example will clearly hold for only some nalbeliefsmdashbecause it will always be the case that a conrmatoryagent is overcondent when all his perceived signals favor onehypothesis

FIRST IMPRESSIONS MATTER 73

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 38: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

We suspect moreover that more complicated and weakerversions of Proposition 1 will hold in more general models Theundercondence result seems to rely on the agent having receiveda small number of signals where certain nal beliefs can only begenerated by a unique path of updating Consequently it is verylikely that a lsquolsquolimit overcondencersquorsquo result would holdmdashonce anagent is likely to have received large numbers of signals of allstrengths we can assure that L L when L 1

APPENDIX 2 PROOFS

Proof of Proposition 1 We rst notice that

prob (n a nb A ) 5 oi 5 0

n b

prob (ii A )c(na 2 n b na 1 n b 2 2i)

middot u u 1 q(1 2 u )n a 2 12 i(1 2 q)(1 2 u )n b 2 i

and

prob (n a nb B ) 5 oi 5 0

n b

prob (ii B)c(na 2 nb na 1 nb 2 2i)

middot (1 2 u )(1 2 u ) 1 q u n a 2 12 i(1 2 q) u nb 2 i

where c(n a 2 nb n a 1 n b 2 2i) is the number of ways to choosen a 2 nb more a signals than b signals in na 1 nb 2 2i drawswithout ever having chosen an equal number of a and b signalsand prob (ii x) is the probability of observing i perceived a and iperceived b signals in 2i draws when the true state is x [ AB Given the symmetric distribution of the signals prob (ii A) 5prob (ii B)29 Therefore prob (n a nb A) and prob (n a nb B) differ

29 Formally

prob (ii A ) 5 prob (ii B ) 5 oj 5 0

i

ok 5 0

i 2 j

oi 5 0

max i 2 j 2 k 2 10)

middot djkl uj u k((1 2 q)(1 2 u )) j 1 k(1 2 u )i 2 j 2 k 2 l u l((1 2 q) u )i 2 j 2 k

The coefficient djkl is the number of ways to choose j signals in favor of the correcthypothesis when the agent believes the two hypotheses are equally likely k biasedsignals favoring the correct hypothesis i 2 j 2 k 2 l signals in favor of the incorrecthypothesis when the agent believes the two hypotheses are equally likely l biasedsignals in favor of the incorrect hypothesis j 1 k unbiased signals opposing abelief in favor of the correct hypothesis and i 2 j 2 k unbiased signals opposing abelief in favor of the incorrect hypothesis

QUARTERLY JOURNAL OF ECONOMICS74

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 39: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

only by the effect of the signals that the agent perceives after thelast time that he believes the two hypotheses are equally likely

Using Bayesrsquo Rule

(11) L (na nb ) 5prob (na n b A )

prob (na nb B )

5

S i 5 0n b prob (ii A )c(n a 2 nb na 1 n b 2 2i)

middot u u 1 q(1 2 u )na 2 1 2 i(1 2 q)(1 2 u )nb 2 i

S i 5 0nb prob (ii B )c(na 2 n b na 1 n b 2 2i)(1 2 u )

(1 2 u ) 1 q u na 2 12 i(1 2 q) u n b 2 i

Because u 1 q(1 2 u ) (1 2 u ) 1 q u u (1 2 u ) q [ (01it follows that

(12) u 1 q(1 2 u )na 2 12 i (1 2 u )n a 2 12 i (1 2 u ) 1 q u na 2 12 i u na 2 12 i

with a strict inequality for i 5 0 since the hypotheses imply thatn a $ 2 Factoring and multiplying (12) by (1 2 u )(1 2 q)n b 2 i andrearranging we have

(13) u u 1 q(1 2 u )na 2 12 i(1 2 q)nb 2 i(1 2 u )nb 2 i

(1 2 u )(1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 iu

1 2 u

na 2 n b

i with a strict inequality for at least i 5 0 since n a $ 2 Using(11) (13) and prob (ii A) 5 prob (ii B)

L (n a nb )

S i 5 0n b prob (ii A)c(n a 2 n b na 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u na 2 12 i(1 2 q)nb 2 i u n b 2 i( u (1 2 u ))n a 2 n b

S i 5 0n b prob (ii B )c(na 2 nb n a 1 nb 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n a 2 12 i(1 2 q)n b 2 i u nb 2 i

5u

1 2 u

n a 2 n b

5 L (na n b )

Proof of Proposition 2 Clearly lim e 0 L (n a 0 1 2 e 1 2 e ) 5` na 0 and lim e 0 L (na 1 1 2 e 1 2 e ) 5 (na 1 1)(na 2 1) middotn a 1

It can be shown that if the agentrsquos current beliefs are that

A and B are equally likely and A is truethen the probability that the next signal is a lt 1 is b 5 e

FIRST IMPRESSIONS MATTER 75

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 40: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

A and B are equally likely and B is truethen the probability that the next signal is a 5 e is b lt 1

A is probably true and A is truethen the probability that the next signal is a lt 1 is b 5 e 2

A is probably true and B is truethen the probability that the next signal is a lt 1 is b lt e

B is probably true and A is truethen the probability that the next signal is a lt e is b lt 1

B is probably true and B is truethen the probability that the next signal is a 5 e 2 is b lt 1

From these numbers we can calculate that if na nb

middot Suppose that A is the true state Consider all paths s suchthat (1) s 1 5 b and (2) there is always a strict majority of bsignals until 2n b 2 1 signals after which all signals are a Then the probability of any particular path s is about e n b 1 1All other paths each occur with probability on the order ofe nb 1 2 or greater when nb $ 2

middot Suppose that B is the true state Consider all paths s suchthat (1) s 1 5 a and (2) there is always a strict majority of asignals The probability of any particular path s is aboute nb 1 1 All other paths each occur with probability on theorder of en b 1 2 or greater when n b $ 2

To show that L (n a nb 1 2 e 1 2 e ) 1 with nb $ 2 thereforewe need only to show that the number of paths of type s isstrictly greater than the number of paths of type s This is easy toverify For every particular path of type s there exists a path oftype s that is the mirror image of that path for the rst 2n b 2 1signals (replacing each a with a b and each b with an a ) andwhose last na 2 nb 1 1 signals consist of na 2 n b 2 1 a rsquos followed by2 b rsquos In addition there will exist at least one more path of types for instance n a a rsquos followed by nb b rsquos

QED

Proof of Proposition 3 The proof is by induction Dene L (n)as the agentrsquos relative likelihood ratio after observing n signalsSuppose that L (1) 1 Then a Bayesian observer infers that theagent observed a single true lsquolsquoarsquorsquo signal and L (1) 5 u (1 2 u ) 1Now suppose that L (n) L (n) and L (n 1 1) 1 We must showthat L (n 1 1) 1 First suppose that n is an even numberBecause L (n) 1 after period n the agent has perceived at least

QUARTERLY JOURNAL OF ECONOMICS76

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 41: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

two more lsquolsquoarsquorsquo signals than lsquolsquobrsquorsquo signals Therefore knowing onlythat L (n) 1 a Bayesian observerrsquos relative likelihood ratioL (n) 5 prob (x 5 A)prob (x 5 B) is given by

(31) L (n) 5

S j 5 0(n2) 2 1 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j 5 0(n2) 2 1 S i 5 0

j p(ii B)c(n 2 2 jn 2 2i)(1 2 u )

middot (1 2 u ) 1 q u n 2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

where p(ii x) and c(middotmiddot) are dened as in the proof of Proposition 1Dene px(n) as the probability of perceiving a (strict) majority oflsquolsquo a rsquorsquo signals in n draws given the state x [ AB Then L (n 1 1) isgiven by

(32) L (n 1 1) 5pA(n) 1 p(n2n2 A) u

pB(n) 1 p(n2n2 B )(1 2 u )

Because p(n2n2 A) 5 p(n2 n2 B) and u 05 L (n 1 1) 1 follows immediately from the hypothesis that L (n) 1 whichimplies that pA(n) pB(n)

Now suppose that n is an odd number Because by hypothe-sis L (n) 1 after period n the agent has perceived more lsquolsquoarsquorsquo thanlsquolsquobrsquorsquo signals Therefore knowing only that L (n) 1 a Bayes-ian observerrsquos relative likelihood ratio L (n) 5 prob (x 5 A)prob (x 5 B) is given by

(33) L (n) 5

S j5 0(n2 1)2 S i 5 0

j p(ii A)c(n 2 2jn 2 2i)

middot u u 1 q(1 2 u )n2 12 j2 i(1 2 q) j2 i(1 2 u ) j2 i

S j5 0(n2 1)2 S i5 0

j p(ii B)c(n 2 2jn 2 2i)(1 2 u )

middot (1 2 u ) 1 qu n2 12 j2 i(1 2 q) j2 i u j2 i

5pA(n)

pB(n)

Meanwhile L (n 1 1) is given by

(34) L (n 1 1) 5

pA(n) 2 S i 5 0(n 2 1)2 p(ii A)c(1n 2 2i)

middot u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n 2 1)2) 2 i(1 2 q)(1 2 u )

pB(n) 2 S i 5 0(n 2 1)2p(ii B )c(1n 2 2i)

middot (1 2 u )(1 2 u ) 1 q u ((n2 1)2) 2 i

middot (1 2 q) u ((n 2 1)2) 2 i(1 2 q) u

FIRST IMPRESSIONS MATTER 77

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 42: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Because L (n) 1 implies that pA(n) pB(n) in order toestablish L (n 1 1) 1 it is sufficient to show that

(35) oi 5 0

(n 2 1)2

p(ii A )c(1n 2 2i) u u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 q)

middot (1 2 u )((n2 1)2) 2 i(1 2 q)(1 2 u )

oi 5 0

(n2 1)2

p(ii B )c(1n 2 2i)(1 2 u )(1 2 u )

1 q u ((n2 1)2) 2 i(1 2 q) u ((n2 1)2) 2 i(1 2 q) u

Using the fact that p(ii A) 5 p(ii B) and canceling liketerms the inequality in (35) is satised if

u 1 q(1 2 u )((n 2 1)2) 2 i(1 2 u )((n2 1)2) 2 i

1 2 u 1 q u ((n2 i)2) 2 1u ((n2 1)2) 2 i i [ 0 (n 2 1)2

But this inequality is always satised because ( u 1 q(1 2 u ))((1 2 u ) 1 q u ) u (1 2 u ) q [ (01 Therefore L (n 1 1) 1

QED

Proof of Proposition 4 The rst hypothesis implies that u 05 and therefore using Lemma 1 PW satises

PW 5 (1 2 u ) middot (1 2 p(1 u )) 1 p(1 u ) middot PW

1 u middot p(1 u ) middot PW

or

PW 5(1 2 u ) middot (1 2 ((1 2 u )u ))

(1 2 (1 2 u ) middot ((1 2 u ) u ) 2 u ((1 2 u )u ))

PW 0 because u 05 for all q $ 0The second hypothesis implies that u 05 and therefore

using Lemma 1 PW satises

PW 5 (1 2 u ) middot PW 1 u middot p(1 u ) middot PW

PW 5 0 because p(1 u ) 1

QED

Proof of Proposition 5 Ignoring integer problems and sinceq 1 2 1(2 u ) for the cases we consider below the denition of

QUARTERLY JOURNAL OF ECONOMICS78

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 43: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

D(micro) and Lemma 1 imply that

PW(micro) 5 1 21 2 u

u

D(micro)

11 2 u

u

D(micro)

PW(05) 0

(i) Note that limq 1 u 5 1 for all (micro u ) Therefore for qsufficiently close to 1 PW(micro) can be made arbitrarily close to 1

(ii) Note that lim u 05 u 05 and lim u 05 D(micro) 5 ` for all(microq) Therefore for u sufficiently close to 05 PW(micro) can be madearbitrarily close to 1

QED

Proof of Proposition 6 (i) Fix u [ (05micro The result followsdirectly from the fact that the principalrsquos payoff from having theagent observe signals P ( u q) 5 micro(u q)u(WR) 1 (1 2 micro(u q))u(0)is continuously monotone decreasing in q with P ( u 1 2 12 u ) u(W ) and P ( u 1) u(W ) (ii) Fix q [ (01 The result followsdirectly from the fact that P ( u q) is continuously monotoneincreasing in u with P (05q) u(W ) and P (uq) u(W )

QED

Proof of Proposition 7 The proof is by construction In orderto establish the result it is sufficient to show that there exists afunction g(middot) such that

g(u(WR) 2 g(u(W ))

g(u(W )) 2 g(u(0))

1 2 micro(u q)

micro(u q)

Dene the function g(middot) as

g(x) 5xx u(W )

u(W ) 1 e (x 2 u(W ))x u(W )

Clearly g8 0 g9 0 and

g(u(WR) 2 g(u(W ))

g(u(W)) 2 g(u(0))5

e (u(WR) 2 u(W))

u(W ) 2 u(0)

1 2 micro(u q)

micro(u q)

for e sufficiently small Parts (ii) and (iii) follow directly fromconcavity of g(middot) and Theorem 1 in Pratt 1964

QED

UNIVERSITY OF CALIFORNIA BERKELEY

EMORY UNIVERSITY

FIRST IMPRESSIONS MATTER 79

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 44: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

REFERENCES

Arkes H R lsquolsquoPrinciples in JudgmentDecision Making Research Pertinent toLegal Proceedingsrsquorsquo Behavioral Sciences and the Law VII (1989) 429ndash456

Bacon F (1620) The New Organon and Related Writings (New York NY LiberalArts Press 1960)

Baumann A O R B Deber and G G Thompson lsquolsquoOvercondence amongPhysicians and Nurses The lsquoMicro-Certainty Macro-Uncertaintyrsquo Phenome-nonrsquorsquo Social Science and Medicine XXXII (1991) 167ndash174

Beattie J and J Baron lsquolsquoConrmation and Matching Biases in HypothesisTestingrsquorsquo Quarterly Journal of Experimental Psychology Human Experimen-tal Psychology XL (1988) 269ndash297

Bjorkman M lsquolsquoInternal Cue Theory Calibration and Resolution of Condence inGeneral Knowledgersquorsquo Organizational Behavior and Human Decision Pro-cesses LVIII (1994) 386ndash405

Bodenhausen G V and M Lichtenstein lsquolsquoSocial Stereotypes and Information-Processing Strategies The Impact of Task Complexityrsquorsquo Journal of Personalityand Social Psychology LII (1987) 871ndash880

BodenhausenG V and R S Wyer lsquolsquoEffects of Stereotypes in Decision Making andInformation-ProcessingStrategiesrsquorsquoJournal of Personality and Social Psychol-ogy XLII (1985) 267ndash282

Borum R R Otto and S Golding lsquolsquoImproving Clinical Judgment and DecisionMaking in Forensic Evaluationrsquorsquo Journal of Psychiatry and Law XXI (1993)35ndash76

Bruner J and M Potter lsquolsquoInference in Visual Recognitionrsquorsquo Science CXLIV(1964) 424ndash425

Camerer C lsquolsquoIndividual Decision Makingrsquorsquo in Handbook of Experimental Econom-ics J Kagel and A E Roth eds (Princeton NJ Princeton University Press1995) pp 587ndash703

Chapman L J and J P Chapman lsquolsquoGenesis of Popular but Erroneous Psychodi-agnostic Observationsrsquorsquo Journal of Abnormal Psychology LXXII (1967)193ndash204

Chapman L J and J P Chapman lsquolsquoIllusory Correlation as an Obstacle to the Useof Valid Psychodiagnostic Signsrsquorsquo Journal of Abnormal Psychology LXXIV(1969) 271ndash280

Chapman L J and J P Chapman lsquolsquoTest Results Are What You Think They ArersquorsquoPsychology Today V (1971) 106

Darley J and P Gross lsquolsquoA Hypothesis-Conrming Bias in Labeling EffectsrsquorsquoJournal of Personality and Social Psychology XLIV (1983) 20ndash33

Devine P G E R Hirt and E M Gehrke lsquolsquoDiagnostic and ConrmationStrategies in Trait Hypothesis Testingrsquorsquo Journal of Personality and SocialPsychology LVIII (1990) 952ndash963

Dougherty T W D B Turban and J C Callender lsquolsquoConrming First Impressionsin the Employment InterviewA Field Study of Interviewer BehaviorrsquorsquoJournalof Applied Psychology LXXIX (1994) 659ndash665

Einhorn H and R Hogarth lsquolsquoCondence in Judgment Persistence of the Illusionof Validityrsquorsquo Psychological Review LXXXV (1978) 395ndash416

Feller W An Introduction to Probability Theory and Its Applications (New YorkNY John Wiley and Sons Inc 1968)

Fischoff B and R Beyth-Marom lsquolsquoHypothesis Evaluation from a BayesianPerspectiversquorsquo Psychological Review XC (1983) 239ndash260

Fischhoff B P Slovic and S Lichtenstein lsquolsquoKnowing with Certainty TheAppropriateness of Extreme Condencersquorsquo Journal of Experimental Psychol-ogy Human Perception and Performance III (1977) 552ndash564

Fleming J and A J Arrowood lsquolsquoInformation Processing and the Perseverance ofDiscredited Self-Perceptionsrsquorsquo Personality and Social Psychology Bulletin V(1979) 201ndash205

Friedrich J lsquolsquoPrimary Error Detection and Minimization(PEDMIN) Strategies inSocial Cognition A Reinterpretation of Conrmation Bias PhenomenarsquorsquoPsychological Review C (1993) 298ndash319

Griffin D and A Tversky lsquolsquoThe Weighing of Evidence and the Determinants ofCondencersquorsquo Cognitive Psychology XXIV (1992) 411ndash435

QUARTERLY JOURNAL OF ECONOMICS80

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 45: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Hamilton D L and T L Rose lsquolsquoIllusory Correlation and the Maintenance ofStereotypic Beliefsrsquorsquo Journal of Personality and Social Psychology XXXIX(1980) 832ndash845

Hamilton D L S J Sherman and C M Ruvolo lsquolsquoStereotype-Based Expectan-cies Effects on Information Processing and Social Behaviorrsquorsquo Journal of SocialIssues XLVI (1990) 35ndash60

Haverkamp B E lsquolsquoConrmatory Bias in Hypothesis Testing for Client-Identiedand Counselor Self-Generated Hypothesesrsquorsquo Journal of Counseling Psychol-ogy XL (1993) 303ndash315

Hodgins H S and M Zuckerman lsquolsquoBeyond Selecting Information Biases inSpontaneous Questions and Resultant Conclusionsrsquorsquo Journal of ExperimentalSocial Psychology XXIX (1993) 387ndash407

Hubbard M lsquolsquoImpression Perseverance Support for a Cognitive ExplanationrsquorsquoRepresentative Research in Social Psychology XIV (1984) 48ndash55

Jennings D L M R Lepper and L Ross lsquolsquoPersistence of Impressions of PersonalPersuasiveness Perseverance of Erroneous Self-Assessments outside theDebrieng Paradigmrsquorsquo Personality and Social Psychology Bulletin VII (1981)257ndash263

Jennings D L T M Amabile and L Ross lsquolsquoInformal Covariation AssessmentData-Based versus Theory-Based Judgmentsrsquorsquo in Judgment under Uncer-tainty Heuristics and Biases D Kahneman P Slovic and A Tversky eds(Cambridge Cambridge University Press 1982) pp 211ndash230

Keren G lsquolsquoFacing Uncertainty in the Game of Bridge A Calibration StudyrsquorsquoOrganizational Behavior and Human Decision Processes XXXIX (1987)98ndash114 lsquolsquoOn the Ability of Monitoring Non-Veridical Perceptions and UncertainKnowledge Some Calibration Studiesrsquorsquo Acta Psychologica LXVII (1988)95ndash119

Klayman J and Y-w Ha lsquolsquoConrmation Disconrmation and Information inHypothesis Testingrsquorsquo Psychological Review XCIV (1987) 211ndash228

Lepper M R L Ross and R R Lau lsquolsquoPersistence of Inaccurate Beliefs about theSelf Perseverance Effects in the Classroomrsquorsquo Journal of Personality andSocial Psychology L (1986) 482ndash491

Lord C G L Ross and M R Lepper lsquolsquoBiased Assimilation and AttitudePolarization The Effects of Prior Theories on Subsequently ConsideredEvidencersquorsquo Journal of Personality and Social Psychology XXXVII (1979)2098ndash2109

Macan T H and R L Dipboye lsquolsquoThe Effects of the Application on Processing ofInformation from the Employment Interviewrsquorsquo Journal of Applied SocialPsychology XXIV (1994) 1291ndash1314

Mahajan J lsquolsquoThe Overcondence Effect in Marketing Management PredictionsrsquorsquoJournal of Marketing Research XXIX (1992) 329ndash342

Mahoney M lsquolsquoPublication Prejudices An Experimental Study of ConrmatoryBias in the Peer Review Systemrsquorsquo Cognitive Therapy and Research I (1977)161ndash175

Mehle T C F Gettys C Manning S Baca and S Fisher lsquolsquoThe AvailabilityExplanation of Excessive PlausibilityAssessmentsrsquorsquo Acta Psychologica XLIX(1981) 127ndash140

Miller A G J W McHoskey C M Bane and T G Dowd lsquolsquoThe AttitudePolarizationPhenomenonRole of Response MeasureAttitude Extremity andBehavioral Consequences of Reported Attitude Changersquorsquo Journal of Personal-ity and Social Psychology LXIV (1993) 561ndash574

Nisbett R and L Ross Human Inference Strategies and Shortcomings of SocialJudgment (Englewood Cliffs NJ Prentice-Hall 1980)

OrsquoDonoghue E and M Rabin lsquolsquoIncentives for Procrastinatorsrsquorsquo NorthwesternUniversity CMSEMS Discussion Paper No 1181 February 25 1997

Oskamp S lsquolsquoOvercondence in Case-Study Judgmentsrsquorsquo in Judgment underUncertainty Heuristics and Biases D Kahneman P Slovic and A Tverskyeds (Cambridge Cambridge University Press 1982) pp 287ndash293

Paese P W and M Kinnaly lsquolsquoPeer Input and Revised Judgment Exploring theEffects of (Un)Biased Condencersquorsquo Journal of Applied Social Psychology XXIII(1993) 1989ndash2011

FIRST IMPRESSIONS MATTER 81

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82

Page 46: FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS · FIRST IMPRESSIONS MATTER: A MODEL OF CONFIRMATORY BIAS* MATTHEWRABINANDJOELL.SCHRAG Psychological research indicates that

Perkins D N The Mindrsquos Best Work (Cambridge MA Harvard University Press1981)

Pfeifer P E lsquolsquoAre We Overcondent in the Belief That Probability Forecasters AreOvercondentrsquorsquo Organizational Behavior and Human Decision ProcessesLVIII (1994) 203ndash213

Plous S lsquolsquoBiases in the Assimilation of Technological Breakdowns Do AccidentsMake Us Saferrsquorsquo Journal of Applied Social Psychology XXI (1991) 1058ndash1082

Popper Karl Conjectures and Refutations (London Routledge and Kegan PaulLtd 1963)

Pratt J lsquolsquoRisk Aversion in the Small and in the Largersquorsquo Econometrica XXXII(1964) 122ndash136

Redelmeier D andA Tversky lsquolsquoOn the Belief That Arthritis Pain is Related to theWeatherrsquorsquo Proceedings of the National Academy of Sciences USA XCIII (1996)2895ndash2896

Resnik J lsquolsquoManagerial Judgesrsquorsquo Harvard Law Review XCVI (1982) 374ndash445Ross L lsquolsquoThe Problem of Construal in Social Inference and Social Psychologyrsquorsquo in A

Distinctive Approach to Psychological Research The Inuence of StanleySchachter R E N Neil E Grunberg Judith Rodin and Jerome E Singereds (Hillsdale NJ Lawrence Erlbaum Associates Inc 1987) pp 118ndash150

Soll J lsquolsquoDeterminants of Overcondence and Miscalibration The Roles ofRandom Error and Ecological Structurersquorsquo Organizational Behavior and Hu-man Decision Processes LXV (1996) 117ndash137

Souter G lsquolsquoOvercondence Causes Reinsurers to Underprice Coverages Consul-tantrsquorsquo Business Insurance XXVII (1993) 47

Stangor C lsquolsquoStereotype Accessibility and Information Processingrsquorsquo Personalityand Social Psychology Bulletin XIV (1988) 694ndash708

Stangor C and D N Ruble lsquolsquoStrength of Expectancies and Memory for SocialInformation What We Remember Depends on How Much We Knowrsquorsquo Journalof Experimental Social Psychology XXV (1989) 18ndash35

Tomassini L A I Solomon M B Romney and J L Krogstad lsquolsquoCalibration ofAuditorsrsquo Probabilistic Judgments Some Empirical Evidencersquorsquo Organiza-tional Behavior and Human Performance XXX (1982) 391ndash406

Van Lenthe J lsquolsquoELI An Interactive Elicitation Technique for Subjective Probabil-ity Distributionsrsquorsquo Organizational Behavior and Human Decision ProcessesLV (1993) 379ndash413

Winman A and P Juslin lsquolsquoCalibration of Sensory and Cognitive Judgments TwoDifferent Accountsrsquorsquo Scandinavian Journal of Psychology XXXIV (1993)135ndash148

Wood A S lsquolsquoFatal Attractions for Money Managersrsquorsquo Financial Analysts JournalXLV (1989) 3ndash5

Wyatt D and D Campbell lsquolsquoOn the Liability or Stereotype of HypothesisrsquorsquoJournal of Abnormal and Social Psychology XLVI (1951) 496ndash500

Zuckerman M C R Knee H S Hodgins and K Miyake lsquolsquoHypothesis Conrma-tion The Joint Effect of Positive Test Strategy and Acquiescence ResponseSetrsquorsquo Journal of Personality and Social Psychology LXVIII (1995) 52ndash60

QUARTERLY JOURNAL OF ECONOMICS82