Exploring preference anomalies in double bounded contingent valuation

20
Journal of Health Economics 26 (2007) 463–482 Exploring preference anomalies in double bounded contingent valuation Verity Watson , Mandy Ryan Health Economics Research Unit, Institute of Applied Health Sciences, University of Aberdeen, Polwarth Building, Foresterhill, Aberdeen, AB25 9ZD, UK Received 7 February 2006; received in revised form 24 October 2006; accepted 25 October 2006 Available online 22 November 2006 Abstract Double bounded dichotomous choice (DBDC) contingent valuation offers increased efficiency of willing- ness to pay (WTP) estimates compared with the single bounded format. However, evidence suggests DBDC generates anomalous respondent behaviour. This paper provides the first investigation and explanation of these anomalies in health. Results suggest the incentives for truthful preference revelation are altered in the presence of a follow up question. This result is found using both regression techniques and analysis of raw responses. Although findings suggest ‘very certain’ respondents exhibit less anomalous behaviour inconsistencies remain across bounds. The results of this study question the use of iterative valuation formats. © 2006 Elsevier B.V. All rights reserved. JEL classification: D12; D60; I10 Keywords: Contingent valuation; Anomalies; Prospect theory; Anchoring; Calibration 1. Introduction Whilst contingent valuation is being increasingly used in health economics to estimate will- ingness to pay (WTP) (Diener et al., 1998; Klose, 1999; Smith, 2003), the appropriate elicitation format is a continuing source of debate (Smith, 2003). There is a consensus that an open-ended (OE) approach is not appropriate (Arrow et al., 1993; Donaldson et al., 1997). The payment card (PC) approach has proved popular (Smith, 2003). Proponents of this method argue it mimics real Corresponding authors. Tel.: +44 1224 555937; fax: +44 1224 550926. E-mail addresses: [email protected] (V. Watson), [email protected] (M. Ryan). 0167-6296/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jhealeco.2006.10.009

Transcript of Exploring preference anomalies in double bounded contingent valuation

Journal of Health Economics 26 (2007) 463–482

Exploring preference anomalies in doublebounded contingent valuation

Verity Watson ∗, Mandy Ryan ∗Health Economics Research Unit, Institute of Applied Health Sciences, University of Aberdeen,

Polwarth Building, Foresterhill, Aberdeen, AB25 9ZD, UK

Received 7 February 2006; received in revised form 24 October 2006; accepted 25 October 2006Available online 22 November 2006

Abstract

Double bounded dichotomous choice (DBDC) contingent valuation offers increased efficiency of willing-ness to pay (WTP) estimates compared with the single bounded format. However, evidence suggests DBDCgenerates anomalous respondent behaviour. This paper provides the first investigation and explanation ofthese anomalies in health. Results suggest the incentives for truthful preference revelation are altered inthe presence of a follow up question. This result is found using both regression techniques and analysisof raw responses. Although findings suggest ‘very certain’ respondents exhibit less anomalous behaviourinconsistencies remain across bounds. The results of this study question the use of iterative valuation formats.© 2006 Elsevier B.V. All rights reserved.

JEL classification: D12; D60; I10

Keywords: Contingent valuation; Anomalies; Prospect theory; Anchoring; Calibration

1. Introduction

Whilst contingent valuation is being increasingly used in health economics to estimate will-ingness to pay (WTP) (Diener et al., 1998; Klose, 1999; Smith, 2003), the appropriate elicitationformat is a continuing source of debate (Smith, 2003). There is a consensus that an open-ended(OE) approach is not appropriate (Arrow et al., 1993; Donaldson et al., 1997). The payment card(PC) approach has proved popular (Smith, 2003). Proponents of this method argue it mimics real

∗ Corresponding authors. Tel.: +44 1224 555937; fax: +44 1224 550926.E-mail addresses: [email protected] (V. Watson), [email protected] (M. Ryan).

0167-6296/$ – see front matter © 2006 Elsevier B.V. All rights reserved.doi:10.1016/j.jhealeco.2006.10.009

464 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

life allowing individuals to “shop around” for a value that is the most they would pay (Donaldsonet al., 1997).

Following the National Oceanic and Atmospheric Administration (NOAA) panel recommen-dations (Arrow et al., 1993), there exist many applications of the single bounded dichotomouschoice (SBDC) approach in both health and environmental economics (Smith, 2003; Batemanet al., 2002). The advantages of this approach include cognitive simplicity for respondents andreduced incentives for strategic behaviour (Hoehn and Randall, 1987; Carson et al., 1999). TheSBDC approach, however, provides limited information regarding respondents’ true WTP withlarger sample sizes required for an equivalent level of statistical precision compared with PC orOE elicitation formats.

To increase statistical efficiency the initial dichotomous choice question (DC1) can be sup-plemented with a second dichotomous choice question (DC2), resulting in the double boundeddichotomous choice (DBDC) format. Responses to DC1 determine the bid offered in DC2. Ifrespondents state ‘yes’ in DC1, a higher bid is offered in DC2. Conversely, if respondents state‘no’ in DC1, a lower bid is offered in DC2. Hanemann et al. (1991) provided empirical supportfor increased statistical efficiency through the reduced variance of WTP estimates from DBDCexperiments. However, a well-documented result from DBDC applications is that resultant wel-fare estimates are lower than those from SBDC (Hanemann et al., 1991; Cameron and Quiggan,1994; McFadden, 1994; Alberini et al., 1997; Clarke, 2000; Bateman et al., 2001; Kennedy, 2002;Whitehead, 2002). Moreover, responses to DC2 are dependant on DC1 (Alberini et al., 1997;Cameron and Quiggan, 1994). Within health economics applications of the DBDC format arelimited, although there appears to be growing interest in the method (Clarke, 2000; Liu et al.,2000; Kennedy, 2002; Asfaw and von Braun, 2005; Hackl and Pruckner, 2005; Prosser et al.,2005). Of the existing studies only Clarke (2000) and Kennedy (2002) reported comparisonsof WTP estimates from DBDC and SBDC models, both finding results lower WTP estimatesfrom the DBDC model, neither study investigated the behavioural motivations underlying theirresults.

Given the evidence of anomalous preference expressions questions are raised concerningthe behavioural motivations within DBDC studies. A number of specific explanations for theanomalous behaviour observed in DBDC studies have been proposed. Respondents may use theinformation provided by DC1 to inform their response to DC2 through anchoring or startingpoint bias (Boyle et al., 1985; Herriges and Shogren, 1996; Boyle et al., 1997). Alternatively,applying prospect theory (Kahneman and Tversky, 1979), respondents would frame DC2 asa gain or loss against the initial bid DC1 (DeShazo, 2002). Carson et al. (1999) proposedcost based expectations. Here respondents assume the initial bid amount represents the costof providing the good. A higher DC2 is seen as an attempt by the government to raise addi-tional revenue, and a lower DC2 will result in a lower quality good being provided. Batemanet al. (1999) argued that ‘Guilt and Indignation’ affect respondents when answering a DBDCquestion. Indignation occurs when respondents perceive that they struck a deal with the inter-viewer in DC1. When asked a higher bid amount in DC2 respondents feel the interviewer hasreneged on the deal. Conversely, guilt (or a sense of social responsibility) occurs when respon-dents state ‘no’ to DC1 and are then asked a lower bid amount in DC2. Yea saying has beenproposed as an explanation of SBDC resulting in higher welfare estimates than OE or PCCV approaches (Holmes and Kramer, 1995; Kanninen, 1995; Ready et al., 1996; Frew et al.,2003; Ryan et al., 2004). In DBDC, yea saying would manifest as a tendency for respondentsto say ‘yes’ to all bids resulting in higher welfare estimates. Finally, strategic behaviour maybe induced where respondents perceive DC2 indicates uncertainty and price flexibility result-

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 465

ing in respondents being provided with an incentive to understate true WTP (Carson et al.,1999).

This paper considers the application of DBDC to estimate WTP within health, and investigatesif preference anomalies are observed. The application is concerned with preferences for theprovision of a national air ambulance service (AAS). Following estimation of WTP, a number ofexplanations for any divergence between DC1 and DC2 are investigated. Consideration is alsogiven to how preference anomalies differ according to respondents’ certainty when completingthe CV task. Consequently, this study not only provides the first consideration of these issues ahealth care context, but also tests the generalisability of a DBDC results from other disciplines(Hunter, 2001) and the robustness of these findings when respondent certainty is taken into account.Section 2 discusses the study design, methods of analysis for DBDC data, and tests of preferenceanomalies. These tests can be split into two groups: (i) tests incorporated into the analyticalframework and (ii) tests exploiting study design to consider aggregate response patterns acrossDC1 and DC2. Section 3 presents the results and Section 4 discusses these results.

2. Methods

2.1. The experiment

This study sought public perceptions of, and WTP for, the provision of a national air ambulanceservice1. Between August and September 2002, a representative sample of 1400 members of thepublic was interviewed using computer assisted telephone interviews (CATI). For the generalsurvey results see Johnston and Ryan (2002).

Prior to the DBDC respondents were asked if they were willing to pay anything for the good.Respondents stating ‘yes’ were offered a randomly assigned ‘base bid’, respondents stating ’no’were asked the reasons for their response. The objective of this paper is to consider anomaliesin DBDC, thus respondents who stated ‘no’ to the screening question are not considered in theanalysis.2

Advanced disclosure has been argued to reduce the likelihood of preference anomalies;accordingly respondents were informed they would be presented with two valuation questions.Respondents were told the second valuation question was dependant on their response to the initialquestion, where the bid amount would be higher if they stated ‘yes’ and lower if they stated ‘no’to the initial valuation question. Respondents were randomly presented one of five ‘base bids’(DC1): £25, £50, £100, £200, and £300. If a ‘yes’ (‘no’) response was given to DC1, respon-dents were offered a higher (lower) ‘follow up bid’ (DC2). The bid levels in DC2 for each ‘basebid’ level are; £25 (with lower = £10 and higher = £50); £50 (with lower = £25 and higher = £100);£100 (with lower = £50 and higher = £200); £200 (with lower = £100 and higher = £300); and £300(with lower = £200 and higher = £400)3. The experimental design of the bid vector was chosento ensure overlapping bid levels across bounds. A feature of this design is the ability to compareresponses to the same bid amount (e.g. £50) across bounds. For instance, responses to the bid

1 Whilst the results presented provide insight into the value the community places on air ambulances, the primaryobjective is to investigate anomalies in DBDC data.

2 In the valuation exercise those who answered ‘no’ to the screening question were asked why. This was to distinguish‘protestors’ from genuine zero values (Johnston and Ryan, 2002).

3 The bid levels were obtained from a CV survey of air ambulances conducted in the Grampian Region of Scotland(Ryan et al., 2004). Response monitoring ensured an equal split of respondents across bid levels.

466 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

amount, £50, can be compared when £50 is presented as the base bid or the follow up bid. Fromthe design, £50 is a follow up bid both after a lower base bid (£25) and after a higher base bid(£100). The experimental design permits consideration of four bid groupings (£50, £100, £200,and £300).

Following both DC1 and DC2 respondents were asked to assess their degree of certainty forthe response. Certainty was measured on a 5-point scale (1 = very uncertain to 5 = very certain).Respondents’ certainty following DC1 is used to distinguish between ‘very certain’ respondents(certainty = 5) and ‘less certain’ respondents (certainty = 1, 2, 3, 4). Degree of certainty has previ-ously been used in stated preference (SP) studies to filter ‘false’ yes responses. Evidence suggestsif response data are calibrated by certainty, resultant WTP estimates have higher external validity(Johannesson et al., 1998; Blumenschien et al., 1998; Blumenschein et al., 2001).

2.2. Data analysis

Table 1 provides a summary of the regression models estimated to investigate both welfareestimates from DBDC experiments (Models I–IV) and preference anomalies (Models V–VIII).

2.2.1. Estimating willingness to pay using DBDC dataResponses to DBDC data, as with SBDC data, are analysed within the random utility model

(McFadden, 1974; Hanemann, 1984). Let t1 be the base bid at DC1 and t2 be the follow up bid atDC2. Possible responses are:

Yes|Yes ⇒ WTP � t2

Yes|No ⇒ t1 � WTP < t2

No|Yes ⇒ t1 > WTP � t2

No|No ⇒ WTP < t2

Following this:

WTPij = zijβ + εij (1)

where WTPij is the jth individual’s WTP, i = 1, 2 represents DC1 and DC2, respectively, zijβ arevectors of variables related to individuals and their parameters, respectively. The error term, εij,incorporates both individual and question specific error. Thus, Eq. (1) incorporates the notion

Table 1Summary of empirical models

Model Description

Analysis and welfare estimates from DBDCModel I Bivariate probitModel II Interval data modelModel III Probit analysing (DC1 only)Model IV Probit analysis (DC2 only)

Models of preference anomaliesModel V Random effects probit (naıve model)Model VI Shift model (random effects probit)Model VII Anchoring model (random effects probit)Model VIII Shift and anchoring model (random effects probit)

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 467

that individual j may respond differently to each question, i. Combining Eq. (1) with responsedescriptors above, the probability of respondent j answering ‘yes’ to DC1 and ‘no’ to DC2 isexpressed as:

Pr(yes, no) = Pr(WTP1j ≥ t1, WTP2j < t2)

Pr(yes, no) = Pr(z1β + ε1j ≥ t1, z2β + ε2j < t2)(2)

Expanding to incorporate all response combinations results in the likelihood function:

Lj(zβ|t) = Pr(z1β + ε1j ≥ t1, z2β + ε2j < t2)YN

×Pr(z1β + ε1j > t1, z2β + ε2j ≥ t2)YY

×Pr(z1β + ε1j < t1, z2β + ε2j < t2)NN

×Pr(z1β + ε1j < t1, z2β + ε2j ≥ t2)NY

(3)

Assuming error terms ε1j and ε2j are normally distributed with mean zero and variance σ21 and

σ22 , respectively, and allowing for correlation between DC1 and DC2, expressed by ρ, Eq. (3) is

estimated using the bivariate probit model (Cameron and Quiggan, 1994). This is referred to asModel I.

A restricted version of the bivariate probit model is the interval data model (Hanemann et al.,1991).4 Here responses to DC1 and DC2 are assumed to be motivated by the same latent WTPvalue, observed differences are due to randomness in the underlying WTP distribution, and ρ = 1.Thus:

z1jβ = z2jβ

ε1j = ε2j

σ1 = σ2

(4)

These restrictions are tested using a likelihood ratio test to compare the bivariate probit (ModelI) and interval data model (Model II).

Analysing responses to DC1 and DC2 separately (referred to as Models III and IV, respectively),as if elicited from independent DC experiments, assumes no correlation between the responses(ρ = 0). Again, this restriction is tested using a likelihood ratio test between Model I and ModelsIII and IV.

Further, comparing the interval data model with responses to DC1 only (Model III) provides atest of the common finding that DBDC results in lower WTP estimates than SBDC. For simplicityall models are estimated with only a constant and the bid vector.

Models I–IV are re-estimated for ‘very certain’ and ‘less certain’ respondents to test if anyobserved preference anomalies persist. The restriction that preferences are the same across ‘verycertain’ and ‘less certain’ respondents is tested using a likelihood ratio test. Given evidence that‘very certain’ responses have higher external validity, it is hypothesised that preference anomaliesmay differ across groups.

4 The interval data is the most usually method of analysis for DBDC data. Further, Alberini (1995) found welfareestimates from the model were relatively unbiased even when ρ was as low as 0.2.

468 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

2.2.2. Incorporating preference anomalies into empirical modelsRegression analysis is used to investigate any preference anomalies found in Models I–IV

above. The hypothesis tested is that respondents’ preferences shift between DC1 and DC2. Assum-ing responses to DC1 are based on respondents true WTP, then responses to DC2 are based onrespondents true WTP plus the effect of a follow up question. This effect is captured through theinclusion of a structural shift parameter, δ (Alberini et al., 1997):

WTP2 = WTP1 + δ (5)

A negative coefficient indicates the follow up question increases respondents’ probability ofrejecting the bid amount in DC2. This finding is consistent with several proposed explanationsfor the divergence between SBDC and DBDC, including prospect theory, indignation, and costexpectations. A positive coefficient would be consistent with yea saying.

Anchoring (or starting point bias) can also be incorporated into this model (Herriges andShogren, 1996). Again responses to DC1 are based on respondents’ true WTP. However, WTPexpressed in DC2 is based upon the weighted average of respondents WTP expressed at DC1and t1. Respondents faced with the second bid level, t2, assume the true value of the good liesbetween t1 and t2. Accordingly, WTP expressed at DC2 (WTP2) is the weighted average of trueWTP (WTP1) and the bid level t1:

WTP2 = (1 − γ)WTP1 + γt1 (6)

where 0 ≤ γ ≤ 1. If γ = 0 no anchoring is present and WTP2 = WTP1 (responses to both DC1and DC2 are based on same underlying WTP). However, if γ > 0 anchoring is present andWTP2 �= WTP1. As γ tends to 1, anchoring is increased, and WTP2 tends to the base bid, t1.

The presence of a structural shift and anchoring are tested using:

WTPi = β0 + βtti + βDD + γt1D (7)

where β0 is the constant term; incorporating respondents preference for the goods provision,ti the bid amount at bound i, where i = 1, 2. A shift effect can be tested through considerationof δ = βD, where D is a dummy variable equal to 1 for DC2 and equal to 0 for DC1. Under theweighted average hypothesis 0 < γ < 1, where γ is the coefficient when t1 is included as a covariatein modelling responses to DC2. In the case of DC1, where there is no shift effect of the follow upquestion and no anchoring, Eq. (7) would be reduced to:

β0 − β1t1 (8)

For DC2, Eq. (7) becomes:

β0 + β2t2 + βDD + γt1D (9)

Four random effects probit models are estimated (see Table 1): the naıve model with no controls(Model V); the shift effect model (Model VI); the anchoring model (Model VII); and the combinedshift and anchoring model (Model VIII). These models are estimated for the whole sample andsubgroups of respondents according to reported response certainty to base bid levels (certainty = 5and certainty = 1, 2, 3, 4). Respondents reporting certainty = 5 are assumed to hold well-definedpreferences and be less likely to base subsequent valuations on information provided in DC1. Inthis case the shift and anchoring parameters are expected to be insignificant.

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 469

2.2.3. Investigating anomalies through response patternsConsideration is given here to the pattern of raw responses to explain preference anoma-

lies (DeShazo, 2002). Hanemann and Kanninen (1999) tested response consistency based on anonparametric test where unconditional probability of stating ‘yes’ to bid amount, t, in DC1 iscompared with the conditional probability of the same bid in DC2 having stated ‘yes’ to a lowerbid, tL, or ‘no’ to a higher bid, tH. Accordingly, it is expected that consistent responses will satisfythe following conditions:

Pr{‘yes’ to t} = Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}or

Pr{‘yes’ to t} = Pr{‘yes’ to t|‘no’ to tH } × Pr{‘no’ to tH }Applying this framework permits the proposed explanations from the divergences between SBDCand DBDC to be investigated.

2.2.3.1. Prospect theory. Prospect theory (Kahneman and Tversky, 1979) assumes respondentsform a reference in answering ‘yes’ to DC1 (respondents answering ‘no’ are assumed not to form areference). A higher follow up bid, DC2, is compared with this reference point. DC2 is negativelyframed, thus:

Pr{‘yes’ to t} > Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

Pr{‘no’ to t|‘yes’ to tL} × Pr{‘yes’ to tL} > Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}and

Pr{‘no’ to tL} = Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}

2.2.3.2. Guilt and Indignation. Bateman et al. (2001) proposed indignation occurs when respon-dents believe they struck a deal with the interviewer in answering ‘yes’ to DC1. This increasesthe probability of a negative follow up response. Conversely, when respondents state ‘no’ to DC1guilt, or sense of social responsibility, increases the probability of a positive response to followup questions. The effect in the middle interval is dependent on the relative strength of the guiltand indignation effects.

Pr{‘yes’ to t} > Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

Pr{‘no’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}?Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}and

Pr{‘no’ to tL} > Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}

2.2.3.3. Cost expectations. Carson et al. (1999) argue that respondents may interpret DC1 as thecost of providing the good. Accordingly, when respondents state ‘yes’ to DC1, DC2 is seen as anattempt by government to obtain additional funds beyond the actual cost of the good, reducingconditional ‘yes’ responses thus:

Pr{‘yes’ to t} > Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

470 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

Conversely, following a ‘no’ response to DC1, respondents perceive a lower quality of good willbe provided, reducing conditional ‘yes’ responses to DC2 thus:

Pr{‘no’ to tL} < Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}When the hypotheses at the upper and lower intervals are combined the predicted impact on themiddle interval is:

Pr{‘no’ to t|‘yes’ to tL} × Pr{‘yes’ to tL} > Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}Under the cost expectations explanation respondents do not express their valuation of the goodin DC2 but rather react to the new information, based on the expectation that the good will ‘cost’DC1.

2.2.3.4. Strategic behaviour. Proponents of this explanation suggest the presence of DC2 indi-cates price flexibility, compromising incentive compatibility and provoking strategic behaviour(Carson et al., 1999; DeShazo, 2002). It is argued that respondents are more likely toanswer ’no’ to DC2, regardless of whether DC2 is in the ascending or descending sequence.Thus:

Pr{‘yes’ to t} > Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

Pr{‘no’ to t|‘yes’ to tL} × Pr{‘yes’ to tL} > Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}and

Pr{‘no’ to tL} < Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}

2.2.3.5. Yea saying. Yea-sayers will state ‘yes’ to any bid offered (Holmes and Kramer, 1995;Kanninen, 1995; Ready et al., 1996). This should not affect the descending sequence, given thisstarts with a ‘no’. Thus, the descending sequence is anomaly free and is used as a reference withwhich to compare the ascending sequence. The probability of respondents answering ‘yes’ to bothDC1 and DC2, will be greater than the probability of stating ‘yes’ to DC1. Thus:

Pr{‘yes’ to t} < Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

Pr{‘no’ to t|‘yes’ to tL} × Pr{‘yes’ to tL} < Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}and

Pr{‘no’ to tL} < Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}

2.2.3.6. Anchoring. The weighted average anchoring explanation assumes follow up responses toDC2 are based on a weighted average of DC1 and DC2 (Herriges and Shogren, 1996). Respondentsin ascending sequences form an anchor at DC1, with DC2 interpreted as a lower weighted averagebid increasing the probability of ‘yes’ to DC2:

Pr{‘yes’ to t} < Pr{‘yes’ to t|‘yes’ to tL} × Pr{‘yes’ to tL}

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 471

Respondents in descending sequences anchor responses at the higher DC1 level, andDC2 is interpreted as a higher weighted average bid decreasing the probability of ‘yes’to DC2

Pr{‘no’ to tL} < Pr{‘no’ to tL|‘no’ to t} × Pr{‘no’ to t}The effect upon the middle interval of the sequence is dependent on the relative strengths of effectsin the upper and lower intervals.

Pr{‘no’ to t|‘yes’ to tL} × Pr {‘yes’ to tL}?Pr{‘yes’ to tL|‘no’ to t} × Pr{‘no’ to t}Raw responses to DC1 and DC2 from the DBDC experiment are used to investigate the abovehypotheses. The design of the bid vector allows comparison across four bid groupings (£25 and£50; £50, and £100; £100 and £200; £200 and £300). The significance of differences conditionaland unconditional probabilities is tested using a test of proportions. Response patterns are againinvestigated according to respondents’ self reported certainty following DC1. A priori less certainrespondents are expected to be more susceptible to using the information presented in DC1 toinform their valuations in DC2.

3. Results

A summary of responses across bid levels, for the full sample and the subgroups accordingto certainty, is presented in Table 2. A priori expectations of the probability of ‘yes’ falling asthe bid level increases, were fulfilled for all the data. For the full sample the proportion of ‘yes’responses to DC1 ranged from 76% for £25 to 23% for £300. A similar pattern was observedfor the proportion of ‘yes’ responses to DC2 in the lower bound (76–21%) and the upper bound(35–20%). Across bounds ‘very certain’ respondents were more likely to state ‘yes’ to lower bidlevels and less likely to state ‘yes’ to higher bid levels. For example, in the case ‘very certain’respondents (certainty = 5), the probability of ‘yes’ to the initial bid fell from 80% for £25 to 15%for £300. With ‘less certain’ respondents (certainty = 1, 2, 3, 4) the probability of ‘yes’ to theinitial bid fell from 72% to 31%.

3.1. Comparison of willingness to pay estimates

Estimates of mean WTP from Models I–IV (Table 3) were in line with existing evidence.Willingness to pay estimated from DBDC data, using the interval data model (Model II), waslower than from SBDC model of initial bids (Model III) (£104.34 < £141.91). In the bivari-ate probit model (Model I) Rho (ρ) was negatively significant, implying a negative correlationbetween responses to DC1 and DC2 (ρ = −0.239; χ2 (d.f.) = 9.928(1)). This was confirmed byWTP estimates, where WTP estimated from Model III, using only initial responses, was higherthan WTP estimated from responses to DC2 (Model IV). Likelihood ratio tests indicated restric-tions imposed by Models II, III and IV were invalid compared to the unrestricted bivariate probitmodel (Model I) (Model I versus Model II: 239.71 ∼ χ2(2), Models III and IV versus Model I:8.32 ∼ χ2(2)).

A likelihood ratio test indicated the restriction of equal preferences for ‘very certain’ and‘less certain’ respondents was not valid in Models I–IV (96.7 ∼ χ2(2), 53.54 ∼ χ2(2), and16.52 ∼ χ2(2), respectively). For Model IV the likelihood ratio test statistic (0.16 ∼ χ2(2)) can-

472V.W

atson,M.R

yan/JournalofH

ealthE

conomics

26(2007)

463–482

Table 2Responses to DBDC bid levels

Base bid DC1 N Yes N (%) No N (%) Upper bound(DC2)

Yes N (%) No N (%) Lower bound(DC2)

Yes N (%) No N (%)

Full sample£25 157 120 (76.4) 37 (23.6) £50 42 (35) 24 (60) £10 28 (75.7) 9 (24.3)£50 150 103 (68.6) 47 (31.3) £100 35 (33.9) 68 (66.1) £25 31 (65.9) 16 (34.1)£100 172 92 (53.4) 80 (46.5) £200 20 (21.7) 72 (78.3) £50 44 (55) 36 (45)£200 151 56 (37.1) 95 (62.9) £300 13 (23.2) 43 (76.8) £100 42 (44.2) 53 (55.8)£300 172 39 (22.7) 133 (77.3) £400 8 (20.5) 31 (79.5) £200 28 (21) 105 (79)

‘Very certain’ (certainty = 5)£25 85 68 (80) 17 (20) £50 32 (47.1) 36 (52.9) £10 14 (82.4) 3 (17.6)£50 69 47 (68.1) 22 (31.9) £100 21 (44.7) 26 (55.3) £25 11 (50) 11 (50)£100 81 42 (51.9) 39 (48.1) £200 14 (33.3) 28 (66.7) £50 18 (46.2) 21 (53.8)£200 77 18 (23.4) 59 (76.6) £300 7 (38.9) 11 (61.1) £100 22 (37.3) 37 (62.7)£300 96 15 (15.6) 81 (84.4) £400 5 (33.3) 10 (66.7) £200 10 (12.3) 71 (87.7)

‘Less certain’ (certainty = 1–4)£25 72 52 (72.2) 20 (27.8) £50 10 (19.2) 42 (80.1) £10 14 (70) 6 (30)£50 80 55 (68.8) 25 (31.3) £100 14 (25.5) 41 (74.5) £25 20 (80) 5 (20)£100 89 50 (56.2) 39 (43.8) £200 6 (12) 44 (88) £50 26 (66.7) 11 (28.2)£200 74 38 (51.3) 36 (48.6) £300 6 (15.8) 32 (84.2) £100 20 (55.6) 16 (44.4)£300 74 23 (31.1) 51 (68.9) £400 3 (13) 20 (86.9) £200 18 (35.3) 33 (64.7)

V.Watson,M

.Ryan

/JournalofHealth

Econom

ics26

(2007)463–482

473

Table 3Welfare estimates from DBDC data

Model I bivariate model Model II interval model Model III DC1 responses Model IV DC2 responses

Full sampleLower bound 142.81 (142.27–143.42) 141.91 (141.15–142.27)Upper bound −33.59 (−26.64–−40.52) 27.46 (26.14–28.78)Overall £104.34 (95.93–112.77)

Observations 802 802 802 802Log-likelihood −984.71 −1224.42 −491.12 −497.75Rho (ρ) −0.239 1Chi-squared (d.f.) 9.928 (1)

‘Very certain’ (certainty = 5)Lower bound 121.68 (120.46–121.74) 123.05 (122.05–123.33)Upper bound 38.29 (36.31–40.26) 10.73 (7.84–13.64)Overall 95.37 (91.79–107.17)

Observations 408 408 408 408Log-likelihood −487.96 −506.42 −227.87 −261.55Rho (ρ) 0.182 1Chi-squared (d.f.) 2.936 (1)

‘Less certain’ (certainty = 1, 2, 3, 4)Lower bound 178.72 (177.45–180.01) 177.65 (176.36–178.96)Upper bound −204.44 (−679.47–270.59) 31.73 (30.23–33.24)Overall 112.28 (100.52–125.29)

Observations 394 394 394 394Log-likelihood −469.98 −669.65 −254.99 −236.28Rho (ρ) −0.628 1Chi-squared (d.f.) 42.576 (1)

95% confidence intervals for welfare estimates are reported in parenthesis, calculated by bootstrapping with 1000 replications.

474 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

not reject the restriction that preferences were the same across ‘very certain’ and ‘less certain’respondents.

For all models WTP estimated for ‘very certain’ respondents was lower than for the full sample,inline with existing literature (Johannesson et al., 1998; Blumenschien et al., 1998; Blumenscheinet al., 2001). Further, within ‘very certain’ respondents WTP estimated using the interval datamodel was lower than WTP from SBDC model of initial bids. Combined with a ρ significantlyless than 1 (ρ = 0.182; χ2 = 2.936) this implies a positive association between responses to DC1and DC2. This suggests that preference anomalies persist when only ‘very certain’ respondentsare considered.

Likelihood ratio tests of the restrictions in Models II, III and IV compared with the unrestrictedModel I indicated that whilst the restriction imposed by Model II was not valid (36.92 ∼ χ2(2)),the restriction that models are independent cannot be rejected (Models III and IV versus Model I(2.92 ∼ χ2(2)).

For ‘less certain’ respondents, WTP estimated using the interval data model from DBDCdata was lower than WTP from SBDC model of initial bids. Welfare estimates for this groupwere higher than for ‘very certain’ respondents and for the full sample. Here, ρ was signif-icant and negative, with a stronger negative association than for the full sample, implyinga negative association between responses to DC1 and DC2. Likelihood ratio tests rejectedthe restrictions imposed by Model II and Models III and IV compared with the unrestrictedModel I (Model I versus Model II: 399.34 ∼ χ2(2), Models III and IV versus Model I:42.58 ∼ χ2(2)).

3.2. Exploring anomalies based on regression analysis.

The random effects probit models, incorporating controls for preference anomalies, are pre-sented in Table 4. Model V presents the results of the random effect probit model with noanomaly tests incorporated. The shift parameter in Model VI, δ, was negative and significant,indicating respondents’ WTP differs significantly across bounds. In Model VII, the coefficienton t1 × D, γ in Eq. (7), was negative and significant across all samples, implying weightedaverage anchoring was not observed. The shift and anchoring effects were included in ModelVIII. For the full sample, the shift variable was significant and negative, and the anchoringvariable was positive and significant at the 10% level. This suggests that when the effect of ashift in preferences across bounds was controlled for, a weak weighted average anchoring effectwas present in the data. Re-estimating Model VIII to include only ‘very certain’ respondents,while the shift parameter remained significant, no significant anchoring effect was found. Thisimplies the follow up question affected respondents stated WTP, but the anomalies were notexplained by anchoring. ‘Less certain’ respondents were found to include a significant anchoringeffect. This result was inline with behavioural explanations where anchoring had been proposedwhen respondents were uncertain of their true WTP anchor on DC1 (Herriges and Shogren,1996).

3.3. Exploring anomalies based on response patterns

Results from the investigation of the raw data are shown in Table 5. Table 5areports the results for the total group. Across intervals and groupings, response prob-abilities were significantly different at the 5% level (except the lower interval of

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 475

Table 4Empirical test of preference anomalies

Variable Model V Model VI Model VII Model VIIINaıve model Shift model Anchoring model Shift and anchoring model

Full sampleConstant 0.407 (7.537)*** 0.628 (9.63)*** 0.435 (7.91)*** 0.673 (9.62)***

Bid −0.004 (12.68)*** −0.004 (12.85)*** −0.003 (10.87)*** −0.005 (12.12)***

D −0.418 (6.37)*** −0.544t1D −0.0012 (3.43)*** 0.0009 (1.79)*

Observations 1604 1604 1604 1604Log-likelihood −1012.9947 −992.5204 −1007.0404 −990.91958

Rho (ρ) 8.32 × 10−8 8.32 × 10−8 8.32 × 10−8 8.32 × 10−8

−0.0001 −0.0001 −0.0001 −0.0001

‘Very certain’ (certainty = 5)Constant 0.590 (5.11)*** 0.783 (5.83)*** 0.591 (5.67)*** 0.808 (5.46)***

Bid −0.006 (−7.31)*** −0.007 (−7.48)*** −0.006 (−7.38)*** −0.007 (−6.67)***

D −0.332 (−3.29)*** −0.382 (−2.46)**

t1D −0.0012 (−2.27)** 0.0003 (0.043)

Observations 816 816 816 816Log-likelihood −497.3287 −491.8025 −494.9089 −491.7094

Rho (ρ) 0.274 0.286 0.217 0.3000.092 0.090 0.075 0.094

‘Less certain’ (certainty = 1, 2, 3, 4)Constant 0.396 (5.15)*** 0.666 (7.25)*** 0.412 (5.31)*** 0.794 (7.98)***

Bid −0.004 (−8.03)*** −0.004 (−7.97)*** −0.003 (−6.75)*** −0.005 (−8.61)***

D −0.539 (−5.74)*** −0.889 (−6.39)***

t1D −0.0009 (−1.86)* 0.003 (3.47)***

Observations 778 778 778 778Log-likelihood −502.0453 −485.3973 −500.3123 −479.4125

Rho (ρ) 1.13 × 10−7 1.13 × 10−7 1.13 × 10−7 1.13 × 10−7

0.00004 0.00004 0.00004 0.00004

(***), (**), (*) denote significance at the 1%, 5% and 10% levels, respectively. T statistics reported in parenthesis.

grouping four). This provides evidence of anomalous behaviour between DC1 and DC2.Responses were best explained by the guilt/indignation hypothesis (Bateman et al.,1999).

Table 5b reports patterns for ‘very certain’ respondents. In contrast to the full sample dataresponse patterns were mixed with responses for the lower bid groups consistent with prospecttheory and indignation (without guilt). However, at the higher bid groupings response patternsare consistent with well-defined preferences.

Table 5c reports response patterns for respondents ‘less certain’. Response patterns indicatedthe significant finding of indignation and guilt.

476 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

Table 5aResults from tests for consistency of responses (all respondents N = 802)

Bid increasingpath

Proportion(%)

Bid decreasingpath

Proportion(%)

Observedpattern

Possibleexplanations

Grouping 1: £25 and £50Upper interval P(Y25Y50) 26.6 P(Y50) 67.8 <*** Guilt/indignationMiddle interval P(Y25N50) 49.7 P(N50Y25) 20.3 >***

Lower interval P(N25) 23.5 P(N50N25) 10.2 >**

Grouping 2: £50 and £100Upper interval P(Y50Y100) 22.9 P(Y100) 52.9 <*** Guilt/indignationMiddle interval P(Y50N100) 44.7 P(N100Y50) 25.3 >***

Lower interval P(N50) 30.9 P(N100N50) 20.7 >**

Grouping 3: £100 and £200Upper interval P(Y100Y200) 11.5 P(Y200) 37.1 <*** Guilt/indignationMiddle interval P(Y100N200) 41.4 P(N200Y100) 27.8 >***

Lower interval P(N100) 45.9 P(N200N100) 35.1 >**

Grouping 4: £200 and £300Upper interval P(Y200Y300) 8.6 P(Y300) 22.5 <*** Guilt/indignationMiddle interval P(Y200N300) 28.4 P(N300Y200) 16.1 >***

Lower interval P(N200) 62.9 P(N300N200) 60 >

(***), (**), (*) denote significant differences between proportions at the 1%, 5%, and 10% levels, respectively.

Table 5bResults from tests for consistency of responses (‘very certain’ respondents [certainty = 5] N = 408)

Bid increasingpath

Proportion(%)

Bid decreasingpath

Proportion(%)

Observedpattern

Possibleexplanations

Grouping 1: £25 and £50Upper interval P(Y25Y50) 27.3 P(Y50) 57.7 <*** Prospect

theory/indignationMiddle interval P(Y25N50) 42.4 P(N50Y25) 15.7 >***

Lower interval P(N25) 20 P(N50|N25) 15.7 =***

Grouping 2: £50 and £100Upper interval P(Y50Y100) 30 P(Y100) 50.6 <*** Prospect

theory/indignationMiddle interval P(Y50N100) 38.6 P(N100Y50) 21.7 >***

Lower interval P(N50) 31.4 P(N100N50) 27.7 =***

Grouping 3: £100 and £200Upper interval P(Y100Y200) 16.8 P(Y200) 23.3 =*** ConsistentMiddle interval P(Y100N200) 33.7 P(N200Y100) 28.0 =***

Lower interval P(N100) 49.4 P(N200N100) 48.1 =***

Grouping 4: £200 and £300Upper interval P(Y200Y300) 9.1 P(Y300) 16.3 >* UnclearMiddle interval P(Y200N300) 14.3 P(N300Y200) 10.2 =***

Lower interval P(N200) 76.6 P(N300N200) 73.5 =***

(***), (**), (*) denote significant differences between proportions at the 1%, 5%, and 10% levels, respectively.

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 477

Table 5cResults from tests for consistency of responses (‘Less Certain’ respondents [certainty = 1–4], N = 389)

Bid increasingpath

Proportion(%)

Bid decreasingpath

Proportion(%)

Observedpattern

Possibleexplanations

Grouping 1: £25 and £50Upper interval P(Y25Y50) 13.8 P(Y50) 69.1 <*** Indignation/guiltMiddle interval P(Y25N50) 58.3 P(N50Y25) 24.7 >***

Lower interval P(N25) 27.8 P(N50N25) 6.2 >***

Grouping 2: £50 and £100Upper interval P(Y50Y100) 17.3 P(Y100) 54.9 <*** Indignation/guiltMiddle interval P(Y50N100) 51.8 P(N100Y50) 28.6 >***

Lower interval P(N50) 30.8 P(N100N50) 16.5 >***

Grouping 3: £100 and £200Upper interval P(Y100Y200) 6.6 P(Y200) 51.3 <*** Indignation/guiltMiddle interval P(Y100N200) 48.3 P(N200Y100) 27 >***

Lower interval P(N100) 45 P(N200N100) 21.6 >***

Grouping 4: £200 and £300Upper interval P(Y200Y300) 8.1 P(Y300) 31.5 <*** Indignation/prospect

theoryMiddle interval P(Y200N300) 43.2 P(N300Y200) 19.5 >**

Lower interval P(N200) 48.6 P(N300N200) 44.7 =

(***), (**), (*) denote significant differences between proportions at the 1%, 5%, and 10% levels, respectively.

4. Discussion

Consistent with previous studies, evidence was found that welfare estimates derived fromDBDC data were lower than those generated from SBDC. Further, the bivariate probit model indi-cated low correlation between DC1 and DC2. These results held when models were re-estimatedaccording to respondents’ certainty. Rho (ρ) was positive in the case of ‘very certain’ respondentsand negative for ‘less certain’ respondents, suggesting behavioural motivations differed betweengroups.

These findings were consistent with results from the random effects probit model incorporatingtests of preference shifts and anchoring. All models provided evidence of a significant shift effectbetween DC1 and DC2. This suggested DC1 and DC2 were not drawn from the same underlyingdistribution, and provided evidence of incentive incompatibility of the follow up question. Weakevidence of a weighted average anchoring effect was found for the full sample. When ‘verycertain’ respondents were considered no significant anchoring effect was found, again suggestingdifferent behavioural motivations across certainty groups.

Response patterns in the raw data indicated results consistent with the indignation/guilt hypoth-esis for the full sample. These results are in contrast to DeShazo (2002), who reported evidenceof prospect theory (although DeShazo’s results were consistent with indignation). ‘Very certain’respondents at lower bid levels had similar response patterns to those reported by DeShazo, whilstresponse patterns at higher bid groupings were consistent with theory.

Overall, greater response consistency was found at higher bid levels, Alberini et al. (1997)argued when bid amounts are close to true WTP, respondents face a more difficult task. Accord-ingly, when bid amounts are (considerably) higher than respondent’s WTP, the task is simplified.Thus, respondents should find it easier to provide consistent responses (and be more certain oftheir response). An alternative explanation is when respondents are faced with higher bid amounts

478 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

they are more considered in their responses, as they perceive making a ‘mistake’ to be morecostly.

‘Less certain’ response patterns were consistent with the indignation/guilt hypothesis. It wasnot possible to distinguish between prospect theory and indignation for the highest bid group-ings. Bateman et al. (2002), through debriefing focus groups, reported qualitative evidence of anindignation effect. Further research is needed to investigate the findings from the analysis of rawresponses, and qualitative data is likely to be useful here.

DeShazo (2002) concluded only responses in the ascending sequence were anomalous andthus recommended only ‘no’ responses be followed by a subsequent bid. The results of this studyfound anomalies in both the ascending and descending sequences. This suggests the presenceof any follow up question creates anomalous responses. Our results have potential implicationsfor all multiple bounded elicitation formats, including the payment card, iterative bidding andrandom card sort (Smith, 2006). For example, in payment card formats, respondents considereach of the amounts presented to them, and state if they would be willing to pay that amount.Whilst respondents have advanced disclosure (they see all amounts they face), the study presentedhere also informed respondents they would face a second valuation question that would be higherif they stated ‘yes’ to DC1 and lower if they stated ‘no’ to DC1. Anomalous behaviour persisted,however. Future research should investigate these issues within the context of payment cards,considering ascending and descending bid amounts.

The NOAA panel recommended the use of dichotomous choice contingent valuation based onthe perceived incentive compatibility of the method. The move from a single to double boundeddichotomous choice structure may compromise this incentive compatibility, thus implying theuse of DBDC should be avoided in favour of SBDC (Carson et al., 1999). If anomalies areattributed to incentive incompatibility introduced by the follow up question, Carson et al. (1999)predict response patterns consistent with the cost expectations explanation. In both this studyand DeShazo (2002) anomalies were not consistent with this explanation, but rather prospecttheory and indignation. Indignation may also be a consequence of incentive incompatibility, asthe respondent does not truthfully reveal their valuation of the good.

The explanations proposed for anomalous behaviour (with the exception of yea saying) assumeutility maximisation at DC1. However, patterns observed may indicate respondents were not utilitymaximising, and did not hold well-defined preferences for the good in question (McFadden,1994; Sugden, 1999). In such cases respondents may use the contingent valuation question frameto aid their preference formation. Gregory et al. (1993) acknowledge this when recommendingresearchers ‘should function not as archaeologists, carefully uncovering what is there, but asarchitects, working to build a defensible expression of value’. This would indicate preferencesare malleable; dependent upon, and constructed in response to the contingent valuation frame(Bateman and Mawby, 2004; Hanley and Shogren, 2005; Sugden, 2005).

The existence of well-formed preferences may be related to respondent’s familiarity with,or experience of, the good being valued (Boyle et al., 1993; Roach et al., 1999; Braga andStarmer, 2005; McCollum and Boyle, 2005). Evidence here is mixed. Boyle et al. (1993) andMcCollum and Boyle (2005), when looking at preferences for boating on the Colorado riverand moose hunting in Maine, found no significant difference between WTP estimates elicitedfrom more experienced and less experienced respondents. In contrast Roach et al. (1993), in astudy concerned with boating preferences, found respondents’ experience influenced their WTPestimates. They question whether respondents with little or no experience are able to provide validresponses to stated preference tasks. Many applications of stated preference techniques in healtheconomics elicit the preferences of patients for a treatment they have experienced. In these cases

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 479

respondents can reasonably be expected to have a greater familiarity with the good in question.Future work should consider if such anomalies as reported in this paper persist under varyingdegrees of respondent familiarity.

San Miguel and Ryan (2003) in series of discrete choice experiments valuing goods that differin familiarity (a supermarket, a dentist appointment, and bowel cancer screening) did not findevidence of preference construction. As respondents are presented with a number of choices, doesthis elicitation format allow individuals to form their preferences? Indeed many DCEs include aset of warm up questions allowing respondents to form their preferences.

When respondents are familiar with the good in question researchers may be able to care-fully uncover existing preferences (acting as archaeologists). Conversely, when respondents areasked to value a good they have no direct experience of, in this case an air ambulance service,respondents’ preferences may build on the information they receive in the question frame. Hanleyand Shogren (2005) discuss the presence of a middle ground between well-defined preferencesand constructed preferences, where respondents have a range of uncertainty. This study found‘certain’ respondents exhibited less anomalous behaviour than ‘less certain’ respondents, perhapsindicating ‘less certain’ respondents were more dependent on the question frame to form theirpreferences.

Providing respondents with the opportunity to ‘learn’ their preferences within a CV exper-iment may overcome some of the anomalies observed. Within health economics Dolan et al.(1999) eliciting patients views on priority setting over two focus group sessions, found viewson priority setting were systematically different after respondents were given the opportunity fordiscussion and deliberation. This led the authors to conclude that the results of ‘surveys, whichdo not allow respondents the time or opportunity to reflect on their preferences’ may be doubtful.Future stated preference studies should consider information provision and how to best provideinformation to respondents prior to a valuation task. The citizen’s jury a promising approach,which allows respondents to reflect on their preferences, has been combined with stated pref-erence techniques within the environmental economics literature (Kenyon et al., 2001; Kenyonet al., 2003; MacMillan et al., 2002; MacMillan et al., in press). This methodology combines acontingent valuation task with elements of participatory analysis. Respondents are given time toreflect upon their preferences and obtain larger amounts of information, especially in the caseof unfamiliar goods. Of particular relevance is a study by MacMillan et al. (in press) that val-ued two goods, Red Kite preservation and renewable energy. Respondents’ familiarity with thegoods differed; respondents were unfamiliar with Red Kite preservation and familiar with renew-able energy. Findings indicated WTP values were significantly different for the unfamiliar goodafter respondents were given the opportunity to deliberate and obtain additional information. Theauthors concluded that CV could act as a ‘preference engine’.

The presence of observed anomalies in the application of CV questions its use at the policylevel. Despite the UK Treasury’s Green Book recommending the use of monetary measures tovalue non-marketed goods, cost-benefit analysis (CBA) has not been used at the policy levelwithin health care (Drummond et al., 2005). Anomalies, such as those reported in this paper,undermine the validity of CV. A recent special issue of the Journal of Environmental and ResourceEconomics (Volume 32, Number 1, September 2005) considered how stated preference researchcould cope with observed anomalies. In the opening paper Sugden (2005) proposes a frameworkfor the discussion of observed anomalies: recognise the aspiration of SP techniques as legitimate;best practice in SP is not a closed world (evidence maybe found from other judgement anddecision-making tasks); and adopt a precautionary principle by developing methodologies to copewith anomalies. Following this he summarised five alternative ways to respond to the problems

480 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

anomalies create in CBA: be pragmatic, allow for preference discovery in experiments, developnew theories of preferences, considering market simulation, and measure happiness (Sugden,2005). The interested reader is directed to this Journal.

In conclusion, this study found serious anomalies in a widely applied contingent valuationtechnique: the DBDC format. In line with previous studies WTP estimated from DBDC waslower than WTP from SBDC. Results of regression models and tests of raw data response pat-terns indicated anomalous behaviour across bounds. Anomalies were present for ascending anddescending bounds, and for less certain and very certain respondents, although they were greaterfor less certain respondents. The results may indicate that for some health interventions, wherethere is limited familiarity, respondents do not hold theoretically consistent preferences. Theimplication of this conclusion could be seen as worrying. One interpretation is that the results ofsuch experiments cannot be used within an economic evaluation framework. However, once werecognise that economists can act as architects, working to build preferences, rather than archae-ologists, working to uncover them, CV tasks can be designed accordingly. If CV is to be adoptedby policymakers the health economics community needs to discuss how to deal with anomalies.

Acknowledgements

The authors gratefully acknowledge comments from Ian Bateman on an earlier version of thispaper. Financial support from the Department of Health, University of Aberdeen and Health Foun-dation is acknowledged. The Chief Scientist Office of the Scottish Executive Health Department(SEHD) funds HERU. The usual disclaimer applies.

References

Alberini, A., 1995. Efficiency vs bias of willingness to pay estimates: bivariate and interval data models. Journal ofEnvironmental Economics and Management 29, 169–180.

Alberini, A., Kanninen, B., Carson, R., 1997. Modelling response incentive effects in dichotomous choice contingentvaluation data. Land Economics 73, 309–324.

Arrow, K., Solow, R., Portney, P.R., Leamer, E.E., Radner, R., Schuman, H., 1993. Report of the NOAA panel on contingentvaluation. Federal Register 58, 4601–4614.

Asfaw, A., von Braun, J., 2005. Innovations in health care financing: new evidence on the prospect of community healthinsurance schemes in the rural areas of Ethopia. International Journal of Health Care Finance and Economics 5,241–253.

Bateman, I.J., Mawby, J., 2004. First impressions count: interviewer appearance and information effects in stated prefer-ence studies. Ecological Economics 49, 47–55.

Bateman, I.J., Landford, I.H., Jones, A.P., Kerr, G.N., 2001. Bound and path effects in double and triple boundeddichotomous choice contingent valuation. Resource and Energy Economics 23, 191–213.

Bateman, I.J., Carson, R.T., Day, B., Hamemann, M., hanley, N., Hett, T., Jones-Lee, M., Loomes, G., Mourato, S.,Ozdemiroglu, E., Pearce, D.W., Sugden R., Swanson, J., 2002. Economic Valuation with Stated Preferencs: A manual.Edward Elgar.

Blumenschein, K., Johannesson, M., Yokoyama, K.K., Freeman, P.R., 2001. Hypothetical versus actual willingness topay in the health care sector: results from a field experiment. Journal of Health Economics 20, 441–457.

Blumenschien, K., Johannesson, M., Blomquist, G.C., Liljas, B., O’Conor, R.M., 1998. Experimental results on expressedcertainty and hypothetical bias in contingent valuation. Southern Economic Journal 65, 169–177.

Boyle, K., Bishop, R.C., Welsh, M.P., 1985. Starting point bias in contingent valuation bidding games. Land Economics61, 188–194.

Boyle, K., Welsh, M.P., Bishop, R.C., 1993. The role of question order and respondent experience in contingent valuationstudies. Journal of Environmental Economics and Management 25, 45–55.

Boyle, K., Johnston, F.R., McCollum, D.W., 1997. Anchoring and adjustment in single bounded, contingent valuationquestions. American Journal of Agricultural Economics 79, 1495–1500.

V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482 481

Braga, J., Starmer, C., 2005. Preference anomalies, preference elicitation and the discovered preference hypothesis.Environmental and Resource Economics 32, 55–89.

Cameron, T., Quiggan, J., 1994. Estimation using contingent valuation data from dichotomous choice with follow upquestionnaire. Journal of Environmental Economics and Management 27, 218–234.

Carson, R., Groves, T., Machina, M., 1999. Incentive and informational properties of preference questions—plenaryaddress. In: Proceedings of the Ninth Annual Conference of the European Association of Environmental and ResourceEconomics (EAERE), Oslo, Norway, June 1999.

Clarke, P.M., 2000. Valuing the benefits of mobile mammographic screening units using the contingent valuation method.Applied Economics 32, 1647–1655.

DeShazo, J.R., 2002. Designing transactions without framing effects in iterative question formats. Journal of Environ-mental Economics and Management 43, 360–385.

Diener, A., O’Brien, B., Gafni, A., 1998. Health care contingent valuation studies: a review and classification of theliterature. Health Economics 7, 313–326.

Dolan, P., Cookson, R., Ferguson, B., 1999. The effect of group discussions on the public’s view regarding priorities inhealth care. British Medical Journal 318, 916–919.

Donaldson, C., Thomas, R., Torgenson, D., 1997. Validity of open-ended and payment scale approaches to elicitingwillingness to pay. Applied Economics 29, 79–84.

Drummond, M., Sculpher, M., Torrance, G., O’Brien, B., Stoddart, G., 2005. Methods for the Economic Evaluation ofHealth Care Programmes, third ed. Oxford University Press.

Frew, E.J., Whynes, D.K., Wolstenholme, J.L., 2003. Eliciting willingness to pay: comparing closed-ended with open-ended and payment scale formats. Medical Decision Making 23, 150–159.

Gregory, R., Lichtenstien, S., Slovic, P., 1993. Valuing environmental resources: a constructive approach. Journal of Riskand Uncertainty 7, 171–197.

Hackl, F., Pruckner, G.J., 2005. Warm glow, free riding and vehicle neutrality in a health-related contingent valuationstudy. Health Economics 14, 293–306.

Hanemann, W.M., 1984. Welfare evaluations in a contingent valuation experiments with discrete responses. AmericanJournal of Agricultural Economics 66, 332–341.

Hanemann, W.M., Loomis, J., Kanninen, B., 1991. Statistical efficiency of double bounded dichotomous choice contingentvaluation. American Journal of Agricultural Economics 73, 1255–1263.

Hanemann, W.M., Kanninen, B., 1999. The statistical analysis of discrete-response CV data. In: Bateman, I.J., Willis,K.G. (Eds.), Valuing Environmental Preferences: theory and practice of the contingent valuation method in the US,Eu and Developing countries.

Hanley, N., Shogren, J.F., 2005. Is cost-benefit analysis anomaly-proof? Environmental and Resource Economics 32,13–34.

Herriges, J.A., Shogren, J.F., 1996. Starting point bias in dichotomous choice with follow up questioning. Journal ofEnvironmental Economics and Management 30, 112–131.

Hoehn, J., Randall, A., 1987. A satisfactory benefit cost indicator from contingent valuation. Journal of EnvironmentalEconomics and Management 14, 112–131.

Holmes, T.P., Kramer, R.A., 1995. An independent sample test of yea-saying and starting point bias in dichotomous choicecontingent valuation. Journal of Environmental Economics and Management 29, 121–132.

Hunter, J.E., 2001. The desperate need for replications. Journal of Consumer Research 28, 149–158.Johannesson, M., Liljas, B., Johansson, P.O., 1998. An experimental comparison of dichotomous choice valuation

questions and real purchase decisions. Applied Economics 30, 643–647.Johnston, D., Ryan, M., 2002. Air ambulances: public perceptions of value. Report prepared for Department of Health.Kahneman, D., Tversky, A., 1979. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291.Kanninen, B., 1995. Bias in discrete response contingent valuation. Journal of Environmental Economics and Management

28, 114–125.Kennedy, C.A., 2002. Revealed preference valuation compared to contingent valuation: radon-induced lung cancer

prevention. Health Economics 11, 585–598.Kenyon, W., Hanley, N., Nevin, C., 2001. Citizens’ juries: an aid to environmental valuation? Environmental Planning C:

Government and Policy 19, 557–566.Kenyon, W., Kevin, C., Hanley, N., 2003. Enhancing environmental decision-making using citizens’ juries. Local Envi-

ronmental 8, 221–232.Klose, T., 1999. The contingent valuation method in health care. Health Policy 47, 97–123.Liu, J.T., Hammitt, J.K., Wang, J.D., Liu, J.L., 2000. Mother’s willingness to pay for her own and her child’s health: a

contingent valuation study in Taiwan. Health Economics 9, 319–326.

482 V. Watson, M. Ryan / Journal of Health Economics 26 (2007) 463–482

MacMillan, D., Philip, L., Hanley, H.D., Alvarez-Farizo, B., 2002. Valuing non-market benefits of wild goose conservation:a comparison of interview and group-based approaches. Ecological Economics 43, 49–59.

MacMillan, D., Hanley, N., Lienhoop, N. Contingent valuation: environmental polling or preference engine? EcologicalEconomics, in press.

McCollum, D.W., Boyle, K.J., 2005. The effect of respondent experience/knowledge in the elicitation of contingent values:an investigation of convergent validity, procedural invariance and reliability. Environmental and Resource Economics30, 23–33.

McFadden, D., 1974. Conditional logit analysis of qualitative choice behaviour. In: Zarembka, P. (Ed.), Frontiers inEconometrics. Academic Press, New York.

McFadden, D., 1994. Contingent valuation and social choice. American Journal of Agricultural Economics 76, 689–708.Prosser L.A., Bridges, C.B., Uyeki, T.M., Rego, V.H., Ray, G.T., Meltzer, M.I., Schwartz, B., Thompson, W.W., Fukuda,

K., Lieu, T.A., 2005. Values for preventing influenza-related morbidity and vaccine adverse events in children. Healthand Quality of Life Outcomes 3.

Roach, B., Boyle, K., Bergstrom, J.C., Reiling, S.D., 1999. The effect of instream flows on whitewater visitation andconsumer surplus: a contingent valuation application to the Dead River Maine. Rivers 7, 11–20.

Ready, R.C., Buzby, J.C., Hu, D., 1996. Differences between continuous and discrete contingent value estimates. LandEconomics 72, 397–411.

Ryan, M., Scott, D.A., Donaldson, C., 2004. Valuing health care using willingness to pay: a comparison of the paymentcard and dichotomous choice methods. Journal of Health Economics 23, 237–258.

San Miguel, F., Ryan, M., 2003. Revisiting the axiom of completeness in healthcare. Health Economics 12, 295–307.Smith, R.D., 2003. Construction of the contingent valuation market in health care: a critical assessment. Health Economics

12, 609–628.Smith, R.D., 2006. Its not just what you do it’s the way that you do it: the effect of different payment card formats and

survey administration on willingness to pay for health gain. Health Economics 15, 281–293.Sugden, R., 1999. Alternatives to the neo-classical theory of choice. In: Bateman, I.J., Willis, K.G. (Eds.), Valuing

Environmental Preferences: theory and practice of the contingent valuation method in the US, Eu, and DevelopingCountries. Oxford University Press.

Sugden, R., 2005. Anomalies and stated preference techniques: a framework for coping strategies. Environmental andResource Economics 32, 1–12.

Whitehead, J.C., 2002. Incentive compatibility and starting point bias in iterative valuation questions. Land Economics78, 285–297.