QALYs for resource allocation: probably not and certainly not now

9
POINT OF VIEW 32. 33. 34. 35. 36. 37. 38. 39. Gigerenzer G. How to make cognitive illusions disappear: beyond heuristics and biases. In: Strobe W, Hewstone M. edi- tors. Eur Rev Soc Psycho1 1991; 2: 83-1 15. Kaplan RM, Bush JW, Berry CC. Health status index. Cat- egory rating versus magnitude estimations for measuring levels of well-being. Med Care 1979; 17: 501-25. Sintonen H. An approach to measuring and valuing health states. Soc Sci Med 1981; 15c: 55-65. Bombardier C, Ware J, Russell J, Larson M, et al. Auranofin therapy and quality of life in patients with rheumatoid arthritis: results of a multicenter study. Am] Med 1986; 81: 565-78. Glasziou PP, Simes RJ, Gelber RD. Quality adjusted survival analysis. Staf Med 1990; 9: 1259-76. Goldhirsch A, Gleber RD, Simes, RJ, Glasziou P, et al. Costs and benefits of adjuvant chemotherapy in breast cancer: a quality adjusted survival analysis. J Clin Om01 1989; 7: Detsky AS, Naglie GI. A clinician’s guide to cost-effectiveness analysis. Ann Intern Med 1990; 113: 147-54. Egan T. Oregon lists illnesses by priority to see who gets Medicaid. New York Times 1990; JuI 9: Sect. A: 1, 18 (col 1-4). 36-44. 40. 41. 42. 43. 44. 45. 46. 47. Street A, Richardson J. The value of health care: what can we learn from Oregon? A w t Health Reu 1991; 15: 124-34. Welch HG, Larson EB. Dealing with limited resources: the Oregon decision to curtail funding for organ transplan- tation. N Engl J Med 1988; 319: 171-3. Sox HC Jr, Blatt MA, Higgins Mc, Marton KI, et al. Medical decision making. Boston: Butterworths, 1988: 161-3. Cox DR. Fitzpatrick R, Fletcher A, Gore SM, et al. Quality-of- life assessment: can we keep it simple?] R SfatisfSoc 1992; 15: 353-93. La Puma J, Lawlor EF. Quality-adjusted life-years: ethical implications for physicians and policymakers. JAMA 1990; 263: 29 17-2 1. Mooney G, Olsen J. Qalys we’re next. In: McGuire A, Fenn P, Mayhew K, editors. Providing health care: the economics of altmfiue system offinance and deliuely. Oxford: Oxford Uni- versity Press, 1991. Harris J. Unprincipled QALYs: a response to Cubb0n.J Med Ethics 1991; 17: 185-8. Carr-Hill RA. Allocating health resources to health care: is the QALY (quality adjusted life year) a technical solution to a political problem? l n f J Health Sem 1991; 21: 351-63. QALYs for resource allocation: probably not and certainly not now Colin Burrows Graduate School of Management, M mh University Kaye Brown National Centre fw Health Program Evaluation, M mh University Abstract Quality-adjusted life years (QALYs) have the attractive characteristic of combining morbidity and mortality into a single index which purports to measure the outcomes of health intewentions. Their primary aim, when combined with cost, is to permit comparisonsacross candidate spending programs and thereby promote economic efficiency in the use of rationed funds. QALYs, in fact, comprise a family of measures with major differences in approach and many variationsin construction, process and methods of measurement. A necessary unifymg characteristic is the ethical assumption of utilitarianism. The paper examines the state of the art in the development of.QALY measures. It concludes that they fall far short of requirements for their advocated use in resource allocation decisions. Furthermore, their demands on measurement for this purpose are such that it is unlikely that methodological problems can be solved. (Ausi J Public Healih 1993; 17: 278-86) hat follows is a dissenting opinion of the proposition put by Schwartz et al.’ Con- W siderable attention will be devoted to the two particular aspects of QALYs on which we spoke at the symposium, but we feel obliged to do more than that. The Schwartz et al. article goes further than the substance of the contributions at the sym- posium by the three authors and is, in effect, an expanded argument for the use of QALYs in cost- effectiveness analyses as the primary basis for rationing health expenditures. We think this argu- Based on the authors’ contributions to a symposium on quality- adjusted life years held at the annual meeting of the Public Health Association of Australia in Canberra, September 1992. Corre- spondence to Colin Burrows, Graduate School of Management, Monash University, Clayton, Vic 3168. Fax (03) 565 5499. ment cannot be sustained for technical, administrat- ive and ethical reasons. The basic premise of their paper is that ‘QALYs have evolved as the recommended way to compare the effects of disparate health interventions (clinical decisions as well as public health programs) (p. 272). On the perfectly acceptable general argument that, under conditions of rationing, funds should go where the benefits generated exceed those produced by putting resources into an alternative use, QALYs are favoured as an expression of this relationshipin a single index that combines both quantity and quality of life. This is both the appeal of QALYs and the problem. It is administratively appealing and probably emotionally comforting to have a simple criterion for complex decisions. To generate such a simple cri- 278 AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3

Transcript of QALYs for resource allocation: probably not and certainly not now

Page 1: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

32.

33.

34.

35.

36.

37.

38.

39.

Gigerenzer G . How to make cognitive illusions disappear: beyond heuristics and biases. In: Strobe W, Hewstone M. edi- tors. Eur Rev Soc Psycho1 1991; 2: 83-1 15. Kaplan RM, Bush JW, Berry CC. Health status index. Cat- egory rating versus magnitude estimations for measuring levels of well-being. Med Care 1979; 17: 501-25. Sintonen H. An approach to measuring and valuing health states. Soc Sci Med 1981; 15c: 55-65. Bombardier C, Ware J, Russell J, Larson M, et al. Auranofin therapy and quality of life in patients with rheumatoid arthritis: results of a multicenter study. A m ] Med 1986; 81: 565-78. Glasziou PP, Simes RJ, Gelber RD. Quality adjusted survival analysis. Staf Med 1990; 9: 1259-76. Goldhirsch A, Gleber RD, Simes, RJ, Glasziou P, et al. Costs and benefits of adjuvant chemotherapy in breast cancer: a quality adjusted survival analysis. J Clin Om01 1989; 7:

Detsky AS, Naglie GI. A clinician’s guide to cost-effectiveness analysis. Ann Intern Med 1990; 113: 147-54. Egan T. Oregon lists illnesses by priority to see who gets Medicaid. New York Times 1990; JuI 9: Sect. A: 1, 18 (col 1-4).

36-44.

40.

41.

42.

43.

44.

45.

46.

47.

Street A, Richardson J. The value of health care: what can we learn from Oregon? A w t Health Reu 1991; 15: 124-34. Welch HG, Larson EB. Dealing with limited resources: the Oregon decision to curtail funding for organ transplan- tation. N Engl J Med 1988; 319: 171-3. Sox HC Jr, Blatt MA, Higgins Mc, Marton KI, et al. Medical decision making. Boston: Butterworths, 1988: 161-3. Cox DR. Fitzpatrick R, Fletcher A, Gore SM, et al. Quality-of- life assessment: can we keep it simple?] R Sfatisf Soc 1992; 15: 353-93. La Puma J, Lawlor EF. Quality-adjusted life-years: ethical implications for physicians and policymakers. JAMA 1990; 263: 29 17-2 1. Mooney G, Olsen J. Qalys we’re next. In: McGuire A, Fenn P, Mayhew K, editors. Providing health care: the economics of a l t m f i u e system offinance and deliuely. Oxford: Oxford Uni- versity Press, 1991. Harris J. Unprincipled QALYs: a response to Cubb0n.J Med Ethics 1991; 17: 185-8. Carr-Hill RA. Allocating health resources to health care: is the QALY (quality adjusted life year) a technical solution to a political problem? l n f J Health Sem 1991; 21: 351-63.

QALYs for resource allocation: probably not and certainly not now Colin Burrows Graduate School of Management, M m h University

Kaye Brown National Centre fw Health Program Evaluation, M m h University

Abstract Quality-adjusted life years (QALYs) have the attractive characteristic of combining morbidity and mortality into a single index which purports to measure the outcomes of health intewentions. Their primary aim, when combined with cost, is to permit comparisons across candidate spending programs and thereby promote economic efficiency in the use of rationed funds. QALYs, in fact, comprise a family of measures with major differences in approach and many variations in construction, process and methods of measurement. A necessary unifymg characteristic is the ethical assumption of utilitarianism. The paper examines the state of the art in the development of.QALY measures. It concludes that they fall far short of requirements for their advocated use in resource allocation decisions. Furthermore, their demands on measurement for this purpose are such that it is unlikely that methodological problems can be solved. (Ausi J Public Healih 1993; 17: 278-86)

hat follows is a dissenting opinion of the proposition put by Schwartz et al.’ Con- W siderable attention will be devoted to the

two particular aspects of QALYs on which we spoke at the symposium, but we feel obliged to do more than that. The Schwartz et al. article goes further than the substance of the contributions at the sym- posium by the three authors and is, in effect, an expanded argument for the use of QALYs in cost- effectiveness analyses as the primary basis for rationing health expenditures. We think this argu-

Based on the authors’ contributions to a symposium on quality- adjusted life years held at the annual meeting of the Public Health Association of Australia in Canberra, September 1992. Corre- spondence to Colin Burrows, Graduate School of Management, Monash University, Clayton, Vic 3168. Fax (03) 565 5499.

ment cannot be sustained for technical, administrat- ive and ethical reasons.

The basic premise of their paper is that ‘QALYs have evolved as the recommended way to compare the effects of disparate health interventions (clinical decisions as well as public health programs) (p. 272). On the perfectly acceptable general argument that, under conditions of rationing, funds should go where the benefits generated exceed those produced by putting resources into an alternative use, QALYs are favoured as an expression of this relationship in a single index that combines both quantity and quality of life.

This is both the appeal of QALYs and the problem. It is administratively appealing and probably emotionally comforting to have a simple criterion for complex decisions. To generate such a simple cri-

278 AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3

Page 2: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

tenon, which can be technically and ethically justi- fied, is the problem.

As an example of the problem faced in trying to make decisions where alternative outcomes yield dif- fering qualities of life, Schwartz et al. refer to the early analysis by Klarman et al. of end-stage renal dis- ease where transplants result in higher quality years of life than does dialysis.’ They conclude that, even though the quality-of-life weighting was arbitrary, it produced a ‘more realistic analysis than no quality adjustment at all’ (p. 272). This is the archetypal straw man (or should it be straw person?) argument because the gross difference in quality of life for transplant and dialysis patients, especially in the mid 1960s, does not require anything like the measure- ment ‘precision’ of a single index. Most importantly, such an exercise would make no difference to this policy decision which will be based not on the mani- fest improvement in quality of life, but on the avail- ability of suitable donor organs.

This type of argument is carried over to the final paragraph of the article where the alternative to QALYs is asserted to be ‘the often ill-informed politi- cal process that currently operates. . . ’ (p. 277). This not only implies very unflattering characteristics and behaviours of clinicians and those who make policy decisions, but it avoids the difficult problems: what to do when there are not gross and obvious differences in outcomes.

It is here that the apparent benefits of QALYs are revealed because, with the complexities of the values of lives reduced to an index between zero and one, it is a simple task to calculate marginal cost- effectiveness ratios, as per their Table 1. But therein lies a big problem because it does not take much in the way of percentage error in the numerator and denominator to yield substantial changes in the rankings of the end product of such analyses, the cost per QALY gained.

Consider Table 1: in making adjustments to the numerator and denominator, we will simply point to the sources of potential error raised by Schwartz et al. themselves (and elaborated on later) and invite readers to look at the costs included in the many cost- per-QALY analyses in the literature.2 There they will find great variation in the cost categories and defi- nitions used, the extent of costs sought and methods of valuation. This will always be so because there are no natural boundaries to effects of interventions, availability of data differs across types of inter- ventions and some methods of valuation are unresolved (for example, discount rates) or definable only in general terms.

If w e assume, for the candidate interventions in Table 1, a relatively small potential error of 10 per cent in the cost figures and a 0.02 range about the quality adjustments, both eminently defensible, the effects on the marginal costs per QALY are large. For the A to B comparison, the $23 077 per life year gained becomes $41 489 at the upper limit and $12 65 1 at the lower. For the sake of brevity, we leave it to readers to make their own assumptions about error limits and do similar exercises for the empirical studies in the literature, but the point has been well recogni~ed.~.~ If the rank order of the ratios deter- mines policy priorities, as Schwartz et al. suggest (p.

275). it’s discomforting to know that rankings are likely to be highly sensitive to small errors in measurement.

QALYs and clinical trials Here, too, it is opportune to make a clear distinction between the use of quality-of-life measures in clinical trials and in other applications. Much of the empiri- cal work on health status assessment can be described by a cross-classification of applications that dis- tinguishes between three functions of measurement: comparison or discrimination, evaluation and allo- cation; and two focuses: the community and the indi- vidual. Clinical trials occupy one cell in the resulting matrix (evaluation, individual) and program trials another (evaluation, community). Needless to say, the validity, and hence the applicability, of any given quality-of-life instrument must be established for the particular function and focus.

This has important repercussions for applications concerned with allocation because the technical relationships among inputs and outcomes necessarily underpin answers about the relative efficiency of interventions. In particular, if direct comparisons are made for purposes of resource allocation, it is not the efficacy of the intervention that determines efficiency but its community effectiveness5 In the domain of public health where long chains and weak links frequently obtain, the distinction is important.

Clinical trials should be viewed as a special case when dealing with quality-of-life measures. They are assessing efficacy in well-chosen populations under highly controlled conditions; they are concerned with hypothesis-testing and minimum clinical signifi- cance; they are looking for strong and consistent effects over several trials; they are concerned with information-seeking and disaggregated data about the impact of an intervention. Importantly, trials are about local comparisons, not global comparisons for decisions about resource allocation, which are the province of community effectiveness analyses. It is probable, too, that, even if a generic quality-of-life measure is used as part of the information sought, the required strength of effect will accommodate any we1l;validated measure, but one must ask why a disease-specific measure is not used, given the history of development and validation of such

So what are QALYs? As Schwa- et al. say, ‘the essential feature of the QALY approach is the assignment of weights (quality adjustments) to different health states’ (p. 273). However, within this ‘essential feature’ it may be argued that the QALY approach is best viewed as a family of approaches, with a number of identifiable genera, subgenera and species. The unifylng theme is the assertion of the need to summarise health out- comes as a single number which integrates morbidity and mortality by expressing health in terms of equiv- alent well-years.

The pivot, in terms of dividing the QALY family into genera, is what is required of the respondent or the interviewee, that is, the task itself. One genus of approaches, exemplified by the Quality of Well-being Index,8 focuses on the health status of identified indi- viduals and involves the administration of over-the-

AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3 279

Page 3: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

counter questionnaires, either to individuals with a specific disease or condition or to a proxy (health care provider or partner) who is familiar with the nominated individual’s health status. The emphasis is on scaling stimulus subjects; all respondents in the sample respond to the same set of stimuli (questions) in respect of a given subject with a particular con- dition and any differences in scores obtained are assumed to reflect real differences among subjects. This approach sits firmly within a long tradition of cognitive and psychometric research and it relies heavily on the content of the instrument and veridicality of the scaled responses.

To accomplish the desired end, health is rep- resented by a number of dimensions, for example, mobility, physical activity, social activity and symptoms/problems (Quality of Well-being Index6) or distress and disability (Rosser Indexg). The subject’s level of health on each dimension is deter- mined from responses to specific content items. Then, to permit aggregation, dimensions and/or items are weighted according to their utility, relative desirability, social preference and importance to quality of life. Here, scaling methods vary and may include category scaling, variations of magnitude estimation or other methods in psychophysics or measurement theory. There is debate about whose weights to use, but, for decisions about the allocation of public funds, it is usually assumed that some embodiment of ‘the general public’ is required.’O Of course, how a sample of this general public should be selected is a tricky question and, in application, it varies widely. The Quality of Well-being Index used 867 residents of San Diego, ‘ethnically representative of the population’;8 the Sickness Impact Profile used 133 judges composed of 108 enrolees in a prepaid health plan and 25 health care professionals and preprofessional students;’ I the Rosser Index used weights derived from interviews with ‘70 people hav- ing different personal and professional experiences of illness and health’.g

Once established, the weights effectively become one of the parameters of the instrument. Therefore, their validity is heavily dependent on the nature of the population sampled, with all the attendant con- notations of cultural and social representativeness. Finally, the weighted scores within dimensions and the dimensions themselves must be aggregated to produce a single index and some aggregation rule must be used. There are several possible rules, each of which will yield different results.

The second genus of QALY approaches is grounded directly in axiomatic utility theory and relies on holistic judgements to establish preferences among alternative health states and, ultimately, their ‘values’. The essence of this approach in the presen- tation to subjects of ‘scenarios’, brief descriptions of the levels of sensory, physical, emotional, cognitive and self-care functioning, symptoms associated with the health problem, side-effects of treatments, if rel- evant, and levels of pain. Accordingly, if the relevant corpus of respondents for resource allocation decisions is deemed to be the general public, there is no requirement that they be experiencing (actually or vicariously) the health states described. The emphasis is on scaling stimulus objects rather than subjects; that is, concern is with the assignment of numerical values

to different stimuli as represented by the scenarios, based on the responses of groups of respondents.

There are several subgenera in the utility-based approach that can be identified according to the scaling technique used: standard gamble, time trade- off and category scaling. The standard gamble is illus- trated by Schwartz et al.; the time trade-off requires the respondent to trade off a given number of years in an ‘unhealthy’ state for a smaller number of years of life in ‘good health’; category scaling requires that the respondent locate health states on a scale with end points such as ‘death, least desirable’ and ‘healthy, most desirable’, at intervals that correspond to the dif- ferences in their preferences among health states.

To date, the locus of validation exercises in utility measures has been the degree of convergence across scaling methoh, rather than the validity or otherwise of health state descriptions. The standard gamble and time trade-off do, as Schwartz et al. say, yield similar utilities. However, given the well-documented and acknowledged criticisms of standard gambles as valid measures of preferences, this should be worrying.’2*1Y Even more worrying is the scant atten- tion given to the manner in which health state scen- arios should be developed. In many cases their derivation appears to be essentially ad hoc, they have seldom been subjected to validation tests in their own right and their frequent omission from the write-up of studies serves only to compound problems of interpretability .

Finally, there is a fundamental difference between the multidimensional psychometric approach to quality-of-life measurement and utility-based holistic judgments which has seldom been raised and even less often addressed. The former measures have been developed as generic instruments with large numbers of standardised items that, at least in intentions, should apply to a very wide range of illnesses and dis- abilities. How well they do this is an empirical ques- tion but it is a matter which can be tested both within and across instruments. Health-state scenarios for holistic judgments are necessarily brief and con- structed for specific conditions and interventions. It is difficult to see how they could be otherwise unless heroic assumptions are made about the cognitive demands of the respondent’s task. How these condition-specific indexes derived from holistic judgements of health state scenarios compare with those obtained from responses to validated multidimensional generic instruments-across many conditions and using the same subjects in each comparison-has not been systematically tested.

Within these two broad approaches, or genera, variations are many, some but by no means all of which are instanced by Schwartz et al. (pp. 273-6). These include:

The von Neumann and Morgenstern axioms underlying utility theory have been shown to be sys- tematically violated (p. 273), with consequent implications for the use of standard gambles and time trade-offs. There are many variations in methods for estimat- ing utilities (p. 273). Magnitude estimation and person trade-off pro- duce results which may vary considerably from those produced by the other utility techniques and from rating scales (p. 274).

280 AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3

Page 4: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

Various measurement techniques produce differ- ent results because they involve quite different cog- nitive tasks and vary along many dimensions (p. 274). Not everyone accepts that utilities can be aggre- gated across people (p. 274). Individuals’ preferences are highly sensitive to subtle differences in measurement (p. :!74). There is no universally accepted discount rate and different rates can produce significantly different answers (p. 276).

To those one could add: Whose weights should be used for importance of items and how should they be scaled? What are the effects of ethnicity, age, social status and other demographic and cultural variables on perceptions of health and what are the ramifi- cations for sampling populations and the generality of application for generic quality-of-life measures?

It is difficult to disagree with their concluding para- graph in the section ‘Significance of psychological research’:

To summarise, there is no consensus about how the extra years of life produced by a health care intervention should be adjusted for their quality

and Researchers and clinicians must recognise that, whatever weighting technique is adopted, the values obtained will be affected by the way health states are described, scores gener- ated and surveys administered (p. 274).

Validity of quality-of-life measures The previous section was concerned with essentially technical questions which are posed in the paper as just that: questions which would have to be resolved if resulting QALY figures were to be something more than broad estimates with very wide confidence limits. Other matters of fundamental importance, and not independent of these technical problems, are the particular aspects of QALYs that we addressed in the symposium: questions about the validity of quality-of-life measures.

The two aspects of validity we considered are both referred to by Schwartz et al., one indirectly, the other as virtually a dismissal in one sentence.

In their discussion of measurement, they say Even fewer (researchers) have attempted to prove that their questionnaires actually measure what they purpoit to measure; that is. that they are ‘valid in the psychometric sense (p. 273).

If validation of an instrument does, in fact, demon- strate that it measures what it says it measures and not something else or even nothing knowable at all, it should warrant more than this statement; it should be a worry that few researchers have bothered much about it and, even more so, it should be a concern if cost-per-QALY comparisons are made on the basis of unknown meanings of the ratios.

The second aspect of validity addressed was an extension of the technical validity question to the consequences of using the measures. Cronbach, recog- nised this consequential basis of validity in his obser- vations that ‘tests that impinge on the rights and life chances of individuals are inherently disputable’ and

validators have an obligation to review whether a practice has appropriate consequences for individuals and institutions, and especially to guard against adverse consequences (p. 6).14

Messick, in particular, has argued that test validity should be construed b r ~ a d l y ’ ~ - ’ ~ because

the process of construct interpretation inevitably places test scores in both a theoretical context of implied relationships to other constructs and a value context of implied relationships to good and bad, to desirable and undesirable attributes and behaviors (p. 41).19

Therefore, he defines test validity as ‘an evaluative judgment of the adequacy and appropriateness of inferences and actions based on test scores’ (p. 42, emphasis added).Ig

This aspect of QALY-based policy decisions is referred to by Schwartz et al. in their section on ‘Objections and rationing of health care’. However, the questions they themselves raise are more or less waved away by setting up a single and simplistic alternative criterion of maximising lives.

The consequential basis of validity will be addressed later, but first something will be said about validity in the psychometric sense and what is required to test for validity.

Requirements for validation A central simple fact about quality-of-life measures is that quality of life is a construct; it does not exist other than in how we choose to define and measure it. Moreover, it cannot be validated by reference to a measurable external criterion. Acceptable measures of quality of life are acceptable only if there is agree- ment about the content of the construct and the pro- cesses by which the measures are derived, where agreement has strong theoretical and technical impli- cations. If we accept the view of modern validation theory that the three historical categories of validity: criterion, content and construct, are all strands of construct validity, testing quality-of-life measures is concerned with the technicalities of construct validation. I 4 . I 5

In this context, it is difficult to argue with the fol- lowing statements: 1 . Hardly any of the quality-of-life measures cur-

rently being used and/or advocated have been well validated.

2. Where validation exercises have been under- taken, with few exceptions the tests have been superficial and methodologically unsound.20 The well-established methodologies and pro- cesses that do exist for establishing validity, are available in the validation literature, including many recent books and a r t i c l e ~ . ~ I - ~ ~ This litera- ture is seldom referenced in validation studies of quality-of-life instruments and there is little evi- dence that it has been referred to.

The requirements for construct validation are deceptively simple in outline but they are demanding in application, especially in this area. Validation methodology is based on the unarguable proposition that we need to evaluate both the constructs them- selves and their methods of measurement because constructs are always imprecise and measurement is not only imprecise but affected by extraneous fac- tors. Because there are no objectively measurable cri- teria of validity for a generic measure of quality of life, as there frequently are for clinical trials, the val-

3.

AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3 28 1

Page 5: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

idity of quality-of-life instruments depends primarily on comparison with other constructs. Consequently, tests must be grounded in theory that embodies agreed upon and justifiable characteristics of the phenomenon. Tests must also demonstrate not only that instruments adequately measure an acceptable concept of quality of life but that they do not measure some other related but separate construct such as social support or differential adjustment to illness.

Validation, then, is concerned with hypothesis- testing and it necessarily utilises a variety of statistical techniques and a variety of theoretical constructs and linkages within constructs.

A detailed discussion of validity testing is given by Brown and Burrows as well as the books referred to earlier, but the generalised requirements for vali- dation can be summarised as f0llows:~5 1 .

2.

3.

4.

5.

6.

There must be both Convergent and divergent evi- dence in the characteristics of the construct itself and in theoretical predictions. That is, there should be evidence not only that alternative operational definitions of the construct are highly correlated (convergence), but that the same methods of measuring theoretically differ- ent definitions and constructs are not (diver- gence). The latter, too seldom investigated, avoids the very real possibility that supposedly different constructs yield similar results. The theoretical underpinnings of the construct should provide a basis for deriving testable hypoth- eses about the linkages between test scores on the target construct and measures of other con- structs of the same phenomenon, that is, quality of life. The boundaries of the construct must be defined by demonstrating lack of association of the con- struct (using methodologically similar measure- ment methods) with substantially different but potentially confounding constructs such as social support. There must be evidence of dism'minant validity within the construct where the domain is multidimensional. For example, where a quality- of-life measure treats functional status, social functioning and emotional status as separate and different dimensions, tests should demonstrate high levels of discrimination among them. Because reliability is a necessary but very insuf- ficient general condition for validity, the tests must assess and differentiate between them. To accomplish this, the process must examine togethm same methods of measurement, same characteristics; same methods, different charac- teristics; and different methods, different characteristics.

It is a truism that validity can never be complete. It is an evolving concept that requires many studies by different researchers using a variety of theoretically relevant variables and methods of measurement to achieve consistent results. It is an iterative process with interplay between tests and theoretically derived inferences. That is, validity accrues over time and requires both convergent and divergent evidence.

The standard method for validation studies is the multitrait, multimethod (MTMM) approach, which is not a set of tests and techniques but a research frame-

work that utilises, in best practice, several methods of measuring the construct and yields multiple measures of association: convergent, divergent and di~criminative.'~-2~

Hardly any of the available quality-of-life instru- ments, and none of those derived from standard util- ity theory, have been subjected to anything approaching this degree of rigorous testing. Rather, the tendency has been to rely on simple correlations between the target measure and one or two others, often similar in construction;28:29 and not in- frequently to use indefensible samples and/or unvalidated content.>O Seldom are testable hypoth- eses proposed and the emphasis is usually entirely upon convergence with complete disregard for diver- gent and discriminative evidence. This does very little to establish validity and lends an aura of respect- ability to measures of unknown or uncertain mean- ing. Furthermore, where an instrument is to be used for evaluation purposes rather than discrimination, which is the chief objective in resource allocation decisions, validity must also encompass measurement of change or responsiveness in quality of life as a result of an intervention. That is, measures must be longitudinally valid. This is a particularly trouble- some area of validation because measurement of change is a contentious issue.Y'.Y2

Again, most existing generic instruments have ignored this critical aspect of validity and investi- gators have carried out what exercises they have undertaken largely with other measures designed for discriminative purposes. Where tests on even the best-validated instruments have sought evidence of responsiveness, they have been found to be sufficiently lacking to cause concern about their gen- eral appl i~a t ion .~~

Overall, we must agree with Spitzer: How often do you hear somebody get up and say, 'we've attempted to validate this and we've found that it is not valid so we've abandoned it'? Most of the time people talk about the superficial exercise they have done with their data, which they almost invariably choose to interpret as positive evidence ofval- idity, and we are asked to accept it. That's not ood enough. We

Given, then, the general state of validity of existing instruments and, in particular, the lack of evidence that the required precision in change-measurement necessary for the use of QALYs for resource allo- cation can be obtained, it is a bit worrying that these are regarded by Schwartz et al. as just technical prob- lems that are potentially solvable (p. 276).

The consequential basis of validity Messick argues that there are four questions to be addressed explicitly whenever a measure such as quality of life is proposed for a particular purpose (p. 9):

ought to tighten up our rigor in this area. 2 8

1. What evidence justifies the proposed test interpretation as balanced against counter evidence or evidence supporting rival interpretations?

2. What evidence justifies the proposed test use in contrast to evidence supporting alternative proposals based on other measures or methods, including non-testing methods?

3. What are the value implications of the preferred test interpretation and to what degree are its value implications and theoretical implications compatible or antagonistic?

4. What are the potential social consequences of the proposed test use and to what degree are they facilitative or debilitative of the intended purpose?''

282 AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3

Page 6: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

The first two questions encapsulate the technical aspects of validation. The last two address the values that are always embodied in constructs such as quality of life and the potential consequences of their application.

Sometimes these values and consequences are not obvious, but in the case of QALYs used specifically for resource allocation, they are clear-and worry- ing. Not only are they clear but they have been spelled out unambiguously-in the Schwartz et al. paper itself, elsewhere by at least one of the authors, and in books and papers by those promoting the use of QALYs (largely economists) and their opponent^.^^-^^

The grand value judgment is that health care is an investment rather than a consumption good. Inter- ventions generate streams of future QALYs, to be discounted to present values, completely analogous to future cash flows in a textbook capital investment analysis. The decision maker is entreated to act as a QALY maximiser akin to a profit maximising entre- preneur in conventional economic theory. The most explicit statements of these values are those of M a y r ~ a r d , ~ ~ , ~ ~ in which clear statements are made about resources going to patients ‘whose care creates the largest volume of health (QALY) benefits’ and people ‘who are poor investments in terms of gener- ating QALYs do not get care’ (p. 11 3).36 He also sets out decision rules in terms of efficient levels of invest- ment by equating marginal benefits (QALYs) and marginal costs.

These statements are quite compatible with those of Schwartz et al. (p. 275) and other cost-utility advocates.

The second major value implication is the accept- ance of the utilitarianism-based ethic of ‘neutral’ QALYs; a QALY is a QALY irrespective of how it is gained or for whom.

These two assumptions yield the types of simple arithmetic exercises illustrated by Schwartz et al. (p. 275) and in numerous books and papers demon- strating QALY applications or presenting results of investigations.Y94’

The consequences of neutral QALYs are, at least to some, bizarre. Interventions of the same total cost that:

preserve a ‘full health’ life for six months for 10 000 babies; or improve the quality of life of 20 000 elderly people for the remaining five years of statistical life by 0.05 units on the utility scale; or restore partial health with consequent gain in qual- ity of life of 0.25 for forty years for 500 people

are of equal value. Moreover, it doesn’t matter if a gain of 0.15 is from, near-zero or from already reasonably good health; nor whether it applies to basic functioning or cosmetic surgery, as long as they are in the array of alternatives.

This value-free characteristic of QALYs is viewed as a desirable quality but it does, of course, lead to biases in favour of magic bullets and simply years of remaining life.y4 The very old will seldom do well because they don’t have enough additional years of life left to generate many QALYs; nor will those with chronic diseases where interventions can do no more than make very painful lives a bit better. Given, too, the relationship between health and social character-

istics of populations, as Evans points out, ‘it should also mean treating women rather than men, white patients rather than black, and upper social classes rather than the lower’ (p. 189).Y7

To the usual response: ‘but things other than QALY rankings can be taken into account’, we would respond: of course they can.34 The questions are whether cost per QALY rankings are or can be of suf- ficient precision to serve even as a basis for ‘other things’; the extent to which ‘other things’ are likely to be taken into account, given the implied precision and power of hard numbers; whether there are other characteristics of QALYs that would mitigate these apparent consequences; and why, with ‘other things’, the desirable simplicity of reducing ‘unstructured’ decisions to a ranking of ratios will not be subverted.

An additional, and related, claimed advantage of QALY-based.decisions is the public or explicit nature of rationing.4244 Explicitness is to be valued as a desirable quality in itself and, presumably, it enhances understanding of the processes and conse- quences viz a viz other ways of arriving at policy decisions. The Oregon experiment is referred to favourably as an example of such openness. Britain’s National Health Service, on the other hand, is characterised in deprecatory terms as ‘inconsistent, incoherent and implicit’(p. 188)45 and

a silent conspiracy between a dense obscurating bureaucracy, intentionally avoiding written policy for macroallocation, and a publicly unaccountable medical profession privately managing misallocation so as to conceal life and death decisions from patients (pp. 662-3).46

Two implications of these statements can be ques- tioned. In the first place, the assumed value of explicitness as a desirable quality is arguable. The extent to which rules governing such policy decisions should be ‘open’ is a controversial q ~ e s t i o n . ~ ~ - ~ ” In their book, which could well be read by those with simple views of democracy and public policy, Calabresi and Bobbitt argue that ‘tragic choices’ are best made out of full public view in order to sustain important symbolic values such as the sanctity of life;47 and that the usefulness and social acceptability of policies where there are no ‘solutions’ depend on the maintenance of a public ethos that they serve their stated purposes (p. 24). They also argue that society copes with such choices by means of a strategy of cycles. Flawed systems are criticised and eventually replaced by new reformed systems that offer new ‘solutions’. These, once implemented, inevitably dis- play their own shortcomings (because there is no sol- ution) and ‘degrade those values they had sought to protect’ (p. 196). They are themselves subjected to criticism and replacement, sometimes by older and formerly rejected solutions. As witness to this pro- cess, it is difficult to go beyond the rise (or resurrec- tion) of neoclassical economics (or ‘economic rationalism’); its adoption as the basis of economic policy by political parties and other institutions to which it would have been an anathema two decades ago, and the increasing questioning of its foun- dations and social implications in the last few years. To be in favour of openness is currently politically correct and we do think that moves toward greater community consultation and participation in health policy are long overdue. However, the embodiment

AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3 283

Page 7: QALYs for resource allocation: probably not and certainly not now

POINT OF VIEW

of ‘explicitness’ in the arguments for resource allo- cation criteria is at least a debatable manifestation of community empowerment in an area of social policy with complex and difficult ethical implications and high emotive content. It is also highly questionable whether the process of QALY calculations can be explicit to any but a few with considerable technical expertise, a point taken up in the next paragraph. It may well be that the current advocacy of QALYs, with their clear rankings and claimed explicitness, simply reflect Calabresi and Bobbitt’s strategy of cycles and they, too, will have their day.47

A second problem with explicitness of QALYs is the extent to which they are explicit, a question which has been addressed by several ~ r i t e r s . ~ ~ . ~ ~ - ~ ~ What is explicit is that a construct called quality of life, when related to a number called cost, is reduced to a single ratio .called cost per QALY that is capable of at least interval ranking. How much more is explicit to decision makers and the community at large is highly debatable and we should not confuse quantification with explicitness. For example, is it known that equal increases in quality of life from 0 to 0.09 and from 0.9 to 0.99 are of equal ‘value’? Or that time trade-off assumes a simple linear relationship with time? Or that the discounting of QALYs is always assumed to follow the same compound interest relationship as that for a financial investment?

Even if standardised publicly available instruments are used, few other than the investigators will know much about the content of its health state items (or scenarios), the method of elicitation, sample of respondents, scaling and weighting procedures, methods of aggregation of values and weights and how weights were derived. Even fewer will under- stand, or even be aware of, their underlying assump- tions or their technical defensibility. It is doubtful, too, if most investigators in the field fully understand the technicalities of matters such as scales used and the sources of possible error in health state descrip tions and methods of elicitation of responses.

Mosteller, on the basis of experience on com- mittees dealing with such measures, concluded that, whilst ‘talented lay people’ were willing to make com- plex choices, they felt single numbers concealed something from them; they were not willing to accept somebody else’s summary numbers (pp. S285-6).52 Ware similarly warned that when we aggregate to achieve simplicity, we should remember what was lost; that many profiles of health can lead to the same number and the same profile can lead to many differ- ent aggregate scores (p. S287).53 Cam-Hill, too, reminds us that ‘index numbers are not an obser- vation upon the world; they are gene.rated by a specific set of technical procedures’ (p. 361)54 and they are not neutral; they serve different interests.53

Another concern is not that decision makers can- not know how final indexes and ratios are obtained, but that they might not worry too much about it. Some writers are concerned about the tyranny of number^;^'.^^ that once complex administrative decisions are reduced to simple quantified compari- sons of cost and benefit, it not only seems irrational not to act in accordance with the numbers (p. 544),55 but that conversion of difficult political decisions to routine technical procedures is inherently attractive (p. 189).37 No-one who has served on policy com-

mittees should doubt this and there is almost cer- tainly a strong element of it in the current urge to develop ‘policy-relevant’ health outcome measures. This is a highly desirable endeavour and is long over- due but it is not the legitimate function of research to ignore or obfuscate the social and ethical impli- cations of decision-friendly measures.

Conclusion Our first concern is that Schwartz et al. do regard the many technical problems of QALYs as ‘all potentially solvable’ (p. 276), despite the fact that they do not offer solutions nor point to where and how solutions might be found. To the extent that quality-of-life measures are to be used in the manner advocated in their paper, that is, collapsed into a single index to permit calculation of costs for QALY ratios and ranked to serve as a (or the) criterion for policy pri- orities, we don’t think they are solvable. It is not the place to go into detailed technical justification of this position but the sources of error are cumulative and they derive from lack of precision in the many components of the process of developing such a measure. To appreciate the complexity of the process (and the problem), one must refer to the abundant literatures on survey methodology, medical anthro- pology and sociology, behavioural decision theory and cognitive psychology, psychometrics, psycho- physics and measurement theory, and test validity as well as utility theory and economics.

In some of these areas, the research has been com- prehensive and rigorous and there are few answers to problems raised by Schwartz et al.-and others they didn’t raise. In other areas, for example the dimen- sions that constitute health-related quality of life and the cultural implications of health and illness, much less has been done and the current state of knowl- edge can be described, without apology, as very incomplete.

The second concern is that validation of instru- ments is poor and that this does not seem to be a problem in the forefront of the minds of many inves- tigators, nor of Schwartz et al. If lack of validity means, and it does, that we cannot be at all sure what it is that is being measured and how well, it should be of central concern.

The third major concern is the social and ethical implications of decisions by QALYs. Not only are the ratios, and therefore their rankings, likely to be highly sensitive to small percentage errors in both the numerator and the denominator, the assumption of value-neutral QALYs must have potential conse- quences that are very disturbing. If, on the other hand, cost per QALY is simply one of the criteria for choice, a case must be made that these single num- bers, with all their assumptions and errors in measurement, are the best way of incorporating quality-of-life considerations in a more complex and messy decision process.

The Schwartz et al. paper concludes with a ques- tion: what is the alternative to the use of QALYs for decision making? Their answer is an often ill- informed political process that currently operates. Their conclusion is that cost-utility analysis, by imposing a structured decision process, reduces the likelihood of gross error.

284 AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 1 7 NO. 3

Page 8: QALYs for resource allocation: probably not and certainly not now

1 .

2.

3.

Our responses would be: Health policy decisions should be a political (though, one hopes, informed) process that involves not only economics but ethical and social values. The simple utilitarianism that underpins 'neutral' QALYs is an unacceptable value basis. A structured decision process does not presup- pose an analysis that encapsulates all the dimen- sions and aspects of quality of life into a single number. Quality of life, even a quality-of-life measure, as an outcome of health interventions can be incorporated into policy decisions as part of a more complex and less deterministic pro- cess. The apparent assumption that choices are either QALYs or chaos makes one wonder how we've managed to survive as a society without clean criteria and simple choice processes for all the social policy decisions that are embodied in the way we live. Unless it can be shown that cost-utility analysis is capable of reducing the likelihood of gross errors in final decisions compared with other processes for incorporating quality of life, including quality-of-life measures that do not require a single index as the outcome, in the face of the numerous opportunities for (arguably inevita- bility of) error in the several stages of develop- ment and application of instruments, it is probably at least as likely to encourage simple solutions with consequent gross policy errors.

References 1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

Klarman HE, Francis JO'S, Rosenthal GD. Cost effectiveness analysis applied to the treatment of chronic renal disease. Med Care 1968; 6. Reprinted in: Cooper MH, Culyer AJ, edi- tors. Health economics. Harmondsworth: .Penguin, 1973: 230-40. Gerard K. A review of cast-utility studies: assessing their policy- making relevance. Discussion paper 11/91. Health Economics Research Unit. Aberdeen: University of Aberdeen, 1991. Drummond MF. Resource allocation decisions in health care: a role for quality of life assessments. J Chron D6 1987; 40: 605-16. Weinstein MC. A cost-effectiveness approach to decision making. In: Deber RB, Thompson GG, editors. Choices in health care. Decision making and evaluation of effectiveness. Toronto: Dept of Health Administration, University of Toronto, 1982: 95-107. Tugwell P, Bennett KJ, Sackett DL, Haynes RB. The measurement iterative loop: a framework for the critical appraisal of need, benefits and costs of health interventions. J

Guyatt GH, Bombadier C, Tugwell PX. Measuring disease- specific quality of life in clinical trials. Canad Metl Ass J 1986; 134: 889-95. Guyatt GH, Mitchell A, Irvine EJ, Singen J. et. al. A new measure of health status for clinical trials in inflammatory bowel disease. GastroenteTology 1989; 96: 804-10. Kaplan RM, Anderson JP. The quality of well-being scale: rationale for a single quality of life index. In: Walker SR, Rosser RM, editors. Quality of life: assessmmu and application. Lancaster: MTP Press, 1988: 51-77. Rosser R. From health indicators to quality aldjusted life years: technical and ethical issues. In: Hopkins A, Costain D, editors. Measuring the outcomes of medico1 care. London: Royal College of Physicians of London, 1990: 1-17. Hadorn DC. The role of public values in setting health care priorities. Soc Sn' Med 1991; 32: 773-81. Bergner M. Developing, testing, and use of the Sickness Impact Profile. In: Walker SR, Rosser RM, editors. Quality of life: assessment and application. Lancaster: MTP I?ress, 1988: 79-94. Slovic P, Lichtenstein S. Preference reversals: a broader per- spective. Am Econ Rev 1983; 73: 596-605.

Chron Dis 1985; 38: 339-51.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

POINT OF VIEW

Schoemaker PJH. The expected utility model: its variants, purposes, evidence and limitations. J Econ Lit 1982; 20: 529-63. Cronbach LJ. Five perspectives on validity argument. In: Wainer H, Braun HI, editors. Test validity. Hillsdale, NJ: Erlbaum, 1988: 19-32. Messick S. Meaning and values in measurement and evalu- ation. Amer Psychol 1975; 30: 955-66. Messick S. Test validity and the ethics of assessment. Amer Psycho1 1980; 35: 1012-27. Messick S. Evidence and ethics in the evaluation of tests. Ed Researcher 1981; 10: 9-20. Messick S. Validity. In: Linn RL, editor. Edvcational measure- ment (3rd edn). New York: Macmillan/hnerican Council on Education, 1989: 13-103. Messick S. The once and future issues of validity: assessing the meaning and conseauences of measurement. In: Wainer H. Braun ' iI , editors.' Test validity. Hillsdale, NJ: Erlbaum: 1988: 33-45. Spitzer WO. Discussion: Advances in health assessment con- ference. 1 Chron Dis 1987; 40 (Supp): 187-9. Anastasik Evolving concepts of vhkation. Ann Rev Psychol 1986; 37: 1-15. Cronbach LJ. Construct validation after thirty years. In: Linn RL, editors. Intelligence: measuremenl thecny and public policy. Urbana, Ill: University of Illinois Press, 1990: 147-71. Jaeschke R, Guyatt G. How to develop and validate a new quality of life measure. In: Spilker BF, editor. Quulity of life assessments in clinical trials. New York Raven Press, 1990: 47-57. Wainer H, Braun H, editors. Test validity. Hillsdale, NJ: Lawrence Erlbaum, 1988. Brown K, Burrows C. What is validity? A prologue to an evalu- ation of selected health status instruments. Fairfield, Vic: National Centre for Health Program evaluation; 1992. Research Report No. 1. Shori:ell SM, Richardson WC. Health program evaluation. St Louis: CV Mosby, 1978. Fiske DW. Convergent-discriminant validation in measure- ments and research strategies. In: Brindberg D, Kidder LH, editors. F m of validity in research. San Francisco: Jossey- Bass, 1982: 77-92. Torrance GW. Social preferences for health states: an empiri- cal evaluation of three measurement techniques. San'oecon Plan Sci 1976; 10: 129-36. Buxton M, Ashby J. The time trade-off approach to health state valuation. In: Teeling Smith G, editor. Measuring health: a practical approach. Chichester: Wiley, 1988: 69-87. Richardson J, Hall J, Salkeld G. Cost-utility analysis: the com- patibility of measurement techniques and the measurement of utility through time. In: Selby-Smith C, editor. Economics and health: 1989. Proceedings of the eleventh Australian Confm- m e ofHealth Economists. Clayton, Vic: Public Sector Manage- ment Institute. Monash University, 1989: 31-60. Brown K, Burrows C. How should we measure 'change' in utility measures of health status-or should we? In: Selby- Smith C, editor. Economics and health: 1992. Proceedings of the fourteenth Australian Confmence of Health Economists. Clayton, Vic: Public Sector Management Institute, Monash University: 195-235. Guyatt GH, Walter S, Norman C. Measuring change over time: assessing the usefulness of evaluative instruments. J Chrm Dis 1987; 40: 171-8. MacKenzie CR, Charlson ME, DiGioia D, Kelley K. Can the Sickness Impact Profile measure change? An example of scale assessment. J Chron Dis 1986; 39: 429-38. Richardson J. Cook J. The QALY victim of misinformation. Health Issues 1992; 32: 63-7. Maynard A. Logic in medicine: an economic perspective. Br Med J 1987, 295: 153741. Maynard A. The inevitability of outcome measurement: how should QALYs be used? J Manuge Med 1987; 2: 107-14. Evans JG. Symposium proceedings: the ethics of resource allocation. J Epidemiol Community Health 1990; 44: 187-90. Cubbon JE. The principle of QALY maximisation as the basis for allocating health care resources. J Med Ethits 1991; 17: 181-4. Boyle MH, Torrance GW, Sinclair JC, Honvood SP. Econ- omic evaluation of neonatal intensive care of very-low birth- weight infants. N End I Med 1983: 308: 1330-7. W&ams AH. Econo&i;s of corona; artery bypass grafting. BMJ 1985; 291: 326-9.

AUSTRALIAN JOURNAL OF PUBLIC HEALTH 1993 VOL. 17 NO. 3 285

Page 9: QALYs for resource allocation: probably not and certainly not now

LETTERS TO THE EDITOR

41.

42.

43.

44.

45.

46.

47.

48.

49.

Weinstein MC, Schiff I. Cost-effectiveness of hormone replacement therapy in the menopause. Obstet Gynecol Sumey

Daniels N. Is the Oregon rationing plan fair?JAMA 1991;

Dixon J, Welsch HG. Priority setting: lessons from Oregon. Lancet 1991; 337: 891-4. Hadorn DC. The Oregon priority-setting exercise: quality of life and public policy. Hactings Center Rep& 1991; (Supp): 11-16. Maynard A. Symposium proceedings: the ethics of resource allocation.JEpidemio1 Community Health 1990; 44: 187-90. Cranshaw R. Health care rationing [letter]. Science 1990;

Calabresi G, Bobbitt P. Tragic choices. New York: Norton, 1978. Friedman DD. Comments on ‘rationing and publicity’. In: Agich GJ, Begley CE, editors. The pn’ce of health. Dordrecht: Reidel, 1986: 217-24. Rhoads S, editor. Valuing life: publicpolicy dilemmas. Boulder, Co: Westview Press, 1980.

1983; 38: 445-55.

265: 2232-5.

247: 662-3.

LETTERS TO THE EDITOR

50. Winslow GR. Rationing and publicity. In: Agich GJ, Begley CE, editors. The p i c e of health. Dordrecht: Reidel, 1986: 199-2 16.

51. Carr-Hill RA. Allocating resources to health care: is the QALY (quality adjusted life year) a technical solution to a pol- itical problem? IntJ Health Sen, 1991; 21: 351-63. Mosteller F. Final panel: comments on the conference on advances in health status assessment. Med Care 1989; 27

53. Ware JE. Final panel: comments on the conference on advances in health status assessment. Med Care 1989; 27

52.

(SUPP): S202-86.

(SUPP): S286-90. 54. Carr-Hill RA. Social indicators for basic needs: who benefits

from which numbers? In: Cole S, Lucas H, editors. Models, planning and basic needs. Oxford Pergamon Press, 1982. Mulkay M, Ashmore M, Pinch T. Measuring the quality of life: a sociological invention concerning the application of economics to health care. Sociology 1987; 21: 541-64.

55.

Invisibility of carers In their article on the costs and experiences of caring for sick and disabled patients, Smith et al. claim that ‘ . . . there is no doubt that the physical, personal and emotional cost to the carer is priceless’.’ Despite the sentiment underscoring this remark I cannot let it go unchallenged. It is misleading because it results in a zero value being placed on informal care.

The reluctance to place a value on the time spent by carers (or more accurately what they can do with that time) leads to what Waring describes as the invisi- bility of women’s work in caring.‘ If the value of the carer’s work remains invisible then the carers them- selves will remain invisible. Though carers are not exclusively women, women do bear a disproportion- ate amount of the burden for caring of older people.

The trend towards shifting the cost of care from the public sector (nursing homes and hospitals) to private individuals (providing unpaid informal care) seems to be based on the assumption that home care is, and always will be, a cheaper alternative. This rests on the notion that unpaid labour inputs are ‘free’. To an economist nothing is ‘free’ if something has been forfeited to attain it. The time spent by carers could possibly have been spent on paid work or leisure time. A price can be placed on this paid work or leisure time forgone by using an appropriate market wage rate as a proxy measure. Green, in costing informal care for older people in New Zealand, used wage rates for substitute labour appropriate for the task.3 Of course the opportunity cost of productive paid work or leisure time forgone will depend on the age, sex and occupational characteristics of the carers, the pool of unemployed with ‘carer’ skills as well as the dependence of the older person being cared for.

Costing informal care is anything but straightfor- ward. However, evaluative studies that compare the substitutability of paid formal care for informal unpaid care must attempt to put a price on informal care. To omit a price compromises any conclusions made about the substitutability or cost of services.

286 AUSTRALIAN JOURNAL OF PUBLIC

I sympathise with the authors; the methods for costing informal care are far from clear and are unlikely to incorporate the emotional aspects of caring. However, their assertion that the cost to the carer is priceless produces the perverse result that they are not valued at all. This can only perpetuate the invisibility of carers amongst some health policy makers.

Glenn Salkeld Department of Public Health

University of Sydney

References 1. Smith, B, O’Malley S, Lawson J. Costs and experiences of

caring for sick and disabled geriatric patients-Australian observations. Aust J Public Health 1993; 17; 13 1-4.

2. Waring M. r f women counted. A new feminist economics. San Franciso: Harper and Row, 1988.

3. Green FT, Raper AC. The resource costs of community care of the dependent elderly. In: Selby-Smith C, editor. Economics and health: 1990. Proceedings of the Eleventh Australian Conference of Health Economists. Clayton, Victoria: Public Sector Management Institute, Monash University, 199 1.

Chemical hazards in the Melbourne metropolitan are a Carlo, Sund and coworkers, in their recent papers, seek to dismiss the toxicological hazards of dioxins and the role of Nufarm in the emissions of dioxins to Werribee sewage farm.’.’ Close examination of these papers, however, reveals a number of omissions and inconsistencies and we would like to draw attention to the following points.

The article by Sund et al. addresses the dioxin con- gener profiles in selected effluents and soils from urban and industrial sites in the area. They conclude that Nufarm’s effluent congener profile does not resemble the contamination profiles at Werribee or elsewhere and that Nufarm is therefore not a primary contributor to the contamination. Some concerns attach to the inconsistencies in the two tables of results (Tables 3 and 4), which report duplicate analy-

HEALTH 1993 VOL. 17 NO. 3