The Condorcet's Jury Theorem in a Bioethical Context: The Dynamics of Group Decision Making

379THE CONDORCET’S JURY THEOREM IN A BIOETHICAL CONTEXTGroup Decision and Negotiation 9: 379–392, 2000

© 2000 Kluwer Academic Publishers. Printed in the Netherlands

The Condorcet’s Jury Theorem in a Bioethical Context:The Dynamics of Group Decision Making

TOM KOCHUniversity of British Columbia, Department of Geography, 1984 West Mall, Vancouver, BC, Canada V6K 2S1

MARK RIDGLEYUniversity of Hawaii (Manoa), Dept. of Geography, Porteus Hall # 445, Honolulu, HI, USA 96822

Abstract

The Condorcet Jury Theorem was the first and remains a central model of collective decision making in bothsocial and political theory. Advanced as an argument for small group or “jury” decision processes over those ofindividual experts, its axioms and conclusions have been a subject of rigorous debate in recent years. Thoseconsiderations have typically been mathematical and theoretical, however, rather than concrete and descriptive.This paper considers the applicability of the Jury Theorem in light of data collected in a series of focus groupsorganized at The Hospital for Sick Children (Toronto, Canada) to review organ transplant eligibility criteria.With each of four focus groups we used the Analytic Hierarchy Process to elicit views of hospital members andusers on the relative importance of criteria commonly used to define organ transplant eligibility. Analysis of thepriority measures obtained provided clear insights into issues of consensus, the role of experts, and the processof collective decision making by heterogeneous juries. The conclusions may be of use to those interested indemocratic process and social theory in all contexts – legal, moral, and political – involving small group decisionmaking.

Key words: analytic hierarchy process, bioethics, condorcet jury theorem, multicriterion decision making,organ transplantation

First stated in the late eighteenth century, the Condorcet Jury Theorem (CJT) remains acentral plank in the platforms of modern democratic political theory (Berg, 1996) andsocial choice theory (Shapley and Grofman, 1984). Simply stated, it was the first andremains a central model of collective decision making. Democratic in its essence, it predictsthat “the collective performance of a group in arriving at a correct judgment on the basis ofmajority rule will be superior to the average individual performance, provided that certainconditions hold” (Berg and Nurmi, 1996, 208). Then and now, the question has been: whatconditions are required for groups to make better decisions than the individual membersof that group might make alone? What are the mechanisms by which collective decisionmaking may in fact lead to better choices than the individual choices of officials andofficial experts? The answer is of general importance across the broad arena of democraticprocesses – from the classical context of the courtroom jury to a general understanding ofvoter patterns.

This paper considers elements of the CJT from the perspective of small-group decisionmakers. Its focus is the state of knowledge required for effective decision making by small

380 KOCH AND RIDGLEY

groups. The assumption in much of the literature has been that the probability of correctdecision making is dependent on the professional expertise – the knowledge base – ofspecific voters. On the surface, this seems to contradict the general thrust of the CJT,that “majorities of individuals are likely to be more often correct than individuals”(Grofman and Feld, 1988, 569). This paper explores, within the framework of the CJT(one concerned with aggregated decision making), the process by which small groups ofdecision makers, possessing different levels of expertise in a specific subject area, arriveat a decision.

While a large literature has developed to address the CJT and its application, the majorityof that work has been mathematical and theoretical rather than descriptive and concrete. Itthus suggests “what should be” rather than “what is,” how groups might behave ratherthan what occurs when small-scale voter groups are charged with making critical decisions.This paper’s discussion is based upon data returned in a series of focus groups at TheHospital for Sick Children in Toronto, Canada. Participants were asked to consider theproblem of organ-transplant eligibility: Who among patients who are medically equallyneedy should be accepted as a potential organ recipient? Previous reports on this researchfocused on, first, the problem of organ-transplant eligibility in a context of absolutescarcity (Koch, 1996), and secondly, on the general conclusions of these focus groups(Koch and Rowell, 1997; Ott, 1997). A third paper reviewed the relevance of groupfindings to the general issues of consensus in bioethics (Koch and Rowell, 1998).

1. Condorcet’s Jury Theorem

Let (X1. . . X

n) be n independent identically distributed binary random variables such that

Pr(Xi =1) = p > ½ and P

n = Pr(ΣΧ

i > n/2. Then (a) P

n > p and (b) P

n is monotonically

increasing in n and Pn → 1 as n → ∞ If p < ½ then P

n < p and P

n → 0 as n → ∞. Finally,

when p = ½, then Pn = ½ for all n. (Berg, 1996, 230)

Summarized, the CJT states that when the probability of each participant individuallyreaching a correct decision is greater than .50, then the probability of the group reaching acorrect decision increases as the number of group participants increases. On the otherhand, if the probability of each person voting incorrectly is greater than .50, the likelihoodof an incorrect decision increases as the number of participants increases. The CJT thussuggests it would be better to entrust decision making to a larger group all of whose membershave a lower probability of making a correct decision than to another, much smaller groupeach of whose members have higher probabilities of being correct. This seemingly counter-intuitive assertion assumes, however, that the former group’s members’ individualprobabilities of correctness are all greater than 0.50.

As Boland and others have pointed out, the CJT reflects a general, political thesis.Condorcet argued the then radical notion that elective assemblies and citizen groups wouldbetter serve the nation than the estates and corporate bodies that still ruled 18th centuryFrance in the aftermath of feudalism (Boland, 1989). Grofman and Feld have suggestedthat the CJT is a formalization of Rousseau’s “general will.” From this perspective, thatgeneral will becomes a common field defined by the aggregate decisions of citizens at

381THE CONDORCET’S JURY THEOREM IN A BIOETHICAL CONTEXT

least somewhat more likely than not to make a “better” choice between alternatives(Grofman and Feld, 1988, 569).

A number of researchers have criticized individual assumptions of the CJT, sometimesadvancing alternate statements in an attempt to maximize its potential applicability indifferent contexts (Miller, 1996). Two such assumptions have received special consideration.They are (a) that voters in a decision making system share a common goal (Berg, 1996,230), and (b) that individual voters be statistically independent. What has often been unclearis the definition of how one might define a person or group’s probability of reaching acorrect decision.

Implicitly or explicitly, some theorists have endorsed the assumption that the probabilityof achieving a correct decision is a function of “the estimated quantity of information”held by participants (Schmidt, 1985, 30). This has led some to believe that the best jurieswill therefore be composed of experts whose knowledge of the problem being deliberatedwill be greater. To the degree that those with specialized expertise are at least somewhatmore likely than not to make a “better choice” – physicians rather than laypersons, forexample, in cases of medical decision making – does this deny the supposedly democraticsense of the CJT?

Another issue is the CJT’s assumption that all jurors vote independently. Voting may beindependent, but both in juries and in focus groups participants act interdependently, poolingperspectives and knowledge in their search for answers. Thus one can consider aninterdependent model of the Condorcet jury system in which jurors or group members areassumed to be not independent but interdependent. As Miller and others have noted (Miller,1994), interdependence can increase average individual competence through the sharingof data, improving each participant’s knowledge base. It also presents the possibility thaterroneous data introduced by one or another jury member may be accepted by other voters,resulting in a diminished probability of any member’s reaching a correct conclusion.

The issue of interdependence and its effect on the probability of choosing or makingthe correct choice remains a central issue in this literature. As Berg points out, the presenceof an influential voting leader may affect other, dependent voters whose views will bemodified as a result of a group member’s arguments (Berg, 1994). Where that leader’sviews are inaccurate, incorrect, or erroneously value-laden, his or her influence will tendto lead to an incorrect determination by a jury or focus group. An experiential considerationof the dynamics of a group from this perspective is required.

1.1 Defining “correct” decisions

A problem in examining CJT axioms has been the inability to define correctness in manyareas of real-life decision making. While probability and correctness can be definedtheoretically, there are almost insuperable difficulties in applying a standard of correctnessin areas of social and bioethical decision making. Decisions in these venues necessarilyoccur in a context of uncertainty and incomplete knowledge. More generally, “correctness”remains difficult to define except in a limited context. A legal jury may find a person guiltyonly to be faced at a later date with evidence proving innocence where guilt was assigned.


Voters may elect an official on the basis of available data only to learn at some futuretime that data was false, inaccurate, or incomplete. Even in the realm of science, wheredata is supposedly free of subjective interpretation, implicit and explicit values on thepart of “expert” evaluators affect data interpretation and application (Caldwell, 1996,395). In situations where a “correct” choice is difficult or impossible to define, “thenotion of the expected choice that would be made by an infinitely large voting populationunder a majority voting procedure” can be substituted (Shapley and Grofman, 1984,330). In such situations, perhaps the best that can be said is that (a) decisions werereached by a correct (socially appropriate) process that was (b) inclusive – including allappropriate jury members – and (c) utilized pertinent data. Some ethicists go further,however. “Discourse theorists maintain that talk is constitutive of the realities withinwhich we live, rather than expressive of an earlier, discourse-independent reality” (Woodand Kroger, 1995, 83). From this perspective, one common in some schools ofcontemporary bioethics, group consensus represents not verifiable truth, but the closestwe can come to it in a context of incomplete knowledge.

2. Methodology

In the mid 1990s, a series of highly publicized North American cases focused attention onthe issue of organ-transplant eligibility. In some cases, persons with disabilities were rejectedas potential recipients. In other cases, famous persons with a history of alcoholism orcancer were accepted. In both professional and popular literatures the problem was statedclearly: What criteria are fair and appropriate in deciding between potential organ-transplantrecipients?

A hierarchy of criteria generally acknowledged as important to transplant eligibilitywas created on the basis of a careful review of both the technical literature (Cook, et al.,1989; Corley and Sneed, 1994; Obrisch and Levenson, 1991) and the public debatesurrounding controversial transplant decisions in Canada and the US. These included twocases in which persons with Down syndrome – California independent-living activist SandraJensen (Delsohn and Philip, 1995) and Canadian Special Olympics skier Terry Urquart(Dawson, 1995; Donohue, 1995) – were denied eligibility to the organ-transplant waitinglist. In other cases, former baseball player Mickey Mantle (Hoppe, 1995) and actor LarryHagman were given organ transplants despite histories of cancer and alcoholism.

The resulting criteria set then was organized using the Analytic Hierarchy Process (AHP),an approach to multicriterion decision making (Saaty, 1980). The AHP facilitates thequantification of preferences through a series of paired comparisons of hierarchicallyorganized criteria. Criteria are compared in a pair-wise fashion as to their importance orcontribution to a specific goal, in this case organ-transplant eligibility. Typically,comparisons are given on verbal or numerical scales. Each participant’s set of pairedcomparisons yields a ratio scale on which the relative importance or priority of each criterionis measured. Different group members’ priorities for the same criterion are aggregatedinto a collective, group assessment using the geometric mean. This permits inter- andintra-group comparisons.


The hierarchy was then submitted to a hospital research review committee. After itsreview, a set of focus groups was constituted. These included two groups (HSC1 andHSC2) composed of staff persons from the departments of bioethics, surgery, neonatology,nursing, psychiatry, and social work, at The Hospital for Sick Children. The groups, numberingsix and eleven persons respectively, were heterogeneous rather than department-specific.Another focus group was composed solely of ten members of a local chapter of the DownSyndrome Family Association whose members had expressed an interest in the issues oforgan-transplant eligibility. Finally, a control group of eight lay citizens drawn from Toronto’sBeaches community was asked to consider the organ transplant hierarchy. None of the lastgroup’s members had a medical background or personal experience with transplantation.

Groups were constituted differently, depending on their relation to the hospital. Thehospital’s department of bioethics sent a brief description of the testing program to membersof varying departments, soliciting their participation. HSC-1 sessions excited sufficientinterest to attract additional participants to the second, larger group. Questions of transplantaccessibility for persons with Down syndrome (DS) had been expressed publicly bymembers of the Down Syndrome Family Association. They agreed to participate and wereasked to choose the members of their own group. The control group was scheduledinformally with persons from the Beaches Community who had expressed a general interestin issues of hospital organization, responsibility, or “these tough choices”.

The inclusion of stakeholders from the Down Syndrome Family Association and thecommunity was critical. Because of an historical failure to treat persons with DS born withsurgically correctable problems (Pueschel, 1989), individuals with this condition are morelikely to require heart or heart-lung transplants than are members of the general population.They have thus been primary potential litigants, as the cases of both Sandra Jensen andTerry Urquart demonstrated, when organs are refused on the basis of diminished IQ orsocial independence. A citizen group was constituted because in both Canada and the USgraft organs are defined in law as national resources to be used for the benefit of thenation’s citizenry. Thus hospital representatives agreed a public voice was required in aconsideration of allocation policy.

Prior to meeting, members of each group were given a package of background materialsthat described the problem of organ transplant allocation in general terms. Reference tothen current newspaper cases was included to assure a common practical as well as generalbody of data across all groups. Also provided were an illustration of the hierarchy, and thedefinitions of all criteria to be considered by the groups. Finally, a brief description of theAnalytic Hierarchy Process’ pair-wise comparison procedure was offered to familiarizeparticipants with this approach.

Permission to tape-record sessions was secured from all participants. Individual decisionsfor each pair-wise comparison were recorded both on tape and manually. For eachcomparison, the geometric mean of the judgments of each group’s members was enteredinto Expert Choice, a software package for AHP, while a record of individual comparisonswas reserved for later analysis. The results of each group’s deliberations yielded a cardinalmeasure of relative importancefor each criterion, thus permitting intra-group comparisons.

For each criterion and for each vote, the group’s mean value rather than those ofindividual participants was considered. The results were not binary – yes/no – in the manner


of classic jury decision making. While members were striving for general consensus ineach group’s discussion – “I want to hear why you say that,” was a frequent request –differences often remained between discussants; differences in intensity – and sometimesin basic judgments – remained. Aggregation thus resulted in a majority-rule rather than awholly consensual vote on each criterion.

Finally, an index of inconsistency was calculated for each group’s decision set in anattempt to determine the relative degree of internal agreement for each group, and to allowcomparison between groups. To measure the inconsistency of all the judgments in thehierarchy, the inconsistency value of each set of paired comparisons is first calculated andthen multiplied by the priority of the criterion with respect to which those comparisonswere made. It then sums these values for all criteria, giving a single, overall measure ofinconsistency. The inconsistency index is then calculated as the ratio of this value to thatwhich would result if all pair-wise comparison matrices had been filled with randomjudgments.

Although AHP accepts inconsistency as a human trait and not bad per se, it alsorecognizes that too much is undesirable for good decision making. An inconsistency indexof 0.1 or less is generally considered quite acceptable, while one higher than 0.1 is generallyconsidered problematic. In the latter case, the respondent is advised to reconsider his orher judgments (pairwise comparisons). However, the objective is not to minimize theinconsistency, but to elicit considered judgments. If due reflection during the comparisonshas been given, the resulting priorities are accepted even if inconsistency is deemed “high”.In fact, some practitioners working in areas of complex, public policy argue thatinconsistency indices as high as 0.2 may be both common and defensible (Hamalainen,Pers. Comm.).

Figure 1 summarizes this data for this project, identified as Series 1. Series 2 and 3represented other problems addressed using a similar methodology. In Series 1, hospital-based group decisions (.04 and .06 respectively) were extremely consistent while those ofthe Down syndrome focus group (.08) were less consistent. The inconsistency value of theCitizen Group, however, was comparatively high (0.14). Assuming that consistencyrepresented an informed reading of the available data from a stable, constant perspective,and that decision quality increases with consistency, the greater inconsistency evidencedby non-professional participants would suggest less confident decisions by these non-professional groups.

A careful analysis of the specific judgments of all groups suggests a very differentconclusion, however. Relatively inconsistent decision sets were related principally tovotes on two criteria in the hierarchy. Understanding these apparently eccentric judgmentsyielded valuable insights into both the process and assumptions embedded in the hierarchyitself.

3. Results

Substantiation of individual model criteria, and a general report on group responses, hasbeen fully reported elsewhere (Koch, 1998, Ch. 5). For the sake of brevity, only the highest


order (level-one) criteria are considered here. For each of these, lower-level sub-criteriafurther specified dimensions of their “parent’s” applicability. Level-1 criteria related tothe goal of organ-transplant eligibility included:

• Potential for post-operative patient survival: It was assumed those with a greaterstatistical likelihood of survival would be favored by participants.

• Level of physical activity expected following a successful transplant: It was assumedthat those most likely to achieve full, post-transplant physical function would befavored.

• Level of social independence expected following a successful transplant: It wasassumed those most likely to achieve full, post-transplant social independence wouldbe favored.

• Intelligence measured by IQ of transplant candidate: It was expected on the basis ofthe literature review that those with “moral intelligence” (an IQ of at least 70 points)would be favored.

• Level of compliance or non-compliance by transplant candidate: On the basis of theliterature review we assumed “compliant” patients who followed medical orders anddirectives would be favored.

• Degree of transplant candidate’s public recognition or notoriety: It was expected thatthose who were public figures would be favored.

The results of Level-1 judgments are presented graphically as Figure 2. The verticalaxis shows the priorities given each criterion, indicated on the horizontal axis. Despite agreat deal of congruence, two areas of divergence were problematic. The first involvedthe relative weight given to “activity” levels expected for a successful candidate followinga transplant, while the second and greater disagreement centered on the issue of“compliance.” The strong emphasis given this latter criterion by the citizen control groupraised its inconsistency level across the overall hierarchy from approximately 0.08 to alevel of 0.15.

In other areas, however, a clear consensus dominated inter- and intra-group judgments.Survival was the most strongly valued criterion, for example. Similarly, members ofthese disparate groups uniformly gave little weight to public recognition and intelligencemeasured by IQ. In addition, social independence was accorded roughly the same

Figure 1. Inconsistency Index.


importance by members of stakeholder (Down syndrome), lay (Citizen) and expert groups.Further, participant voting patterns indicated these judgments were made with equal intensityacross all groups. Participants in all groups rated survival as “extremely” more importantthan most other criteria, except physical activity. Group 1 rated physical activity as moreimportant than unqualified survival. Both intelligence, as measured by IQ, and publicrecognition were in almost all cases viewed as “very much” or “extremely” less importantthan other criteria in the pair-wise comparisons. This was especially significant given thereliance on IQ-defined intelligence reported in the literature (Olbrisch and Levenson, 1991:Majeske, 1995), and secondly, the importance given to public stature by medicalprofessionals defending the decision of colleagues to provide a liver transplant for formerbaseball player Mickey Mantle (Kolata, 1995).

Others have noted the significance of these groups’ decisions, which anticipated changesin US organ transplant policy (Ott, 1997). Historically, in both Canada and the US,“treatment decisions were made by considering only what was best for each individual”(Ott, 1997). In late 1996, however, the organization charged with setting US organ-transplantpolicy (the United Network for Organ Sharing) announced policy changes emphasizingsurvival irrespective of medical urgency as a principal criterion in liver transplantation(Kolata, 1996; Estrin, 1996; Ott, 1997). Thus a low probability of post-transplant survivalshould be, under these guidelines, reason for disqualifying a candidate whatever his or herstate of medical urgency.

Figure 2. Level-1 Criteria.


As an example, the decision to transplant a liver to former baseball player MickeyMantle was condemned by focus group members on the basis of his extremely poor, long-term prognosis. Like many public commentators (Hoppe, 1995), group participants arguedthe unfairness of providing graft organs to famous persons whose cases are medicallyproblematic. Simply put, public recognition should not, participants argued, be a criterionin organ-transplant eligibility.

More importantly, perhaps, the low valuation given intelligence as measured by IQpresaged several changes in regional transplant policy following the Jensen and Urquartcases. Inspired by Jensen’s fight for transplant eligibility, a bill prohibiting physicians andhospitals from denying access to life-saving transplantation on the basis of a person’sdisability was passed by the California legislature in July, 1996 (Hubert, 1997). At theparticipating hospital in Alberta, Canada, the debate over Terry Urquart’s refusal resultedin a new standard that recognizes family support as a value supporting the transplantcandidacy of those with cognitive deficits. With these two cases, precedents were set inboth Canada and the US acknowledging that however it is measured, “reasonableintelligence” does not serve as a sole, independent criterion in transplant eligibility.

Members of Citizen and Down Syndrome Family Association discussants arguedstrongly that IQ does not measure the range of social and general intelligence they mostvalued. Its narrow focus on a type of abstract reasoning did not, for them, reflect the worthof persons who may be loving and socially important even if limited in their ability forabstract reasoning. Participants in both hospital groups similarly questioned its validity asa measure of either individual independence or social worth. Further, they noted that personswith extremely low IQs will have other limitations that in any case would likely excludethem from transplant eligibility. Thus while it is attractive as a “firm number” clinicalpractitioners can use, participants agreed that, as a criterion for organ-transplant eligibility,it was at best redundant and at worst prejudicial.

3.1 Problematic criteria

Areas of disagreement included the criteria “Activity” and “Compliance”. One group,HSC1, ranked the potential for post-transplant physical activity as more important than“survival”. They were concerned that without qualification, physical “survival” as a criterionwould permit transplantation for persons in a vegetative state, or for others so whollydisabled as to make them unacceptable candidates to many participants. This was a concernshared by others who ranked “activity” as important, if less important than survival. Formembers of all groups, activity became important at extreme levels of medical dependence.Thus none would offer an organ to someone in a vegetative state, for example. Most agreeda person on sustained life support – even if conscious – would be too fragile to be anappropriate organ candidate. Thus the apparent disagreement at Level-1 masked anunderlying accord revealed at lower levels of the hierarchy in which definitions of parentcriteria were considered.

Far more problematic was the distance between group judgments regarding patientcompliance as a criterion. This disagreement resulted from very different definitions


employed by members of different groups. In making comparisons, each groupoperationalized “compliance” in a different way despite having received identicalhandouts and definitions prior to meeting. HSC1 members defined compliance as ameasure of communication between staff, patients, and patient families, rejecting itsutility as a solely clinical criterion. HSC2 participants at first defined “compliance” as apredictive criterion directly contributing to post-transplant survival. A discussion of theefficacy of compliance as a predictive criterion, and of HSC1’s reservations, led them toaccept the first group’s definition.

The group from the Down Syndrome Family Association defined “compliance” withmedical directives as a social rather than clinical measure. For them, it represented patientand citizen responsibility. Compliant persons were those who did what they were toldwas required to assure their – and the organs’ – survival. Non-compliant persons whorefused medical directives diminished the probability of organ survival, wasting a preciousresource. Finally members of the Citizen Group, defined this criterion as a measure ofpatient trust in the medical professionals, and secondarily, the public health system atlarge. “You don’t know,” said one person. “So you have to trust the doctors.”

Thus Citizen and Down Syndrome group members defined compliance not on clinicalgrounds but as a criterion representing social values related to patient participation inthe publicly funded health-care system, as well as trust in its administrators. In discussion,members of the lay focus group agreed that medical personnel had a responsibility toexplain clinically relevant requests to patients and their families. For them, compliancenecessarily included a process of staff explanation, discussion, and perhaps negotiationwith patients. All assumed this regularly occurred, while professionals saw more openstaff-patient relationships as a goal for which compliance was a surrogate indicator.

Obviously, this criterion was ambiguously open to various interpretations. And yet,the literature of transplant decision making earlier cited assumes compliance to be clearlyunderstood. In arriving at their decision, group participants performed the invaluableservice of showing the ways in which this criterion – almost universally accepted in theexpert literature – could be variously considered.

4. Discussion

In effect, focus-group voters took criteria that had become ends unto themselves –”Compliance” is perhaps the best example – and through their deliberations consideredthe relevance of those criteria to the goal of transplant-eligibility policy. In utilizing theseresults, the original, literature-based hierarchy was reformulated with “survival” as themodel’s goal. Eligibility thus became a matter of the likelihood of two level-1 criteria: (a)transplant survival and (b) the quality of social participation that would result for thepatient within his or her community. To the degree “compliance” could be shown to affectsurvival, it was retained as a clinical criterion labeled, in a subsequent iteration of thehierarchy, “medical compliance”. Eligibility thus would be denied on clinical groups to analcoholic who continued to drink (an activity prohibited by necessary post-transplantmedications) or a patient who refused all oral medications.


“Social Compliance” was also included, however, as a separate, socially constructedcriterion representing both patient-staff relationships and social responsibility within anational health-care system. In this reordering of the original, literature-based hierarchy,the insights of non-professional participants were as important as those of the professional.Citizens, stakeholders, and professionals came to similar conclusions in many of the pair-wise comparisons in this hierarchy. Where differences occurred, they reflected a distinctbut no less valid assumption set about the importance of one or another criterion to themodel’s goal. Multiple focus groups composed of homogeneous voters were more likelyto uncover these differences than were members of any single group. Indeed, it was in theprocess of critical evaluation both within specific groups and between the set of groupsthat a broadly consensual perspective was found.

A review of the dialogues of each group suggested a general good will among groupmembers, one aimed at both their own and the greater community’s needs. Medicalpersonnel sought both the best outcomes for patients and what was, from their perspective,the best use of a socially scarce resource. Where “compliance” was defined as a matter oftheir comfort or convenience, it was devalued. Similarly, members of the Down syndromegroup agonized over what seemed to them just and right for their members and whatseemed to them important for the greater society. They thus debated for almost forty-fiveminutes whether those with extremely high IQs should receive priority over those withlower IQs because, some suggested, society needs its very brightest persons to addresscontemporary problems.

The jury process operative in each group was dynamic and fluid. Members ofprofessional, layperson, and stakeholder groups found their views changing and theirjudgments shifting in the discussion of pair-wise criteria. Examples of these dialogues aregiven in another report (Koch, 1998). This suggests that, while group members may voteindependently, the interdependence of the small-group discussion process may enhancethe probability of each “voter” (each group) reaching a generally acceptable decisionirrespective of other participants’ levels of expertise.

This conclusion differs from that of previous researchers like Nitzan and Paroush who,as Berg points out, conclude that independent choices are generally superior tointerdependent decisions in which weaker jury or group members are swayed by a single,stronger group participant (Nitzan and Paroush, 1985; Berg, 1994, 72). In these groups,that didn’t happen. Rather, there seemed to have been a revolving leadership in whichdifferent individual positions may have affected other jury members on any single decision.

In the Down Syndrome Family Association group, for example, members spent forty-five minutes discussing one aspect of IQ and its relation to other criteria because onemember felt strongly that some special consideration should be given persons withextremely high intelligence scores, those above 140 basis points. Not all were convinced,but some participants altered their votes on this point. The leader in that discussion, however,was an average-to-minor discussant on points where others felt strongly.

This was the general pattern among groups: leaders developed for one or two questionsbut were not dominant across the process of criterion evaluation. Each person was listenedto, and at different points the perspective of different voters may have been of singularimportance in the deliberations of others. The need to explain one’s position to others, the


desire for cohorts to understand even where they do not accept an individual’s viewpoint,created a context in which judgments acceptable to all group members were made. Abroadly consensual position based on the experiences and perspectives of hospital personnel,stakeholders, and citizens resulted from the aggregation of the judgments of all participatinggroups.

In this case one may say that the CJT’s broad promise is fulfilled even if some of itsrestrictions have been relaxed. The conclusion may be that the likelihood of reaching thebest possible decision is enhanced by involving multiple juries whose constituency, whilegenerally homogeneous group to group, is heterogeneous across all sets of discussants.This offers the broadest mechanism for the exchange and comparison of a range ofperspectives, knowledge bases, and value sets. The diversity it represents leads toconclusions that are synthetic and, not coincidentally in this case, promise the broadestpossible base of social support.

The potential of the heterogeneous jury lies not simply in the specialized knowledgeits individual members may possess but in the diverse types of knowledge variousmembers bring to a single problem. Expertise, and the “knowledge” it supposedlybestows, may be clinical or scientific, social, or experiential. The inclusion ofheterogeneous focus groups as voters suggests that, at least in the realm of bioethics, allthese varying types of knowledge contribute to the resolution of a problem whose solutionmust be at least generally acceptable both clinically and socially. To make “correct”bioethical decisions, hierarchies will have to be structured to reflect a pluralism of bothknowledge (ways of knowing) and of values.

From this perspective, the probability of arriving at “correct” decisions increases tothe degree voters share (a) a compatible knowledge base that is (b) applied to a sharedset of critical definitions reflecting (c) a shared set of cultural values. In the jury process,critical definitions and the cultural values they reflect become the context within whichspecific and technically precise data is interpreted. The participation of layperson andstakeholder groups assured, in this example, that values unrelated to outcome, whichare often implicit in professionally-accepted criteria, were represented in what may atone level appear to be a purely technical (or in this case clinical) problem. Put anotherway, judgments made without the input of diverse constituencies may inhibit rather thanadvance the probability of correct decision making. It is the necessity of these diverseknowledge types to bioethical decision making that argues for a CJT approach, one inwhich the probability of a correct – in this case socially and clinically acceptable –policy judgment is increased by the participation of a broadly rather than narrowlyconstituted jury.

Acknowledgements

The authors wish to express their appreciation to Prof. Hannu Nurmi, University of Turku,Department of Political Science, Finland, for his comments and suggestions on preliminarydrafts of this paper. They also acknowledge the contribution of reviewers whose commentson an earlier draft were incorporated into this version.


References

Austen-Smith, D., and J. Banks. (1996). “Information Aggregation, Rationality, and the Condorcet Jury Theorem,”American Political Science Review 90, 34–35.

Berg, S. (1996). “Condorcet’s Jury Theorem and the Reliability of Majority Voting,” Group Decision andNegotiation 5(3), 229–238.

Berg, S., and H. Nurmi. (1996). “Group Decision Quality and Social Choice Theory,” Group Decision andNegotiation 5(3), 207–209.

Berg, S. (1994). “Evaluation of Some Weighted Majority Decision Rules Under Dependent Voting,” MathematicalSocial Sciences 28, 71–83.

Boland, P. J. (1989). “Majority Systems and the Condorcet Jury Theorem,” The Statistician 38, 181–189.Caldwell, Lynton K. (1996). “Science Assumptions and Misplaced Certainty in Natural Resources and

Environmental Problem Solving,” in John Lemons (ed.), Scientific Uncertainty and Environmental ProblemSolving. Cambridge, MA: Blackwell, 394–42, 395.

Cook, R. D., S. Staschak, W. T. Green, and L. G. Vargas. (1989). “A Method to Allocate Livers for OrthotopicTransplantation: An Application of the Analytic Hierarchy Process”. Proceedings of the InternationalConference on Multiple Criteria Decision Making. Bangkok, December 6–8.

Corley, M. E., and G. Sneed. (1994). “Criteria in the Selection of Organ Transplant Recipients,” Heart and Lung23(6), 453–454.

Dawson, C. (1995, March 28) “Supporters Rally for Teen’s Transplant,” Calgary Herald, B3.Donohue, T. (1995, March 22) “Morally Outrageous,” Ottawa Citizen, A12.Delsohn, B., and T. Philip. (1995, August 11). “Activist Takes on Fight for her Life,” Sacramento Bee, A1.Edwards, W., and B. F. Hutton. (1994). “SMARTS and SMARTER: Improved simple methods for multiattribute

utility measurement,” Organizational Behavior and Human Decision Processes 60, 306–325.Estrin, R. (1996, November 15). “Liver Transplant Policy to Favor Patients Most Likely to Survive,” The

Washington Post, A3.Grofman, B., and S. L. Feld. (1988). “Rousseau’s General Will: A Condorcetian Perspective,” American Political

Science Review 82(2), 567–576.Hamalainen, Raimo (Personal communication to Ridgley). Address: Systems Analysis Laboratory, Helsinki

University of Technology, Helsinki, Finland. E-mail: [email protected], A. (1995, June 16). “Mickey Mantle’s Lucky Liver,” San Francisco Chronicle, A29.Hubert, C. (1997, May 25). “Transplant Pioneer Loses Battle for Life,” Sacramento Bee, A3.Koch, T. (1988). The Limits of Principle: Deciding Who Lives and What Dies. Westport, CT: Praeger Books,

1998.Koch, T., and M. Rowell. (1998). “The Dream of Consensus: Finding Common Ground in a Bioethical Context,”

Theoretical Medicine and Bioethics 19(5).Koch, T., and M. Rowell. (1997). “A Pilot Study on Transplant Eligibility Criteria: Valuing the Stories in Numbers,”

Pediatric Nursing 23(2), 160–166.Koch, T. (1996). “Normative and Prescriptive Criteria: The Efficacy of Organ Transplantation Allocation

Protocols,” Theoretical Medicine 17(1), 75–93.Kolata, G. (1996, November 15). “In Shift, Prospects for Survival will Decide Liver Transplants,” The New

York Times, A1, A26.Kolata, G. (1995, June 11). “Transplants, Morality, and Mickey Mantle,” New York Times, Section 4, 5.Majeske, R. A. (1995, August 5). “Criteria for Transplant Candidate Selection,” BIOMED-L conference list.Miller, N. R. (1996). “Information, Individual Errors, and Collective Performance: Empirical Evidence on the

Condorcet Jury Theorem,” Group Decision and Negotiation, 5(3), 211–218.Nitzan, S., and J. Paroush. (1985). Collective Decision Making. Cambridge University Press, Cambridge. Also

quoted in Berg (1994).Obrisch, M. E., and J. L. Levenson. (1991). “Psychosocial Evaluation of Heart Transplant Candidates: An

International Survey of Process, Criteria, and Outcomes,” Journal of Heart Lung Transplant 10, 948–955.Ott, B. (1997). “Commentary on Koch and Rowell Article: Changes in Liver Transplantation,” Pediatric Nursing

23(2), 167–168.


Pueschel, S. M. (1989). “Ethical Considerations in the Life of a Child with Down Syndrome,” Issues in Law andMedicine 5(1), 87–99.

Saaty, T. L. 1980. The Analytic Hierarchy Process. McGraw-Hill.Schmidt, Frederick F. (1985). “Consensus, Respect, and Weighted Averaging,” Synthese 62, 25–46.Shapley, L., and B. Grofman. (1984). “Optimizing Group Judgmental Accuracy in the Presence of

Interdependencies,” Public Choice 43, 329–343.Wood, L. A., and R. O. Kroger. (1995). “Discourse Analysis in Research on Aging,” Canadian Journal on Aging

14(1), 82–99.

The Condorcet's Jury Theorem in a Bioethical Context: The Dynamics of Group Decision Making

Documents

Transcript of The Condorcet's Jury Theorem in a Bioethical Context: The Dynamics of Group Decision Making