Lawyers' probability misconceptions and the implications for legal education

20
Lawyers’ probability misconceptions and the implications for legal education Peter Hawkins Solicitor. Principal Lecturer, College of Law, London Anne Hawkins Director, Royal Statistical Society Centre for Statistical Education, University of Nottingham INTRODUCTION This article describes an empirical study that was undertaken to see whether lawyers possess the statistical and, more particularly, probabilistic understanding that they need for their professional duties. Their earlier training in mathematics might have been expected to provide them with the relevant understanding, but was found to be wanting. The implications for legal decision making are discussed, and recommendations are made for preparing lawyers with the necessary skills. WHY IS AN UNDERSTANDING OF PROBABILITY PARTICULARLY IMPORTANT TO LAWYERS? Assessing likelihoods is fundamental to the work of lawyers and therefore, either directly or indirectly, to each member of society. Whether or not they recognise the fact, lawyers do work with probability. They need to assess what is likely to have happened in a particular case. They must then evaluate the case and select the oprimal strategy to be followed. They must predict the uncertain behaviour of other persons, eg witnesses and jurors, in order to influence it. They must have the ability to select and represent information, both qualitative and quantitative, guiding the court’s interpretation of the evidence in the face of uncertainty. Finally, in all civil and criminal cases, they must convince the court to a required level of probability. In the civil courts in England and Wales, this level is ‘the balance of probabilities’, ie ‘more likely than not’ (or a probability that is greater than 0.5). In the criminal courts, the standard for prosecutors is ‘beyond reasonable doubt’, the precise meaning of which is much more difficult to gauge. A judge may say to thejury that they ‘haveto be sure’, but people’s ideas of ‘being sure’ are subject to variability. Furthermore, if a judge advises them to be ‘certain beyond reasonable doubt’, the jury is being presented with a contradiction in terms. To be ‘certain’ is to be 100% sure, ie is associated with a probability of unity (p=l), whereas ‘beyond reasonable doubt’ is, by definition, a lower standard of proof (ie for example, p=0.99, or 0.95, or 0.90, or 0.75, etc).

Transcript of Lawyers' probability misconceptions and the implications for legal education

Page 1: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions and the implications for legal education

Peter Hawkins Solicitor. Principal Lecturer, College of Law, London

Anne Hawkins Director, Royal Statistical Society Centre for Statistical Education, University of Nottingham

INTRODUCTION

This article describes an empirical study that was undertaken to see whether lawyers possess the statistical and, more particularly, probabilistic understanding that they need for their professional duties. Their earlier training in mathematics might have been expected to provide them with the relevant understanding, but was found to be wanting. The implications for legal decision making are discussed, and recommendations are made for preparing lawyers with the necessary skills.

WHY IS AN UNDERSTANDING OF PROBABILITY PARTICULARLY IMPORTANT TO LAWYERS?

Assessing likelihoods is fundamental to the work of lawyers and therefore, either directly or indirectly, to each member of society. Whether or not they recognise the fact, lawyers do work with probability. They need to assess what is likely to have happened in a particular case. They must then evaluate the case and select the oprimal strategy to be followed. They must predict the uncertain behaviour of other persons, eg witnesses and jurors, in order to influence it. They must have the ability to select and represent information, both qualitative and quantitative, guiding the court’s interpretation of the evidence in the face of uncertainty. Finally, in all civil and criminal cases, they must convince the court to a required level of probability.

In the civil courts in England and Wales, this level is ‘the balance of probabilities’, ie ‘more likely than not’ (or a probability that is greater than 0.5). In the criminal courts, the standard for prosecutors is ‘beyond reasonable doubt’, the precise meaning of which is much more difficult to gauge. A judge may say to the jury that they ‘have to be sure’, but people’s ideas of ‘being sure’ are subject to variability. Furthermore, if a judge advises them to be ‘certain beyond reasonable doubt’, the jury is being presented with a contradiction in terms. To be ‘certain’ is to be 100% sure, ie is associated with a probability of unity (p=l), whereas ‘beyond reasonable doubt’ is, by definition, a lower standard of proof (ie for example, p=0.99, or 0.95, or 0.90, or 0.75, etc).

Page 2: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 31 7

The two standards of proof, civil and criminal, are very different. It is perfectly possible for a defendant to be found not guilty in the criminal courts but nevertheless to be held liable in the civil courts. The two recent 0 J Simpson trials in the USA were a good example of this.’ It is not clear, however, what these probabilistic standards, particularly the second, actually mean to lawyers or jurors. Lord Denning said that ‘there may be degrees of proof within that [the criminal proof] standard’ .2 Indeed, Simon and Mahad would agree. They administered a questionnaire to groups of US judges, jurors4 and students of sociology and found evidence to suggest that the more serious the crime, the greater was the presumption of innocence. Figure Is shows the mean probabilities of guilt that the three groups of respondents considered corresponded with ‘beyond reasonable doubt’ for five types of crime. The judges’ standard of proof appears to be approximately 0.9 whatever the nature of the crime, and is less subject to variability than the standards proposed by the other two groups.6 The jurors seem to have a markedly lower standard of proof, though, with most of their judgments of ‘beyond reasonable doubt’ being in the region of 0.75. They are also out of step with the other two groups in giving a lower standard for crimes of rape. This may be a reflection of the age or context of this particular study. If standards of proof can vary according to the type of crime then they are likely also to vary by time as society revises its views of the seriousness of different offences. Cultural differences might also be expected to have a bearing. Moreover, Eggleston makes the point that stated standards of proof do not necessarily correspond directly with the inclination of judges or jurors to convict or acquit in practice. The situation is clearly more complex than Figure I might suggest (see also Aitken7).

1. People of California v Simpson (1995) No BA097211, Superior Court of the State of California for the County of Los Angeles; Rufo v Simpson et al, Goldman, etc. v Simpson et al, Brown, etc. v Simpson (1997) Nos SC031947, SC036340 & SC036876 respectively, Superior Court of the State of California for the County of Los Angeles. 2. R Eggleston Evidence, Proof and Probability (London: Weidenfeld and Nicholson, 2nd edn, 1983). 3. R J Simon and L Mahan ‘Quantifying burdens of proof (197 1) 5 Law and Society Rev 3 19-330. 4. Research into how real juries arrive at their decisions is prohibited in the UK, so most of the research in this field has been done in North America. 5. From Eggleston, above n 2. 6. C G G Aitken Statistics and the Evaluation of Evidence for Forensic Scientists (Chichester: Wiley, 1995). 7. Aitken asks, for example, what might the judges’ 0.92 standard of proof for murder mean? If it is 0.92 to 0.08 (approximately 12: 1) odds in favour of guilt, does this mean that out of every 13 Americans convicted for murder one was innocent? Aitken posits that this is unlikely to be the case, and suggests that it is more likely that ‘the judges do not have a good intuitive feel for the meaning of probability figures’.

Page 3: Lawyers' probability misconceptions and the implications for legal education

318 Legal Studies

Figure I : Differing interpretations of ‘Beyond Reasonable Doubt ’ (Simon and Mahan, 1971)

Mean response of people surveyed Crime Judges Jurors Students

Murder 0.92 0.86 0.93 Forcible Rape 0.91 0.75 0.89 Burglary 0.89 0.79 0.86 Assault 0.88 0.75 0.85 Petty Larceny 0.87 0.74 0.82

BACKGROUND TO THE PRESENT STUDY

Research in the field of statistical education has shown that some people have great difficulty in dealing with information that involves conditionality8 or causality, especially when these concepts are set in the context of decision-making under uncertainty. The tree-diagrams in Figure 2 show four different outcomes that might occur in a legal context. These are derived from the combinations of two possible realities (Guilt and Innocence) with two possible verdicts (‘Guilty’ and ‘Not guilty’). Examples of Probability values have been put in alongside the four possible outcomes.

The distinct nature of the four outcomes shown in the tree-diagrams in Figure 2 is not likely to cause too many problems. However, it is not uncommon for people to confuse the conditional probability of a verdict of ‘Guilty’ given that the accused is Guilty, ie Pr( ‘G’IG)9, with its inversion - the probability of Guilt given a verdict of ‘Guilty’, ie Pr(GI‘G’). As we shall see, this and similar ‘confusions of the inverse’ have been at the heart of a number of disputes in recent court cases.1°

The following sporting example, after Koehler,” may help to underline the point that a conditional probability cannot be substituted for by its inverted form. The probability that Manchester United beat Liverpool, given that Manchester scored five goals, is not the same as its inversion - the probability that Manchester scored five goals, given that they beat Liverpool. The former probability, Pr(ManU winslManU scores five goals), is very high because it is rare for either football team to score five goals in a match, let alone both of the teams. However, the second probability, Pr(ManU scores five goalslManU wins) is quite low because (notwithstanding differing opinions about the relative merits of the two teams) Manchester United could have scored any number of goals, even as few as one, in beating Liverpool.

8. Where we are interested in something happening ‘given that’ some other event has also occurred or will do in the future. 9. Pr(‘G’IG), where the vertical line signifies the ‘given that’ condition. 10.For further discussion of such errors, see A S Hawkins ‘Uncertain Justice’ (1993) 112(9) Law Notes 21-23. 1l.J J Koehler ‘Probabilities in the courtroom: An evaluation of the Objections and Policies’ in D K Kagehiro and W S Laufer (eds) Handbook ofPsychology andLaw (New Y ork: Springer-Verlag, 1992) pp 167- 184.

Page 4: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 319

Figure 2: Tree-diagrams demonstrating that Pr(‘G ’( G) is not the same as Pr(G1 ‘GY

Truth Verdict Probabilities Truth Verdict Probabilities

Pr(‘Gu1lty’ verdict I Guilt) Pr(Gu1lt I ‘Gullty’ verdict) Given G, Guilt, only the top half of the Given ‘GI, a ‘Guilty‘ verdict, only

tree-diagram is relevant. Then, only the branches 1 and 3 are relevant. Then, only top branch relates to ‘GI, a ‘Guilty’ verdict. the top branch relates to G, Guilt. Pr(‘G1G) = O.W(O.!j4 + 0.36) = OM.9 = 0.6 Pr(G1‘G’) = 0.W(0.54 + 0.06) = O.W.6 = 0.9

The problems that people experience with conditional probabilities can be seen to be very important when we consider the ‘prosecutor’s fallacy’, which featured in R v Dean.I2 The prosecutor’s fallacy is a particular example that can be seen as a ‘confusion of the inverse’. A simple representation of the fallacy is as follows: in the criminal courts it often falls to the prosecutor to ask the jurors to consider the probability of the accused being innocent given the evidence, ie Pr(Not Guilty I E~idence).’~ Instead, however, the prosecutor may try to make them consider a different probability, namely the probability of the evidence occurring given that the accused is innocent, ie Pr(Evidence I Not Guilty). For example, if there are ten items of forensic evidence tending to implicate the accused, the prosecutor’s fallacy is to pose the wrong question - ‘How unlikely is it for all these ten pieces of evidence to occur in conjunction with one another if the accused is not guilty?’ The implication is that it is very unlikely, and therefore the accused ‘must be’ guilty.

12.R v Deen ( 1 994) Times, 10 January. 13.The authors would like to make the point that they have shown the probabilities in the way in which they are typically portrayed in the literature. Thus, the correct question for the court has been stated to be ‘Given the evidence, what is the probability of the defendant being innocent?’ However, this is actually a rather worrying way of presenting the information, given that we have a system of justice which proclaims that the accused is ‘innocent until proved guilty’. We should really be interested in the probability of guilt given the evidence, ie Pr(GIEvid), not in the probability of innocence given the evidence, ie Pr(NGIEvid). Even if these two probabilities are opposite sides of the same coin, the latter formulation puts the emphasis in the wrong place.

Page 5: Lawyers' probability misconceptions and the implications for legal education

320 Legal Studies

The landmark case in the United States, PeopEe v Collins,14 involved a street robbery. There were six items of forensic evidence, but no positive identification of the two accused. The prosecutor estimated that - ‘the chances of anyone else besides these defendants being there . . . having every similarity . . . is something like one in a billion’. Although the California Supreme Court adopted a lower probability of one in twelve million, the defendants were still convicted. The judgment was reversed, however, because the real question should not have been - ‘How unlikely is it for all those aspects to occur in conjunction’ (ie multiplying small decimal probabilities to get even smaller decimals) but rather - ‘Given the evidence, what is the probability of the defendants being guilty?’. In fact it was found that the combination of distinguishing aspects could probably have occurred by chance more than once in the locality. The defendants could not, therefore, be found guilty ‘beyond reasonable doubt’ because the possibility of even just one other ‘match’ existing destroyed the prosecutor’s case. Indeed, if it was 5050 between two possible defendants (one in court, and another hypothetical one ‘somewhere in the relevant locality’), not even the civil burden of proof (balance of probabilities) has been satisfied.

Accepting that lawyers deal in probabilities rather than certainties, and that they must help others to do likewise, their probabilistic intuitions must be sound. Cases like those described above would suggest that this is not something that can be taken for granted. The research instrument (see Figure 4 ) used in the present study permitted an exploration of lawyers’ susceptibility to some of the probabilistic misconceptions that have been shown by earlier researchers to be prevalent among other groups. Apart from work with young children looking at the development of probabilistic understanding, or conversely at the emergence of misconceptions (eg Piaget and 1nhelder,l5 FischbeinI6), most of the studies have been with adults, particularly university students. Kahneman, Slovic and Tversky” and Shaughnessyrs are useful early and later reviews of research findings in this area. The error types and associated research methodology are also described in Hawkins, Jolliffe and G1i~kman.I~ Some of the items in the present research instrument were designed to look at the lawyers’ ability to handle the ‘building blocks’ of probabilistic reasoning (selection, conjunction, random order, etc). Others were more closely related to ‘applied’ law and to more global examples of legal and probabilistic inference.

14.People v Collins (1968) 68 Cal2d 319,66 Cal Rptr 497,438 P 2d 33, 36 ALR 3rd 1 176 at 34. 15.J Piaget and I Biirbel (trans L Leake Jr, P Burrell, H D Fischbein) La genise de I’idke de hasard chez I’enfant (The origin of the idea of chance in children) (London: Routledge & Kegan Paul, 1951). 16.E Fischbein The intuitive sources of probabilistic thinking in children (trans C A Sherrard) (Dordrecht: D Reidel Publishing Co, 1975). 17.D Kahneman, P Slovic and A Tversky (eds) Judgment under Uncertainty: Heuristics and biases (Cambridge: Cambridge University Press, 1982). 18. J M Shaughnessy ‘Research in probability and statistics: reflections and directions’ in D AGrouws (ed) Handbook on Research onMathematics Teaching andz-earning (New York: Macmillan, 1992) pp 465-494. 19.A Hawkins, F Jolliffe and L Glickman Teaching Statistical Concepts (Harlow: Longman, 1992).

Page 6: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 321

THE PRESENT STUDY

Methodology The study required each respondent to complete a ‘Likelihoods Schedule’, which consisted of a multiple-choice paper and pencil test. The subjects were told that the test would take approximately six minutes to complete. In fact, more time was allowed if it was needed, but all the respondents completed the test well within 10 minutes. The test was anonymous and no conferring was allowed.

Subjects

This report focuses on 168 students who had just begun a Legal Practice Course, although there were further groups who participated in the main study, including a group of statistics teachers. The lawyers who took part were all graduates, although not necessarily in law, and all had received a minimum of one year’s training in law.

Most of the respondents (124) were asked to record the highest qualification in mathematics that they had obtained, chosen from a list of examples. On the basis of this, it was possible to classify this sub-group of respondents into two groups (see Figure 3) in order to consider the effect of mathematical training on the lawyers’ probabilistic intuitions and reasoning. Those (6 1%) who had received compulsory school education in mathematics or less are designated ‘the lower mathematics group’, and those who had received more than this level of mathematics education will be identified as ‘the higher mathematics group’. Almost all the former group had been educated up to the equivalent of GCSE mathematics. Most of the latter group had been educated up to the equivalent of A level mathematics, although a few (7% of 124) had received some university training in this subject. Where mathematics training was shown not to be a key factor, the results from the whole group of 168 lawyers are reported.

Figure 3 Mathematical backgrounds of the lawyers

More than compulsory

Compulsory or less

The research instrument The Likelihoods Schedule (see Figure 4) included six multiple-choice items, all of which were couched in legal contexts. They all followed the format of items that can be found in the research literature, in particular Kahneman et a1.20 The present study featured those misconceptions and patterns of reasoning which have sometimes been described as the ‘availability heuristic’, the

2O.Above n 17.

Page 7: Lawyers' probability misconceptions and the implications for legal education

322 Legal Studies

‘representativeness heuristic’, the ‘conjunction fallacy’, ‘inferential asymmetries’, and the ‘base-rate fallacy’.

Different multiple-choice variants were piloted to find the optimal level of forced choice on the different items. As will be seen, the answer ‘Neither’ was included where this was the objectively correct answer to a question. The answer ‘Don’t know’ was used on one item to minimise interference from pure guesswork. Ranking was used as the response to one question.

Figure 4: Likelihoods Schedule

Question 1

In England, we have an Appeal Court system which normally operates with a panel of three senior judges, who are chosen from a larger pool of judges, in order to ensure that panel membership varies. Some other countries apply a similar system, but use larger panels of judges (eg seven) for similar purposes.

Suppose that the pool of senior judges eligible for selection numbers ten.

Which system is likely to produce a greater number of different panels of judges?

Question 2

Mr Tyne and Mr Tees are barristers. They were appointed to act for the plaintiffs in two different cases in the same court house, both cases being scheduled to start at 10 am. They share a taxi to the court house, but are delayed when this taxi is involved in a minor road accident. They arrive at the court at 10.45 am.

On arrival, Mr Tyne is told that his case was called on time and was struck out in his absence. The judge has now left to play golf.

Mr Tees is told that his case was called just 5 minutes ago and has also been struck out in his absence. The judge has also now left to play golf.

Who is likely to be more upset?

Page 8: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 323

1 in 1

Question 3

1 in2 1 in3 1 i n4

Twelve cases were dealt with today in Toontown Court, six in court A and six in court B. The judge in court A found the first three defendants liable (L) and the last three defendants not liable (N). The judge in court B found the first defendant liable, the second and third defendants not liable, the fourth defendant liable, the fifth defendant not liable, and the sixth defendant liable.

1 in 1

ie Court A: LLLNNN Court B: LNNLNL

1 in2 1 in3 1 in4

Assume that you have no more information about the cases or the judges. Which set of judgments is more likely?

-1 -1 p i K T - l

Question 4

Roger is 19 years old. He is intelligent but not very creative. He has always wanted to be a solicitor. At school, he excelled in English and history, but was rather weak in mathematics. Last year, he passed three ‘A’ Levels, obtaining good grades in each.

Rank these four statements, ie put them in order. (Put 1 for the most likely statement, 2 for the second most likely, and so on.)

(a) Roger is a law student (b) Roger is a student (c) Roger likes listening to jazz (d) Roger is a law student who likes listening to jazz

Question 5

Four lawyers attend a meeting at a client’s office. Two are solicitors and two are barristers. They all arrive separately.

Page 9: Lawyers' probability misconceptions and the implications for legal education

324 Legal Studies

Question 6

A taxi-cab was involved in a hit and run accident at night. Two cab companies, Green Cabs and Blue Cabs, operate in the city, and they both deny responsibility. An action for damages is brought against Blue Cabs. The court is given this evidence:

The court tests the reliability of the witness when attempting to identify the colours of cabs in the dark, ie under the same circumstances that existed on the night of the accident. The court concludes that the witness has correctly identified each colour 80% of the time, but that he has failed to do so 20% of the time.

(a) 85% of the cabs in the city are Green and 15% are Blue: and (b) An independent witness identifies the guilty cab as being Blue.

Should the Blue Cab company be held liable for the accident?

-1 I Not Liable 1 1 pGGill

Give reason(s) for your answer: ............................................................

RESULTS

Question 1: Selecting appeal court panels The objectively correct answer to this question is ‘Neither’. For every different set of three judges who are selected, there is a residual set of seven who are not selected. Consequently, the different panels of size three that could be chosen are just as prolific as those of size seven. For example, selecting judges ‘A’, ‘B’ and ‘C’ leaves ‘D, E, F, G, H, I and J’ , and choosing ‘A’, ‘B’ and ‘D’ leaves ‘C, E, F, G, H, I and J’, etc.

Despite this, many people give the answer ‘Three’ to this question. In the research literature, this has sometimes been attributed to their over-reliance on the availability heuristic, suggesting that it occurs because it is easier for people to think of different groups of three than of seven (the former being more ‘available’ mentally). Whatever the reason, failure to understand the principles of selection (building blocks for probabilistic understanding) will inevitably result in subsequent comparisons of likelihood being flawed.

The lawyers’ responses to this question were extremely poor (in the order of only 10% of them being objectively correct). Both mathematics groups were very susceptible to the predicted error type (ie answering ‘Three’ rather than ‘Neither’). This was the only item in the test where the lawyers’ mathematics background was significantly associated (in statistical with their response to the question. Perhaps surprisingly, those lawyers who had studied mathematics beyond compulsory level were significantly more likely than members of the ‘lower mathematics group’ to be drawn towards the availability response ‘three’ (8 1 % and 7 1 % respectively).

21.At the 5% level of significance.

Page 10: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 325

Question 2: The reactions of the two barristers arriving late for court The objectively correct answer to this question is ‘Neither’. Both barristers were late; both had their cases struck out; and both would have to apply for reinstatement. However, the tendency is for people to say that Mr Tees, who just missed getting there in time, would be more upset. This tendency was certainly observed in the present study, notwithstanding the fact that a few respondents added a footnote to the effect that it was the barristers’ clients who would, in fact, be most upset.

Some researchers have described this as another misapplication of the availability heuristic. Others have used the term ‘simulation heuristic’. The suggestion is that the response is chosen based on the ease with which it is perceived that the past could be undone -the ‘if only’ syndrome. In other words, respondents find it easier to imagine Mr Tees arriving early enough. This may be similar to the exaggerated feeling of irritation that you feel when the telephone stops ringing while you are fumbling to unlock the front door. Hearing that you missed a phone call much earlier in the day while you were out does not usually evoke the same degree of frustration. It might be argued that such reasoning could result in a failure to appreciate the relative significance or impact of different pieces of information by lawyers and jurors alike, or indeed in errors on the part of the lawyers in predicting jurors’ reactions to different items of evidence.

Once again, although they fared better than they did with question 1, the lawyers’ responses to this question still inclined towards the error type described in the research literature. The mathematics backgrounds of the lawyers made no significant difference here. Both groups performed very similarly. Overall, 60% of the 168 lawyers replied that Mr Tees would be more upset, with only 38% choosing the objectively correct response.

Question 3: Sequence of judgments in courts A & B Again, the objectively correct answer to this question is ‘Neither’, because the regular sequence of judgments is just as likely as the irregular sequence. However, people tend to assert that the series of judgments in court B (LNNLNL) is more likely than that in court A (LLLNNN). In the research literature this has been said to result from people misapplying the ‘representativeness heuristic’. Researchers have suggested that this occurs because people’s judgment is influenced by the degree to which the sample of observations matches up to their expectations about apopularion of outcomes. This results in information about the order of events not being ignored by respondents, although it is irrelevant in this particular test item.

This example is actually of far more significance than might appear at first sight, because it concerns people’s sense of what occurs randomly, or by chance, as opposed to what is ‘caused’.22 The lawyers did considerably better on this item than on the first two. The difference between the two mathematics groups

22.The ideas of randomness and causation should not be assumed to be universal. There is certainly a cultural dimension that should not be overlooked when considering people’s understanding of likelihoods, probability, etc. In some regions, for example, there is no understanding of ‘chance’, and everything is thought to have a cause, whether it be the influence of the god(s), or that of the local witch doctor, etc.

Page 11: Lawyers' probability misconceptions and the implications for legal education

326 Legal Studies

was not statistically significant, although the ‘higher mathematics group’ did appear to do slightly better with 77% of them, compared to 69% of the ‘lower mathematics group’ giving the objectively correct response. Overall, 66% of the 168 lawyers chose the correct answer. This still left one in three of them who did not, most of them (27%) making the error predicted by the representativeness fallacy explanation, ie believing wrongly that a sequence of events had a specific cause.

Question 4: Comparative likelihoods - law students who like jazz

This question attempts to test what has been described as the ‘conjunction fallacy’, under which people tend to rate certain types of conjunctive events as more likely than their parent stem events. In the present study (see Figure 5), all ‘Law students who like jazz’ are necessarily part of ‘Law students’ who are in turn part of ‘Students’. The relative size of the population of people who ‘Like jazz’ is not important. What matters is the ordering of the other three statements, which must be ranked in descending order of likelihood, ie ‘Student’, ‘Law student’, and ‘Law student who likes jazz’. The lawyers’ rankings were evaluated, both paired and overall, allowing for the fact that there was more than one ‘correct’ answer, depending on each respondent’s view of the popularity of jazz.

Figure 5: Example of relative population sizes of conjunctive characteristics

f = Law students who like j a u

Here, the standard of the lawyers’ responses was not very different from that for question 3, with approximately one in three of the lawyers failing to order all the categories correctly. Of the lawyers, 85% realised that ‘Student’ was most likely, but only 65% were correct with the least likely statement, ie ‘Law Student who Likes Jazz’. The rest over-estimated its likelihood. The level of the lawyers’ mathematics backgrounds was not a significant influence on success with this question, although there is some evidence to suggest that the ‘higher mathematics group’ fared a little (but not significantly) better. Seventy five per cent of this group (compared with only 63% of the ‘lower mathematics group’) were correct with all their rankings.

Page 12: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 327

The lawyers’ errors on this item prompt the question of how this relates to the prosecutor’s fallacy outlined earlier. In the present study, about a third of the respondents tended to over-estimate the likelihood of conjunctive events. In the prosecutor’s fallacy, however, the converse seems to happen, with people tending to be over-influenced by the perceived un-likelihood of conjunctive events, ie many things in conjunction seem so unlikely as to be very noteworthy or ‘significant’. At the very least, there appears to be evidence that people are inconsistent in the way they interpret low probabilities arising from conjunction of events, bending their significance to suit their purposes in adversarial contexts.

Question 5: Arrival of the solicitors and barristers

The objectively correct response to both parts of this question is ‘one in three’. This can easily be demonstrated by counting all the possible outcomes and excluding those that are impossible on the given facts (eg ‘given that a solicitor arrives first’) in the two scenarios. This is shown below for both parts of the question, with the three possible outcomes displayed in bold print, only one of which has the solicitors arriving both first and second.

(a) Probability (solicitor second, given solicitor first) = 1/3 &J sbsb sbbs bssb bsbs bbss

(b) Probability (solicitor first, given solicitor second) = 1/3 sbsb sbbs bssb bsbs bbss

The first part of this question tends to pose fewer problems. Given that the first ‘s’ is fixed, there are one ‘s’ and two ‘b’s left, so the chances of the second arrival being a solicitor is correctly seen to be one in three. The second part of the question tends to be more counter-intuitive. Again, though, the correct answer can be shown to be one in three.

Question 5(a) appears to have been easy for the lawyers, with 87% of the ‘lower mathematics group’ and 95% of the ‘higher mathematics group’ selecting the objectively correct response. Both groups of lawyers, however, did significantly worse (in statistical terms23) on part (b). Overall, 84% of the 168 lawyers got the first answer correct, but only 55% succeeded on the second part, with 25% choosing ‘one in four’ and 17% choosing ‘one in two’. Mathematics background was not a statistically significant factor, although the ‘higher mathematics group’ appeared to do slightly worse. Only 50% of these respondents, as opposed to 64% of the ‘lower’ group, gave ‘one in three’ as their answer. Readers might wish to speculate on what reasoning might have led some of the respondents to chose ‘one in four’, or why others selected ‘one in two’.

Some researchers have attributed this discrepancy between performance on the two parts of the item as evidence of respondents’ inconsistency when confronted with inferential assymetries, ie it may be easier to reason the ‘forward influence’ of events rather than their ‘backward influence’. It has also been described as another example of ‘confusion of the inverse’ (see Figure 2 ) . In fact, it seems to the authors that a simpler way of describing the respondents’

23.At the 1 % level of significance.

Page 13: Lawyers' probability misconceptions and the implications for legal education

328 Legal Studies

basic difficulty is that they fail to formulate the original problem andor fail to enumerute all the possible outcomes.24 This is important, especially as the lack of these skills can also be seen to underpin other types of misconception, irrespective of the more specific error labels used in the research literature, which are neither universally nor unambiguously applied. As we saw earlier, in the examples of flawed court case decisions, a thorough grasp of conditional probabilities is vital for lawyers.

Question 6: Taxi cab problem

Objectively, this problem is easy to evaluate using Bayes’ theorem.25 It is necessary to compare the probability of the cab being blue, given that the witness identified it as being blue, ie Pr(BI‘B’), with the probability of the cab being green, given that the witness identified it as being blue, ie Pr(GI‘B’). The result of applying Bayes’ theorem to the data given in this item is that the Blue Cab Company is not liable (see Hawkins and HawkinP). The likelihood that the cab was blue, given that the witness said it was blue, is actually 0.412’ (ie less than 0.5, which is the level of proof required in the Civil Courts). Hence ‘on the balance of probabilities’ the Blue Cab Company is not liable for the accident.

24.Reducing the two problems to comparisons of the number of positive instances to the number of possible instances given certain prior knowledge is similar to the approach used in the tree diagrams shown in Figure 2. There, however, the comparisons were based on given (example) probability values as opposed to numbers of events. Both approaches, though, relied on a fonnulation of the problem that yielded an exhaustive set of possible outcomes. 25.Bayes’ theorem is a formal rule for reviewing probability assessments when additional information is available. In particular, in legal contexts, it may be used to derive probabilities based on a combination of subjective beliefs and statistical evidence in the form of relative frequencies. Its importance lies in providing a link between, for example, the Pr(gui1t) and the Pr(guilt I evidence). The formal application of Bayes’ theorem to item 6 follows:

Pr(B I ‘B’) = Pr(‘B’ I B) x Pr(B) (1) Pr( ‘B ’ )

Similarly, Pr(G I ‘B’) = Pr(‘B’ I G) x Pr(G) (2) Pr( ‘B’)

Dividing (1) by (2) allows us to compare the probability of the cab being blue with the probability of the cab being greengiven the evidence of the witness who says ‘B’, ie ‘Blue’. This comparison is known as a likelihood ratio. It is not difficult to see how helpful this likelihood ratio is in determining an outcome depending on the balance ofprobabilities. If we replace B with Liable, G with Not Liable, and ‘B’ with Evidence in the above formulae, the resulting likelihood ratio gives a direct comparison of the relative likelihoods of liability and non-liability given the evidence. 26.P J Hawkins and A S Hawkins ‘Bayes Watch: Balancing the Probabilities’ (1992) 11 l(6) Law Notes 26-27. 27.Pr(B I ‘B’)/Pr(G I ‘B’)=0.8x0.15 10.2 x0.85=0.12/0.17. Thisgivestheratioof the likelihoods of the two possibilities. The actual probability of the cab being Blue, given that the witness said it was Blue is Pr(B I ‘B’) = 0.12 / (0.12 + 0.17) = 0.41.

Page 14: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 329

Some researchers have attributed people’s errors with this type of item to their inability to recognise the relevance of base-rate information (ie the relative frequencies of green and blue cabs). One suggestion is that people tend to ignore base-rates if they interpret them as being incidental information rather than causal. In fact, in this example, the base-rate information (85% and 15% respectively) is more extreme than the witness is credible, because the witness’ evidence is only 80% reliable. However, it has been argued that if respondents perceive the base-rate information about the distribution of cab colours to be incidental they will rely too heavily, or exclusively, on the witness’ evidence and therefore decide that the Blue Cab Company is liable.

This is the item on the Likelihoods Schedule that would appear to be of most immediate relevance to the lawyers, and yet the standard of the lawyers’ answers was very poor. Only 44% chose the correct answer (‘not liable’), while 41% were incorrect (choosing ‘liable’) and 14% stated that they did not know the answer. These results were not significantly different from those that would be expected if the respondents had merely been guessing. Research (eg Bar-HilleP8) has shown that many people get this type of problem wrong if they rely on their intuitions, so the lawyers in the present study were not unusual in this respect. However, if neither prospective members of the pool of jurors nor lawyers can cope with reasoning of this kind, the standard of decision-making in court cases may be seriously compromised, as it was in the Dutch case described by Wagenaar.29

In this case, in which Wagenaar gave expert testimony, the issue was whether the vehicle involved was being driven by the male or the female occupant. The outcome should have been decided by comparing the probabilities Pr(Ma1e driver I ID evidence) and Pr(Fema1e driver I ID evidence). The judges, however, seized on an estimate provided by the expert witness of Pr(ID evidence I Female driver). Wagenaar was then unable to persuade them that this was only one component out of the four that were needed to derive the required likelihood ratio of probabilities.30 In fact, the judges made at least three separate errors; they confused the inverse, they ignored base-rate information, and they failed to understand that a likelihood ratio depends on two probabilities and not just one. The accused in this case was, perhaps fortunately, found not guilty, but for totally the wrong reasons. This is certainly not a standard of decision-making that these authors would wish to see endorsed.

28.M Bar-Hillel ‘The Base-Rate Fallacy in Probability Judgments’ (1980) 44 Acta Psychologica 21 1-233. 29.A Wagenaar ‘The proper seat: A Bayesian discussion of the position of expert witnesses’ (1988) 12 Law and Human Behaviour 499-510. 30. Applying Bayes’ theorem to the Grosscamp case, the required likelihood ratio of posterior probabilities is derived as:

Pr(ma1el‘male’) = Pr(male) x Pr(‘male’lmale) Pr(femalel‘ma1e’) Pr(fema1e) Pr(‘male’lfema1e)

In fact, the judges only considered Pr(‘male’lfemale), shown in bold, confusing this with its inverse Pr(female1 ‘male’); ignoring base-rate information, Pr(male) compared to Pr(female), the prior probabilities of male and female drivers in the Netherlands; and failing to realise that a likelihoods ratio required the comparison of two probabilities, in this case Pr(malel‘male’) and Pr(femalel‘male’).

Page 15: Lawyers' probability misconceptions and the implications for legal education

330 Legal Studies

In the present study, when the authors introduced the extra requirement that the lawyers explain the reasons for their choice of answer it became clear that, in addition to admitted ‘honest guesses’ some very dubious and inadequate reasoning was taking place, irrespective of whether the respondent chose the correct answer or not. Frequently, this reasoning accorded neither with the legal principles that these students had been taught3’ nor with objective probability. Although the ‘higher mathematics group’ performed slightly better than the ‘lower mathematics group’ (48% and 43% correct respectively) this difference was not statistically significant. Both groups were as good (or, more importantly, as bad) as each other.

DOES MATHEMATICS BACKGROUND PREDICT THE ABILITY TO DEAL WITH UNCERTAINTY?

On all of the items the lawyers’ responses were significantly different in statistical from what would be objectively correct. The question then arises as to

whether the lawyers’ mathematical background was a good predictor of how well they would perform on the Likelihoods Schedule. The answer to this question is ‘no’. The observed problems generally occurred irrespective of the lawyers’ mathematical backgrounds. Put more bluntly, their mathematics training had failed to equip the lawyers with the necessary skills and understanding that they needed for making judgments in the face of uncertainty. Clearly, something needs to be done if lawyers are to be effective in this crucial aspect of their professional duties.

DISCUSSION

KoehleIj3 made a number of recommendations for tackling this problem. First, he stated that there should be statistical education for lawyers, in which law students, practising attorneys and judges should be exposed to elementary probability theory, using real cases as examples. Secondly, he advocated the introduction of courtroom education, involving lectures for jurors where appropriate and the use of charts (eg Bayesian) as decision aids. Thirdly, he saw the need for research into jurors’ probabilistic reasoning and into the impact of any instructional aids that are used.

There is much to be said for Koehler’s suggestions, and their adoption would assist the legal system to produce just decisions. Nor are his ideas entirely unprecedented. In 1970, more than a quarter of a century ago, Finkelstein and F a i r l e ~ ~ ~ argued that the use of Bayes’ theorem in a trial setting would be helpful to enable jurors to assess certain types of probabilistic evidence. See

31.For example, it was clear that some respondents were applying the criminal standard of proof in what was a civil case. 32.At the 0.01% level of significance. 33. See above n 1 1. 34.M 0 Finkelstein and W B Fairley ‘A Bayesian Approach to Identification Evidence’ (1970) 83 Ham LR 489.

Page 16: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 331

also Tribe.35 However, in England and Wales at least, this sensible course of action may have been precluded for the time being as a result of the rulings in R v and R v Adams ( N o 2).37 The Adams case involved a prosecution for rape, where the prosecution case was based on statistical evidence about a DNA profile. The rest of the evidence was not statistical. It was clear that the jurors needed a method of combining, or weighing, the various items of evidence. This method had to accommodate statements of the kind ‘there is one chance in two hundred million of a random, unconnected DNA match’, with ‘I was at home with my girlfriend’ and ‘Yes, he was with me all night’, etc.

The defence called an expert witness, a statistician, who asserted that the appropriate, and indeed the only logically consistent, way to reach a conclusion based on such disparate types of evidence was to evaluate the relative likelihoods of each item of evidence, given innocence and guilt respectively. The judge and jurors were shown how this could be done by comparing subjectively estimated percentages. It was emphasised that the percentages used were to be chosen entirely at the jurors’ discretion. The jurors were then shown how to use Bayesian techniques to combine these estimates of the weight of each item of qualitative evidence in order to arrive at a figure that could be compared with the DNA likelihood statistic. The court of first instance attempted to follow this statistical procedure, although the judge made it clear that the jury was not under any obligation to do so. There was no suggestion that the expert witness was usurping the function of the jury by answering the ‘ultimate question’.

The accused was convicted, and the case went to appeal, not because Bayes’ theorem had been introduced, nor because attempts had been made to teach the court its use, but rather because the judge made errors in his summing up of the statistical evidence, eg by confusing percentages and statements about how many times it was more likely that something had occurred. However, the Court of Appeal expressed very grave doubts as to whether the evidence of Bayes’ theorem was properly admissible. Their Lordships ruled that:

‘. . . the attempt to determine guilt or innocence on the basis of a mathematical formula, applied to each separate piece of evidence, is simply inappropriate to the jury’s task. Jurors evaluate evidence and reach a conclusion not by means of a formula, mathematical or otherwise, but by the joint application of their individual common sense and knowledge of the world to the evidence before them’.

What their Lordships appeared to be saying is that juries have a duty to be unscientific. Even where evidence can be evaluated objectively, the jury must ignore this and take a guess. As the present study has shown, though, when lawyers (let alone jurors) do this, they are capable of gross errors. In the second appeal, the Court of Appeal went even further, stating that expert evidence should not be admitted that might encourage jurors to attach mathematical probability values to items of qualitative evidence. Their Lordships stated that evaluating such evidence is the sort of task that jurors perform every day, carefully and conscientiously, along conventional (intuitive) lines. It is noteworthy that they

35.L H Tribe ‘Trial by Mathematics: Precision and Ritual in the Legal Process’ (1971) 84 Harv LR 1329. 36.R v Adam [ 19961 2 Cr App R 467. 37.R v Adurns (1997) Times, 3 November.

Page 17: Lawyers' probability misconceptions and the implications for legal education

332 Legal Studies

did not add that jurors would by these means be guaranteed to perform their duties competently.

Until we can guarantee that every member of society is statistically literate, the Court of Appeal rulings in R v Adam are bound to result in jurors making incorrect decisions. They may do this because they themselves lack relevant skills and understanding, or because they are misled by lawyers. What else can be expected when the blind attempt to lead the blind? Prevention, of course, would be preferable to cure. Ideally, statistical literacy should be seen as an immediate priority of everyone’s education. However, given that this clearly does not pertain at present, then remediation strategies like those recommended by KoehleP are the only realistic alternative. Statistical education must be introduced for lawyers, and the Court of Appeal rulings shou’ld be challenged. To prohibit any statistical ‘training’ of jurors will be to ignore the fundamental point that achieving the right decision in any court case matters more than how that decision is reached. After all, ‘the basic purpose of a trial is the determination of truth’.39

Their Lordships are also reported as saying that ‘we had never heard it suggested that a jury should consider the relationship between . . . scientific evidence and other evidence by reference to probability f~rmula[e] ’ .~ In fact, there is a growing body of literature on the subject with which lawyers should be familiar. Although it is perhaps not surprising that British judges would not have read American literature on this topic, presumably they should at least be aware of the implications of books (eg Aitken4’) and articles published in Britain (eg Hawkins and hawk in^,^* E ~ e t t , ~ ’ this last author being the Head of Interpretation Research at the Home Office Forensic Science Service and who has written extensively on this very subject). Apart from the unsatisfactory nature of the Court of Appeal ruling itself, the inadequate preparation of judges to try cases involving statistical evidence is further indicated by the fact that theAdams case went to appeal at all, ie because the judge in the court of first instance made statistical errors.

According to L ~ p t o n , ~ ~ Laplace (1749- 1 827)45 defined the theory of probability as ‘common-sense reduced to calculation’. Lupton argued, therefore, that instruction in common sense and calculation should form part of boys’ general education. The argument is in no way diminished in our present times, although we would, of course, include girls. No doubt their Lordships might have

38. See above n 1 1. 39.Tehan v US, ex re1 Shorr (1966) 382 US 406 at 416. 4O.See above n 36 at 48 1. 41. Above n 6. 42. Above n 26. 43.1 W Even ‘Bayesian inference and forensic science: problems and perspectives’ (1987) 36(2) The Statistician 99-105. 44.S Lupton ‘On the educational value of the theory of probability’ (1892) 276 Journal of Education 357-359. 4S.Probably the most accessible reference is P Simon, Marquis de Laplace A Philosophical Essay on Probabilities, a translation of Essai Philosophique sur les Probabilirks (1 812). This introduction (153 pages) to Thiorie Analytique des ProbabilirPs (645 pages) has been translated by F W Truscott and F L Emory (New York: Dover, 1951) with an introductory note by E T Bell.

Page 18: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 333

made a better ruling if they had considered the following (cited, unattributed, in L ~ p t o n ) : ~ ~

‘It is more and more generally perceived that the facts and theories about which we can ever hope to obtain any positive proof are very few in comparison to the great majority, which we can only hope to render more or less probable. Hence, the domain of the theory of probability is perpetually increasing, and the more and more necessary is it to investigate how far, by its aid, we are able to obtain valuable conclusions from probable, and not certain, data. The more ignorant a man is, speaking generally, the more certain he is of the correctness of conclusions derived by invalid methods from incorrect premises.’

Lawyers might further appreciate Lupton’s later point relating to problems over jury-type decisions, which he acknowledged to be difficult (if not ‘insuperable’) because ‘all the voters uurors] must be assumed to be exactly equal in their power and intention of judging correctly; and, in practice, many apparently small circumstances influence the decision’. Lupton suggested that ‘a judge would, nowadays [ 18921, hardly follow the example of Lord Eldon, and allow a man to be hanged for horse-stealing because he was said to have skeleton keys suitable for turnpike gates in his pocket’. Can we be so sanguine a century later? The evidence would suggest not. It is safer by far if judges, and jurors, are shown how to apply what we know about probability rules to achieve the decision that is more likely to be correct.

CONCLUSIONS AND STRATEGIES FOR THE FUTURE

The present study has shown that the assessment of probabilities causes real problems for lawyers, and that these problems persist even when they have received considerable amounts of formal mathematical education. This could be due to defects in the school mathematics syllabus, or perhaps it is the way in which statistical concepts are typically taught that is the problem. The lawyers certainly did significantly worse in statistical terms than the group of statistics teacher^,^' so possibly it is the type of mathematics training that is a more influential factor than the level at which mathematics has been studied. However, what remains clear is that mathematics education is currently failing those pupih who later become lawyers, because they are not learning those probability concepts that they need in order to do their work effectively. It is also failing those people who become jurors or who are otherwise involved in, or dependent upon, the outcomes of judicial processes.

Until this situation can be rectified, the present authors concur with Koehler’s recommendation^,^^ namely that there is a need for:

Statistical education for lawyers. Courtroom education in relevant aspects of statistics and probability. Research into jurors’ probabilistic reasoning.

46. Above n 44. 47.At the 0.01% level of significance. 48.Above n 1 1.

Page 19: Lawyers' probability misconceptions and the implications for legal education

334 Legal Studies

to which we would add:

Further research into the impact of instructional methods and materials.

However, while remedial education in the courtroom is effectively precluded by the Court of Appeal in R v Adams, the only short-term education strategy would seem to be one that focuses on preparing lawyers more adequately. This could be achieved by introducing relevant aspects of statistics and probability into the pre-service legal education syllabus, as part of law degrees or in the Legal Practice and Bar Vocational Courses. Continuing professional education courses are also an obvious way of enhancing the skills and understanding of in-service, and more senior, lawyers. The emphasis placed by the Dearing ReporP9 on the development of key skills such as communication and numeracy, and of cognitive skills such as the understanding of methodologies or ability in critical analysis, would be wholly commensurate with such advances in legal training.

A key issue arises concerning how best to implement such legal curriculum developments. Koehler suggests the importance of using real cases. Certainly, this approach accords with current thinking about statistical education. For many; statistics (and especially probability) can seem to be counter-intuitive. It is therefore very important that the methods and materials used to teach lawyers should reflect what is known about how non-specialist students learn and are best taught. There is an extensive and highly relevant (research) literature in the field of statistical education that should not be overlooked.

The current authors feel that it is important to adopt teaching strategies which take account of the faulty intuitions with which lawyers start out. However, they also suggest that cognitive research into errors and misconceptions that emphasises labels such as ‘inferential assymetry’, ‘confusion of the inverse’, ‘representativeness heuristic’ and ‘base-rate fallacy’, etc, probably lends itself more to remediation than to prevention or pedagogic strategies. It might be more effective to adopt teaching strategies that will counter people’s inability to formulate probability models, and their failure to enumerate all the alternatives when developing representations or analogies of reality. Indeed, this should be an approach that would find favour with lawyers, because it has much in common with the methodical attention to detail that is normally expected of them when they extract and present key aspects of cases. Encouraging a more pedantic approach to lawyers’ interpretation of uncertainty, probabilistic and statistical information may well be beneficial. This particular teaching strategy was one that was advocated by the late Leslie Glickman (see Hawkins et a15’) as being likely to yield more generalisable, or transferable, skills and understanding.

Certainly, there is evidence (including the findings of the present study) to suggest that the kind of mathematics education that typically emphasises the manipulation of models is not an effective way of developing statistical literacy. Green5’ advocated that there should be more emphasis on teaching people how

49.Higher Education in the Learning Society (1997). Report of the National Committee of Inquiry, chaired by R Dearing (Full report http:/lwww.leeds.ac.uk/educoVncihe/). 50.See above n 19. 51.D Green Probability Concepts in 11-16 year old Pupils (Loughborough: Centre for Advancement of Mathematical Education in Technology, University of Technology, Loughborough, 2nd edn, 1982).

Page 20: Lawyers' probability misconceptions and the implications for legal education

Lawyers’ probability misconceptions 335

to construct probabilistic models of reality. Attempts to implement probabilistic training for lawyers should take account of this and other recommendations of those well versed in statistical education research. Statistical education of non- specialists has had a rather chequered history, and developments in the training of lawyers need not, and should not, follow the less than optimal teaching practices of earlier times. As always, in advocating educational innovations, though, the question arises as to who will teach the teachers. At present there are relatively few academic lawyers with an interest in, and experience of, statistical methods. There are even fewer who have been involved in statistical education. It is important, therefore, that there should be co-operation between law teachers and statisticians, and that that co-operation should also involve experts in statistical education.