Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction...

19
Transitive reasoning distorts induction in causal chains Momme von Sydow 1,4,5 & York Hagmayer 2 & Björn Meder 3 # Psychonomic Society, Inc. 2015 Abstract A probabilistic causal chain ABC may intui- tively appear to be transitive: If A probabilistically causes B, and B probabilistically causes C, A probabilistically causes C. However, probabilistic causal relations can only guaranteed to be transitive if the so-called Markov condition holds. In two experiments, we examined how people make probabilistic judgments about indirect relationships AC in causal chains ABC that violate the Markov condition. We hypothesized that participants would make transitive inferences in accor- dance with the Markov condition although they were present- ed with counterevidence showing intransitive data. For in- stance, participants were successively presented with data entailing positive dependencies AB and BC. At the same time, the data entailed that A and C were statistically indepen- dent. The results of two experiments show that transitive reason- ing via a mediating event B influenced and distorted the induc- tion of the indirect relation between A and C. Participantsjudg- ments were affected by an interaction of transitive, causal- model-based inferences and the observed data. Our findings support the idea that people tend to chain individual causal relations into mental causal chains that obey the Markov condi- tion and thus allow for transitive reasoning, even if the observed data entail that such inferences are not warranted. Keywords Transitivity . Causality . Markov condition . Mixing of causal relationships . Probabilistic reasoning . Causal induction . Knowledge-based induction . Causal learning . Causal coherence . Transitive distortion effects . Categorization Transitive reasoning enables judgments about unobserved re- lationships based on indirect evidence. If one observes that object A is heavier than object B, and that B is heavier than C, one can infer that A is heavier than C. Not all relations, however, are transitive. If A is the mother of B, and B is the mother of C, this does not mean that A is the mother of C. We investigate whether and to what extent people reason transitively about causal relations, even when the conditions for transitive inferences do not hold true. We focus on proba- bilistic causal chains of the type ABC, where individual relations AB and BC can be combined to form a chain ABC to make probabilistic inferences from the chains initial event A to the terminal event C. First, we specify the conditions under which transitive reasoning in causal chains is valid. We then report the findings of two experiments investi- gating whether people make transitive inferences even when the available data entail that such inferences are not warranted. Our research builds on the idea that people represent the world in terms of mental causal models (Sloman, 2005; Waldmann, 1996 ; Waldmann, Cheng, Hagmayer, & Blaisdell, 2008; Waldmann & Hagmayer, 2001) that share key characteristics with causal Bayes nets, which originated in philosophy and machine learning (Pearl, 2000; Spirtes, Glymour, & Scheines, 1993). The key question of the present research is whether judgments about indirect relations in * Momme von Sydow [email protected]; Momme.von- [email protected] 1 Department of Psychology, Ruprecht-Karls-Universität Heidelberg, Hauptstr. 47, 69117 Heidelberg, Germany 2 Department of Psychology, University of Göttingen, Göttingen, Germany 3 Center for Adaptive Behavior and Cognition, Max Planck Institute for Human Development, Berlin, Germany 4 Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, Heidelberg, Germany 5 Munich Center for Mathematical Philosophy (MCMP), University of Munich, Munich, Germany Mem Cogn DOI 10.3758/s13421-015-0568-5

Transcript of Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction...

Page 1: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Transitive reasoning distorts induction in causal chains

Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3

# Psychonomic Society, Inc. 2015

Abstract A probabilistic causal chain A→B→C may intui-tively appear to be transitive: If A probabilistically causes B,and B probabilistically causes C, A probabilistically causes C.However, probabilistic causal relations can only guaranteed tobe transitive if the so-called Markov condition holds. In twoexperiments, we examined how people make probabilisticjudgments about indirect relationships A→C in causal chainsA→B→C that violate theMarkov condition.We hypothesizedthat participants would make transitive inferences in accor-dance with the Markov condition although they were present-ed with counterevidence showing intransitive data. For in-stance, participants were successively presented with dataentailing positive dependencies A→B and B→C. At the sametime, the data entailed that A and C were statistically indepen-dent. The results of two experiments show that transitive reason-ing via a mediating event B influenced and distorted the induc-tion of the indirect relation between A and C. Participants’ judg-ments were affected by an interaction of transitive, causal-model-based inferences and the observed data. Our findingssupport the idea that people tend to chain individual causal

relations into mental causal chains that obey the Markov condi-tion and thus allow for transitive reasoning, even if the observeddata entail that such inferences are not warranted.

Keywords Transitivity . Causality . Markov condition .

Mixing of causal relationships . Probabilistic reasoning .

Causal induction . Knowledge-based induction . Causallearning . Causal coherence . Transitive distortion effects .

Categorization

Transitive reasoning enables judgments about unobserved re-lationships based on indirect evidence. If one observes thatobject A is heavier than object B, and that B is heavier thanC, one can infer that A is heavier than C. Not all relations,however, are transitive. If A is the mother of B, and B is themother of C, this does not mean that A is the mother of C.

We investigate whether and to what extent people reasontransitively about causal relations, even when the conditionsfor transitive inferences do not hold true. We focus on proba-bilistic causal chains of the type A→B→C, where individualrelations A→B and B→C can be combined to form a chainA→B→C to make probabilistic inferences from the chain’sinitial event A to the terminal event C. First, we specify theconditions under which transitive reasoning in causal chains isvalid. We then report the findings of two experiments investi-gating whether people make transitive inferences even whenthe available data entail that such inferences are not warranted.

Our research builds on the idea that people represent theworld in terms of mental causal models (Sloman, 2005;Waldmann, 1996; Waldmann, Cheng, Hagmayer, &Blaisdell, 2008; Waldmann & Hagmayer, 2001) that sharekey characteristics with causal Bayes nets, which originatedin philosophy and machine learning (Pearl, 2000; Spirtes,Glymour, & Scheines, 1993). The key question of the presentresearch is whether judgments about indirect relations in

* Momme von [email protected]; [email protected]

1 Department of Psychology, Ruprecht-Karls-Universität Heidelberg,Hauptstr. 47, 69117 Heidelberg, Germany

2 Department of Psychology, University of Göttingen,Göttingen, Germany

3 Center for Adaptive Behavior and Cognition, Max Planck Institutefor Human Development, Berlin, Germany

4 Interdisciplinary Center for Scientific Computing (IWR),University of Heidelberg, Heidelberg, Germany

5 Munich Center forMathematical Philosophy (MCMP), University ofMunich, Munich, Germany

Mem CognDOI 10.3758/s13421-015-0568-5

Page 2: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

causal chains are influenced by transitive reasoning evenwhen the data show that the chain is intransitive.

Transitive reasoning in causal chains

When is transitive reasoning in causal chains valid? Considera researcher conducting two studies with knockout mice toinvestigate the causal relations between a certain gene, thelevel of a particular neurotransmitter, and a behavioral pheno-type. The first study finds that knockout mice tend to have anelevated neurotransmitter level,

P(elevated transmitter|knockout mice) > P(elevatedtransmitter|normal mice)

The second study finds that an elevated transmitter levelraises the probability of showing anxious behavior, that is,

P(anxiety|elevated transmitter) > P(anxiety|no elevatedtransmitter)

Given these findings, what can be concluded regarding therelation between gene and behavior? A transitive inferencemay seem intuitively plausible, namely, inferring that knock-out mice are more likely to show anxiety than normal mice,that is,

P(anxiety|knockout mice) > P(anxiety|normal mice)

Note, however, that neither study has directly assessed thisindirect relation. Rather, the two direct relations are integratedinto a causal chain that guides the inference about the indirectrelation.

One way to formalize causal chains is to use causal Bayesnet theory (Pearl, 2000; Spirtes et al., 1993). The frameworkcouples directed acyclic graphs with probability distributions,with the directed edges representing the causal dependenciesbetween the domain variables. A central assumption of thecausal Bayes net framework is the causal Markov condition,which states that a variable conditioned on its direct causes isindependent of all other variables in the causal network, ex-cept for its causal descendants (Hausman&Woodward, 1999,2004; Pearl, 2000; Spirtes et al., 1993; Spohn, 2001). It fol-lows from the Markov condition that the probability distribu-tion over the variables in the causal graph factors such that

P X ið Þ ¼ ∏X i∈X

P X i pa X ið Þjð Þ; ð1Þ

where the joint probability of variables, P(Xi) = P(X1, . . . Xn),is equal to the product of the probabilities of these variables,conditional on pa(Xi), which denotes the set of direct

causes of variables Xi in the graph (see, e.g., Hausman& Woodward, 1999, p. 531ff.). Applying the Markovcondition to a causal chain A→B→C entails A and Cbeing conditionally independent given B, that is,P(C|B∧A) = P(C|B∧¬A) and P(C|¬B∧A) = P(C|¬B∧¬A).In other words, the probability of C depends only on thestate of variable B and not on the state of variable A.

Importantly, if the Markov condition holds, the conditionalprobability P(C|A) can be calculated from the parameters ofthe two direct causal relations with Eq. 2:

P C Ajð Þ ¼ P B Ajð ÞP C Bjð Þ þ P :B Ajð ÞP C :Bjð Þ: ð2Þ

We refer to such inferences about indirectly related eventsin causal chains as transitive inferences. Formally, probabilis-tic transitive inferences in chains are valid if the Markov con-dition holds (Bonnefon, Da Silva Neves, Dubois, & Prade,2012). If the Markov condition does not hold, probabilitiesinferred through transitive inferences via Eq. 2 may deviatefrom the actual relations in the data.

Markov violations and category-based transitiveinference

The Markov condition enables transitive causal inferences. Itsnormative and descriptive status, however, is highly disputed.Some advocates of Bayes nets have defended the Markov con-dition as being a universal characteristic of causal relations inthe world or of their representations (Hausman & Woodward,1999, 2004; Spohn, 2001). Others have criticized the condi-tion’s ontological or epistemological necessity (Arntzenius,2005; Cartwright, 2001, 2002, 2006, 2007; Sober, 1987,2001; Sober & Steel, 2012; Steel, 2006). However, even advo-cates of a universal Markov condition concede that the lattercan be violated psychologically when (i) the event categoriesused are inadequate, (ii) there is a mismatch between causalrepresentations and the true causal structure, or (iii) hiddenexternal variables are correlated (Hausman & Woodward,1999, 2004; Spohn, 2001). In short, both advocates and criticsagree that the Markov condition can be violated in practicewhen descriptions of the world are incomplete or inadequate.

In this paper, we focus on situations in which the categoriesused to classify the instances involved in the causal relationsresult in causal chains that violate the Markov condition andalso do not warrant transitive inferences. Consider again theexample of the causal relations between a gene, a neurotrans-mitter level, and a behavioral phenotype. Assume that sixmice serve as a sample: three knockout mice and three normalmice (A vs. ¬A). The neurotransmitter level (e.g., elevated vs.normal, B vs. ¬B) and the anxiety level (e.g., high vs. low, Cvs. ¬C) of each mouse are assessed. Hence, the six mice areclassified according to their genetic makeup, their

Mem Cogn

Page 3: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

neurotransmitter level, and their anxiety level. A plausiblehypothesis might be that the gene causally influences a mous-e’s anxiety through regulating the neurotransmitter level.Figure 1 illustrates these temporally ordered variables andthe parameters estimated from the data. Knockout mice aremore likely to have an elevated neurotransmitter level thannormal mice, P(B|A) = 2/3 > P(B|¬A) = 1/3. Also, mice withan elevated transmitter level are more likely to be anxious thanmice with a normal transmitter level, P(C|B) = 2/3 > P(C|¬B)= 1/3. Despite these two positive relations, however, it doesnot hold that knockout mice are more likely to be anxious thannormal mice. In fact, knockout mice are less likely to be anx-ious than normal mice, P(C|A) = 1/3 < P(C|¬A) = 2/3.1

Whereas the data indicate a negative relation between geneand behavior, a different conclusion results from transitivereasoning. Inferring the probability P(C|A) from the parame-ters of the chain’s direct relations via Eq. 2 yields erroneousestimates of P(C|A) = .56 and P(C|¬A) = .44, indicating thatknockout mice are more likely to show high anxiety levelsthan normal mice. This discrepancy results from the causalchain being intransitive due to a violation of the Markov con-dition. Inferring P(C|A) via Eq. 2 is normally only valid if Aand C are independent conditional on B, which is not the case,as P(C|B∧A) = .5, but P(C|B∧¬A) = 1.

In this example the Markov condition is violated at the cate-gory level due to mixing heterogeneous items (or subclasses)with varying deterministic relationships (Cartwright, 2001;Hausman&Woodward, 1999; Spirtes et al., 1993). For instance,for a mouse (here symbolized by a circle) in the top left corner ofFig. 1, the relations A→B→C and A→C hold. Conversely, forthe mouse in the bottom left corner, A→¬B→¬C and A→¬Chold. Thus, on the item level there is no discrepancy between thedirect relations A→B and B→C and the indirect relation A→C.Causal relations, however, are typically defined at the level ofcategories, for example, mice that have gene A versus mice that

do not. On this level, however, the causal chain A→B→C isintransitive. The key question of our studies is whether peoplemake transitive inferences on the category level even when theobserved causal chain is intransitive.

Previous research on transitive inferences in causalchains

Previous research on causal chains has shown that peoplemake transitive inferences from A to C when observing onlyrelations A→B and B→C (Ahn & Dennis, 2000; Baetu &Baker, 2009). Ahn and Dennis (2000) used a sequential learn-ing paradigm, providing participants with evidence on thecovariation of events A and B (fertilizer→level of chemicalsin soil) intermixed with evidence about events B and C(chemicals→blooming of flower). They additionally studiedtrial-order effects by varying whether positive or negative ev-idence for local contingencies (between A and B, and betweenB and C) was presented first. Participants received no data onthe relation between fertilization (A) and blooming (C) butwere asked to judge this indirect relation. Average causaljudgments were positive, with higher judgments in thepositive-evidence-first condition. A primacy effect when con-structing the local relations in conjunction with transitive rea-soning may explain this.

Baetu and Baker (2009) investigated the influence of tran-sitive reasoning with chains in more detail. In their studies,positive, negative, and zero contingencies for A→B and B→Cwere combined. Participants learned about the contingencybetween two lights A and B (while light C was covered), andbetween lights B and C (while Awas covered); trials occurredintermixed. Subsequently, participants first rated the globalA→C relationship and then the local relationships A→B andB→C. Baetu and Baker used ∆P = P(effect|cause) −P(effect|¬cause) as a measure of causal strength. If the causalMarkov condition holds, ∆PA→C = ∆PA→B × ∆PB→C (Baetu &Baker, 2009, Appendix). Although participants never ob-served the contingency A→C, their judgments were consistent

AC

B

B

C

P(B|A)=2/3

P(B A)=1/3

P(C|B)=2/3

P(C B)=1/3

P(C|A)=1/3

P(C A)=2/3

A

A: Genetic makeup C: BehaviorB: Neurotransmitter level

Fig. 1 Causal chain A→B→C (gene → neurotransmitter → behavior)with positive relations between A and B and between B and C (solidarrows), but a negative relation between A and C (dashed arrow). Each

symbol denotes an individual mouse (or mouse population), which differin the presence versus absence of the three variables

1 Likewise, causal strength measures such asΔP (Allan, 1980; Jenkins &Ward, 1965; cf. White, 2003) or causal power (Cheng, 1997; Griffiths &Tenenbaum, 2005; Meder, Mayrhofer, & Waldmann, 2014) indicate thatthe individual links are positive, whereas the causal strength for the indi-rect relation of A and C is negative when marginalizing over B.

Mem Cogn

Page 4: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

with a multiplication of the individual contingencies, indicat-ing transitive reasoning.

Goals and scope

The studies of Ahn and Dennis (2000) and Baetu and Baker(2009) indicate that in the absence of direct evidence on therelation A→C, people make transitive inferences as if they areinducing causal chains that obey the Markov condition. Inthese studies, no contradictory data were available, so it seemslike a reasonable default assumption for learners.

Our research goes beyond these studies by investigating rea-soningwith intransitive chains in which theMarkov condition isviolated. Intransitive chains provide a stronger test for the hy-pothesis that people integrate the links into causal models andtend to reason causally coherent, deviating from the data, as ifthe Markov condition holds (causal coherence hypothesis).Also, we did not use a sequential learning task (Ahn &Dennis, 2000; Baetu & Baker, 2009; cf. Hebbelmann & vonSydow, 2014; von Sydow, Hagmayer, Meder, & Waldmann,2010) but presented all items in an overview format. This typeof format allows participants to detect that theMarkov conditiondoes not hold on the category level, because subclasses of itemswith different contingencies are mixed. In addition to elicitingprobability judgments on the category level, we obtained judg-ments about individual items to investigate the relationship be-tween category-based and data-driven inferences.

Figure 2 illustrates our experimental paradigm. Each blacksquare represents an individual item defined by the feature di-mensions Bsize^ and Bgrayscale^ (see Fig. 3). The item space ispartitioned into three categories A/¬A, B/¬B, and C/¬C, withdifferent boundaries. The data imply a positive contingencybetween A and B, P(B|A) = .75 > P(B|¬A) = .25, as well asbetween B and C, P(C|B) = .75 > P(C|¬B) = .25. However, Aand C are independent, P(C|A) = P(C|¬A) = .5, as indicated bythe orthogonal category boundaries of A and C. The Markovcondition is violated because P(C|B∧A) = 2/3 but P(C|B∧¬A) =1. Likewise, P(C|¬B∧A) = 0 but P(C|¬B∧¬A) = 1/3 (see Fig. 2).This is due to subclasses of A items having different probabil-ities of co-occurring with eventC. For instance, A items that arebright and small always lead to C, but A items that are dark andsmall never do.

The participants’ task was to judge the conditionalprobability of C given A, P(C|A), after being presentedwith the data.2 Our key question was whether peoplewould recognize the independence of A and C basedon the observed data, or whether they would induce a

Markov-coherent causal chain and transitively infer apositive relation. If they based their inferences solelyon the available data, participants should judge that Aand C are independent, P(C|A) = P(C|¬A) = .5. In con-trast, if people induce a causal chain A→B→C and as-sume the Markov condition, they should infer that Aand C are positively related, with P(C|A) = .625 (seeEq. 2). In Experiment 1, we elicited judgments on thecategory and item level. In Experiment 2 we comparedseveral intransitive and transitive chains and investigatedadditional boundary conditions (e.g., judgment order).

Experiments 1a and 1b

In both experiments, participants were sequentially providedwith data regarding the relations A→B and B→C, which alsoallowed them to observe A→C. The data entailed a violationof the Markov condition, rendering A and C statistically inde-pendent. Participants first judged P(B|A) and P(C|B) for theindividual relations and finally estimated P(C|A). InExperiment 1a, data were removed before participants judgedP(C|A). In Experiment 1b, the data remained visible. Thequestion was whether participants would recognize that Aand C were independent (e.g., by realizing that the categoryboundaries were orthogonal), or whether they would deriveestimates for P(C|A) from their causal model representationand reason transitively from A to C, concluding a positiverelation. In a control condition participants were presentedwith only data on A→C. Since the mediating event B wasomitted, they were expected to realize the independence of Aand C. We also requested judgments for individual items; thegoal was to investigate to what extent people’s judgmentswere sensitive to the varying contingencies on the item level.

Method

Participants and design. One hundred twenty-eight studentsfrom the University of Göttingen participated, in exchange forcandy and participation in a lottery where they could win €50.There were sixty-four participants in each experiment(Experiment 1a: 68% female, Mage = 24 years; Experiment1b: 45% female,Mage = 23 years). Participants were randomlyassigned to either the chain condition (A→B→C) or the con-trol condition (A→C). In Experiment 1a, one participant whohad previously participated in a related study was replaced.

Materials and procedure Both experiments used the samematerials and counterbalancing conditions. The procedurewas also identical except for varying whether the data wereremoved before judging P(C|A) (Experiment 1a) or remainedvisible (Experiment 1b). Participants were asked to take therole of a developmental biologist investigating three

2 Conditional probabilities are simple, uncontroversial measures (Evans& Over, 2004; Oberauer, Weidenfeld, & Fisher, 2007). Measures of cau-sality (ΔP, causal power) yield qualitatively similar predictions for theinvestigated contingencies.

Mem Cogn

Page 5: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

developmental stages of microbes. Specifically, the relationsbetween the kinds of carotene developed by the microbes in

three consecutive stages (α-carotene, β-carotene, and γ-caro-tene) were of interest.

Microbes with Alpha-carotene Microbes with no Alpha-carotene

Sta

ge 1

Microbes with Beta-carotene Microbes with no Beta-carotene

Sta

ge 2

Microbes with Gamma-carotene Microbes with no Gamma-carotene

Sta

ge 3

Fig. 3 Stimuli (Bmicrobes^) of Experiments 1 and 2 as shown toparticipants. Participants were instructed that microbes did not changetheir appearance across stages. Microbes were sorted in each stageaccording to whether they produced the stage-specific carotene (alpha-,

beta-, or gamma-carotene). Participants’ task was to judge the conditionalprobability of one type of microbe later becoming another specific type ofmicrobe, that is, to judge P(B|A), P(C|B), and P(C|A)

Size

Gra

ysca

le

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

A

A

B

B

C

C

C C B 15 5 B 5 15

B B A 15 5 A 5 15

C C A 10 10 A 10 10

Fig. 2 Item space and examples for the event structure in theexperimental conditions of Experiments 1a and 1b. Each black squarerepresents one stimulus item (Bmicrobe^; see Fig. 3), created bycombining different levels of grayscale (1=bright, 8=dark) and size(1=small, 8=large). Squares with a white dot denote the four individualtest items. The black lines bisecting the item space denote the categoryboundaries. Below the squares, contingencies and some resulting

measures are shown. After the conditional probabilities thecorresponding values on the used scale are shown in parentheses.Δp(AB) denotes Delta P as a measure of causal strength between twovariables, Pdata(C|A) represents the conditional probability of C given A,as entailed by the data, and Ptrans(C|A) represents the conditionalprobability as estimated transitively, based on Pdata(B|A), Pdata(B|¬A),Pdata(C|B) and Pdata(C|¬B), and Eq. 2

Mem Cogn

Page 6: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Stimuli consisted of 40 individual Bmicrobes^ represented ascircles (Fig. 3), varying on the dimensions Bgrayscale^ andBsize^ (Fig. 2). A pretest showed that participants could accu-rately distinguish the individual items. Categories A, B, and Cwere created by rotating the category boundary, resulting in or-thogonal categories A and C. To permit three linearly separablecategories, some feature combinations were eliminated (Fig. 2).

Eight counterbalancing conditions with identical contin-gencies were created by rotating the three category boundariesin steps of 45° (Fig. 2). Accordingly, categories A and C had aone-dimensional boundary in four counterbalancing condi-tions, whereas category B involved a two-dimensional bound-ary. In the other four conditions, A and C had a two-dimensional and B a one-dimensional category boundary. Inall conditions, there was a positive contingency between A andB, and B and C, respectively, whereas A and C were indepen-dent: P(C|A) = P(C|¬A) = .5.

Participants were first presented with data regarding the firstand second developmental stages, printed on large panels(Fig. 3). Data for the first stage arranged the 40 microbes ac-cording to whether they produced α-carotene (A) or not (¬A).The samemicrobes in the second stage were arranged accordingto whether they generatedβ-carotene (B) or not (¬B). Panels forthe first and second stages were displayed simultaneously.

Participant first judged the conditional probability P(B|A);that is, they judgedwhethermicrobeswithα-carotene (A) tendedto produceβ-carotene (B) or not (¬B). The experimenter pointedto the corresponding categories on the panels. We used an 11-step rating scale of −100 to +100, with amidpoint of 0 indicatingthe independence of A and B (Fig. 4). On this scale P(B|A) < .5corresponds to values below zero, P(B|A) > .5 corresponds tovalues above zero, and P(B|A) = .5 corresponds to zero.

After participants judged P(B|A), the C panel was added,showing the microbes that had produced γ-carotene (C) andthose that had not (¬C; Fig. 3). Using the same rating scale,participants judged the conditional probability P(C|B), that is,whether microbes producing β-carotene (B) would or wouldnot produce γ-carotene.

In Experiment 1a, after participants judged P(C|B), all datapanels were removed and participants had to judge P(C|A) on

the same 11-step rating scale. In Experiment 1b, after partici-pants judged P(C|B), all data panels remained visible whenparticipants judged P(C|A), so that the data showing the inde-pendence of A and C were available during the judgment.

The procedure and materials in the control conditions ofboth experiments were identical, except that participants wereshown data only for categories A and C. Accordingly, theyprovided an estimate of only P(C|A), with panels for A andC being visible during judgment.

After participants completed all probability judgments, thedata panels were removed and participants were presentedwith four individual microbes (presented in one of two ran-dom orders). Figure 2 indicates the location of these test itemswith a white spot. We selected these items since they were atleast one step away from the category boundary in allcounterbalancing conditions, and in all counterbalancing con-ditions each item belonged to one of four combinations ofcategories (A∧C, ¬A∧C, A∧¬C, and ¬A∧¬C). For instance,in Fig. 2 the item with size = 1 and grayscale = 3 is of typeA∧C (i.e., the microbe produced α-carotene and γ-carotene).The goal was to investigate whether judgments were influ-enced by observed contingencies on the item level and/ortransitive inferences based on the category level.

For each test item, participants judged the probability that itwould generate γ-carotene. To eliminate uncertainty regardingthe category membership of A, information was provided foreach item on whether it had or had not produced α-carotene.Again, a rating scale of −100 to +100 was used, labeled BThis[non] alpha microbe tends not to develop gamma-carotenelater^ on the left side and Btends to develop gamma-carotenelater^ on the right side. Different ratings for A and ¬A itemsyielding the same effect C would indicate an influence ofcategory-based transitive reasoning. Different ratings for itemsbelonging to categories C or ¬C would indicate an influence ofthe actually observed contingency on the item level.

Results

For the analyses, the eight counterbalancing conditions wererecoded to match the data structure depicted in Fig. 2. Our

Fig. 4 Example scale used to elicit conditional probability judgments in Experiments 1a and 1b

Mem Cogn

Page 7: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

predictions for transitive inferences for the global A→C rela-tion rely on qualitatively correct judgments of the local A→Band B→C relationships. Therefore, following previous re-search (Baetu & Baker, 2009), we included only participantswho correctly judged both local relationships to be positive. InExperiment 1a, 25% of participants failed to meet this criteri-on in each of the two relations; in Experiment 1b an average of19% failed in each of the two relations.3

Figure 5 shows participants’ mean probability judgments.In both studies, judgments for the local relations P(B|A) andP(C|B) reflect the probabilities in the data. (The objectiveprobabilities P(B|A) = P(C|B) = .75 correspond to a value of+50 on the scale used.) The key finding is that in both exper-iments participants in the chain condition judged P(C|A) to begreater than zero, Experiment 1a: t(15) = 6.49, p < .001;Experiment 1b: t(20) = 2.19, p < .05. Thus, although A andC were independent in the observed data (corresponding tozero on the scale used), participants judged this relation to bepositive. In contrast, in the control conditions in which theintermediate event B was omitted, judgments did not differfrom zero, Experiment 1a: t(31) = −1.39, p = .17;Experiment 1b: t(31) = .21, p = .22. Judgments for P(C|A)

differed between experimental and control condition in bothExperiment 1a, t(46) = 4.52, p < .001, and Experiment 1b,t(51) = 2.84, p < .05.4

These findings indicate that participants relied on transitivereasoning to judge the relation between A and C. Despitebeing able to detect that these categories of items were inde-pendent when B was omitted, they inferred a positive relationwhen B was the intermediate event. The lower judgments forP(C|A) in Experiment 1b indicate that making the data avail-able during judgment increased people’s sensitivity to the in-dependence of A and C.

We next analyzed participants’ probability judgments re-gardingC for the individual test items, which consisted of fouritems combining membership of categories A and C (i.e.,items of type A and C, A and ¬C, ¬A and C, and ¬A and ¬C;see Fig. 2). Theoretically, if judgments are based solely ontransitive reasoning on the category level, there should beidentical positive judgments for A items around +25, regard-less of whether an item belonged to categoryC or ¬C. The two¬A items (¬A and C and ¬A and ¬C) should receive negativeratings of around −25. Conversely, if participants’ judgmentsare driven solely by the observed data, C items should receivea rating of +100, and ¬C items one of –100, independent ofwhether they are type A or ¬A. Note that the maximalBbottom-up^ effect (C vs. ¬C items) is larger than the maximalBtop-down^ effect (A vs. ¬A items).

3 The predictions for people who failed to meet the selection criteria arenot clear, since they may have failed for various reasons, such as a lack ofconcentration or because they recognized the intransitivity. To explorewhether participants used the local relations as predictors, we correlatedthe empirically found estimates of P(C|A) with the product P(C|B) ×P(B|A) and with P(C|B) × P(B|A) + (1-P(C|B)) × (1-P(B|A)). The latterestimate relies on a symmetry assumption in line with the observed data:P(C|B) =P(¬C|¬B). In Experiment 1a, both correlations were positive andlarge when we included all participants (r = .53 and r = .55) or only thosemeeting the selection criterion (r = .63 and r = .63). In Experiment 1b, incontrast, we obtained no correlation when considering all participants (r= .12 and r = .08) but strong positive correlations for participants meetingthe criterion (r = .55 and r = .53). This suggests that people in the groupmeeting the selection criterion were often guided by transitivity, but thatat least in Experiment 1b many of the participants failing to meet thecriterion cannot be modeled by transitivity.

Fig. 5 Mean judgments (±SE) on the category level in Experiments 1aand 1b. Judgments were given on an 11-step scale of −100 to +100.P(C|A) was judged with data being removed in Experiment 1a and databeing present in Experiment 1b. For the local relations, P(B|A) = P(C|B) =.75 holds in the data, corresponding to +50 on the scale used. The

observed data entails P(C|A) = P(C |¬A) = .5, corresponding to a valueof 0 on the scale used. Deriving P(C|A) from a causal chain via Eq. 2yields .625, corresponding to a value of +25 on the scale used. Judgmentsof P(B|A) and P(C|B) were not collected in the control condition

4 Additionally, we controlled for the two patterns of one-dimensional(1D) and two-dimensional (2D) category boundaries used in the eightcontrol conditions (1D–2D–1D vs. 2D–1D–2D categorization type).Descriptively, the distortion effect was larger in the 2D–1D–2D condi-tion. However, a two-way ANOVA concerning the P(C|A) judgmentsshowed a significant main effect only of experimental condition vs. con-trol, Experiment 1a, F(1, 44) = 19.50, p < .001; Experiment 1b, F(1, 49) =5.15, p < .05. There was no effect of categorization type, Experiment 1a,F(1, 44) = 1.7, p = .20; Experiment 1b, F(1, 49) < 1, p = .99, and nointeraction, Experiment 1a, F(1, 44) < 1, p = .94; Experiment 1b, F(1, 49)= 1.43, p = .24.

Mem Cogn

Page 8: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Figure 6 shows participants’ mean judgments for thefour test items regarding the probability of C. For thesejudgments the learning data had been removed in bothstudies. Judgments in the control condition serve as base-line, showing the influence of the data on participants’judgments without intermediate event B. In the controlcondition, in both studies judgments varied only as a func-tion of whether an item did or did not belong to category C,irrespective of whether the item belonged to category A or¬A. Consequently, in both experiments an analysis of var-iance (ANOVA) for the control condition, with item typesA versus ¬A and C versus ¬C as within-subject factors,yielded a significant effect only for items of type C versus¬C, Experiment 1a, F(1, 31) = 17.90, p < .001, ηp

2 = .37;Experiment 1b, F(1, 31) = 6.33, p < .05, ηp

2=.17; there wasno influence of type A versus ¬A, F(1, 31) < 1, and nointeraction, Experiment 1a, F(1, 31) < 1; Experiment 1b,F(1, 31) = 1.27, p = .27.

A different pattern of judgments was obtained in the chainconditions. For instance, the two items belonging to categoryC were judged differently depending on the status of A, withhigher judgments for A than for ¬A items. Analogously, itemsthat belonged to category ¬C received higher judgments whenbelonging to A than when belonging to ¬A. The pattern alsoreflects an influence of the data, as the judgments for the two Aitems and the two ¬A items varied as a function of true mem-bership regarding C. Accordingly, in Experiment 1a, a maineffect of typeC versus ¬C, F(1, 15) = 11.67, p <. 01, ηp

2 = .43,a main effect of type Aversus ¬A, F(1, 15) = 6.04, p < .05, ηp

2

= .28, and an interaction, F(1, 15) = 9.72, p < .05, ηp2 = .39,

resulted. Thus, in the chain condition participants’ judgmentswere influenced by both the category level relations and ob-servations on the item level.

In Experiment 1b, the pattern was qualitatively simi-lar, although less pronounced. The ANOVA for the

chain condition yielded a significant effect of items oftype C versus type ¬C, F(1, 20) = 14.69, p < .01,ηp

2 = .46, as well as—at least for a one-tailed test ofour prediction—an influence of type A versus ¬A,F(1, 20) = 3.80, pone-tailed < .05, ηp

2 = .16, but nointeraction, F(1, 20) < 1. The two main effects indicatethat judgments on the item level were still influenced bythe inferred positive relation between A and C on thecategory level and the observed data concerning catego-ry C versus ¬C. As in Experiment 1a, the effect of thecategory membership of C was larger than that of cate-gory A, consistent with the theoretically maximal effectof these factors.

Table 1 shows the distribution of judgments, that is,the proportion of participants judging P(C|A) to be pos-itive, zero, or negative. Although a larger proportion ofparticipants in the experimental conditions detected thezero contingency in Experiment 1b than in Experiment

−50

0

50

A and C ¬A and C A and ¬C ¬A and ¬CItem

Judg

ed p

roba

bilit

y P

(C|i

tem

)(±

SE

)

Chain

Control

Experiment 1a

−50

0

50

A and C ¬A and C A and ¬C ¬A and ¬CItem

Judg

ed p

roba

bilit

y P

(C|i

tem

)(±

SE

)

Chain

Control

Experiment 1b

Fig. 6 Mean judgments (±SE) of P(C|item) in Experiments 1a and 1b.Each item corresponds to a combination of categories A/¬A and C/¬C(see Fig. 2 for locations in the item space). For each item, information on

A vs. ¬A was given and the task was to infer the probability of C.Judgments were given on an 11-step scale of −100 to +100

Table 1 Percentage (and frequency) of participants in Experiments 1aand 1b judging the relation between A and C to be negative, zero, orpositive on a scale of −100 to +100

Experiment Condition Negative Zero Positive

1a Intransitive chain 0% (0) 6% (1) 94% (15)

Control 28% (9) 56% (18) 16% (5)

1b Intransitive chain 19% (4) 33% (7) 48% (10)

Control 28% (9) 56% (18) 16% (5)

Note. Boldface entries correspond to the prediction of transitive distortioneffects in chain conditions and zero judgments in the control condition.Participants’ judgments were classified as positive for the interval 0 < x ≤100, negative for −100 ≤ x < 0, and zero if and only if they answeredBzero.^

Mem Cogn

Page 9: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

1a (exact Fisher test, p < .05), both experiments show alarger proportion of positive answers in these conditionsthan in the respective control conditions (Experiment 1a:p < .001; Experiment 1b: p < .05).

Discussion

The studies show that participants whomade correct judgmentsregarding the two local relations A→B and B→C judgedP(C|A) to be larger than zero, indicating a transitive inferencefrom the chain’s initial event A to the final event C. Thesejudgments strongly differed from those in the control conditionswithout intermediate variable B, in which judgments aroundzero were obtained. The size of the effect was modulated bythe specific testing conditions, with a stronger influence of thecategory-based transitive inference when the data were not vis-ible during the judgment. Probability judgments for individualtest items were influenced by transitive inferences on the cate-gory level and item-specific knowledge. The results support theidea that participants induced a causal chain A→B→C andreasoned transitively from A to C, although the Markov condi-tion was violated and transitivity did not hold in the data.

Experiment 2

The goal of Experiment 2 was to investigate reasoning withintransitive chains in a wider array of circumstances. In additionto the intransitive chain of Experiment 1 (A→B, B→C, A inde-pendent of C, henceforth denoted ++0) we included a new in-transitive chain involving two preventive causal relations (A→¬B, ¬B→C, A independent ofC, denoted −−0). The goal was torule out that positive judgments for the local relations created aresponse bias toward a positive rating for the A→C relation.Additionally, we used a new control condition that matched thecomplexity of the intransitive chains. This involved a positiveA→B contingency, followed by independent B and C variables,and likewise independent variables A and C (denoted +00).

We also included two transitive chains that obeyed theMarkov assumption (A→B, B→C, A→C, denoted +++; andA→¬B, ¬B→C, A→C, denoted −−+) as a comparison for therespective intransitive chains (++0 and −−0). This allowed us toinvestigate whether participants would use the observableevidence in addition to transitive reasoning to judge the indirectrelation between A and C. Finally, we included a Markov-coherent chain with a negative overall relation (−+−) to examinewhether people correctly learn a negative overall relationship.

Based on the findings of Experiment 1 we expected distortioneffects due to transitive inferences in conditions ++0 and −−0.However, we expected to find higher ratings in the respectivetransitive conditions +++ and −−+, due to an additional influ-ence of the observable positive relation between A and C. In thecontrol condition +00, both transitive inferences and the learning

data entailed a zero contingency. Therefore we expected partic-ipants to correctly detect A’s and C’s independence.

Finally we controlled for question order. In the local–globalconditions, similar to Experiment 1, participants rated the individ-ual causal links before judging the conditional probability of Cgiven A. In the global–local conditions, this order was reversed.

Method

Participants One hundred twenty-four participants (56% fe-male;Mage = 23 years), mostly students from the University ofHeidelberg, took part in exchange for chocolate and participa-tion in a lottery where they could win €50. Three participantswere excluded from the analyses (two clear outliers in the timeused and one who gave a rating of −100 in all local judgments).

Design The experiment had a 2 (Judgment Order: local–globalvs. global–local, between-subjects) × 6 (Contingency Condition,within-subject) mixed factorial design. The six within-subjectconditions, which involved different A→B, B→C, and A→Ccontingencies, were presented in random order. Figure 7 showsthe item space and resulting contingencies for the six contingen-cy conditions. Conditions ++0 and −−0 were intransitive andMarkov-incoherent. The +++, −−+, and −+− conditions weretransitive and Markov-coherent. The neutral control condition +00 entailed a zero A→C contingency, both in the data and whenusing Eq. 2. For each condition (see Fig. 7), participants wererandomly assigned to one of eight counterbalancing conditionscreated by rotating the category boundaries, resulting in 48 datasets, each with three data panels presenting 40 microbes.5

Materials and procedure We used the same scenario andmaterials as in Experiment 1 (see Fig. 3), but in a computer-based experiment. As in Experiment 1b, judgments aboutP(C|A) were elicited in the presence of the learning data.Since participants were presented with several contingencyconditions, we omitted the single-item test.

First, participants were told about the microbes, their threedevelopmental phases, the three types of carotenes (α-carotene,β-carotene, γ-carotene), and the question order. To elicit prob-ability judgments, we again used an 11-step scale but withdifferent labels and no numbers (see Fig. 8). For reasons ofcomparability we report results using a scale of −100 to +100.

Participants were randomly assigned to a question order. Inthe local–global condition, participants were first shown theA/¬A panel and the B/¬B panel and asked to judge P(B|A).Subsequently, the C/¬C panel was added and participantsjudged P(C|B). Finally, they estimated P(C|A), with the

5 With 40 microbes, some conditions only approximate the Markov con-dition. We did not increase the number of items in order to retain compa-rability with Experiment 1 and to ensure that the task did not becomemore difficult.

Mem Cogn

Page 10: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

previous judgments and data remaining visible. In the global–local condition, all three panels were presented from the outset

and participants first estimated P(C|A). Subsequently, weasked for judgments for P(B|A) and P(C|A). Finally, we

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

C

Intransitive generative chain(++0)

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

C

Intransitive preventive chain (--0)

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

Size

Gra

ysca

le

C

Transitive generative chain (+++)

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

Size

Gra

ysca

le

C

Transitive preventive chain (--+)

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

Size

Gra

ysca

le

C

Transitive mixed chain (-+-)

Size

Gra

ysca

le

A

Size

Gra

ysca

le

B

Size

Gra

ysca

le

C

Neutral control condition (+00)

Fig. 7 Item space in the six contingency conditions of Experiment 2. Black squares represent stimulus items (Bmicrobes^) in each of the threedevelopmental stages showing a corresponding carotene (A, B, C); lighter squares refer to items without the corresponding carotenes (¬A, ¬B, ¬C)

Mem Cogn

Page 11: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

included an item-sensitivity test to make sure that participantswere able to distinguish between neighboring feature valuesof size or brightness (see Appendix 1).

Results

All answers were recoded to match the counterbalancing condi-tions shown in Fig. 7. We used the same selection criterion asbefore; that is, participants had to judge the two local links qual-itatively correctly (cf. Baetu & Baker, 2009), because this is aprerequisite for testing the impact of transitive reasoning in in-transitive chains. Judgments for positive local relations had to bein the interval +20 ≤ x ≤ +100 (i.e., one of the five farthest pointsto the right on the 11-point rating scale); for a negative relation inthe interval −100 ≤ x ≤ −20 (i.e., one of the five farthest points tothe left); and for the relation with a predicted zero mean in theinterval −40 ≤ x ≤ +40 (i.e., one of the five points mid-scale).These intervals are centered on the predicted values for the re-spective relations: positive = 50, negative = −50, and null = 0.6

Selections were made for each condition separately.Figure 9 shows the mean estimates for judgments of P(C|A).

Judgments ofP(B|A) andP(C|B) were close to the true values (fordetails, see Appendix 2, Table 5). Importantly, the intransitivechains ++0 and −−0 yielded positive judgments, despite A andC being independent.7 In the Markov-coherent +00 condition,answers were close to zero, the transitive chains +++ and −−+yielded strong positive judgments, and the transitive chain −+−yielded negative judgments. Question order (local–global vs.global–local) did not influence judgments.

We first analyzed whether ratings of P(C|A) differed fromzero (Table 2). Unsurprisingly, ratings differed from zero inthe three conditions in which there was a relation between Aand C (+++, −−+, and −+−). However, consistent with tran-

sitive reasoning, estimates also differed from zero in theMarkov-incoherent conditions (++0 and −−0), although Aand C were independent. They did not differ from zero inthe control condition (+00).

Second, we compared estimates of P(C|A) across the differ-ent conditions, controlling for question order.8 Table 3 showsthe results of respective 2 × 2 ANOVAs. For no comparisonwere Question order or the Order × Contingency interactionsignificant. The comparisons of the generative intransitivechain ++0 and the preventive intransitive chain −−0 with theneutral control condition +00 (see Table 3, upper two rows)indicate illicit transitive inferences. In both intransitive chainconditions, judgments were higher than in the control condi-tion (differences of 18.7 and 13.8, respectively), although thedifference between the preventive chain and the control condi-tion was only marginally significant (p = .06, one-sided test).9

To investigate to what extent participants were sensitive to theobserved data, ratings in the intransitive conditions (++0 and −−0) were compared to the corresponding transitive conditions(+++ and −−+; see Table 3, two middle rows). Judgments werehigher in the latter caseswhenP(C|A) > .5 thanwhenP(C|A) = .5.These results show that judgments in the intransitive conditionswere influenced not only by transitive reasoning on the categorylevel, but also by the observed contingencies.

Additionally, we examined potential response biases bycomparing the ++0 condition with the −−0 condition, andthe +++ condition with the −−+ condition (see Table 3,bottom two lines). If observing two positive relationships forthe direct links or giving two positive judgments creates atendency to judge the indirect relation positively, too, judg-ments of P(C|A) should differ between conditions. The anal-yses show that this was not the case: There was no responsebias and no effect of question order (see also Fig. 9).

Furthermore, Table 4 presents an analysis on the individuallevel for the target inference P(C|A). Each participant wasclassified according to judging a particular relation qualita-tively as positive, negative, or zero. In both the generativeintransitive condition (++0) and the preventive intransitivechain condition (−−0) the overall proportion of positive an-swers was higher than in the neutral control condition (+00;four field χ2 tests, p < .001, p < .05).10 There was an evenhigher proportion of positive P(C|A) judgments in the corre-sponding transitive conditions +++ and −−+ (χ2 tests, p < .05,p < .001). The proportion of negative answers in the mixed

6 Across all conditions, an average of 29% of answers concerning localrelations fell into the six excluded levels of the scale (out of 11 levels).7 We also tested if the difference between the categorization types mod-ulated the size of the distortion effect. There was no significant differencein the two relevant intransitive conditions, ++0: F(1 ,71) = 2.55, p = .11and −−0: F(1, 64) = 3.35, p = .07, but descriptively the distortion effectwas larger in the 2 dimension-1 dimension-2 dimension (2D–1D–2D)condition.

8 We did not conduct a global ANOVA on question order because apply-ing the selection criterion of qualitatively correct local judgments to allconditions simultaneously would have excluded too many participants.9 Note that this test has a lower statistical power than the test against zero,because applying the selection criterion to both conditions lowered thenumber of participants involved.10 However, for the preventive intransitive chain condition this differenceseems to have been drivenmainly by the local–global conditionwhere thedifference between positive and negative answers became significant (ex-act binomial test, p < .05).

Fig. 8 Example of scale used for eliciting conditional probabilityjudgments in Experiment 2. Judgments were given on an 11-step scalewithout values; for reasons of comparability with Experiment 1, we reportour results using a scale of −100 (mostly no beta-carotene) and +100(mostly beta-carotene). Note that the scale in Experiment 1 had −100on the left and +100 on the right. Here, the equivalent of −100 (mostlyno beta-carotene) is on the right and the equivalent of +100 (mostly beta-carotene) is on the left

Mem Cogn

Page 12: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

transitive condition (−+−) was of a similar size to the propor-tion in the positive transitive conditions and higher than that inthe control condition +00 (χ2 test, p < .001).

Finally, the item-sensitivity test corroborated a reasonableability of participants to distinguish between neighboring fea-ture values of the items shown (microbes) and that differencesin ability seem not to have driven the distortion effect (seeAppendix 1).

Discussion

In Experiment 2 we replicated and extended the results ofExperiment 1, while controlling for alternative explanations. Inboth intransitive chain conditions (++0 and −−0) participants’judgments deviated from the observed data, consistent with theidea that people tend to induce Markov-coherent causal chainsand use them to reason transitively from A to C. The fact thatparticipants gave higher ratings when reasoning with transitivechains suggests that judgments were also influenced by the learn-ing data. In the transitive condition the ratings even seem a bit too

high, but this may relate to the numberless rating scale in thisexperiment (see also Rehder & Burnett, 2005).

Our analyses also show that the distortion effect in theintransitive conditions cannot be explained by answer tenden-cies being due to influences from previous judgments or byprevious beliefs about the global relation of A and C. First, the+00 control condition yielded judgments close to zero; sec-ond, the intransitive −−0 and ++0 conditions yielded similarjudgments. Third, judgments in the positive (+++, −−+) andnegative (−+−) transitive conditions had similar absolute pos-itive or negative values, and fourth, the order in which judg-ments were elicited was irrelevant.

General discussion

Our goal was to investigate whether transitive inferences inprobabilistic causal chains of the type A→B→C distort theinduction of the relationship between A and C when the tran-sitive inference based on the independent combination of theobserved local relationships A→B and B→C, for instance,entails a positive indirect relationship, while the data directlyshows that that A and C are independent. We studied the in-fluence of transitive inference in intransitive chains that vio-lated the Markov condition on the category level because het-erogeneous subclasses of items were mixed. Our results showthat people made judgments about P(C|A) that systematicallydeviated from the observed data but were consistent with atransitive inference from A to C based on a mental causalmodel (illicitly) obeying the Markov assumption.

Experiment 1a demonstrated the influence of inappropriatetransitive reasoning when participants learned consecutivelyabout the two individual links and made judgments after the

−50

−25

0

25

50

+ + 0 − − 0 + 0 0 + + + − − + − + −Chain conditions

Judg

ed p

roba

bilit

y P

(C|A

)(±

SE

)

Local−global

Global−local

All

Experiment 2

Fig. 9 Mean judgments of P(C|A) (±SE) on the category level inExperiment 2. Judgments were given on an 11-step scale of −100 to +100. Local–global: Participants first judged the local relations P(B|A) andP(C|B) before judging P(C|A). Global–local: Participants first judged

P(C|A) and subsequently the local relations P(B|A) and P(C|B). All: Meanjudgments aggregated across question order. See text for descriptions ofchain conditions

Table 2 t Tests (one-tailed) for judgments of P(C|A) against zero inExperiment 2

Condition P(C|A)M (SE)

df t p

++0 16.71 (5.30) 72 3.15 .002

−−0 11.81 (5.43) 65 2.16 .033

+00 −1.97 (4.62) 60 −0.43 .672

+++ 35.29 (4.27) 84 8.27 .001

−−+ 31.85 (5.48) 83 5.81 .001

−+− −32.81 (4.85) 63 −6.77 .001

Mem Cogn

Page 13: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

learning data were removed. Experiment 1b showed that thisfinding was also obtained when all data were available whilejudging P(C|A), although in this case the judgments were influ-encedmore strongly by the learning data.When the intermediateevent Bwas omitted from the data, participants had no difficultyrecognizing that A and C were unrelated. Further analysesshowed that judgments on the level of individual items wereinfluenced by transitive inferences on the category level andthe item-level relations.

Experiment 2 investigated the robustness of these findingswhile controlling for alternative explanations, such as taskcomplexity and possible answer tendencies. Similar findingsto Experiments 1a and 1b were obtained. A direct comparisonof intransitive versus transitive chains showed that partici-pants’ judgments on the category level were influenced notonly by illicit transitive reasoning but also by the observeddata. Results in the different control conditions refute the ideathat these distortion effects are due to answer tendencies

Table 4 Percentage (and frequency) of participants in Experiment 2 judging the relationship between A and C to be negative, zero, or positive, on ascale of −100 to +100

Contingency condition Order Negative Zero Positive

Generative intransitive chain (++0) Local–global 30% (11) 6% (2) 64% (23)

Global–local 19% (7) 22% (8) 59% (22)

All 25% (18) 14% (10) 62% (45)

Preventive intransitive chain (−−0) Local–global 24% (9) 24% (9) 53% (20)

Global–local 36% (10) 21% (6) 42% (12)

All 29% (19) 23% (14) 48% (32)

Neutral control (+00) Local–global 25% (8) 41% (13) 34% (11)

Global–local 44% (13) 28% (8) 28% (8)

All 34% (21) 34% (21) 31% (19)

Generative transitive chain (+++) Local–global 14% (6) 7% (3) 79% (35)

Global–local 10% (4) 10% (4) 80% (33)

All 12% (10) 8% (7) 80% (68)

Preventive transitive chain (−−+) Local–global 11% (3) 7% (2) 81% (22)

Global–local 22% (6) 0% (0) 78% (21)

All 17% (9) 4% (2) 80% (43)

Mixed transitive chain (−+−) Local–global 71% (25) 11% (4) 17% (6)

Global–local 86% (25) 10% (3) 3% (1)

All 78% (50) 11% (7) 11% (7)

Note. Boldface entries indicate the main prediction for conditions assuming that participants engage in transitive reasoning without modeling subjectivedistributions over values. Participants’ judgments were deemed positive for the interval 0 < x ≤ 100, negative for −100 ≤ x < 0, and zero if and only if theyanswered Bzero.^ All: Percentage and frequency of judgments aggregated across question order.

Table 3 Analyses of variance (2 × 2) comparing judgments of P(C|A) in the within-subject contingency conditions while controlling for between-subjects question order

Comparisons Contingency Question order Contingency × Order N

F ηp2 p F ηp

2 p F ηp2 p

++0 +00 6.16 .12 .017 0.12 .00 .725 0.19 .00 .668 44

−−0 +00 2.57 .05 .116 2.42 .06 .127 0.06 .00 .805 42

++0 +++ 9.61 .13 .003 0.04 .00 .848 0.18 .01 .674 63

−−0 −−+ 7.66 .16 .008 0.14 .00 .641 0.10 .00 .748 42

++0 −−0 0.05 .00 .828 0.40 .00 .529 0.05 .00 .768 45

+++ −−+ 0.46 .01 .500 0.22 .00 .995 1.99 .04 .165 47

Note. See Fig. 9 for mean judgments in the different conditions.

Mem Cogn

Page 14: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

resulting from previous judgments (cf. atmosphere effects insyllogistic reasoning; Seels, 1936) or to prior beliefsconcerning the global relation. Judgments of the indirect rela-tion were correct and independent of question order inMarkov-coherent, transitive conditions: positive for genera-tive as well as preventive relations, negative for mixed rela-tions, and zero in the neutral control condition.

The present research goes beyond previous studies (Ahn &Dennis, 2000; Baetu & Baker, 2009) that investigated theinfluence of transitive reasoning in the absence of data aboutA and C, so that learners could not assess whether the Markovcondition held true. While these studies demonstrated thatpeople made transitive inferences in the absence of data onthe relation between A and C, our results suggest they do soeven in the presence of counterevidence. Although partici-pants could observe the indirect relation between A and C,judgments were substantially influenced by transitivereasoning.

Should one assume transitivity and the Markov conditionwhen inducing causal structures?

Cartwright (2001, 2002, 2007) criticized the concept of assum-ing the Markov condition as a universal property of causalrelations in the world. Even proponents of a universal assump-tion of the Markov condition concede that the condition neednot hold for inadequate category schemes or incomplete causalstructures actually in use (Hausman &Woodward, 1999, 2004;Spohn, 2001). Inspired by these ideas, we investigated the in-fluence of transitive reasoning in intransitive chains that violatethe Markov condition.

In our scenarios the Markov condition is violated due tomixing subclasses of items with different contingencies.Aggregating these subclasses into the same category resultsin a violation of the Markov condition and of transitivity,given the provided categories. However, categories play anindispensable role in causal induction and causal reasoning,as causal relations are typically defined on the category level(Lien & Cheng, 2000; Waldmann & Hagmayer, 2006;Waldmann, Meder, von Sydow, & Hagmayer, 2010; alsoHagmayer, Meder, von Sydow, & Waldmann, 2011). Even ifone assumes that causal relationships at a more fine-grainedlevel adhere to the Markov condition, there is no guaranteethat this is the case for a given category scheme. One rarelyknows whether categories are homogeneous, and causal rela-tionships may often involve mixtures of different causal rela-tionships at some lower level or involve hidden variables.Thus it seems plausible for transitive distortion effects to playa substantial role in everyday as well as scientific reasoning.

Do our findings show that people’s probabilistic in-ferences are generally flawed and error prone? The

results do show that transitive reasoning that assumesan independent integration of causal links can systemat-ically deviate from objective data. Yet every cognitivesystem needs to make inductive inferences about unob-served relations, and the virtue of the Markov conditionis that it enables such inferences (Pearl, 2000; Spirteset al., 1993). Moreover, the independence assumptionsformalized in the Markov condition facilitate a parsimo-nious representation of relationships between variables(Domingos & Pazzani, 1997). Thus the Markov conditionmay provide a reasonable default assumption that guides hu-man learning at least initially, even if the assumption does nothold (von Sydow et al., 2010; Jarecki, Meder, & Nelson, 2013).Although the Markov condition does not need to hold for thecategories we used, it may provide reasonable guidelines for anideal construction of causal relationships and categories(Hausman & Woodward, 1999, 2004; but cf. Cartwright,2007). We focused here on chains where we find this ideaconvincing. Even if transitive distortion effects show that theindependent integration of single links may lead learners astray,this is taken as support for the idea that people tend to assumethat causal chains are transitive. Apart from resolving suchsituations by differentiating categories into different sub-classes—for which we here found only weak evidence—–afurther way to prevent intransitive chains is that people mayalready induce categories in a way that allows for transitivereasoning (Hagmayer et al. 2011).

Transitive reasoning in causal chains: boundaryconditions and future directions

Our findings suggest several avenues for future research. Akey question concerns the boundary conditions for illicit tran-sitive reasoning.

One way to eliminate transitive distortion effects for a chainwith two nonzero local relations and a zero global contingencymay be to highlight a possible direct relation between A and C.Although the temporal order of events in our experimentsconstrained the set of plausible causal models (Lagnado &Sloman, 2006), such constraints do not exclude a chain withan additional direct link between A and C, rather than assumingthat these variables were only indirectly connected viaintermediating event B. Our results suggest that people tend toinduce a parsimonious chain model without an additional link.Future research should investigate whether better calibratedjudgments are obtained if one would explicitly point out alter-native causal structures.

The obtained distortion effects might have been caused bya focus on causal relations, and might have been attenuated bya focus on the involved categories. In fact, research on causal-based category induction (Lien & Cheng, 2000; Waldmann &

Mem Cogn

Page 15: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Hagmayer, 2006; Waldmann et al., 2010) suggests thatpeople can use causal information to induce categories.The results of Hagmayer et al. (2011) suggest that peo-ple tend to continue to use categories from earlier nodesin a chain. They investigated the transfer of categoryschemes when learning causal chains A→B→C, wherethe dichotomous events A and C were precategorizedbut the intermediate event B consisted of uncategorizedexemplars. They showed that the categorization of Bbased on A was subsequently used for the second causalrelation, even if not optimal. In our task all three eventswere precategorized, and no transfer of categories oc-curred—otherwise participants would have realized thatA and C were orthogonal. Nonetheless, tasks that focusmore strongly on categories than on relationships be-tween categories might reduce transitive distortioneffects.

Similarly, an emphasis on different subclasses of items withdifferent contingencies (mixing) might reduce transitive distor-tion effects. Although we used only a two-dimensional itemspace and—for the intransitive chains—deterministic relation-ships on the subclass level, a stronger emphasis on the existenceof subclasses, communication of several different causes forcategories (Bonnefon et al., 2012), or an even simpler itemspace (see Fig. 1) might reduce distortion effects.

The semantics and pragmatics of scenarios and judgmentsappear to provide an additional important dimension relevantto issues of causal intransitivity. For example, being hungry(A) causes one to eat (B), which in turn causes one to feel full(C). Here, a transitive inference would suggest, counterintui-tively, that being hungry first causes one to feel full. In fact,research on verbally communicated causal relations suggeststhat people do regard some causal chains as intransitive(Bonnefon et al., 2012; Mayrhofer, Hildenbrand, &Waldmann, 2013). Future research should aim to investigatethe conditions under which chains are considered to betransitive.

A further important direction for future research con-cerns the relationships to different models of (causal)learning. While our investigation of intransitive causalchains was motivated by the postulated central role ofthe Markov condition in causal Bayes nets (Cartwright,2002, 2006; Hausman, & Woodward, 1999, 2004;Spohn, 2001; cf. Mayrhofer & Waldmann, 2015;Rehder & Burnett, 2005), an important question is towhat extent associative models of learning could ac-count for our findings. Although associative and causallearning differ with respect to important normative anddescriptive issues (Goedert, & Spellman, 2005;Waldmann, 1996), there is also some overlap and con-vergence between associative and probabilistic models

of contingency judgment (Chater, 2009; De Houwer &Beckers, 2002; Mitchell, De Houwer, & Lovibond,2009; Pineño & Miller, 2007). For instance, in line withMarr’s (1982) distinction between computational and al-gorithmic models, the Rescorla–Wagner model of asso-ciative learning (Rescorla & Wagner, 1972) convergesunder specific circumstances on the probabilistic con-trast ΔP (Jenkins & Ward, 1965), a prominent measureof statistical contingency or causal strength (Chapman &Robbins, 1990; Chater, 2009; Cheng, 1997; Danks,2003; Griffiths & Tenenbaum, 2005). Regarding ourtransitive distortion effects, associative approaches thatmodel updating of associative strength in a purebottom-up fashion based on directly observable contin-gencies between events cannot explain our results.However, associative approaches that additionally modelBinferred^ associations may be able to account for ourresults (e.g., Baetu & Baker, 2009). Future research ontransitive reasoning should aim to investigate the differ-ent models and to characterize the relationships amongthem.

Finally, an important question is whether inferentialdistortion effects are restricted to causal chains. There issome preliminary evidence that they do not generalizeto common-effect structures (A→B←C) with similarpositive local A→B and B←C contingencies and zerocontingencies between A and C (von Sydow et al.,2010). This would be expected from a causal Bayesnet perspective. Another question concerns common-cause structures, which play a central role in the philo-sophical criticism of the Markov condition (Cartwright,2007; Salmon, 1978; Sober, 1987; cf. Hausman &Woodward, 1999, 2004). According to Bayes nets, caus-al chains and common-cause structures are BMarkovequivalent.^ This suggests identical inferential distortioneffects for both structures. Empirically, however, the ev-idence on the direct psychological validity of theMarkov condition for common-cause structures is mixed(Rehder & Burnett, 2005; see also Jarecki et al., 2013;Mayrhofer, Goodman, Waldmann, & Tenenbaum, 2008;Mayrhofer & Waldmann, 2015; Rottman & Hastie,2013; von Sydow, 2011, 2013). Future research shouldcompare reasoning with different causal structures whenthe data violate the Markov assumption (von Sydowet al., 2010).

Relations to and differences from other research

Although our results are novel in the causal domain,related findings in other fields point in a similar direc-tion. For example, the Simpson paradox (Simpson,

Mem Cogn

Page 16: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

1951) describes how statistical dependencies can vanishor even be reversed when moving from populations tosubpopulations. Some studies (Fiedler, Walther, Freytag,& Nickel, 2003; Waldmann & Hagmayer, 2001) havedemonstrated participants’ problems in adequately con-trolling for a third, confounding variable that reversesthe relation between two events. Whereas in our exper-iments participants integrated individual causal links andthereby misjudged the distal relation, participants in thementioned experiments integrated subpopulations, violat-ing the relation among variables in the overallpopulation.

Other research has shown distortion of zero cue–outcomecontingencies, based on high or low base rates of an outcome(Baker, Berbrier, & Vallée-Tourangeau, 1989, Experiment 3;Dickinson, Shanks, & Evandon, 1994; also Buehner, Cheng,& Clifford, 2003). Our results are neutral with regard to suchan Boutcome density bias,^ because we used no skewed out-comes that is, P(A) = P(B) = P(C) = .5, and, empirically, wedid not find distortion effects in the zero-contingency controlconditions.

Furthermore, so-called pseudo-contingencies havebeen discussed (Fiedler & Freytag, 2004; Fiedler,Freytag, & Meiser, 2009; Fiedler, Kutzner, & Vogel,2013; Meiser & Hewstone, 2004; cf. Kutzner, Vogel,Freytag, & Fiedler, 2011), normally referring to illicitinferences about relations between events based onskewed marginal distributions. For instance, when manystudents in one class watch a lot of television, and manystudents in the same class show aggressive behavior, onemight infer that students who watch a lot of televisiontend to be aggressive, even if the events are not corre-lated. Such pseudo-contingencies, however, are unlikelyto apply in our scenarios, as our distributions (includingthe ones with zero contingency) were not skewed.

Concluding remarks

Our results contribute to a view that emphasizes the roleof top-down or knowledge-based inference processes ininduction. It has been argued in different fields, such asperception (Gregory, 1980), memory (Loftus &Hoffman, 1989) , and language comprehension(Graesser, Singer, & Trabasso, 1994), that top-downprocesses favoring broadly coherent representations havea substantial and occasionally distorting impact on in-duction. Overall, our results corroborate the idea thatpeople derive probability estimates by combining singlecausal links into complex causal models in a modularway (Waldmann et al., 2008). The present results, how-ever, suggest that people base their probability

judgments in causal structures not only on bottom-updata—even if observations are directly available duringjudgments—but also on transitive inferences based onmental causal models that obey the Markov condition,even if transitivity does not hold in the data.

Acknowledgments The work of M. v. S. and the running of theexperiments were supported by a grant from the DeutscheForschungsgemeinschaft (DFG Sy 111/2), as part of the priorityprogram BNew Frameworks of Rationality^ (SPP 1516). B. M.was supported by grant ME 3717/2 from the same program.Portions of Experiments 1a and 1b were presented at the 2009Cognitive Science conference in Amsterdam (von Sydow, Meder,& Hagmayer, 2009). We thank Alexander Wendt, AntoniaLange, Alina Greis, Christin Corinth, and Martine Vardar forassistance in data collection and Anita Todd and Martha Cun-ningham for correcting the manuscript. We are grateful to BenNewell, Dennis Hebbelmann, Klaus Fiedler, Martha Cunning-ham, Ralf Mayrhofer, and Michael R. Waldmann for helpfulcomments on this research.

Appendix 1

Because Experiment 2 was run on a computer with asmaller screen than the paper used in Experiment 1aand 1b (with a DIN A4 page for each of the threepanels), we ran an additional item-sensitivity test inExperiment 2 to control for each participant’s abilityto distinguish feature values of the microbes. The testinvolved two counterbalancing conditions using differentitems. For both conditions, participants compared sixpairs of microbes according to size (BIs Item 1 biggeror smaller than Item 2?^) and six other pairs accordingto grayscale. The comparisons each concerned five min-imal (one-step) differences and one larger (four-step)difference in the 8 × 8 stimulus space (cf. Fig. 7).

The results of the item-sensitivity test showed a rea-sonable average error rate of 8%. There were no differ-ences in error between size and grayscale (7% vs. 8%),and only small differences between clearly distinguish-able (four-step) and hard-to-distinguish (one-step) items(5% vs. 8%). Thus, only a small proportion of errorscan plausibly be attributed to an inability to distinguishsimilar items rather than to general noise. Additionally,the error rate in the item-sensitivity task did not corre-late with the number of errors in the local judgments,r(121) = .07, or with the proportion of answers in linewith our predictions, r(121) = .09. Participants’ individ-ual ability to differentiate between items does not seemto have mediated the distortion effects we found.

Mem Cogn

Page 17: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Appendix 2

References

Ahn, W., & Dennis, M. (2000). Induction of causal chains. In L.R. Gleitman & A. K. Joshi (Eds.), Proceedings of the twenty-second annual conference of the cognitive science society(pp. 19–24). Mahwah, NJ: Erlbaum.

Allan, L. G. (1980). A note on measurement of contingency between twobinary variables in judgment tasks. Bulletin of the PsychonomicSociety, 415, 147–149. doi:10.3758/BF03334492

Arntzenius, F. (2005). Reichenbach’s common cause principle. In E. N.Zalta (Ed.), The Stanford encyclopedia of philosophy. Retrieved fromhttp://plato.stanford.edu/archives/win2008/entries/physics-Rpcc/

Baetu, I., & Baker, A. G. (2009). Human judgments of positive andnegative causal chains. Journal of Experimental Psychology:Animal Behavior Processes, 35, 153–168. doi:10.1037/a0013764

Baker, A. G., Berbrier, M. W., & Vallée-Tourangeau, F. (1989).Judgments of a 2 × 2 contingency table: Sequential processing andthe learning curve. Quarterly Journal of Experimental Psychology,41B, 65–97. doi:10.1080/14640748908401184

Bonnefon, J.-F., Da Silva Neves, R., Dubois, D., & Prade, H. (2012).Qualitative and quantitative conditions for the transitivity of per-ceived causation. Annals of Mathematics and ArtificialIntelligence, 64, 311–333. doi:10.1007/s10472-012-9291-0

Buehner, M., Cheng, P. W., & Clifford, D. (2003). From covariation tocausation: A test of the assumption of causal power. Journal ofExperimental Psychology: Learning, Memory, and Cognition, 29,1119–1140. doi:10.1037/0278-7393.29.6.1119

Cartwright, N. (2001). What is wrong with Bayes nets? The Monist, 84,242–264. doi:10.5840/monist20018429

Cartwright, N. (2002). Against modularity, the causal Markov condition,and the link between the two: Comments on Hausman andWoodward. British Journal for the Philosophy of Science, 53,411–453. doi:10.1093/bjps/53.3.411

Cartwright, N. (2006). From metaphysics to method: Comments on ma-nipulability and the causal Markov condition. British Journal for thePhilosophy of Science, 57, 197–218. doi:10.1093/bjps/axi156

Cartwright, N. (2007). Hunting causes and using them: Approaches inphilosophy and economics. Cambridge, England: CambridgeUniversity Press. doi:10.1017/CBO9780511618758

Table 5 Detailed mean estimates (±SE) on the category level in Experiment 2

Condition Prediction Results N

P(B|A) P(C|B) P(C|A) P(B|A) P(C|B) P(C|A)

Data Trans.

Local–global

++0 50 50 0 25 57.2 (3.9) 52.8 (3.5) 16.7 (7.5) 36

−−0 −50 −50 0 25 −58.9 (3.1) −59.4 (3.1) 14.7 (7.5) 38

+0 0 50 50 0 0 63.1 (3.7) 3.7 (4.5) 2.5 (6.8) 32

+++ 50 50 25 25 59.1 (3.0) 50.9 (3.1) 35.0 (6.0) 44

−−+ −50 −50 25 25 −57.0 (3.9) −54.8 (4.1) 38.5 (7.5) 27

−+− −50 50 −25 −25 −52.6 (3.4) 51.4 (3.1) −26.3 (6.9) 35

Global–local

++0 50 50 0 25 51.3 (3.9) 48.6 (3.6) 16.8 (7.6) 37

−−0 −50 −50 0 25 −50.7 (5.1) −64.3 (3.6) 7.9 (8.1) 28

+0 0 50 50 0 0 46.2 (4.1) −8.2 (4.5) −6.9 (6.1) 29

+++ 50 50 25 25 51.7 (3.3) 47.3 (3.3) 35.6 (6.1) 41

−−+ −50 −50 25 25 −48.9 (3.6) −39.3 (4.2) 25.2 (7.8) 27

−+− −50 50 −25 −25 −57.2 (3.5) 46.2 (3.7) −40.7 (6.4) 29

All

++0 50 50 0 25 54.2 (2.7) 50.7 (2.5) 16.7 (5.3) 73

−−0 −50 −50 0 25 −55.4 (2.8) −61.5 (2.4) 11.8 (5.5) 66

+0 0 50 0 0 0 55.1 (2.9) −2.0 (3.2) −2.0 (4.6) 61

+++ 50 50 25 25 55.5 (2.3) 49.2 (2.2) 35.3 (4.2) 85

−−+ −50 −50 25 25 −52.9 (2.7) −47.0 (3.1) 31.8 (5.4) 54

−+− −50 50 −25 −25 −54.7 (2.4) 49.1 (2.4) −32.8 (4.8) 64

Note. In the local–global conditions, participants rated the individual causal links before judging the conditional probability of C given A. In the global–local conditions, this order was reversed. All = Mean judgments aggregated across question order. Data = conditional probability as entailed by the data.Trans. = the conditional probability as estimated transitively. All conditional probability judgments are presented here using a scale of −100 to +100although the scale was not explicitly numbered (see Fig. 8).

Mem Cogn

Page 18: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Chapman, G. B., & Robbins, S. J. (1990). Cue interaction in humancontingency judgment. Memory & Cognition, 18, 537–545. doi:10.3758/BF03198486

Chater, N. (2009). Rational and mechanistic perspectives on reinforce-ment learning. Cognition, 113, 350–364. doi:10.1016/j.cognition.2008.06.014

Cheng, P. W. (1997). From covariation to causation: A causal powertheory. Psychological Review, 104, 367–405. doi:10.1037/0033-295X.104.2.367

Danks, D. (2003). Equilibria of the Rescorla-Wagner model. Journal ofMathematical Psychology, 47, 109–121. doi:10.1016/S0022-2496(02)00016-0

De Houwer, J., & Beckers, T. (2002). A review of recent developments inresearch and theories on human contingency learning. TheQuarterly Journal of Experimental Psychology. B, 55, 289–310.doi:10.1080/02724990244000034

Dickinson, A., Shanks, D., & Evandon, J. (1994). Judgement of act-outcome contingency: The role of selective attribution. QuarterlyJournal of Experimental Psychology, Section A: HumanExperimental Psychology, 36A, 29–50. doi :10.1080/14640748408401502

Domingos, P., & Pazzani, M. (1997). On the optimality of the simpleBayesian classifier under zero-one loss. Machine Learning, 29,103–130. doi:10.1023/A:1007413511361

Evans, S. B. T., & Over, D. E. (2004). If. Oxford: Oxford UniversityPress. doi:10.1093/acprof:oso/9780198525134.001.0001

Fiedler, K., & Freytag, P. (2004). Pseudocontingencies. Journal ofPersonality and Social Psychology, 87, 453–467. doi:10.1037/0022-3514.87.4.453

Fiedler, K., Freytag, P., & Meiser, T. (2009). Pseudocontingencies: Anintegrative account of an intriguing cognitive illusion.PsychologicalReview, 116, 187–206. doi:10.1037/a0014480

Fiedler, K., Kutzner, F., & Vogel, T. (2013). Pseudocontingencies–Logically unwarranted but smart inferences. Current Directions inPsychological Science, 1–6. doi:10.1177/0963721413480171

Fiedler, K., Walther, E., Freytag, P., & Nickel, S. (2003). Inductive rea-soning and judgment interference: Experiments on Simpson’s para-dox. Personality and Social Psychology Bulletin, 29, 14–27. doi:10.1177/0146167202238368

Goedert, K. M., & Spellman, B. A. (2005). Non-normative discounting:There is more to cue-interaction effects than controlling for alterna-tive causes. Learning & Behavior, 33, 197–210. doi:10.3758/BF03196063

Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing infer-ences during narrative text comprehension. Psychological Review,101, 371–395. doi:10.1037/0033-295X.101.3.371

Gregory, R. L. (1980). Perceptions as hypothesis. PhilosophicalTransactions of the Royal Society of London, Series B, 290, 181–197. doi:10.1098/rstb.1980.0090

Griffiths, T. L., & Tenenbaum, J. B. (2005). Structure and strength incausal induction. Cognitive Psychology, 51, 334–384. doi:10.1016/j.cogpsych.2005.05.004

Hagmayer, Y., Meder, B., von Sydow, M., & Waldmann, M. R. (2011).Category transfer in sequential causal learning: The unbrokenmech-anism hypothesis. Cognitive Science, 35, 842–873. doi:10.1111/j.1551-6709.2011.01179.x

Hausman, D., &Woodward, J. (1999). Independence, invariance, and thecausal Markov condition. British Journal for the Philosophy ofScience, 50, 521–583. doi:10.1093/bjps/50.4.521

Hausman, D., &Woodward, J. (2004). Modularity and the causalMarkovcondition: A restatement. British Journal for the Philosophy ofScience, 55, 147–167. doi:10.1093/bjps/55.1.147

Hebbelmann, D., & von Sydow, M. (2014). Betting on transitivity in aneconomic setting. Proceedings of the Thirty-Sixth AnnualConference of the Cognitive Science Society (pp. 2339–2344).Austin, TX: Cognitive Science Society.

Jarecki, J., Meder, B., & Nelson, J. D. (2013). The assumption of class-conditional independence in category learning. In M. Knauff, M.Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the35th annual conference of the cognitive science society (pp. 2650–2655). Austin, TX: Cognitive Science Society.

Jenkins, H. M., &Ward,W. C. (1965). Judgment of contingency betweenresponses and outcomes. Psychological Monographs: General andApplied, 79, 1–17. doi:10.1037/h0093874

Kutzner, F., Vogel, T., Freytag, P., & Fiedler, K. (2011). Contingencyinferences driven by base rates: Valid by sampling. Judgment andDecision Making, 6, 211–221.

Lagnado, D. A., & Sloman, S. A. (2006). Time as a guide to cause.Journal of Experimental Psychology: Learning, Memory, andCognition, 32, 451–460. doi:10.1037/0278-7393.32.3.451

Lien, Y., & Cheng, P. W. (2000). Distinguishing genuine from spuriouscauses: A coherence hypothesis. Cognitive Psychology, 40, 87–137.doi: 10.1006/cogp.1999.0724

Loftus, E. F., & Hoffman, H. G. (1989). Misinformation and memory:The creation of memory. Journal of Experimental Psychology:General, 118, 100–104. doi:10.1037/0096-3445.118.1.100

Marr, D. (1982). Vision: A computational investigation into the Humanrepresentation and processing of visual information. San Francisco,CA: Freeman & Co.

Mayrhofer, R., Goodman, N. D., Waldmann, M. R., & Tenenbaum, J. B.(2008). Structured correlation from the causal background. In V.Sloutsky, B. Love, & K. McRae (Eds.), Proceedings of the 30thannual conference of the cognitive science society (pp. 303–308).Austin, TX: Cognitive Science Society.

Mayrhofer, R., Hildenbrand, I., & Waldmann, M. R. (2013). Causal dis-positions and transitivity in causal chains. In M. Knauff, M. Pauen,N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th annualconference of the cognitive science society (p. 4041). Austin, TX:Cognitive Science Society.

Mayrhofer, R., & Waldmann, M. R. (2015). Agents and causes:Dispositional intuitions as a guide to causal structure. CognitiveScience, 39, 65–95. doi:10.1111/cogs.12132

Meder, B., Mayrhofer, R., & Waldmann, M. (2014). Structure inductionin diagnostic causal reasoning. Psychological Review, 121, 277–301. doi:10.1037/a0035944

Meiser, T., & Hewstone, M. (2004). Cognitive processes in stereotypeformation: The role of correct contingency learning for biased groupjudgments. Journal of Personality and Social Psychology, 87, 599–614. doi:10.1037/0022-3514.87.5.599

Mitchell, C. J., De Houwer, J., & Lovibond, P. F. (2009). The proposi-tional nature of human associative learning. Behavioral and BrainSciences, 32, 183–198. doi:10.1017/S0140525X09000855

Oberauer, K., Weidenfeld, A., & Fischer, K. (2007). What makes us be-lieve a conditional? The roles of covariation and causality. Thinkingand Reasoning, 13, 340–369. doi:10.1080/13546780601035794

Pearl, J. (2000). Causality: Models, reasoning, and inference. New York,NY: Cambridge University Press.

Pineño, O., &Miller, R. R. (2007). Comparing associative, statistical, andinferential accounts of human contingency learning. QuarterlyJournal of Experimental Psychology, 60, 310–329. doi:10.1080/17470210601000680

Rehder, B., & Burnett, R. (2005). Feature inference and the causal struc-ture of categories.Cognitive Psychology, 50, 264–314. doi:10.1016/j.cogpsych.2004.09.002

Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian condi-tioning: Variations in the effectiveness of reinforcement and non-reinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classicalconditioning II: Current research and theory (pp. 64–99). NewYork, NY: Appleton-Century-Crofts.

Rottman, B. M., & Hastie, R. (2013). Reasoning about causal relation-ships: Inferences on causal networks. Psychological Bulletin, 140,109–139. doi:10.1037/a0031903

Mem Cogn

Page 19: Transitive reasoning distorts induction in causal chains · Transitive reasoning distorts induction in causal chains Momme von Sydow1,4,5 & York Hagmayer2 & Björn Meder3 # Psychonomic

Salmon, W. (1978). Why ask, BWhy?^? Proceedings of the AmericanPhilosophical Association, 51, 683–705. doi:10.2307/3129654

Seels, S. B. (1936). The atmosphere effect: An experimental study ofreasoning. Archives of Psychology, 200, 1–72.

Simpson, E. H. (1951). The interpretation of interaction in contingencytables. Journal of the Royal Statistical Society, Series B(Methodological), 13, 238–241.

Sloman, S. (2005). Causal models: How people think about the worldand its alternatives. New York, NY: Oxford University Press. doi:10.1093/acprof:oso/9780195183115.001.0001

Sober, E. (1987). The principle of the common cause. In J. Fetzer (Ed.),Probabi l i ty and causal i ty (pp. 211–228) . Dort recht ,The Netherlands: Reidel.

Sober, E. (2001). Venetian sea levels, British bread prices, and the prin-ciple of the common cause. British Journal for the Philosophy ofScience, 52, 331–346. doi:10.1093/bjps/52.2.331

Sober, E., & Steel, M. (2012). Screening-off and causal incompleteness:A no-go theorem. British Journal for the Philosophy of Science, 52,1–38. doi:10.1093/bjps/axs021

Spirtes, P., Glymour, C., & Scheines, R. (1993).Causation, prediction, andsearch. New York, NY: Springer. doi:10.1007/978-1-4612-2748-9

Spohn, W. (2001). Bayesian nets are all there is to causal dependence. InM. C. Galavotti, P. Suppes, & D. Costantini (Eds.), Stochastic de-pendence and causality (pp. 157–172). Stanford, CA: CSLI.

Steel, D. (2006). Comment on Hausman & Woodward on the causalMarkov condition. British Journal for the Philosophy of Science,57, 219–231. doi:10.1093/bjps/axi154

von Sydow, M. (2011). The Bayesian logic of frequency-based conjunc-tion fallacies. Journal of Mathematical Psychology, 55, 119–139.doi:10.1016/j.jmp.2010.12.001

von Sydow, M. (2013). Logical patterns in individual and general pred-ication. InM. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.),Proceedings of the 35th annual conference of the cognitive sciencesociety (pp. 3693–3698). Austin, TX: Cognitive Science Society.

von Sydow,M., Hagmayer, Y.,Meder, B., &Waldmann, M. (2010). Howcausal reasoning can bias empirical evidence. In S. Ohlsson & R.Catrambone (Eds.), Proceedings of the 32nd annual conference ofthe cognitive science society (pp. 2087–2092). Austin, TX:Cognitive Science Society.

von Sydow, M., Meder, B., & Hagmayer, Y. (2009). A transitivity heu-ristic of probabilistic causal reasoning. In N. A. Taatgen & H. vanRijn (Eds.), Proceedings of the 31st annual conference of the cog-nitive science society (pp. 803–808). Austin, TX: Cognitive ScienceSociety.

Waldmann, M. R. (1996). Knowledge-based causal induction. In D. R.Shanks, K. J. Holyoak, & D. L. Medin (Eds.), The psychology oflearning and motivation, Vol. 34: Causal learning (pp. 47–88). SanDiego, CA: Academic Press. doi:10.1016/s0079-7421(08)60558-7

Waldmann,M. R., Cheng, P. W., Hagmayer, Y., & Blaisdell, A. P. (2008).Causal learning in rats and humans: A minimal rational model. In N.Chater &M. Oaksford (Eds.), The probabilistic mind: Prospects forBayesian cognitive science (pp. 453–484). Oxford, England: OxfordUniversity Press. doi:10.1093/acprof:oso/9780199216093.003.0020

Waldmann, M. R., & Hagmayer, Y. (2001). Estimating causalstrength: The role of structural knowledge and processingeffort. Cognition, 82, 27–58. doi:10.1016/S0010-0277(01)00141-X

Waldmann,M. R., & Hagmayer, Y. (2006). Categories and causality: Theneglected direction. Cognitive Psychology, 53, 27–58. doi:10.1016/j.cogpsych.2006.01.001

Waldmann, M. R., Meder, B., von Sydow, M., & Hagmayer, Y. (2010).The tight coupling between category and causal learning. CognitiveProcessing, 11, 143–158. doi:10.1007/s10339-009-0267-x

White, P. A. (2003). Making causal judgements from contingencyinformation: The pCI rule. Journal of ExperimentalPsychology: Learning, Memory, and Cognition, 29, 710–727. doi:10.1037/0278-7393.29.4.710

Mem Cogn