Society for American Archaeology

Lars Fogelin

Processual and postprocessual archaeologists implicitly employ the same epistemological system to evaluate the worth of

different explanations: inference to the best explanation. This is good since inference to the best explanation is the most

effective epistemological approach to archaeological reasoning available. Underlying the logic of inference to the best expla nation is the assumption that the explanation that accounts for the most evidence is also most likely to be true. This view of

explanation often reflects the practice of archaeological reasoning better than either the hypothetico-deductive method or

hermeneutics. This article explores the logic of inference to the best explanation and provides clear criteria to determine

what makes one explanation better than another. Explanations that are empirically broad, general, modest, conservative,

simple, testable, and address many perspectives are better than explanations that are not. This article also introduces a sys tem of understanding explanation that emphasizes the role of contrastive pairings in the construction of specific explana tions. This view of explanation allows for a better understanding of when, and when not, to engage in the testing of specific


Arqueologos de las orientaciones teoricas procesualy postprocesual, implicitamente emplean el mismo sistema epistemologico

para evaluar el merito de diferentes interpretaciones: inferencia a la mejor explicacion. Esto es bueno ya que inferencia a la

mejor explicacion es el metodo epistemologico mas efectivo del razonamiento arqueologico disponible. Fundamental a esta

logica es la suposicion de que la explicacion que incorpora la mayor evidencia es tambien la mas probable de ser verdad. Este

metodo de explicacion refleja mas correctamente la prdctica real del razonamiento arqueologico comparado con el metodo

hipotetico-deductivo o la hermeneutica. Este ensayo explora la logica de la inferencia a la mejor explicacion y proporciona criterios claws para determinar que hace una explicacion mejor que otra. Las explicaciones que son empiricamente com

prensivas, generales, modestas, conservativas, simples, que son refutables y que hacen referenda a multiples perspectivas son

mejor que las explicaciones que no lo son. Este ensayo ademds introduce un sistema para el entendimiento de explicaciones

que acentua elpapel que juegan pares contrastantes en la construccion de explicaciones especificas. Esta perspectiva de expli cacion permite un mejor entendimiento de cuando, y cuando no, es necesario probar explicaciones especificas.

This article begins with a simple observation.

Whatever theoretical perspectives archaeol

ogists have brought to their research, they have often created long-lasting, powerful explana

tions concerning the lives of people in the past. I am

not suggesting that all archaeological research has

been good?some has been terrible?but through

out all the differing perspectives and approaches in

archaeology, a steady output of compelling, seem

ingly right, explanations of the past have emerged. How has all this good research been possible?

This may seem an odd question. However, when

viewed against the backdrop of archaeological the

ory, it is an important one. At two points in archae

ology's recent disciplinary history, theoretical rev

olutions are said to have occurred: first in the 1960s

with the new archaeology (later termed processual

archaeology) and again in the 1980s with post

processual archaeology. In both cases, proponents claimed that new approaches to archaeology that

signified a radical break with the past were being

developed. If the rhetoric of the rival camps is taken

seriously, processual and postprocessual archaeol

ogists were engaged in wholly different enterprises and should not have been able to have any pro ductive discourse.1 At first glance, the proces

sual/postprocessual debates would seem to fit this

characterization well. However, today many

Lars Fogelin Department of Anthropology and Sociology, Albion College, 611 East Porter St., Albion, MI 49224

([email protected])

American Antiquity, 72(4), 2007, pp. 603-625

Copyright ?2007 by the Society for American Archaeology


AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

archaeologists, myself included, claim to work in

the middle ground between these two archaeolog ical perspectives. Archaeologists of a more empiri cist bent talk of doing "processual plus"

archaeology (Hegmon 2003). Those with more

interpretive leanings are actively engaging in field

work and re-embracing many of the "scientific"

methodologies pioneered by the new archaeology. In the meantime, both sides borrow data from one

another and continue to rely on the work of archae

ologists from the early twentieth century. How has

all of this been possible? How do archaeologists balance two supposedly irreconcilable perspec tives? How is it that the data and knowledge pro

moted by these two perspectives are usable in the

other? If processual and postprocessual archaeol

ogy are truly as incompatible as their proponents once claimed, no synthesis should be possible. Yet

it is occurring. My explanation is straightforward. Neither processual nor postprocessual archaeology are as different from each other as their practition ers claim. They have each brought new concerns

and questions to archaeological inquiry, but both

often rely implicitly on the same underlying style of reasoning?inference to the best explanation.

Making an inference to the best explanation is, at its heart, a straightforward and common process.

According to Lipton (1991:58), "Given our data

and background beliefs, we infer what would, if

true, provide the best of the competing explanations

we can generate of those data." One additional point

is critical. The ability of an explanation to explain a wide variety of data makes it more likely to be true. Clearly, this is a problematic statement in

terms of the official epistemologies of processual and postprocessual archaeology. I ask, for the

moment, that readers reserve judgment on these

epistemological issues and consider how they actu

ally engage in archaeological research. All systems of reasoning have their own sets of epistemologi cal problems. The question then is not which sys tem of reasoning can or cannot provide some

measure of objective truth, but rather, which one is

most useful in terms of archaeological research.

This article has three interlocking goals. First, I

define what inference to the best explanation is and

show how it works. Second, I argue that inference

to the best explanation has been "standard practice" in archaeology for over a century. Its implicit use

partially accounts for much of the best work that

archaeology has to offer. This article is not a

detached discussion of the philosophy of science.

It is intended to be a practical guide for improving

archaeological research. For this reason, this arti

cle concludes with several straightforward sug

gestions toward improving upon the implicit use

of inference to the best explanation in archaeology.

Following Wylie (2002:25-41), I recognize that

the social theories and perspectives of archaeolo

gists have been widely divergent.2 The argument

presented here is only that the underlying standards

used to assess archaeological explanation are

largely the same throughout much of archaeology. The larger social theories that inform these expla nations are often at odds with one another in pro found and important ways. These differences

should not be minimized. Even if my suggestions

concerning inference to the best explanation are

widely accepted, archaeologists will find no lack

of issues to debate.

Philosophy of Science in Archaeology

With the advent of the new archaeology in the

1960s, archaeologists began a process of explicitly

addressing epistemological questions. It was

argued that archaeology should employ the scien

tific method, initially understood as a deductive

nomological enterprise as described by Hempel (covering law model: Hempel 1965, 1966).3

Archaeologists were advised to develop laws of

culture, create testable hypotheses, and apply them

through deduction to the archaeological record

(Binford 1967, 1968a, 1968b; Binford and Bin

ford, eds. 1968; Fritz and Plog 1970; Hill 1970; Watson etal. 1971,1984).

Following the standard definition of deduction

in philosophy, Hempel (1966:10) stated that "in a

deductively valid argument, the conclusion is

related to the premises in such a way that if the

premises are true then the conclusion cannot fail to

be true as well."4 At first glance this definition might seem to imply that any valid deduction must yield true results. That is incorrect. Given the standard

definition of deductive validity, it is possible for a

valid deductive argument to have a false conclu

sion, for example:

All hominids have wings. Homo ergaster is a hominid.

Thus, Homo ergaster has wings.

This is a valid deduction because if the premises were true then the conclusion would have to be true

as well. Of course, both the first premise and the

conclusion are false, but that does not affect the

deductive validity of the argument. In fact, this is

a pattern of reasoning commonly used in science

to refute proposed universal statements or laws. If

Homo ergaster is a hominid and specimens of

Homo ergaster do not have wings, then we can

validly reason that not all hominids have wings. This simple point of logic lies at the heart of the

hypothetico-deductive method Hempel advocated

for testing scientific hypotheses. For Hempel, those hypotheses that fail a valid

deductive test must be rejected or at least modified.

Conversely, those hypotheses that survive multiple tests are taken to be stronger (Hempel 1966:8). For

Hempel, this measure of strength is derived from

inductive reasoning, not deductive reasoning.

Hempel is explicit on this point.

Even extensive testing with entirely favorable

results does not establish a hypothesis con

clusively, but provides only more or less strong

support for it. Hence, while scientific inquiry is certainly not inductive in the narrow sense

... it may be said to be inductive in a wider

sense, inasmuch as it involves the acceptance

of hypotheses on the basis of data that afford no deductively conclusive evidence for it, but

lend it only more or less strong "inductive sup

port," or confirmation [Hempel 1966:18;

emphasis in original].

While it is possible for a scientist to assess

whether a deduction is valid, it is not logically pos sible to state with certainty that any of its premises are irrefutably true. The latter would, as noted by

Hempel (1966:7), fall into the fallacy of affirming the consequent. Thus, deduction by itself has no

mechanism to establish any form of independent or objective truth.

The deductive-nomological approach, as under

stood by Hempel, required that a set of universal

laws of human behavior be developed in archaeol

ogy. By the mid-1970s, even some of Hempel's strongest advocates recognized that little progress

was being made in the creation of these laws (Bin ford 1977;Flannery 1973; Read and LeBlane 1978;

Tuggle et al. 1972; Watson et al. 1984). Forty years later, it does not appear that laws of human behav

ior are any closer to being developed, and few

archaeologists are even trying (see also LaMotta

and Schiffer 2001; Schiffer 1988). With the recognition of the limitations of the

deductive-nomological approach, archaeologists and philosophers began searching for a philosophy of science more appropriate for archaeology. Mer

ilee and Wesley Salmon advocated approaches that

could account for the statistical nature of archaeo

logical reasoning (Salmon 1982; Salmon and

Salmon 1979). Others promoted the falsification

strategy of Karl Popper, arguing that hypotheses can never be confirmed, only rejected (Peebles 1992; Popper 1959, 1976). Others mined biology

and geology for scientific methods deemed more

appropriate for historical sciences (Dunnell 1989,

1992; Flannery 1986; Lyman and O'Brien 1998). Some strayed further from the hypothetico deductive fold, appropriating philosophical dis

cussions of realism (Gibbon 1989) or even rejecting

outright the need for scientific positivism in archae

ology at all (Hodder 1982,1983,1984; Shanks and

Tilley 1987a, 1987b). With the exception of the

anti-positivist outlook of postprocessual archaeol

ogy, few of the alternative systems appear to have

caught on. For the most part, archaeologists seem

to have gone about their research without a clearly articulated epistemological foundation. While some processual archaeologists make vague claims

to the hypothetico-deductive method, in practice it seems that hypotheses are becoming rarer-and-rarer

in the archaeological literature. This point can be most clearly seen in Hegmon's (2003:230) recent discussion of processual-plus archaeology, where

the hypothetico-deductive method is discussed

mostly for its historical interest. Given the quantity and diversity of philosophi

cal writings in archaeology, it is not possible to pro vide a detailed review of them here. Other reviews are more than adequate for this purpose (Gibbon 1989; Kelley andHanen 1988; O'Brien etal. 2005;

Wylie 2002; see also Bawden 2003). However, a

few archaeological discussions of epistemology bear directly on the arguments presented here. In

the background of the epistemological debates in

archaeology, several archaeologists and philoso

phers have argued that little has actually changed? that most archaeologists continue to employ the same sorts of reasoning they always have. Peter

Kosso (1991; see also Arnold 2003) has argued

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

that the middle-range theory of processual archae

ology and hermeneutics of postprocessual archae

ology are, in practice, pretty much the same.

Christine VanPool and Todd VanPool (1999,2001; see also Arnold and Wilkens 2001; Hutson 2001;

Wylie 2002:200-210) have gone further, arguing that postprocessual archaeology is just as scientific

as processual archaeology. One common problem unites all of these articles. What do archaeologists

gain from the recognition of these fundamental

similarities? How does it improve archaeological research? The recognition that archaeological rea

soning across the discipline is similar should lead

to specific suggestions for improving research.

To date, inference to the best explanation has

played only a small role in archaeological discus

sions of epistemology. The most explicit use of the

concept can be found in Marsha Hanen and Jane

Kelley's "Inference to the Best Explanation in

Archaeology" (1989; see also Kelley and Hanen

1988). Here Hanen and Kelley employed the con

cept of inference to the best explanation to argue that the underlying styles of reasoning of proces sual and earlier forms of archaeology were funda

mentally similar. They contrast Kelley's processual research at Cihuatan, El Salvador (Hanen and Kel

ley 1989) with Emil Haury's (1958) research at

Point of Pines in the American Southwest. They concluded that both archaeologists evaluated mul

tiple explanations for a given phenomena, in the

end accepting the 'best' explanation as most likely to be true. The best explanation, in their view, was

determined by eliminating explanations that were

"less well supported" by the material evidence

(Hanen and Kelley 1989:16). In terms of their

analysis of specific cases, I am in full agreement with Hanen and Kelley. Further, I will argue that

postprocessual archaeologists also employ infer

ence to the best explanation in a similar fashion.

My concern with Hanen and Kelley's discussion is

what it does not address. It does not provide a clear

definition of inference to the best explanation, nor

does it present clear guidelines for how to evaluate

which explanation is best. Further, even if one

explanation is shown to be best, there is no guar antee that the explanation is any good. It might

simply be the best of a bad lot. Any application of

inference to the best explanation in archaeology

requires methods to reject those "best" explanations that are not "good" explanations.


If we take as a starting point the limitations of the

hypothetico-deductive method in archaeological

reasoning, it would seem that archaeological expla nations must often be derived from either flights of

fancy or some form of inductive reasoning. As for

the former, it would not account for the observa

tion that began this article?that a great deal of

good, seemingly right, research has occurred in

archaeology. This suggests that induction must be

involved in a great deal of good archaeological rea

soning. The difficulty is that none of the standard

systems of induction seem to fit many of the forms

of explanation that archaeologists typically employ. This section examines traditional types of induc

tion to demonstrate (1) that they could not produce the explanations commonly constructed by prac

ticing archaeologists, and (2) that traditional forms

of induction often rely implicitly on inference to

the best explanation. As defined by The Oxford Companion to Phi

losophy (Honderich, ed. 1995:405), "an inductive

inference can be characterized as one whose con

clusion, while not following deductively from its

premises, is in some way supported by them or ren

dered plausible in light of them." In contrast to a

valid deduction, the conclusion of an inductive

argument can be false even if the premises are true.5

Induction can occur from specific cases to general

principles and, despite a common misunderstand

ing, from general principles to specific cases.6 The

most important element of inductive arguments is

that they are ampliative.7 That is, their conclusions

contain more information than is contained within

their premises. This is a very valuable trait. To use

a classic example, on observing many black ravens,

I might infer that all ravens are black. Here I have

amplified my limited number of observations of

black ravens to a general statement about raveness

(they are all black). Central to all forms of induction lies a critical

assumption, first identified by David Hume

(1956[1777])?regularities that have occurred in

the past will continue to occur in the future. If the

sun has risen every day, it will continue to do so

tomorrow. Inductions, then, are always underde

termined by empirical evidence. As in the exam

ple above, they are empirical generalizations from

a limited sample of past experiences. Further, there

always remains the chance that the sun will not con

tinue to rise in the future (for example, when it

explodes 5 billion years from now). All inductive

arguments, no matter how robust, are always sub

ject to rejection with the discovery of new infor

mation. This problem with induction is often

referred to as Hume's skepticism concerning induc

tion. The conclusions of an inductive argument are

always under threat of the discovery of new evi

dence that could discredit them.

Humean skepticism is not the only form of skep ticism that archaeologists must face. Cartesian

skepticism serves as the epistemological founda

tion of many of the postmodern ideas that under

write postprocessual archaeology. This form of

skepticism is not based on the limits of inductive

reasoning, but rather the unreliability of sensory information. Cartesian skeptics claim that the world

outside our own mind can never be known objec

tively because our sensory perception is epistemo

logically unreliable. Building on this argument, Kant (1998[1781]) proposed that people cannot

simply perceive an objective world; instead, they

actively construct it. In this way, Kant serves as the

bridge between Cartesian skepticism and the con

structivist theories that inform postprocessual

archaeology. Specifically, Cartesian skepticism is

the foundation for claims that knowledge is con

ditioned by the social, political, and historical con

text of the observer. In the end?through an entirely different route?Cartesian skeptics reach the same

skeptical doubts concerning objective claims about external phenomena, more-or-less, as Humean

skeptics do. It is important to note that these skep tical doubts do not apply only to archaeological claims about the past, but to any claim about any

thing outside the mind of an individual. I cannot resolve either of these skeptical prob

lems. I doubt anyone can. There is simply no way

to rule out the possibility that current regularities

might change in the future or that our senses pro vide unreliable information about the world. The

evaluation of different forms of reasoning, there

fore, cannot be based on the ability of a system of

reasoning to objectively identify irrefutable truths.

Any claim or explanation about the world?past or

present?requires a violation of absolute logical or

epistemological ideals. The question is which stan dards to violate, and how? Rather than relying upon an

epistemological guarantee of truth, archaeolo

gists must employ more flexible, practical criteria

based upon their experience in constructing and

evaluating arguments to make these determina

tions. More simply phrased, archaeologists need to

examine which violations of absolute epistemo

logical ideals lead to explanations that seem to work

over the long term. A successful example of this

can be seen in the hypothetico-deductive approach when Hempel argued that hypotheses that survived

multiple attempts at falsification could serve as

widely accepted laws.

Philosophers have identified several different

types of induction. These include: analogical infer

ence, statistical induction, and inference to the best

explanation. Perhaps the most ink has been spilled in archaeology over analogical reasoning (Ascher

1961;Binford 1967; Gould and Watson 1982). This

form of reasoning is covered well, in my opinion,

by Alison Wylie (2002:136-153). Analogical rea

soning follows a simple structure: if A is composed of a set of traits (Al, A2, A3...), and B shares those

traits (Bl, B2, B3...), plus others (B4 and B5), it

follows that A might also have those traits (A4 and

A5). Wylie notes the general philosophical under

standing that good analogical arguments also note

the points of dissimilarity. Further, Wylie (2002:147-148) argues that strong "analogical

comparisons generally incorporate considerations of relevance that bring into play knowledge about

underlying 'principles of connection' that structure

the association of properties in the source and the

subject." I will argue that strong analogical argu ments also employ inference to the best explana

tion. The assumption of analogical reasoning is that some principle of connection exists between the traits being compared. In a strong use of analogi

cal reasoning, this relationship is explained. In a

weak analogical argument, the relationships are

left unexplained (though assumed to exist). Ana

logical reasoning, on its own, cannot create prin

ciples of connection that link the traits being studied. As will be discussed below, this is the job of inference to the best explanation. Thus, strong

analogical arguments employ inference to the best

explanation to clarify the relationships between the traits selected for comparison.

Statistical induction is what most people think of when they consider inductive reasoning.8 The discussion of ravens above is a classic example of

statistical induction. On seeing a sample of the total

Page 7: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

population of ravens, an empirical claim concern

ing the rest of them is proposed (they are all black). The most obvious standard employed to judge the

quality of a statistical induction is the quantity of

prior observations. A statistical induction based on

a few raven sightings would be weaker than one

based on thousands of raven sightings. Statistical

inductions are also stronger, in practice, with

increased temporal and geographic diversity in the

observations. Archaeologists regularly employ these criteria when they examine inductive argu

ments. But archaeologists often run into problems at this point. When archaeologists evaluate infer

ences concerning the Pyramids of Giza, how many other massive, awe-inspiring pyramid complexes

along the Nile can they observe? Sadly, this is not

the only problem with statistical induction in terms

of archaeological research.

Let's suppose that I am an archaeologist study

ing domestic architecture.91 have excavated many domestic structures and read the reports of other

excavations. It is now time to infer something. I first

decide to make a highly universal empirical

generalization?all domestic structures have some

sort of entrance. On the face of it, this does not

appear particularly illuminating. It brings to mind

Flannery's (1973:51) oft-cited critique of "Mickey Mouse laws." By sacrificing universality, I might be able to state something more interesting?most domestic structures contain a hearth. This conclu

sion, while still not earth-shaking, at least allows

for some ideas of where and how food preparation occurred. Finally, let's compare these examples

with the results of a real archaeological study of

domestic architecture, in this case Colin Richards's

(1990) examination of Neolithic semi-subterranean

houses in the Orkneys. In Richards's (1990) study, he examined the lay

out and organization of different features within the

houses in relation to one another. Where is the

hearth in relation to the door, in relations to the beds, in relation to cooking areas, etc.? Through these

examinations Richards identified several nested

oppositions in the use of space. First, he identified

a general right/left division of space within the

houses in relation to the door. Since the doors of

the houses enter toward the right side of the house

and provide greater illumination to that side,

right/left can also be read as interior/exterior and

light/dark. By identifying gendered activities in the

two areas, a male/female dichotomy was also

found. Finally, Richards related the organization of

houses in the Orkneys to Neolithic cosmological

principles connected to seasonal changes. Where did all that come from? It certainly does

not appear to be a statistical induction of the form

noted above, nor do I believe Richards intended it

be. First, Richards's sample is small. The richness

of his explanations cannot be explained by his sam

ple size. Second, the explanation seems different

from the empirical generalizations that typically result from statistical induction. Rather than mak

ing a general statement about houses, he provides an interpretation that brings distinct elements of

these particular houses within an overarching

understanding. Richards's explanation may be

wrong. New information might overturn his con

clusion. For example, archaeologists might learn

that traditional gender roles were reversed or absent

in the Orkneys during the Neolithic. This only goes to show that whatever Richards is doing, it must be

a form of induction. But just as clearly, it is not sta

tistical induction.

Some might suggest, incorrectly I argue, that the

richness of Richards's explanation was due to an

underlying, complex set of statistical inductions. By

employing many statistical inductions (concerning issues of gender, cooking patterns, concepts of

lighting, cosmology, etc.) their roles in Richards's

explanation were disguised. This, however, brings

up a final difficulty with statistical induction. As

the number of premises of a statistical induction

increase, the reliability of its conclusions decrease.

This can be viewed as the problem of multiplying error. All of the premises of a statistical induction can be discredited with new information. Some, or

all, may also be wrong. If it later turns out that one

of the premises is wrong, or at least not wholly right, faith in the conclusion is reduced. The more

premises employed, the greater the chance that this

might occur. Despite archaeologists' widely held

conviction that multiple lines of evidence improve an argument, in terms of statistical induction they do not. This does not mean that multiple lines of

evidence do not have value. They do. Rather, sta

tistical induction does not provide a venue in which

the value of multiple lines of evidence can be effec

tively accounted for.

To sum up, statistical induction has several

inherent problems. First, it is always subject to

rejection based upon new evidence. Inference to the

best explanation does nothing to solve this prob

lem; but if Hume is right, no system of reasoning does. Statistical inductions are also highly depen dent upon the quantity and diversity of previous observations. For this reason, it does not address

particularities in the past well at all. Finally, statis

tical induction does not have a mechanism to

account for the value of multiple lines of evidence.

It is in regard to these latter problems that infer ence to the best explanation is an advance over sta

tistical induction. Inference to the best explanation

places epistemological value on multiple lines of

evidence and can accommodate explanations of

unique phenomena.

Inference to the Best Explanation

Elements of inference to the best explanation can

be found in the writings of Peirce (1931) and other

pragmatic philosophers (Dewey 1929; Hanson

1958; Mill 1904). For these philosophers, inference to the best explanation (what Peirce called abduc

tion or retroduction) was thought to characterize

the creative process that scientists used to gener ate hypotheses. Peirce correctly noted that prac

ticing scientists did not simply observe

commonalities and make empirical generalizations. Rather, scientists sought to explain surprising observations by creating explanations that would account for them. More formally, if a proposed

explanation made a surprising phenomenon explic able, there was sufficient reason to think it might

be true. Peirce saw inference to the best explana

tion as a way to develop strong hypotheses for sub

sequent investigation through standard forms of

scientific reasoning. In Peirce's formulation, then,

inference to the best explanation was viewed as a

model for the initial creative process of scientific

inquiry. In the 1960s and 1970s, Gil Harman developed

the understanding of inference to the best expla nation that underlies this discussion (Harman 1965, 1968a, 1968b, 1973; see also Brody 1970; Hanson

1958; Lipton 1991; Thagard 1978). As presented by Harman (1965:89), when conducting inference to the best explanation, "one infers, from the

premise that a given hypothesis would provide a 'better' explanation for the evidence than would any

other hypothesis, to the conclusion that the given

hypothesis is true." Thus, it is assumed within infer

ence to the best explanation that the best inference

is also most likely to be true.10 The key difference

between Harman's and Peirce's formulations is the

recognition that inference to the best explanation is not only limited to the initial stages of scientific

reasoning. In practice, inferences to the best expla

nation can be sufficiently robust as to be accepted as true without further testing or investigation what soever. At times, some hypotheses are stronger at

inception than those that have been subjected to

numerous rounds of rigorous testing.

At first glance, inference to the best explanation may seem absurdly circular. It actually isn't. Prov

ing that one explanation is "more true" than another

would require prior knowledge of the "true" expla nation. Even if it was possible, we would have to

wonder why anyone would be evaluating the rela tive truth of explanations for a particular set of evi dence if he or she already had the perfect explanation in hand. Still, there must be criteria to

judge which explanations are best for inference to

the best explanation to work. Rather then judging an explanation based upon its likeliness, an infer ence to the best explanation should be evaluated in terms of how compelling it is (Lipton 1991: chap ter 7).11 Compelling explanations have traits that

people have found to characterize successful expla nations in a wide variety of contexts. Many of these traits are already familiar to archaeologists. For

example, accounting for a greater quantity and

diversity of empirical evidence will typically make an argument more compelling. Similarly, all things being equal, a simpler explanation is usually viewed as being more compelling than a complex explana tion. I examine, in detail, toward the end of this arti

cle, several of the traits that philosophers have identified as characterizing compelling explanations.

People employ inference to the best explanation almost constantly with little thought about the sys tem of reasoning they are engaging. A mechanic

employs it to diagnose the problem with a car. A detective uses it to decide who committed a crime. The central element of inference to the best expla nation is that it searches for that explanation that fits the diverse evidence available. Inference to the best explanation is, perhaps, most clearly demon strated in detective stories. The fictional detective links the diverse, if not bizarre, evidence into a sin

gle explanation that clearly indicates that only one

Page 9: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

of a group of people is the culprit. While more

complex than most inferences to the best explana tion, detective novels serve to illustrate a form of

reasoning that pervades our everyday lives.

As argued by Harman (1965) and Lipton (1991), inference to the best explanation effectively addresses some of the limitations of statistical

induction. First, and perhaps most important for

archaeology, it allows for the development of expla nations for unique or infrequent archaeological

phenomena. It does this by placing epistemologi cal value on multiple lines of evidence. While

strengthened by application in multiple cases, infer

ence to the best explanation also focuses on the abil

ity of an explanation to account for the diversity of

evidence in specific cases. That explanation which

accounts for the greatest diversity of evidence is

assumed most likely to be true, even where that case

is unique.

The second benefit of inference to the best expla nation is that it allows for the construction and eval

uation of explanations, not just the identification of

empirical generalizations or regularities. An infer

ence to the best explanation details the relationships between divergent evidentiary elements. In a sense, it makes elements thought to be independent lines

of evidence dependent upon one another. Using

Wylie's terms concerning analogical arguments,

inference to the best explanation details the "prin

ciples of connection" between the available evi

dence (Wylie 2002:147-148). In this sense, inference to the best explanation provides answers

to "why" and "how" questions rather than simply

noting empirical regularities. A fundamental confusion lies at the heart of

many discussions of the philosophy of science. Sci

ence does not have a monopoly on reasoning. For

example, in examining Shakespeare's Macbeth, I

might ask if Macbeth is a good, but weak, charac

ter tempted toward regicide by Lady Macbeth, or

an evil, but cowardly, character given backbone by

Lady Macbeth? To address this question I could

employ several lines of evidence: I could examine

specific scenes within Macbeth, I could study

Shakespeare's other plays and sonnets to see his

general approach toward this sort of issue, or I

could even consider general beliefs concerning

good and evil, husbands and wives, and regicide

during Elizabethan England. While potentially

employing well-reasoned argumentation, including

inference to the best explanation, no analysis of

Macbeth could be considered scientific by any con

ventional definition.

Despite what some scientists and even a few

humanists say, well-reasoned arguments exist

throughout the humanities. When historians study the past, they do not simply make it up; they make

arguments and employ reasoning to generate expla nations about the past or present. Other historians

evaluate their arguments, and accept, reject, or mod

ify them accordingly. Bad reasoning and research

exist in every discipline, but clearly there are also

standards for acceptance and rejection. As I will

argue below, inference to the best explanation is one

of these standards. While inference to the best

explanation is used in the sciences, it is not limited

to the sciences. Thus, the use of inference to the

best explanation is unrelated to questions con

cerning the proper role of scientific reasoning in


Below I provide several examples of the use of

inference to the best explanation in archaeological research. These examples serve two functions.

First, the examples further illustrate how inference

to the best explanation works. Second, they also

suggest its ubiquity in archaeological reasoning.

The Ubiquity of Inference to the Best

Explanation in Archaeological Reasoning

A strong demonstration of the pervasiveness of

inference to the best explanation in archaeological

reasoning would require examining a large sample of archaeological research and determining the rel

ative frequency of its application?in essence the

use of statistical induction. There is not sufficient

time or space to do this here. Instead, I examine

four specific archaeological studies and discuss the

manner in which inference to the best explanation is embedded within their respective arguments.

These cases are Alfred Kidder's (1924) explana tions of pueblo aggregation, Lewis Binford's (1967)

study of smudge pits, Ian Hodder's (1991) discus

sions of hermeneutics, and Michelle Hegmon and

Wenda Trevathan's (1996) analysis of Mimbres

birth scenes.

Alfred Kidder

In Southwestern Archaeology (1924), Alfred Kid

der proposed that the shift from small, independent

unit pueblos of the Pueblo I period to the large cen

tralized pueblos of the Pueblo period in the San Juan

Basin was the result of "hostile pressure" from a

"nomadic enemy" (Kidder 1924:126-127). Kidder

argued that the need for defense against these hos

tile nomads forced the abandonment of peripheral settlements and the aggregation of pueblo people into larger house blocks. Kidder contrasted his

explanation for pueblo aggregation with another:

that aggregation was the result of the progressive desiccation of the southwestern environment, forc

ing pueblo people into more restricted areas

(Hewett et al. 1913; Huntington 1914). In evaluat

ing the two potential explanations, Kidder stated:

To begin with, many of the districts which were

shortly abandoned are still among the most

favorable as to water supply in the entire South

west; secondly, many peripheral ruins (as in

western Utah and eastern New Mexico) were

seemingly deserted at an early time; lastly, the

more recent villages are larger, stronger, and

occupy more easily defensible sites, than the

older ones [Kidder 1924:126].

In Kidder's view, then, his explanation accounted for the observed chronological patterns of site abandonment in the peripheries, the aggre

gation of pueblo people in the core of the San Juan

Basin, and the architectural form of the resulting

larger pueblos. The desiccation explanation, in con

trast, failed to account for the observed patterns of

water availability, the known chronology of pueblo abandonment, or the form and location of subse

quent aggregated pueblos. "I stress here, as before,

the influence of the nomadic enemy; for this

appears to me best to explain the observed facts of

Pueblo history" (Kidder 1924:127). Kidder, implic

itly following the principles of inference to the best

explanation, inferred that the explanation that

accounted for the greatest diversity of evidence was

more likely to be true. Kidder (1924:128) also rec

ognized that the "question is still an open one,"

showing that whatever form of reasoning he

employed, it must have been inductive. Despite later criticisms of cultural-historical archaeology (e.g., Binford 1962, 1968a; Taylor 1948), Kidder

was not simply engaged in description, and the sys tem of reasoning he employed was, and still is,

among the best available to practicing archaeolo

gists. The errors of his explanation are not the prod

uct of epistemological issues, but rather the limited

archaeological evidence at his disposal.

Lewis Binford

As presented by Lewis Binford (1967), "Smudge Pits and Hide Smoking: The use of Analogy in

Archaeological Reasoning" was intended to illus

trate how analogy could be effectively employed within a deductive-nomological approach to

archaeological reasoning. Binford used an analy sis of a particular type of archaeological feature

common in the Eastern United States, smudge pits for hide smoking, to illustrate his argument.

Smudge pits are small holes containing burnt mate

rial (corncobs, twigs, and other vegetable mater

ial), with the upper portion of the pit often covered

loosely with soil (Binford 1967:3-4). I argue that

rather than exemplifying the deductive

nomological approach, Binford's analysis of

smudge pits implicitly employs inference to the

best explanation in much the same way as Kidder


Based upon the close correspondence between

the ethnographically described smudge pits and the

archaeological features, Binford (1967:8) "postu lated that the archaeologically-known features were

in fact facilities employed in the task of smoking hides by the former occupants of the archaeologi cal sites on which they were found." The key term

here was "postulated." Following the deductive

nomological approach of Hempel, "the final judg ment of the archaeological reconstruction... must

rest with testing through subsidiary hypotheses

drawn deductively" (Binford 1967:10). What is

odd, however, is that Binford never tested his pos tulate, nor were his subsidiary hypotheses drawn

from valid deductions.

Binford proposed, but never tested, three

hypotheses based upon further readings of the

ethnographic literature. One of these hypotheses was that "smudge pits should occur in 'base camps'

occupied during the period of the year when hunt

ing activity was at a minimum" (Binford 1967:9). This is not a valid deduction. As discussed earlier, in a valid deduction the conclusion must necessar

ily follow from its premises. In a valid deduction, a

single piece of contradictory evidence falsifies the

hypothesis. The failure to find smudge pits at a sin

gle base camp would be sufficient to negate the

hypothesis. Yet, as Binford himself noted (1967:9),

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

the ethnographic literature indicates that hide smok

ing was performed by specialists. The absence of

smudge pits, therefore, might only show that no

hide smoking was performed at a particular base

camp because the group living there did not include a hide smoking specialist at the time that the base

camp was occupied. Thus, if we accept Binford's

identification of these features as smudge pits for

hide smoking, it cannot be due to his use of the

hypothetico-deductive approach advocated by

Hempel. His hypotheses were not validly deduced

and he never tested them in any case. Rather, Bin

ford seemed satisfied with his results, without test

ing of any sort, and concluded the article.

Why, then, is Binford's identification of the fea

tures as smudge pits so compelling? Embedded

within the article are clues to Binford's actual sys tem of reasoning. At the start of the article Binford

discussed several different explanations for the

smudge pits. He noted that some archaeologists

(e.g., Newell and Krieger 1949:248-249) consid

ered them post molds "presumably because of their

small size" (Binford 1967:5-6). Other archaeolo

gists (e.g., Cole et al. 1951:156) argued that these

features were caches, "in spite of the fact that none

of the corncobs had kernels attached" (Binford

1967:5). Finally, Binford noted his own earlier

interpretation that the smoke from the smudge pits at one site "might have been employed in the con

trol of mosquitoes" (Binford 1967:4). However, Binford considered his own explanation "specula tion" based upon the "experience of the excavators"

(Binford 1967:4). Though Binford's analysis of the

competing explanations was limited, in each case

he found the explanation weak (mosquito control

and post molds) or contradicted by the existing evi

dence (caches). In contrast, Binford identified four

specific elements in the morphology of the features

(size, contents, treatment of contents, and final con

dition of the feature) that corresponded with the

ethnographic accounts of hide smoking (Binford

1967:8). Thus, that explanation that accounted for

the most, and the most diverse, evidence was con

sidered best. This, in turn, was considered prima

facia evidence for its truth.

Through the use of analogy, Binford con

structed an explanation for the function of a par ticular archaeological feature. In Wylie's terms, he

provided a principle of connection that linked the

disparate elements of his analogy. Despite his

claims to the contrary, the value of his explanation came from the empirical breadth that his explana tion subsumed. His hypothesis explained the fea tures better than any others he could think of. This,

by itself, was sufficient reason to accept the expla nation as likely to be true. Despite Binford's

rhetoric, his style of reasoning was essentially the same as Alfred Kidder's and, as we shall see, Ian


Ian Hodder

In "Interpretive Archaeology and its Role," Ian

Hodder (1991) proposed an alternative system of

reasoning for archaeological research; one based

in the humanist and relativist perspective of post

processual archaeology. This system was

hermeneutics (an expanded discussion of

hermeneutics can be found in Hodder 1999; see also

Collingwood 1946; Ormiston and Schrift, eds.

1990; Shanks and Hodder 1995; Shanks and Tilley 1987a, 1987b; Thompson 1981). At the heart of the

hermeneutic process is the identification of differ

ent contexts and an attempt to bring understand

ings of these different contexts into a broader

coherent explanation. One context is that of the

archaeologist?the preconceptions, theories, and

social values that archaeologists bring to their

research. Another is the context of the people who

created the archaeological materials being investi

gated. Hodder argued that this latter context is dif

ferent from our own as it follows its own rules and

logic (see Hodder 1999:51-52). The hermeneutic

process consists of circling between these two con

texts, making each part fit into a coherent whole?

altering our views of our own context and the

archaeological context until the whole makes

coherent sense.

This movement back-and-forth between "our

context" and "their context" is not the only back

and-forth reasoning addressed by hermeneutics

(Hodder 1991:8). There are many other potential

oppositions that include: the past and the present,

specific instances and larger cultural patterns, our

culture and their culture, and others (see Hodder

1999:30-65; Shanks and Tilley 1987a). Metaphor

ically, hermeneutic research can be viewed as a cir

cle or, perhaps, a spiral (Hodder 1999), with each

movement back-and-forth altering the original

propositions that began each loop.

Among the strengths of the hermeneutic

approach is its emphasis of the continuing, dynamic creation of explanations. In this sense, hermeneu

tics is a somewhat better model for the creative

process of explanation building than earlier, more

"scientific," epistemologies. These earlier models

also allowed for the continuing refinement of expla nations, but did not emphasize it with the same

zeal. Another strength of hermeneutics is its recog nition of the contingent nature of knowledge. While

Hodder does not resolve Cartesian skepticism?nor does he try?in a practical sense hermeneutics does

seem to more clearly identify and accommodate

issues of bias. What hermeneutic circles do not do,

however, is provide clear standards for what con

stitutes good explanations or standards for the rejec tion of weaker ones. Rather, hermeneutics

implicitly relies upon inference to the best expla nation to evaluate explanations, in much the same

way that Kidder and Binford did.

The reliance of hermeneutics on inference to the

best explanation is clearly illustrated within an

example of the hermeneutic process provided by Hodder (1991:7).

I recently came across a good example of the

everyday working of hermeneutic principles while listening to the radio in the United States.

I heard a phrase, or thought I did, "it was nec

essary to indoor suffering." ... I did not see

why it should be necessary to suffer indoors,

but then I know that North Americans ... are

willing to try anything. So initially I under

stood the term as it sounded to me and assumed

that the same word had the same meaning. I

then corroborated and adjusted this meaning

by placing it in the peculiar and particular rules

of North American culture.

Gradually, however, this process of internal

evaluation made less and less sense as I con

tinued to listen to the radio program.... Sen

tences such as "to indoor suffering I took a pain killer" made little sense. I could only make sense of these examples when I hit upon the

idea of another component of my understand

ing of the North American context: North Americans often pronounce words "wrongly."

... I searched and found "endure." Now every

thing made coherent sense and the whole had

been reestablished. The hermeneutic circle had

been closed [Hodder 1991:7].

While phrased in terms of the "part-whole rela

tionships" and "meaning," this example appears

fundamentally similar to the systems of reasoning discussed above. Hodder (1) encountered some

thing that did not make sense from his pre-existing

perspective (following Peirce's understanding of

inference to the best explanation), (2) proposed an

explanation, (3) compared it with the data, (4) found

the explanation unable to account for the full diver

sity of evidence, (5) developed a new explanation, and (6) accepted it as best due to its ability to

account for a broader range of empirical phenom ena. Further, this is not simply my interpretation of

this example. In the paragraph immediately fol

lowing this example Hodder states, "[w]e measure

our success in this enmeshing of theory and data

(our context and their context) in terms of how

much of the data is accounted for by our hypothe ses" (Hodder 1991:8). As with all the previous

examples, the ability of an explanation to account

for greater empirical breadth is taken as prima facia evidence for its truth.

Hodder's primary goal in writing the article

quoted above was to argue that postprocessual hermeneutics required some degree of "guarded

objectivity" (Hodder 1991:8). The value of

"guarded objectivity" is that "material culture as

excavated by the archaeologist is different from our assumptions because it is organized partly at

least according to other cultural rules" (Hodder 1991:12). Thus, at a basic level, Hodder rejects the relativism of earlier postprocessual archaeologists (including his own work). It is important not to

overplay this. Hodder continues to argue for the

importance of self-reflexive research and for the

recognition that, to some degree, archaeologists'

pre-existing beliefs will inform their interpretations of the past. The material remains, however, are not

amenable to just any interpretation. Some inter

pretations will be shown wrong through a failure to account for the diversity of evidence that is struc tured by people in the past. For Hodder, this would allow for feminist and ethnic minority voices to

challenge traditional archaeology while allowing for the rejection of bogus claims from the archae

ological fringe (Hodder 1991:9). I see no reason to quarrel with Hodder's claim

to an independent objective world beyond the

archaeologist. Like Hodder, I would not argue that this world can be objectively shown to exist, but it

Page 13: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4, 2007

seems to be a useful starting point for further

research. What's more, I doubt any processual

archaeologist would have a quarrel either. A belief

in an objective world, no matter how guarded, is

fairly common. I also accept that archaeologists'

pre-existing beliefs will influence their interpreta tions. The important point for this discussion is the

reason why Hodder feels compelled to reassert the

need for guarded objectivity. He is trying to reduce

the inherent relativism of earlier postprocessual

approaches to the past.

The organized material remains have an inde

pendence that can confront our taken for grant

eds. The notion that the data are partly

objective is an old one in archaeology, and it

was the basis for processual and positivist

archaeology. But the trouble with positivist and processual archaeologists was that they did not incorporate hermeneutic and critical

insights [Hodder 1991:12].

By Hodder's own words, then, the differences

between postprocessual and processual archaeolo

gies are the use of hermeneutics and critical the

ory. But if hermeneutics relies heavily upon the

same epistemological system of evaluating expla nations, the only significant differences between

processual and postprocessual archaeologies are

the greater dynamism of hermeneutics and the use

of critical theory. These differences are far less pro nounced with processual-plus archaeology.

In the end, it is not Hodder's acceptance of

"guarded objectivity" or use of hermeneutic spi rals that is doing the work of rejecting inferior inter

pretations, though they are important. Rather, it is

the stated belief that the explanation that accounts

for the greatest breadth of material evidence is also

the one that is most likely to be true. It is not the

external existence of an objective world or

hermeneutics that allows for evaluation of expla

nations, only a straightforward standard of rea

soning common throughout archaeology for at least

a century. It is inference to the best explanation. In later works, Hodder greatly expanded his dis

cussion of the criteria used to judge explanations

(see Hodder 1999:30-65). In some ways Hodder's

criteria mirror those that I propose at the end of this

article. Given this, I will address this point when I

introduce my own criteria for evaluating explana

tions toward the end of this article.

Michelle Hegmon and Wenda R. Trevathan

To conclude this review of archaeological uses of

inference to the best explanation, I will examine a

recent article that falls into the processual-plus

approach of American archaeological research. As

the term "processual-plus" was coined by Michelle

Hegmon (2003)?and she self-identifies as a

practitioner?I will focus on an article she cowrote

with Wenda R. Trevathan (Hegmon and Trevathan

1996). In this short article, Hegmon and Trevathan

attempted to identify the gender of the people who

painted Mimbres black-on-white pottery in the

American Southwest during the Classic Mimbres

period (A.D. 1000-1150). Through ethnographic

analogy with later pueblo societies, some archae

ologists argued that women were more likely to

have been potters and pot-decorators (Moulard

1984; Shafer 1985), while other archaeologists sug

gested that men may have been the primary pot ters, or at least pot-decorators (Brody 1977; Jett and

Moyle 1986). Others argued that different stages of pottery production were performed by men and

women (Mills 1995; Wright 1991). Within this context of competing explanations

for ceramic production, Hegmon and Trevathan

examined the depictions of birth on three Mimbres

pots. In each they found that the depiction of birth

was radically different from actual birthing prac tices. First, the babies were facing the same direc

tion as the mothers, rather than backwards as is

typical. Second, the babies were depicted being born hands-first rather than headfirst.

Our analysis of a Mimbres birth scene suggests

to us that it was unusual, perhaps impossible,

and therefore probably painted by someone

not familiar with the details of birthing. Thus, we conclude that the scene probably was

painted by a man and that men may have been

the primary painters of Mimbres figurative

designs [Hegmon and Trevathan 1996:752].

Since no woman would depict birth as inaccu

rately as is found on Mimbres vessels, Hegmon and

Trevathan argued, it was more likely that men were

pot-decorators. This explanation was further sup

ported through ethnographic accounts from the

southwest stating that men rarely witnessed births.

As in the previous example, Hegmon and Trevathan

accept their explanation as true based solely upon its ability to explain the depictions of childbirth on

Mimbres pottery better than any other explanations

they could think of. Hegmon and Trevathan may be wrong, but again that only underscores the

inductive nature of their arguments.


In all of the examples discussed above, hypothe ses or explanations were accepted or rejected based on the breadth and diversity of evidence that they accounted for. None of them could be called deduc

tions or statistical inductions. As for the use of

analogy by Binford, his acceptance of the expla nation that smudge pits were used in smoking hides still rested on inference to the best explanation.

These examples illustrate the ubiquity of infer ence to best explanation in archaeological reason

ing for at least a century. What remains, however, is a discussion of how the recognition that infer ence to the best explanation is typical of archaeo

logical reasoning can improve archaeological research. If archaeologists are already effectively

using inference to the best explanation, why worry about it? To answer this question I must first

address another?what does an explanation explain and how does it explain it? This turns out to be a very difficult question, with several differ ent approaches presented in the philosophical lit erature. Yet, if we are to determine what the best

explanation is of any archaeological phenomenon, we must have a clear understanding of what an

explanation is in the first place.


There are many different ways that philosophers have come to understand explanation. Far more,

in fact, than can possibly be discussed here.12 All seem to have value in certain cases. Even the

deductive-nomological system of Hempel (1965, 1966) seems to work in the natural sciences and, to some extent, in those areas of archaeology most

closely linked with the natural sciences. In archae

ology, explanation has been typically viewed in terms of causation. An alternative way to under

stand explanation is in terms of contrasting statements?an explanation explains why some

thing occurred in terms of why another did not. As will be discussed below, I see contrastive expla nation as a more productive avenue for archaeo

logical reasoning.

Causal explanations

One way of understanding explanation is to con

ceive of it in terms of cause. An explanation of a

phenomenon identifies the cause of that phenom enon. Thus, if an archaeologist wants to explain

why there is a concentration of lithic debitage in one portion of a site, he or she might argue that it was created by a lithic workshop. This, of course, does not explain why there was a lithic workshop in the first place. This is the problem of infinite

regress. Every causal explanation seems to demand a further explanation ad infinitum. Each of these

earlier causes is a part of the causal history of a

particular phenomenon. Further, at some point the

why questions cannot be answered and the causal

history ends with "I don't know." More problem atically, it also becomes reasonable to argue that a cause of a high concentration of lithic debitage in one portion of an archaeological site is that some

fish crawled onto land 350 million years ago. This

is not what we are looking for in an archaeologi cal explanation. Several different approaches have been used to address the problem of infinite


The most straightforward approach to the prob lem of infinite regress is to focus solely on proxi

mate causes?the first, or first few, causes in the causal history of a phenomenon. The problem with this approach is that the next few steps of the causal

history might be very interesting, perhaps more

interesting than the proximate causes. Returning to

the lithic concentration example, stating that there is a lithic workshop has several implications on the forms of craft production practiced by the people

who lived in the site. It could suggest that one fam

ily produced all of the stone tools for people living at that particular site. Alternatively, it could sug gest that cooperative labor was employed by the

village within a communal workshop. Moving beyond proximate cause in this case could lead to

potentially interesting explanations. In practice it seems that archaeologists are not looking for prox imate causes but rather interesting and explanatory causes that are "proximatish."

Hempel (1965, 1966) had a different view of what makes a good scientific explanation. Rather than identifying proximate causes, Hempel sought to relate the particulars of a phenomena to an exist

ing set of universal or probabilistic laws (see Hempel 1966:chapter5).

Page 15: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4, 2007

[L]aws play an essential role in deductive

nomological explanations. They provide the

link by reason of which particular circum

stances can serve to explain the occurrence of

a given event. . . . The laws required for

deductive-nomological explanations share a

basic characteristic: they are, as we shall say,

statements of universal form [Hempel 1966:54].

For example, an explanation for why a particu

lar object falls, and why it falls at a specific rate, would make reference to the laws of gravity

(Hempel 1966:54). The laws of gravity, then, serve

to explain the particulars of falling bodies.

The deductive-nomological approach to expla nation has been productive and useful in many nat

ural sciences and those areas of archaeology most

closely linked to them. However, as discussed ear

lier, universal laws of human behavior have not

been forthcoming in archaeology. In the absence

of widely accepted universal laws to reference, the

deductive-nomological approach fails to provide a

suitable foundation for many, if not most, archae

ological explanations.

Channeling Ernst Mayr (1982) and Aristotle, Kent Flannery (1986) proposed an alternative way of understanding explanation in archaeology. Rather than searching out a single proximate cause

or a set of universal laws to explain a phenomena,

Flannery argued that explanations in the historical

sciences need to address four different types of

causes, all first defined by Aristotle. These are: the

material cause, the efficient cause, the formal cause,

and the final cause. Each of these four types of

causes have different emphases. For example, "The

material cause is that out of which something is

made," while "the final cause is that for the sake of which something is made" (Flannery 1986:517,

emphasis in original).

Flannery's discussion of causation deserves

more time than is available here. In my view, while

Aristotle's types of causation are conceptually use

ful for understanding the variety of causal expla

nations, they do not provide a way to determine if

any one causal explanation is good, bad, or indif

ferent. The types do not lead to a method for con

structing or evaluating explanations about the past.

Rather, these types only provide a system that more

specifically identifies the ways that different "prox imatish" causes can be considered interesting. As

for a method to determine which explanations are

good, here Flannery relies on Mayr, who in turn

seems to rely on inference to the best explanation.

According to Mayr (1982:26 as cited by Flannery

1986:513), biologists accept as true "that which is

consistent with more, or more compelling, facts

than competing hypotheses." There is another problem with causal explana

tions. Many explanations do not identify causes,

they focus on meaning. For example, I might be

reading a book and come across a word with which

I am not familiar. I might come up with several dif

ferent ideas of what the word might mean based on

similarities to words I know or the context of the

word within the text. I would evaluate these poten tial meanings and, with luck, come up with an

explanation of what the word, as used in the par ticular text, means. Following the common under

standing of cause, it would not appear that this

could be described as a causal explanation. The

processual obsession with causal explanations was, in my view, correctly critiqued by postprocessual

archaeologists. They stress the value of interpre tive archaeology as corrective (Hodder 1991; Shanks and Hodder 1995). As I will discuss below, I see interpretation and explanation as fundamen

tally similar enterprises. Within the understanding of explanation presented below, explanations of

both cause and meaning are possible.

Contrastive Explanations

There is another way to understand explanation,

one that is favored here. Explanations can be

understood in terms of contrastive pairings (Lip ton 1991:chapter 5). Asking why there is a con

centration of lithic debitage in one portion of a

site requires additional statements before it can be

explained. It requires a. foil. A foil is a counter

point to the explanation being searched for. The

lithic workshop explanation will need to address

why there is a concentration of lithic debitage in

one particular area of the site rather than an even

scatter of debitage across the entire site. With the

addition of the foil the problem of infinite regress is limited. It no longer makes any sense to explain this through calls to 350-million-year-old fish.

The fish explanation does not account for why lithic debitage would be concentrated rather than

diffuse. In contrast, a lithic workshop would likely cause a concentrated scatter of lithic debitage and

not cause an even scatter of debitage across the

entire site. When viewed in terms of the con

trastive pairing, a lithic workshop explains the

observed pattern of lithic distribution and simul

taneously refutes the foil. Admittedly, there are

also numerous other explanations that could

account for a concentration of lithic debitage within a single portion of a site (e.g., erosion pat

terns). This illustrates an important limitation of

contrastive explanation: it only evaluates the

stated explanations. It does not remove the poten

tial for other, unknown explanations of the same

phenomenon. Once more, however, this is typical of all inductive arguments.

Foils serve to focus explanations. While there are many potential explanations, not all of them will serve to explain why one specific thing occurred

and another specific thing did not.13 Foils have

another valuable feature, they can be changed. Sup

pose I rephrase the initial question to, "why is there

a concentration of lithics in one particular area of

the site rather than a concentration in a different area of the siteT Here the previous workshop

explanation does not explain why that workshop was placed in one part of the site and not another.

We might argue, employing some other evidence

in the process, that the workshop was placed within a family compound because it was a family controlled workshop. The explanation has changed as a result of the change in foils. There is one impor tant point here. The explanation from the previous foil is still there; this concentration is still a con

sidered a lithic workshop. Explanations using dif ferent foils can supplement, even strengthen, one

another. As will be discussed below, explanations that are consistent with multiple foils are, in gen eral, better than those that only address one.

As presented by Lipton (1991:chapter 5), con

trastive explanations are a form of causal explana

tion. I agree that contrastive explanations are often

also causal explanations. I do not agree, however,

that contrastive explanations are always causal

explanations. Contrastive explanations and the use

of foils are equally valuable in examinations of

meaning. In the previous example of the meaning of an unknown word, it would make sense to seek to explain that the word means x rather than y. Thus,

the use of contrastive foils is not limited to causal

explanations, but applicable to interpretations of

meaning as well.

Most archaeological explanations employ foils

implicitly, but sometimes explicitly as well.

Returning to the archaeological examples dis

cussed above, Kidder contrasted his "hostile

nomad" explanation with a desiccation explana tion of pueblo aggregation. Binford contrasted his

interpretation of smudge pits for hide smoking with explanations that labeled these features post molds, caches, or smudge pits used for the control

of mosquitoes. Hodder contrasted his interpreta tion of the word "endure" with his earlier under

standing of "indoor." Finally, Hegmon and

Trevathan contrasted their explanation that men

painted the figurative designs on Mimbres pottery with explanations that claimed women did. In each

case, the archaeologists used a foil to strengthen and direct their explanations.

At first glance, it might seem that I advocate

contrasting any two explanations for a specific

phenomenon to determine which is best. In some

cases, in fact, this is exactly what I advocate.

However, there is an important caveat. The fact

and the foil must be mutually exclusive. For

example, it is not possible to have a lithic scatter

that is both concentrated and diffuse. However, the question, "why is there a concentration of lithic debitage in one particular area of the site

rather than a high frequency of scrapers within

the lithic assemblage?" will not produce a con

trastive explanation. Typically, explanations must

address the same foil in order to determine which

is better. For example, inference to the best expla nation could determine whether erosion or the

presence of a lithic workshop is a better expla nation for why a lithic scatter is concentrated rather than diffuse.

I am not proposing that archaeologists adopt foils in their archaeological explanations. Rather, I am suggesting that it is already the implicit prac tice of most archaeologists. By making our foils

explicit, we can greatly improve our explanations and clarify the muddied debates that seem to per vade our discipline. Many of the explanations that are alleged to contradict each other are simply

addressing different foils. The use of different foils also allows for specific archaeological phe nomena to be explained in several different ways,

providing a justification for postprocessual claims

concerning the multivocality of archaeological data.

Page 17: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4, 2007

Seven Traits of Highly Successful Explanations

So far, this article has focused on presenting what

inference to the best explanation is and discussing its strengths and weaknesses in terms of good

archaeological research. The question remains,

however, if archaeologists are already employing inference to the best explanation, why should they

spend time reflecting on it? The remainder of this

article seeks to answer this question. I conclude the

article with a discussion of the role of hypothesis

testing in archaeology in the light of inference to

the best explanation. But first I present a set of cri

teria that makes explicit the implicit criteria archae

ologists typically employ when evaluating


What, then, makes an explanation a good expla

nation, and what makes one more compelling than

another? Philosophers have examined these ques tions for a long time, developing sophisticated

understandings of good explanations. Quine and

Ullian (1978:43) refer to "five virtues which count

toward the plausibility [of a hypothesis]." These

virtues are: generality, modesty, refutability, con

servatism, and simplicity. To Quine and Ullian's

virtues, I add two more, that good explanations should be empirically broad and should address

multiple foils.

In some ways, these criteria are similar to those

presented by Hodder within a discussion of evalu

ating the fit of an interpretation (1999:59-62). His

criteria include: internal and external coherence,

correspondence, fruitfulness, and simplicity. There

is overlap between my criteria and Hodder's, but

there are also significant differences in emphasis. Most importantly, I assert that these criteria should

be used?and often are used?by both processual

and postprocessual archaeologists. Hodder limits

some, but not all, of his criteria to postprocessual

archaeologists alone.

Before proceeding, I must note a critical limi

tation of these standards. These characteristics of

good explanations are taken from Western philoso

phers and are clearly derived from a Western intel

lectual tradition. As with any system of explanation, the political, social, historical, and other factors

affect what explanations individual archaeologists determine are best. Further, based upon the con

structivist arguments that underlay postprocessu

alism, there is no epistemologically consistent sys tem to overcome these biases and achieve objec tive truth. This does not mean, however, that these

biases cannot be addressed in several practical ways.

Just as I have argued that archaeologists should

not judge the success of an explanation by its abil

ity to establish irrefutable truths, the standards for

judging bias should not be absolute either. In this

light, archaeologists have long been using a vari

ety of practical methods to address bias. Some of

these methods deal with specific issues (e.g., the

tendency to round numbers to 5 or 0), while other

methods have far broader significance (e.g., the

method of multiple working hypotheses [Cham berlain 1965]). Hermeneutics, among its other uses, can also serve as a practical method to address bias.

These methods for the reduction of bias can and do

fit within the perspective of inference to the best

explanation. Ideally, the following list of traits that

characterize compelling explanations would

include methods for the reduction of bias, but there

is simply no way to do justice to the topic without

doubling the length of this already long article.

While valuable, a discussion of bias must be post

poned to another day. The following standards that characterize com

pelling explanations are not absolute or distinct.

They blend at the edges and, at times, conflict with

each other. These are guides to good reasoning, not

absolute prescriptions. The evaluation of any expla

nation has an impressionistic element that can never

be fully removed. Any specific explanation may be

strong in some elements and weak in others. Since

even the best hypothesis might not be good enough to warrant serious consideration, the following stan

dards can also be used to determine if the best

explanation is also a good explanation. Finally, I

am not arguing that these standards are complete.

Just as I have added a new standard to Quine and

Ullian's list, I hope that other archaeologists will

add to those I present here.

Empirical Breadth

I begin with a virtue not explicitly mentioned by

Quine and Ullian?though clearly implied through out their discussion?a good explanation should

address a wide variety of observations or evidence.

A good explanation should explain many empiri cal observations and not be contradicted by others.

Page 18: Lars Fogelin IME


explanation to account for large numbers of highly similar phenomena. This understanding of breadth

is used in evaluating statistical inductions and infer

ences to the best explanation. Alternatively, breadth

can be measured by the diversity of phenomena

explained. In the terms more familiar to archaeol

ogists, the best explanation should address multi

ple lines of evidence. Only inference to the best

explanation places value on this type of breadth.

The value of multiple lines of evidence in terms

of inference to the best explanation is fairly straight forward. In theory, there are infinite numbers of

explanations for any quantity of diverse observa

tions. In practice, however, explanations that employ

multiple lines of evidence are hard to come by, and

stronger for it. As the quantity of evidence subsumed

within the explanation increases, the number of

potential explanations for that evidence decreases.

The relative strength of any one explanation, there

fore, is believed to increase as the quantity and vari

ety of the evidence it encompasses increases.

The value placed on diverse evidence in the eval

uation of inferences to the best explanation stands

in contrast to statistical induction. As discussed

earlier, statistical inductions are weakened by the

addition of premises. This is due to the problem of

multiplying error. Inference to the best explanation, when evaluated in terms of the diversity of its

empirical breadth, provides justification for the

importance archaeologists already place on multi

ple lines of evidence in archaeological reasoning.


A good explanation should also be applicable to a

wide variety of phenomena. This is similar to the idea of empirical breadth, but there is a difference in emphasis. Here the measure is not that one spe

cific set of empirical observations employs a wide

variety of evidence, but rather that the same kind of explanation can be employed in a wide variety of cases. The strength of the concept of biological evolution is not simply judged on its ability to

explain the diversity of finches on the Galapagos Islands, but its broad application to biological phe nomena in general. By these standards biological evolution is a very general explanation. Marxism,

in its various guises, is a general explanation in the social sciences.


With all this talk of generality and breadth, the

value of modesty might seem an odd addition to

this list. It is important, however, not to overreach.

I am sure we can all think of examples of someone

who has a pretty good explanation for some phe nomena, but applies it to everything. This standard

is a check on the previous standards. Don't try to

explain too much.


For good reason, people distrust explanations that cannot be shown to be wrong. Irrefutable explana

tions come in two general types. In the first, the

explanation itself may defy falsification. Some cre

ationists, for example, have argued that God has

placed fossils in the ground as a test of scientists'

faith. Mere humans will always lose a battle of wits

with an omnipotent being. In the second case, an

explanation might be refutable but the evidence to

refute is, in a practical sense, impossible to acquire. For example, falsification might require an exper iment that would take 10,000 years to complete.

Here the explanation is technically falsifiable but

practically unfalsifiable. In the first case, the inabil

ity of the explanation to be refuted would be suffi

cient grounds for outright rejection, in the latter it would only weigh heavily against the explanation.


Good explanations should also be conservative.

They should not throw out well-established expla nations or principles on a whim. At times, of course,

new explanations will replace well-established ones. However, the standards for acceptance of

these explanations will be higher than for those that are conservative.


This standard is similar to conservatism. Simplic ity has long been recognized as a virtue in expla

nation (e.g., Occam's Razor). In general, a simple

explanation for a particular phenomenon is

accepted over a complex one, all other things being equal. Explanations should not create laws, uni

versal principles, or similar concepts that are not

needed, even where they do not conflict with well established principles. At times, some explanations turn out to be very complex. This is okay. The stan dard of simplicity does not argue that every expla

Page 19: Lars Fogelin IME

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

nation must be simple, only that archaeologists should not complicate their explanations any more

than is necessary.

Multiplicity of foils

To Quine and Ullian's set of virtues I add another, derived from Lipton's (1991:chapter 5) discussion

of foils. The more foils accounted for by an expla nation, the better the explanation. To my mind, this

standard for assessing an explanation is almost as

important as an evaluation of its empirical breadth.

When discussing foils above, I presented what

might be termed specific foils. For example, why is lithic debitage concentrated on one part of a spe cific archaeological site rather than another part of

that site? Here it might be best to think of foils in

a more general sense. Why do some phenomena occur at one place rather than another? Why do

some phenomena occur at one time rather than

another? Why do some phenomena take a certain

form rather than another? Through simple word

substitution, a whole variety of different foils can

be created, and then applied to specific cases. As

foils are explained, there will a corresponding increase in the strength of the argument. Phrased

more simply, an explanation that can account for

both where and when a particular event occurred is

usually better than an explanation that only accounts for when a particular event occurred.


On balance, those explanations that are empirically broad, general, modest, conservative, simple,

refutable, and that address many foils are more

compelling than ones that lack these traits. Based

on past experience with other explanations, it is

inferred that more compelling explanations are also

more likely to be true. These standards for evalu

ating the strength of an explanation are exception

ally good at comparing the relative merits of

different explanations of the same phenomena. It

allows an archaeologist to say that one explanation is better than another. However, when engaging in

inference to the best explanation, it must be remem

bered that the best of several explanations could still

be awful.

The same standards used to determine if an

explanation is best can also be used to determine

if an explanation is sufficiently good as to merit seri

ous consideration. When an explanation is partic

ularly good or bad, the standards provide strong

guidance. However, there will always be those

explanations that lie in the middle. While agreeing on all the evidentiary and argumentative particu lars of a middling explanation, two archaeologists

might differ on whether it is good or bad. We should

not trouble ourselves too much with this. In either

case, it would make sense either to improve the

explanation or to find another that is better. We

should also recognize that the standards listed

above often conflict with one another (e.g., mod

esty and generality). The judgments we make on

explanations are impressionistic, based upon our

past experiences with other explanations.

Multiple Explanations and Testing

As noted by Harman (1965, 1968) and Lipton (1991), many inferences to the best explanation are

sufficiently strong as to be effectively unchal

lengeable. They are accepted as true without any further testing or examination. It seems likely that

Binford never tested his assertion that smudge pits were used for hide smoking because he believed

his initial explanation was already sufficiently

strong. Even the most seemingly certain explana

tions could potentially be wrong, but testing and

further examination can quickly reach a point of

diminishing returns. Often, however, an explana tion is not sufficiently strong and requires further

investigations before it can be accepted. In these

cases rigorous testing is a reasonable and effective

system to evaluate an explanation. When explana

tions are of the appropriate form to construct

hypotheses and valid deductions, the hypothetico deductive method is a powerful approach for test

ing explanations. In other cases, testing can employ inference to the best explanation.

In contrast to deductive tests where hypotheses can only be rejected or confirmed, when using infer

ence to the best explanation to perform tests expla nations are made more or less compelling. With a

favorable test result the explanation is made more

compelling by increasing its empirical breadth.

With an unfavorable test, the empirical breadth of

the explanation is reduced through the identifica

tion of negative result. A larger proportion of the

observed data now contradicts the explanation than

before the test was performed.14 Another possibil

ity, one that is surprisingly common, is that the

archaeological remains uncovered are unrelated to

the original research questions. In this case, no

change in the strength of the explanation occurs

directly. However, the new information might sug

gest a new explanation, one that accounts for both

the previous evidence and the new material in a par

ticularly robust way. In this case, the new expla nation would be accepted since it would be more

compelling than the previous one. It is in this light that Binford's (1967) "deductively drawn"

hypotheses concerning smudge pits can be under

stood. Binford's hypotheses, if tested and con

firmed, could have served to further expand the

empirical breadth of his explanation even though his hypotheses were not deductively drawn. This

in turn would have made his hide smoking expla nation even more compelling.

Inference to the best explanation accounts for

several elements of testing that are not supported within the hypothetico-deductive method. First, it

allows for testing of explanations that do not lend

themselves to deductively drawn hypotheses. Sec

ond, while testing of this sort does not clearly con

firm or negate a hypothesis, it does expand or reduce

the empirical breadth of an explanation. This in turn

makes an explanation more or less compelling.

Third, testing through inference to the best expla nation allows for the investigation of unique archae

ological phenomena. Tests can be designed to

expand the diversity of empirical evidence con

cerning rare or unique archaeological phenomenon.

For example, an archaeologist could test an expla nation against previously unexamined aspects of

the Pyramids of Giza. Fourth, this view of testing allows for the relative success of a test to be gauged.

In the hypothetico-deductive framework, hypothe ses are either confirmed or denied. In reality, archae

ologists have all seen test results that are more

equivocal. Finally, testing in this light resembles

what archaeologists actually do, whether proces

sual, postprocessual, or something in between.

When reading the philosophical literature on

inference to the best explanation, it sometimes feels as if philosophers assume that in most cases a sin

gle clear explanation will become evident?that the cream will inevitably rise to the top. In my own

experience I am not so sure. It seems that the hard cases are remarkably common. Many archaeolog

ical phenomena have multiple explanations that are

either of equal quality or oddly unrelated. In prac

tice, postprocessualists seem right when they sug

gest that multiple perspectives can be brought to

archaeological explanations. In terms of inference

to the best explanation, there are two ways in which

this can occur. In the first, two or more good expla nations of the same archaeological phenomena that

address the same foil are of similar quality. Dif

ferent archaeologists might lean toward one or

another, but neither of the potential explanations is

clearly the best. In the second case, good explana tions of the same archaeological phenomena address different foils. In this case there is no direct

link between the explanations. They are more like

ships passing in the night. For example, one expla nation might explain why a certain archaeological

phenomenon occurred at one time rather than

another. Another explanation might explain why the

same phenomenon occurred in one place rather

than another.

These two sources for multiple explanations

require different strategies to accommodate them.

Where two explanations have different foils, no

amount of testing will show which one is best. It

would be possible to rank the explanations in terms

of how well they satisfy explanatory standards, but

since they do not share the same foil, there is no

reason to reject the weaker of the two arguments as long as it is at least a good explanation and does

not substantially contradict the stronger.15 There are

two strategies to employ in this situation. First, be

happy that you have two good explanations and use

them for your research appropriately given the foil

you are investigating. It's often hard to find even

one good explanation?so count your blessings.

Second, think up a better inference to the best expla nation that accommodates the differing foils. This

may entail synthesizing the two explanations, or

replacing them with a wholly different, all

encompassing explanation.

When multiple explanations share the same foil,

testing is a very practical way to evaluate them. One

strategy would be to start testing each explanation

independently and seeing which one winds up with

the most empirical breadth. A better strategy, how

ever, would be a test that served to contrast the

explanations. The test should not only make one of the explanations more compelling, but also simul

taneously make the others less so.

Even with our best efforts, multiple explanations for the same archaeological questions will likely

Page 21: Lars Fogelin IME

persist in archaeology. In this sense, postprocessual

archaeologists are right. There are multiple expla nations of specific archaeological phenomena. Fur

ther, all of these different explanations have value to the extent that they are good explanations. One

of the primary criticisms of postprocessual archae

ology is that it does not have a mechanism to refute

terrible explanations of past events (Earle and Preu

cel 1987). Inference to the best explanation, par

ticularly when paired with contrastive explanations,

provides a mechanism for dealing with this prob lem. Bad explanations can be rejected by reference

to the standards of empirical breadth, generality,

modesty, etc. Good explanations of equal worth

when addressing the same foils, or even good expla nations of different worth when addressing sepa rate foils, can be accounted for.


Both hermeneutics and the hypothetico-deductive method are good approximations of inference to the

best explanation and have value in archaeological research?but both also employ philosophical

gymnastics that conceal the similarities between

them. These similarities in reasoning serve to

explain, in part, one of the issues that began this

paper: how do processual and postprocessual

archaeologists productively borrow data and ideas from each other despite the differences between

them? By sharing a common form of reasoning, and

common methods for the evaluation of competing

explanations, processual and postprocessual

approaches are not as different from each other as

either group typically assumes.

This article is not a rallying cry for the status

quo. Accepting that inference to the best explana tion underlies a great deal of archaeological rea

soning demands modifications to the practice of, and discourse about, archaeological research. Dis

cussions of the relative worth of different explana tions should more explicitly employ the standards

that make an argument compelling (empirical breadth, generality, modesty, conservatism, sim

plicity, refutability, and addressing many foils). If

testing an explanation, archaeologists must be clear

if they are using deductive tests or inductive tests

based on inferences to the best explanation. In the

former, the results can only negate or, in some

philosophical formulations, confirm hypotheses.

More commonly, archaeological testing will only make explanations more or less compelling.

Finally, by employing inference to the best expla nation, archaeologists can explore unique or rare

phenomena that defy investigation through deduc tion or statistical induction.

Throughout this article I have relied on a fairly

simple approach. I have focused on what archaeol

ogists do rather than on what they say they do.16 When viewed in this light, archaeologists have, for a long time, relied upon a style of reasoning well

suited to their goals: inference to the best explana tion. Despite several attempts to divert archaeolo

gists' epistemological interest elsewhere, inference

to the best explanation has persisted as a dominant

form of reasoning in archaeology for a very simple reason?it works. Despite the rapidly changing social theories employed in the development of

archaeological explanations, inference to the best

explanation has continued to be the primary means

for determining the value of individual explana tions. For this reason, I am not proposing that archae

ologists adopt inference to the best explanation. I

only suggest that archaeologists should accept as

worthwhile a style of reasoning they are already

using and do a better job of using it.

Acknowledgments. For over a decade now, my father (Robert

Fogelin, Professor Emeritus of Philosophy at Dartmouth

College) and I have been talking about philosophy in one

way or another. This article is my response to our discus

sions. Throughout the writing process he has continued to

help, pointing out philosophical blunders small and large.

Though I have relied on his assistance, there are several ways in which my father would disagree with what is presented

here. For his advice and help?in things far more important than philosophy?I dedicate this article to him. I thank Laura

Villamil for translating the abstract into Spanish. I also thank

the people who read and commented on the previous drafts

of this article, including: Andy Balkansky, Florence Fogelin, Severin Fowles, Jane Kelley, Scott Hutson, Alice Ritscherle,

Norm Yoffee, and two anonymous reviewers. The article is

far better for their insights. This article was written while

serving as the Visiting Scholar at the Center for

Archaeological Investigations at SIU Carbondale.

AMERICAN ANTIQUITY [Vol. 72, No. 4,2007

1. Throughout this article I treat processual and post

processual archaeology as more theoretically unified than

they actually are or were. I recognize that there is a great deal

of diversity in the approaches of specific archaeologists asso

ciated with these different schools. The caricatures I employ are only intended to reduce hesitations and subclauses that

would only serve to distract from the main thrust of this arti


2. Throughout this article I cite Alison Wylie's Thinking

from Things (2002). This work contains her collected articles,

spanning the years since 1982.

3. Throughout this article I primarily cite and quote Carl

Hempel (1965, 1966) in my discussions of the hypothetico deductive method. However, Hempel's primary contribution was the characterization of scientific explanations as requir

ing universal or probabilistic laws (the deductive

nomological approach to explanation). For the most part,

Hempel's views on the hypothetico-deductive method follow

those of his contemporaries. 4. A similar definition can be found in The Oxford

Companion to Philosophy (Honderich, ed. 1995:181) which states that "[a deduction is] a species of argument or inference

where from a given set of premises the conclusion must fol


5. Hempel (1966:11) notes the same difference between

inductive and deductive arguments: "The premises of an

inductive inference are often said to imply a conclusion only with more or less high probability, whereas the premises of a

deductive inference imply the conclusion with certainty." 6. Astronomical predictions, for example, are usually

inferential. Based upon previous observations of an asteroid's

location and some general laws of gravity, astronomers can

infer its future path. At first glance, this may seem a deduc

tion. However, the conclusion is not the necessary product of

its premises. For example, the asteroid may be hit by a comet or influenced by the gravity of an unknown object. In either

case, all of the premises of the induction would still be strong, but the comet would not follow the predicted path. Astronomical observations of this sort, then, are a form of

inductive reasoning.

7. In contrast to inductions, deductions are non

ampliative. The conclusion of a valid deduction contains no

more information than what is contained within its premises. 8. Harman (1965, 1968b) argues that even statistical

inductions rest upon a foundation of inference to the best

explanation. This is an exceedingly technical discussion that

need not be addressed here.

9. This argument concerning the limitations of statistical

inference follows closely that of Harman (1965). 10. Harman (1965:89) notes, "There is, of course, a prob

lem about how one is to judge that one hypothesis is suffi

ciently better than another hypothesis. Presumably such a

judgment will be based on consideration such as which

hypothesis is simpler, which is more plausible, which

explains more, which is less ad hoc, and so forth. I do not

wish to deny that there is a problem about explaining the

exact nature of these considerations; I will not, however, say

anything more about this problem." 11. Lipton (1991:chapter 7) contrasts the term "likeli

ness" with the term "loveliness." I find "lovely" to be prob lematic as a general term for describing the strength of an

explanation. For this reason, I use the term "compelling" instead. Other than this shift in terms, my argument follows

closely that of Lipton. 12. See, for example, John Stuart Mill's (1904) methods

for the identification of causes or Philip Kitcher's (1981)

analysis of explanation as a demonstration of unity, among

many others.

13. In a recent discussion of analogy in the study of

ancient states, Yoffee (2005:194) notes, "The search for

appropriate comparisons among the earliest states is a rela

tively new enterprise in social evolutionary theory, and its

important that the project's goals include not only explana tions of why things happened as they did, but also why they didn't happen some other way." While phrased in terms of

analogical arguments, Yoffee's suggestion fits neatly with the

concepts of foils and contrastive explanations presented here.

14. The same logic explains why particularly strong

explanations do not seem to require further testing and why, at a certain point, progressive loops in hermeneutic circles

become less productive. Where an explanation encompasses a particularly large amount of empirical evidence, contradic

tory evidence gained from testing will not affect the propor tion of supporting evidence for an explanation to any

significant degree. 15. It is likely that if two explanations contradict each

other they share at least one foil.

16. Hodder (1999:30-65) also claims to examine what

archaeologists do rather than what they say they do. I would

argue, however, that Hodder compares what postprocessual

archaeologists do with what processual archaeologists say

they do. As exemplified by Binford's (1967) "Smudge Pits," the actual reasoning practices of processual and postproces sual archaeologists are remarkably similar.

Received May 18, 2006; Revised October 5, 2006;

Accepted February 14, 2007.