UNDERDETERMINATION AND INDIRECT MEASUREMENT A …cs884mb1574... · 2011. 9. 22. · ii. I certify...

UNDERDETERMINATION AND INDIRECT MEASUREMENT

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF PHILOSOPHY

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Teru Miyake

June 2011

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/cs884mb1574

© 2011 by Teru Miyake. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii

http://creativecommons.org/licenses/by-nc/3.0/us/http://creativecommons.org/licenses/by-nc/3.0/us/http://purl.stanford.edu/cs884mb1574

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Michael Friedman, Primary Adviser


Helen Longino


Patrick Suppes


George Smith

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

iv

Abstract

We have been astonishingly successful in gathering knowledge about certain

objects or systems to which we seemingly have extremely limited access. Perhaps the

most difficult problem in the investigation of such systems is that they are extremely

underdetermined. What are the methods through which these cases of

underdetermination are resolved?

I argue in Chapter 1 that these methods are best understood by thinking of what

scientists are doing as gaining access to the previously inaccessible parts of these

systems through a series of indirect measurements. I then discuss two central problems

with such indirect measurements, theory mediation and the combining of effects, and

ways in which these difficulties can be dealt with.

In chapter 2, I examine the indirect measurement of planetary distances in the

solar system in the sixteenth and seventeenth centuries by Copernicus and Kepler. In

this case, there was an underdetermination between three different theories about the

motions of the planets, which can be partly resolved by the measurement of distances

between the planets. The measurement of these distances was enabled by making

certain assumptions about the motions of the planets. I argue that part of the

v

justification for making these assumptions comes from decompositional success in

playing off measurements of the earth‘s orbit and the Mars orbit against each other.

In chapter 3, I examine the indirect measurement of mechanical properties such

as mass and forces in the solar system by Newton. In this case, there were two

underdeterminations, the first an underdetermination between two theories about the true

motion of the sun and the earth, and the second an underdetermination between various

theories for calculating planetary orbits. Newton resolves these two problems of

underdetermination through a research program where the various sources of force are

identified and accounted for. This program crucially requires the third law of motion to

apply between celestial objects, an issue about which Newton was criticized by his

contemporaries. I examine the justification for the application of the third law of motion

through its successful use for decomposition of forces in the solar system in a long-term

research program. I further discuss comments by Kant on the role of the third law of

motion for Newton, in which Kant recognizes its indispensability for a long-term

program for determining the center of mass of the solar system and thus defining a

reference point relative to which forces can be identified.

Chapter 4 covers the indirect measurement of density in the earth‘s interior using

observations of seismic waves. One of the difficult problems in this case is that we can

think of the interior density of the earth as a continuous function of radius—in order to

determine this radius function, you are in effect making a measurement of an infinite

number of points. The natural question to ask here is how much resolution the

observations give you. I will focus on the work of geophysicists who were concerned

with this problem, out of which a standard model for the earth‘s density was developed.

vi

Acknowledgments

I am incredibly lucky to have been able to take two extraordinary seminars in

which the seeds for the ideas set forth in this dissertation were sown. The first is a

seminar on Newton‘s Principia that George Smith taught at Tufts University that I took

when I was an MA student. George‘s unwavering attention to the details that make a

difference, his way of identifying and trying to answer truly deep and interesting

questions about science, and above all his kindness and dedication to his students, all

made a deep impression on me. I sat in on this seminar again when George taught a

version of it when he visited Stanford University a few years later. I would like to sit in

on it many more times if I could—I‘m sure I would get more out of it every time.

The one other seminar that made a similarly deep impression on me was Michael

Friedman‘s seminar on Kant‘s Metaphysical Foundations of Natural Science that I took

at Stanford. I found Michael to be a thinker of a completely different sort from George,

but I also saw a very similar uncompromising attitude with regard to the study of Kant

and the sciences of his time, and Michael‘s warm personality made it easy for me to

work with him as my advisor at Stanford. George and Michael are a pair of mentors

who, each in his own unique way, sets the highest standards in his area of research. I

only hope my own work could approach those standards someday.

vii

The rest of the dissertation committee is no less distinguished. Pat Suppes is, of

course, in a league of his own. When I first talked to Pat, I have to admit that it was

with a mixture of awe and apprehension, but I grew to really enjoy walking out to visit

him at Ventura Hall. Helen Longino was always very helpful and encouraging, even

during a very busy stint as department chair. Tom Ryckman was not an official member

of the committee, but he was certainly a committee member in my eyes. I have had

countless discussions with him about the topics covered in this dissertation, and he was

the most dependable source of advice and support during my years at Stanford.

As I have already mentioned, I got my MA in philosophy at Tufts University,

and besides George I would like to thank Dan Dennett, Jody Azzouni, Kathrin Koslicki,

David Denby, and the members of my cohort. At Stanford, I would like to thank the

following faculty: Brian Skyrms, David Hills, Krista Lawlor, Lanier Anderson, Chris

Bobonich, Mark Crimmins, Nadeem Hussain, Marc Pauly, John Perry, and Dagfinn

Follesdal. Grad students and visiting scholars who have contributed to the development

of the ideas in this dissertation include Quayshawn Spencer, Angela Potochnik, Joel

Velasco, Alistair Isaac, Johanna Wolff, Tomohiro Hoshi, Sally Riordan, Ben Wolfson,

Dan Halliday, Danny Elstein, Shawn Burns, Micah Lewin, and Samuel Kahn. Part of

this dissertation was given as a talk at the UC Irvine LPS department, and I thank the

audience for their comments, and Jeff Barrett and Kyle Stanford in particular for their

hospitality.

I wrote much of this dissertation at the Max Planck Institute for the History of

Science in Berlin, where I was a Predoctoral Fellow. The Max Planck Institute provided

a perfect environment for writing this dissertation, and I would especially like to thank

viii

Raine Daston and the scholars in Department II. Financial support for the years during

which I was working on this dissertation was provided by the Whiting Foundation and

the Ric Weiland Fellowship. In addition, I am proud to say that I was the very first Pat

Suppes Fellow at Stanford, for which I would like to thank Pat a second time.

I could not have had better preparation for the work I had to do for this

dissertation than my undergraduate experience at Caltech. I want to thank all of my

friends throughout those four very tough but ultimately rewarding years.

Finally, all the members of my family know that the roots of my philosophical

education began with long arguments over pretty much anything with my twin brother

Kay. I would like to thank Dad, Mom, Yochan, June, and Kay for their support.

ix

Table of Contents

Chapter 1: Underdetermination and Indirect Measurement .................................................. 1

Chapter 2: Copernicus, Kepler, and Decomposition .............................................................. 35

Chapter 3: Newton and Kant on the Third Law of Motion ................................................... 68

Chapter 4: Underdetermination in the Indirect Measurement of the Density Distribution

of the Earth’s Interior............................................................................................................... 102

Epilogue ..................................................................................................................................... 136

Bibliography .............................................................................................................................. 141

1

-1-

Underdetermination and Indirect Measurement

1 Prelude

Suppose one day archeologists unearth a mysterious artifact—a perfect black

cube, 10 centimeters on a side, cool to the touch, made of what looks like the blackest

possible steel. They decide, rather unimaginatively, to call the artifact ―Cube‖. It‘s a

mere curiosity at first, but scientists soon find that it has some mystifying features. The

material it is made out of is incredibly hard—it cannot be broken, cut, pierced, drilled, or

dynamited. It cannot even be scraped in order to take samples of the material. All

attempts to take CAT scans or MRI images of the inside of Cube have failed. On one

face are several white dots that look as if they are projected onto the face from within

Cube. The dots move across the face of Cube, tracing out trajectories over time.

Now suppose we are scientists trying to figure out what is going on inside Cube.

We will find, unfortunately, that our options are severely limited, since we have found

no way of accessing the interior of Cube. What do we do? Perhaps the only thing to do

is simply to assume that there are certain lawful connections between the internal and

external states of Cube, that is, the dynamics of the external states depends somehow

2

upon the dynamics of the internal states. We then make hypotheses about (a) the

dynamics of the internal states of Cube, and (b) the laws that connect the internal to the

external states of Cube. From these hypotheses, we deduce predictions about the

dynamics of the external states. If those predictions match our observations of the

external states, we say that those hypotheses have been confirmed. This method, the

hypothetico-deductive method, was described by Pierre Duhem in The Aim and Structure

of Physical Theory (1954) as being the method of physics, and it has been widely

adopted by philosophers, most notably Quine.

There is a problem with this method, though, as Duhem recognized. Since we

have no antecedent knowledge whatsoever about the internal states of Cube, there is

enormous leeway in the hypotheses we can come up with. For any given dynamics of

the external states of Cube, there will be many different sets of hypotheses that are

consistent with those dynamics. In philosophical parlance, our theory of the internal

states of Cube is massively underdetermined by our observation of the external states.

Because of this underdetermination, the mere agreement of predictions about the

dynamics of the external states of Cube with actual observations gives us little reason to

think that the hypotheses from which those predictions were deduced have, in any way,

characterized the true internal states of Cube. Faced with this predicament, we might

give up on the idea that we can gain any knowledge at all about the internal states of

Cube, and instead become instrumentalists. We change our aim to simply predicting the

dynamics of the external states of Cube without making any claim to having any

knowledge about the internal states.

3

2 Resolving underdetermination

According to one way of thinking about the methodology of planetary astronomy

in the sixteenth century, planetary astronomers were in a position very much like what

the scientists faced with studying Cube. All of our knowledge about the solar system

came from the observation of the motions of the planets as they moved across the night

sky. More specifically, we can think of ourselves as being located inside an immense,

hollow, black sphere, on the inner surface of which the constellations are painted. We

can then determine the positions of the planets on this sphere, as seen from the earth, and

thus record their apparent motions over time. We cannot, however, know how far away

a planet is from us merely by looking at it. So we are, in effect, looking at the two-

dimensional projection, onto the celestial sphere, of the actual three-dimensional

motions of the planets through space. Moreover, although we did not know for sure in

the sixteenth century, we are observing these motions from a platform, the earth, that is

itself moving.

Drawing out the analogy with the story of Cube, we can think of the apparent

motions of the planets as corresponding to the external states of Cube, while the actual

three-dimensional motions correspond to the internal states. Like the scientists studying

Cube, astronomers in the sixteenth century faced a problem of radical

underdetermination. Famously, the apparent motions of the planets across the night sky

were compatible with three different theories of the actual motions of the planets—the

Ptolemaic, the Copernican, and the Tychonic theories1—in which the actual three-

dimensional motions of the planets are radically different from each other. This is a

1 I will describe these theories in more detail in chapter 2.

4

classic situation of underdetermination. There were three radically different theories that

could all be made to fit the observations then available to about the same degree of

precision. At the end of the sixteenth century, some astronomers, such as a

contemporary of Kepler‘s called Ursus, came to conclusions similar to those I discussed

above about Cube.2 They decided that the aim of planetary astronomy should not be

about acquiring knowledge about the actual motions of the planets at all. Instead, the

aim of planetary astronomy should simply be to provide a convenient way of calculating

the apparent motions of the planets.

How was this state of underdetermination eventually resolved? Well, suppose

the method of astronomy is, like for Cube, hypothetico-deductive. You make

hypotheses, deduce the observable consequences of these hypotheses, and then you

compare these consequences with actual observations. Since the problem is that there

were three theories that could fit the observations to the same degree of precision, we

might think that one way of resolving the underdetermination is through increasing the

precision in the actual observations. As we shall see in chapter 2, however, Johannes

Kepler shows in the Astronomia Nova that, with minor modifications, the Ptolemaic,

Copernican, and Tychonic systems can be made to give exactly the same predictions for

the apparent two-dimensional motions of the planets—they can be made empirically

equivalent. Thus, a mere increase in precision of the observations of the apparent

motions could not resolve the underdetermination. What actually happened is that

Galileo turned his telescope to the skies in 1619 and observed that Venus has phases,

just like the moon. This situation is inconsistent with the Ptolemaic theory, so it was

2 I will discuss Ursus in chapter 2.

5

eliminated from contention.3 A new kind of technology, the telescope, allowed us to

bring a new kind of evidence to bear on the question of what the actual motions of the

planets are.

I think, however, that there is a third way in which the underdetermination could

have gotten resolved. In fact, Kepler had a good argument, prior to 1619, that the

Ptolemaic theory is not the correct theory of planetary motion. I just got done saying

that Kepler showed all three theories of the planetary motions could be made empirically

equivalent to each other, and so could not be distinguished on the basis of observations

of the apparent two-dimensional motions of the planets. We might note, however, that

the three theories predict very different motions for the planets through three-

dimensional space. If we could somehow measure the actual distances between the

planets with confidence, we could eliminate one or more of the theories. As I said, we

cannot get planetary distances simply by direct observation of the two-dimensional

motions, but they can be inferred from these two-dimensional motions by indirect

measurement.

3 Indirect measurement

So we might be able to resolve underdetermination in some cases by using

indirect measurement. As we shall see, however, there is a problem. In order to carry

out indirect measurement, you have to presuppose certain facts about the system you are

investigating. The central question of this dissertation will be: How can we know with

confidence that indirect measurements are correct or approximately correct, given that

3 It was not until Newton that the Tychonic theory was conclusively laid to rest, as we will see in

chapter 3.

6

we must presuppose certain facts about the system? Let me sharpen this question

further by explaining what I mean by an indirect measurement, and giving some idea of

what the assumptions are that you have to make about the system.

Suppose there is a complicated, partially inaccessible system that I want to

acquire knowledge about. A complicated system is one that consists of many parts,

those parts having various properties and relations with each other. I say an object is

partially inaccessible if we can only confidently measure a proper part of the properties

of, and relations between, the parts of that object. I call the properties that we can

confidently measure the accessible properties. I will also sometimes speak of accessible

parts, by which I simply mean the parts of the system that have properties that we can

confidently measure. In order to determine the properties and relations of the

inaccessible parts, we must make inferences based upon what we know about the

accessible parts. Indirect measurement, then, is the measurement of inaccessible

properties or relations of a complicated, partially inaccessible system, through inference

based upon observations of the accessible properties.

We can think of the solar system, as viewed by astronomers in the sixteenth

century, as a complicated, partially inaccessible system. It is complicated because it

consists of many parts, namely the planets, the sun, and the moon, each having

properties such as mass and size, and distance relations between them. It is partially

inaccessible because we have access to the two-dimensional motions of the planets, but

we do not have access to distances in three-dimensional space. So the measurement of

planetary distances based upon observations of the apparent two-dimensional motions of

the planets is indirect measurement.

7

Now let us go back to the question I asked a few paragraphs back. Could we

have used the observations of the apparent two-dimensional motions of the planets to

break out of the state of underdetermination prior to 1619? The answer to this question

depends on whether we could have made indirect measurements of planetary distances

with confidence prior to 1619. I think we could, as I will argue in chapter 2. But here, I

simply want to examine what might make us lack confidence about indirect

measurements.

Before I go on with my discussion of indirect measurement, I want to distinguish

indirect measurement from a somewhat similar kind of problem. Suppose there is a

system that is partially inaccessible but not complicated. For example, say we have

found a huge underground lake, and we want to know the mineral content in the various

parts of the lake, but we only have access to parts of it. We might then take samples of

the water from the parts we can access, measure the mineral content in these samples,

and then extrapolate to the entire lake. We are making the assumption here, of course,

that the mineral content in the parts of the lake that are inaccessible to us is going to be

similar to the mineral content in the parts that are accessible. If this assumption turns

out to be wrong, we will be wrong about the mineral content in the inaccessible parts.

There can be interesting epistemological problems with this kind of extrapolation, but it

will not be a central topic of this dissertation. I will stick to complicated systems, for

which I believe there are particular problems and ways of dealing with these problems.

I will now explain what I take to be the central problems with indirect

measurement. First, note that we can be very confident about the results of some

indirect measurements. I do not have direct access to the amount of electric current

8

flowing through a wire, but I can have great confidence in the value I measure using a

galvanometer. At least part of the reason for this confidence has to do with what I call

antecedent familiarity. If an object is of a type that is familiar to me, I can safely assume

certain facts about that object. I know that if I drop a shot put from a height of 10 meters,

it will reliably hit the ground in approximately 1.4 seconds, barring any extraordinary

circumstances. I know this because I know that objects like shot puts fall with a uniform

acceleration of approximately 9.8 m/s2 at the surface of the Earth. There have been some

cases in the history of science, however, where we have wanted to know facts about an

object that is utterly unlike anything else we knew of at the time. The solar system is a

good example of such an object. For all astronomers knew in the sixteenth century, the

solar system could have been radically different from anything else we knew of, so it

was hard to know what a reasonable assumption to make about the solar system was.

I think there are two main difficulties when carrying out an indirect measurement

that would make us lack confidence in such a measurement, particularly if the system we

are making the measurement on is antecedently unfamiliar. The first difficulty is theory-

mediation. You have to make measurements of the inaccessible properties, based upon

observations of the properties that are accessible. In order to make such measurements,

you need to presuppose that a particular relation applies between the accessible

properties and the inaccessible properties. If the relation you use to make the

measurement is not known antecedently, then the question naturally arises as to how you

can know that the measurement is correct.

The second difficulty is the combining of effects. Again, the root of this

difficulty is that you have to make measurements of inaccessible properties based upon

9

observation of accessible properties. Suppose the system you are making a

measurement upon is complicated. If so, there could be more than one part of the

system that has an effect on the accessible parts. If you want to measure a property of

one of those parts, you might have to separate out, or decompose, the effects of the

various parts on the accessible part. If you do not antecedently know the composition of

the system, however, you might not know exactly how to carry out such a

decomposition. If so, you might not be confident that the measurement you make using

such a decomposition is correct.

I will discuss these difficulties in more detail in the following sections of this

chapter, but now let me return to the notion of underdetermination. Suppose that there is

a system that we are interested in acquiring knowledge about, but there are two or more

theories that can account for all observations equally well. As I mentioned, there are a

couple of ways in which we can think we could resolve this situation of

underdetermination. One way is simply to improve on the observations we already have,

by increasing the precision of these observations. The other way is to come up with an

entirely new set of observations, like Galileo observing the phases of Venus.

What I am arguing in this dissertation is that there is a third way to resolve the

underdetermination. This is to make indirect measurements by inference from the

observations that are available to us. In order to make these indirect measurements,

however, we must make certain assumptions about what the system is like. Because of

the problem of theory-mediation, you have to make assumptions about the relation

between the inaccessible properties and the accessible properties of the system. Because

of the problem of combining of effects, you have to make assumptions about the

10

composition of the system, that is, the relation between the parts of the system. Since

these assumptions enable indirect measurements to be made, I will sometimes refer to

them as enabling assumptions.

So the now sharpened-up central question of this dissertation is the following:

Given that, in order to carry out an indirect measurement, you must make inferences

from the accessible properties of a system to the inaccessible properties, and that in

order to make these inferences, you need to make the assumptions that (1) certain

relations between accessible and inaccessible properties apply, and (2) effects from

various inaccessible parts on the accessible parts can be decomposed in a certain way,

how do you ensure that the indirect measurement you made is correct, or approximately

correct? I will lay out a preliminary answer to this question in the rest of this chapter.

4 Theory mediation

If I want to find out how wide my window is, I simply take out a tape measure

and measure it. Sometimes, however, I do not have the right kind of access to an object

on which I want to make a measurement. As I write this, the Tokyo Skytree, which will

become the tallest freestanding structure in Japan when completed, is being built.

Suppose I want to figure out how tall it is at this point during its construction. I could

not very well take out a tape measure to measure its height. Instead, I might improvise a

device with which I measure the angle from the horizon to the top of the Skytree. I then

find out the distance from my position to the Skytree construction site. Simple geometry

tells me that the height of the Skytree should then approximately be this distance times

11

the sine of the angle I measured, assuming that the angle is small. With the help of

geometry, I have made a measurement of something that is physically inaccessible to me.

In a particularly philosophical moment, I might realize that I have made the

assumption here that the Skytree is the kind of thing to which Euclidean geometry

applies. We would never call this assumption into question in our day-to-day dealings.

But what if, instead of the Skytree, I was trying to calculate distances to something that

is utterly unfamiliar to me? Astronomers in the sixteenth century, for example, used

geometry in determining the orbits of the planets. If they had known of other geometries,

they might well have raised the question of whether Euclidean geometry really applies to

the planets. After all, those planets were known to be unimaginably distant, and nobody

had the faintest clue what kind of material they could be made out of. Why should we

believe Euclidean geometry applies to them?

For almost all practical purposes, when we make such a measurement, we are on

safe ground assuming that mathematics and geometry will apply to the objects that we

are investigating. But sometimes, in order to make a measurement, we need to assume

more than mathematics and geometry. Sometimes we have to assume that a system on

which we are trying to make a measurement has certain physical properties, and behaves

in accordance with certain mathematical relations. Because I make use of a bit of

physical theory in order to make this kind of measurement, I say that such measurements

are theory-mediated.

Now, when we make measurements using bits of physical theory, the way in

which the theory is used in the measurement can be surprisingly complicated. For

example, consider the problem of trying to measure the muzzle velocity of a cannon.

12

One way we might make this measurement is to fire the cannon and measure how far the

cannonballs fly. The following equation, allows you to calculate, given the angle at

which a cannon is fired, the muzzle velocity v, and the gravitational acceleration g at the

surface of the Earth, the horizontal distance D at which a cannonball lands:

D = 2 v2 (cos sin ) / g. (1)

This equation assumes no air resistance, a perfectly flat Earth, and a constant

acceleration due to gravity. In order to calculate the distance D, all you need to do is

plug in the values of the muzzle velocity and the angle of the cannon.

Now, suppose we want to determine the muzzle velocity of a particular cannon,

but we do not have any means of directly measuring the velocity of the cannonballs as

they come shooting out of the muzzle. There is a way of using the equation given above

for making a measurement of this muzzle velocity. We can think of this method as a

way of measuring a property of something that is not directly accessible, much like our

determination of the height of the Tokyo Skytree.

First, we fire the cannon several times, at a predetermined angle, and measure the

distances at which cannonballs land. We then might guess various values of v, for which

we calculate the distances D at which we predict the cannonball ought to land. We take

the value of v that gives us a predicted value for D that is the nearest to the actually

observed values. Then we might refine our value of v further by taking a cluster of

values around this best value for v, and calculating the distances at which we predict the

cannonball ought to land given these values for v. We then compare these distances

with the distances we have actually measured, and take the value of v that is closest to

13

these distances. We can keep repeating this until we home in on a value for v. Using

this procedure, we hopefully will have measured the muzzle velocity.

Note that this procedure involves using a mathematical equation where v and

are independent variables, and D is a dependent variable. If the aim were to determine

D given values for v and , one could simply plug in the values and use the equation to

calculate D. In this case, however, we are using measured values of D in order to try to

determine the value of v—that is, we are trying to determine the value of an independent

variable using measured values for the dependent variable. The way in which we do this

is to vary the value of v until we find one that fits the value of D that we have observed.

Often, the independent variables such as v are called parameters, and this kind of

problem is called a parameter estimation problem, or a bit more colloquially, curve-

fitting. This kind of problem is also often called an inverse problem, particularly in

cases where instead of trying to estimate discrete parameters, you are trying to estimate a

continuous function.

Suppose we take the mathematical equation to be correct, and that is known.

Then the logical relation between v and D is in the form of an if-statement: if v has

such-and-such a value, then D has such-and-such a value. Note that this relation does

not uniquely determine v given D. What we really would want to guarantee uniqueness

for the value of v would be a logical relation in the form of an if-and-only-if statement.

There is also a further problem having to do with the logic. We used a kind of homing

procedure to find the value of v, where we first guess a value and then adjust v until we

get a value for D that best fits our measured value. Note that this homing procedure

works because we know that D is going to be smooth over small variations in v. But if

14

the equation we were using were such that the dependent variable is sensitive to small

fluctuations in the independent variables, we would not be able to do such a homing

procedure. For the homing procedure to work, the logic has to be of the form if v has

very nearly such-and-such a value, then D has very nearly such-and-such a value. In

some cases of indirect measurement, the use of this very nearly relation is crucial, as we

shall see in chapter 3.

In some cases, due to the mathematical relation between the independent and

dependent variables, there are problems having to do with the nonuniqueness of

solutions. Methods for addressing these nonuniqueness problems have recently become

important in geophysics, computer imaging, and other fields, under the rubric of

―inverse problem theory‖. I will postpone discussion of this problem until chapter 4.

5 When a measurement is theory-mediated, how do we know it’s correct?

As we did with the measurement of the height of Tokyo Skytree, we might think

about the assumptions we are making when we carry out this measurement. How do we

know that these assumptions will result in correct measurements? For example, what

needs to be the case in order for us to come up with the correct value for v, the muzzle

velocity of the cannon?

Our initial impulse might be to say that the equation we are using, and the

assumptions we are making about this system, must be true of the system. But we

should immediately realize that the equation we are using, and the assumptions we are

making about the system, such as no air resistance and a constant acceleration due to

gravity, are, strictly speaking, false with respect to this system. Now, one might think

15

that we ought to try to make the measurement procedure as realistic as possible, by

including as many details as we can. We could, for example, try to include air resistance,

include known details of the terrain, even allow for things like wind and atmospheric

pressure. The problem is that, in many cases, adding too many details to the

measurement procedure complicates the procedure enormously, and in some cases

makes the determination of a value impossible.

On the other hand, we would be in trouble if the assumptions we make are too

unrealistic. In that case, we could perhaps carry out the measurement procedure and

determine values for the properties of the system. But if the assumptions we make are

too unrealistic, the values we calculate would give us properties of some imaginary

cannon, not the real cannon we are interested in. There is a tradeoff here. If the

assumptions we use are too unrealistic, then we would get the wrong answer for our

measurement. But if we are too realistic, then we won‘t be able to carry out the

measurement procedure. The trick is to find assumptions that are realistic enough so

that they will let us calculate a value for the muzzle velocity that is close enough, for our

purposes, to the correct value for the real cannon.

How, then, do we know that we are making the right assumptions, and using the

right equation, to calculate the correct value for v? With regard to the cannon example,

the answer to this question is ultimately going to be an appeal to our everyday

experience, and our experience with cannons in particular (hopefully, we are

experienced artillery engineers). Our familiarity with the type of thing that cannons are,

and the conditions under which they are fired, allows us to justify the assumptions we

make about the system.

16

There was also the further problem of the logic of the relation between v and D.

Even if I have found a value for v that is consistent with the value for D that I have

measured, the logic does not guarantee that the value for v that I found is unique. Here

again, though, we make the assumption that the value for v is unique because of our

familiarity with the situation. We know that, given a constant value for , and the

conditions under which the cannon is fired, Equation 1 ought to apply at least

approximately, and there should be a unique positive value of v for each value of D.

In this example, there is a part of the system to which we do not have direct

access—we cannot directly measure the muzzle velocity of the cannon. In order to

make an indirect measurement of this muzzle velocity, we must make a large number of

assumptions about the cannon. Fortunately, the cannon is a type of system that is

familiar to us, so we can have confidence in the assumptions we make. We might say

that the muzzle velocity of the cannon is an inaccessible property of a familiar system,

and our familiarity with systems of this type allows us to set up a procedure through

which we can measure this inaccessible property.

What do we do, though, if we want to measure inaccessible properties of

unfamiliar systems? Let us hold that thought until after I discuss the second of the two

main difficulties of indirect measurement, the combining of effects.

6 Representing partially inaccessible systems

Before I discuss the combining of effects, however, I first want to introduce the

following way of representing partially inaccessible systems. This will facilitate the

17

discussion by giving us an intuitive grasp of what is going on in cases of the combining

of effects.

Figure 1

We might represent our measurement of the muzzle velocity of the cannon as in

Figure 1. This diagram is in the form of a directed graph.4 The reason it is a directed

graph should become clear in the next few sections, but let‘s just take a look at the figure

first. There are two nodes, labeled X and Y. There is an arrow, labeled a, going from X

to Y. Here is how to interpret this picture. Y stands for the distance the cannonballs

travel, X stands for the muzzle velocity of the cannon, and the arrow a stands for the

relation between X and Y, namely Equation 1 given above. The relation a uniquely

determines Y, given X. That is, as I have mentioned, it is a logical relation of the form if

4 I should say that some inspiration for these diagrams comes from Jim Woodward‘s work on

causation. The idea of these diagrams, however, is not to try to infer causes from observation.

In fact, it is almost the opposite—this sort of structure is assumed in order to enable

measurements of properties. I was also greatly influenced by George Smith‘s work, particularly

his paper ―Closing the Loop‖, encapsulated in the idea of trying to find the ―details that make a

difference, and the differences they make‖.

18

X = v, then Y = w. We have access to Y, that is, we have the means for confidently

measuring its value. What we want is to find the value of X. Ignoring, for now, the

difficulties I mentioned involving nonuniqueness, we can say that the value of X can be

determined if we know the value of Y, because we know the relation a.

We can think of the arrow a, in this case, as representing a causal relation. But

in other cases, the arrow could stand for other relations. For example, the measurement

of the height of the Tokyo Skytree can also be represented by Figure 1. Think of X as

standing for the height of the Skytree, and Y as standing for the angle from the top of the

Skytree to the horizon and the distance from my position to the Skytree site. We are

now interpreting Y as standing for two variables. The arrow a now stands for a

geometrical relation between X and Y, which uniquely determines Y, given X. As in the

cannon example, we can determine the value of X, given Y, because we know the

relation a.

Note, though, that these graphs should not be taken to be faithful representations

of these systems. For example, as I mentioned with regard to the cannon example, the

relation represented by a, Equation 1, is not actually true with regard to the system. We

might further take issue with the structure of the diagram itself. There are factors, such

as the wind, that will influence the distance that the cannonball travels. Shouldn‘t there,

then, be other arrows that point towards the node Y? In fact, if we wanted to come up

with a complete picture of what is happening with the cannon, we would have to have a

very complicated graph, with nodes standing, say, for the wind, details of the terrain,

variations in the gravitational constant, Coriolis forces, and so on. As experienced

artillery engineers, we might decide that we do not consider any of those things. We

19

assume that those other things will not have much of an effect on the outcome, and we

feel right about this in virtue of our experience as artillery engineers. In this case, this

very simple picture involving just X, Y, and a is sufficient for us to get a reasonably

accurate value for X, which is what we wanted. We assume that the relation a holds for

this system well enough for us to make this measurement.

One further remark: the diagram looks like the kind of thing that is often called a

model, both by scientists and philosophers. Because the word is used for many different

kinds of things in the philosophical literature, however, I have thought it best to avoid it.

The role of these diagrams is simply to represent the elements that are necessary for the

measurement to be carried out, and their relation with each other. I am using them as

conceptual tools for thinking about particular cases of measurement, and to facilitate

discussion about what is going on in such measurements. It should not be assumed that

a scientist carrying out a measurement has such a diagram explicitly in mind.

5 Combining of effects and decomposition

Now the discussion in the previous section raises an obvious question. What if

the system I am investigating is more complicated, having various different parts that

have significant effects on the accessible parts? This is the problem I discussed earlier

in this chapter as the problem of the combining of effects.

20

Figure 2

We can now discuss this problem using the diagrams I have just introduced.

What if I can‘t reduce a system to a very simple one like Figure 1, but it is more like

Figure 2? In Figure 2, there are now three nodes, X, Y, and Z, and two arrows—one

from X to Z, labeled a, and the other from Y to Z, labeled b. We can take a to be a

relation that licenses an inference of the following form: given that there are no other

factors affecting Z, then if X = v, then Z = w. Similarly, we can take b to be a relation

that licenses an inference of the form given that there are no other factors affecting Z,

then if Y = v, then Z = w. Now, suppose we have access to Z, and we want to measure

either Y or X. Since Z is affected by both Y and X, we need some way of separating out

their effects on Z. If we could somehow successfully separate out their effects, we

would be able to measure X or Y.

Let me illustrate this situation with the cannon example again. Let Z be the

distance the cannonball travels, and let X be the muzzle velocity of the cannon. The

arrow a going from X to Z again represents Equation 1. But now we have another factor,

21

represented by Y, that has an effect on the distance the cannonball travels. Say Y is the

speed of the headwind or tailwind in the direction the cannonball is shot. Then in order

to measure X given observations of Z, we would somehow have to compensate for the

effect of Y on Z.

How do we compensate? Perhaps the easiest way to do it is to wait to fire the

cannon at times when there is no wind. Since at such times there will be no effect of Y

on Z, we can effectively reduce Figure 2 to Figure 1. In this case, we are isolating the

effect of X on Z from the effect of Y on Z, in order to measure X. Now, it just happens

that in this example, this sort of measurement using isolation can be done. But what if

there is never a time when the wind dies down, and there is always a headwind, for

example?

More generally, what do you do in a situation like in Figure 2, where you have

access to Z, and you want to measure X, but there is always a significant effect of Y on

Z? You would have to find some way to separate out the effects of X and Y on Z. I call

the process of separating out the effects decomposition. How might you carry out this

decomposition? One way to do it would be to somehow try to model what the effect of

Y on Z would be, and then subtract that out in order to measure Z. Of course, we are

making the assumption here that the effects of X and Y on Z will add linearly, which will

not always be the case. At this point, however, I don‘t want to make things too

complicated. Let me simply note, at this point, that we are indeed making this

assumption about how the effects add together.

22

Figure 3

Now, there are other arrangements we can think of as well. In Figure 3, we have

an arrow going from X to Y, and an arrow from Y to Z. Suppose we have access to Z,

and we want to determine the value of X. In this case, X has a causal effect on Y, and Y

has a causal effect on Z, and we need somehow to measure X via its effect on Y.

Another possible arrangement is in Figure 4, where now in addition there are arrows

going between X and Y. We might think of this as a case where there is now some kind

of causal interaction. I call the various different ways in which we can arrange the

arrows and the nodes the relational structure. If all the relations are causal relations,

then we can think of it as a kind of causal structure. In all of these cases, if we want to

23

measure X or Y based on our observations of Z, we must somehow separate out the

effects of X and Y on Z—that is, we must carry out a decomposition.

Figure 4

All of this might seem complicated, but when we are trying to measure

properties of something that is familiar to us, isolating and decomposing the various

effects comes rather naturally. For example, suppose I am in a moving car and I have a

radar gun with me. I want to measure the speed of an oncoming car. I point the radar

gun at the car, then I look at the speed that the radar gun gives, and then I compensate

for my own speed by looking at my speedometer and subtracting my own estimated

speed. This is a form of decomposition that comes naturally because this is a system

that is made up of parts that are familiar to us. Of course, as the directed graph gets

more complicated and you have to decompose more effects, measurement can become

immensely more difficult.

I think decomposition is an aspect of indirect measurement, and of scientific

methodology in general, which has been overlooked by philosophers. In individual

24

cases, scientists are certainly aware of the difficulties involved with separating out

various effects when carrying out measurements. But there has been very little

philosophical literature on the problems of decomposition.5

7 Antecedently unfamiliar systems

Up to this point, we have been talking about the measurement of inaccessible

properties of familiar systems. What if, instead of a familiar system such as a cannon I

was trying to make a measurement on an inaccessible property of a system that is

antecedently unfamiliar to me? Is this even possible? Don‘t we have to know certain

things about the system antecedently in order to measure such inaccessible properties?

In the case of the cannon, we have to know the laws of physics, facts about the

environment of the cannon such as properties of the air and terrain, and less quantifiable

facts about cannons in general—how they are manufactured, how they are fired, and so

on. How could we possibly set up a measurement of an inaccessible property of an

antecedently unfamiliar system?

History seems to show, however, that successful measurements have been made

of inaccessible properties of antecedently unfamiliar systems. Consider planetary

astronomy again. The solar system—not to be confused with the traces we observe of

planets across the night sky, but the planetary system itself—was surely about as

inaccessible and antecedently unfamiliar as a system could be. We might now laugh at

the idea that the planets are carried around the heavens in crystalline spheres, but the

solar system is utterly unlike anything astronomers at the time knew about. There was

5 A few philosophers who have addressed this problem or related problems are George Smith

(2002a, 2002b), Hasok Chang (2004), and William Wimsatt (2007).

25

simply no way to know in advance what a reasonable assumption about the planets is.

Yet, as I show in chapters 2 and 3, the work of Kepler and Newton are examples of how

measurements of antecedently unfamiliar systems can be carried out successfully.

Let us think carefully about what makes the measurement of inaccessible

properties of antecedently unfamiliar systems difficult. As I have discussed earlier,

there are two basic problems—theory-mediation and the combining of effects. First, to

illustrate the problem of theory-mediation, let us return to the cannonball example.

Recall that the diagram for that example is given in Figure 1. There are two nodes, X

and Y, with an arrow, a, pointing from X to Y. Now, suppose we didn‘t know the laws of

physics, so we couldn‘t derive the relation, Equation 1, which relates X to Y and thereby

allows us to measure X by observing Y. If we only have access to Y, we would not be

able to measure X, without knowing this equation. Let me represent this situation in

Figure 5. I have X and Y, but now only a dotted arrow from X to Y, with a question mark

next to it. This is an indication that we think we know that there is a relation between X

and Y, but we don‘t know exactly what it is.

Figure 5

26

How would we measure X in this case? One thing we might think of doing is

simply guessing the relation. But how could we be at all sure that we have measured X

correctly, using a guessed relation? If I was really a cannon maker, here‘s what I would

think of doing. I would try to build something like the cannon, that launches a heavy

object like a cannonball, but for which I know the initial velocity—perhaps a catapult of

some kind. By launching the object at different velocities, I might find some kind of

relation between the initial velocity and the distance traveled. Then, by induction, I

assume that the same relation holds for the cannon. Since I now have a relation between

X and Y, I can make the measurement. So in this case I do not have to derive something

like Equation 1 from fundamental theory—I can determine it empirically. Still, one

might ask whether the inductive move is justified—how do I know that the relation I

found from the catapult applies to cannons as well? Let us hold this thought for a while.

Figure 6

Now let us think about the problem of combining of effects. Think once again

about the cannonball example. Suppose we do know Equation 1, so we know of the

27

relation a relating X to Y. But perhaps we are inexperienced as artillery engineers. We

don‘t know whether there could be other influences on the distance traveled, such as the

wind. Without knowing whether there could be such other influences, we would not be

able to measure X with confidence. Let me represent this situation in Figure 6. I have X

and Y, and an arrow going from X to Y as in Figure 1, but now I have a couple dotted

arrows going towards Y with question marks beside them, indicating possible effects on

Y. Now, again, if we were really cannon makers, there would be ways of determining

whether, say, wind is a factor. We could, for example, fire the cannon using the same

amount of powder under various conditions of wind to make sure that the distance the

cannonball travels is not affected too much by the wind. But, of course, there could be

further unforeseen conditions that affect the distance the cannonball flies. Without being

able to anticipate such unforeseen conditions, we have no way of correcting for them.

8 Indirect measurement and evidence

Let me now return to what I said is the central question of this dissertation: Given

that, in order to carry out an indirect measurement, you must make inferences from the

accessible properties of a system to the inaccessible properties, and that in order to

make these inferences, you need to make assumptions that (1) certain relations between

accessible and inaccessible properties apply, and (2) effects from various inaccessible

parts on the accessible parts can be decomposed in a certain way, how do you ensure

that the indirect measurement that you made is correct, or approximately correct?

If the system we are making an indirect measurement on is antecedently familiar,

we can often give plausibility arguments for assumptions (1) and (2). For example,

28

going back to the cannon example again, we take it as given that the laws of physics

apply to cannonballs, and that under the right conditions, Equation 1 will apply. And I

can give an argument based on past experience to say that the actual conditions are

indeed close enough to those conditions for us to be able to apply Equation 1 to this

particular situation—that, say, the wind is not going to be a factor. But what do we do if

the system is antecedently unfamiliar?

If we look at cases from the history of science, there is not a simple answer,

because the situations tend to be very complicated. Even in cases where the system you

are investigating is antecedently unfamiliar, you can give plausibility arguments for the

assumptions. For example, as we shall see in chapter 3, Newton referred to experiments

done in his laboratory to justify the applicability of the laws of motion in the Principia.

This is a reasonable assumption to make as a working hypothesis, but it could not have

been known at the time that the laws are in fact applicable to celestial objects.

Plausibility arguments are much weaker without the weight of experience behind them.

I think there is a different way of gaining confidence that an indirect

measurement is correct or approximately correct, which does not involve trying to come

up with a straight justification for the assumptions (1) and (2): let the indirect

measurements themselves be evidence that the assumptions were correct.

29

Figure 7

I think there are at least two strategies through which this can be done. The first

strategy is converging measurement.6 Suppose there is some system that we can

represent by Figure 7. There is a node X with two arrows out from it, arrow a to node Y,

and arrow b to node Z. Suppose both Y and Z are accessible properties, that is, we have

a way of measuring their values confidently. Suppose we don‘t have too much

confidence in the relations a and b. In this situation, there are two different ways of

measuring X, through observation of Y using relation a, and through observation of Z

using relation b. If we carry out both measurements, and we get approximately the same

result, that is, they converge, then this is good reason to think that the measurements are

good, and that the measurement of X is correct. We can, of course, have more than two

such converging measurements. The more the results converge, the better reason we

have to believe that the measurement of X is indeed correct. Note, however, that we can

get converging results even if the relations a and b are not strictly true of the system—it

could be the case that, say, relation a simply holds to a good approximation under the

6 The term, and the idea, are George Smith‘s. See (Smith 2002a) and his unpublished

manuscript ―Closing the Loop‖.

30

circumstances of the measurement. It turns out that we have more reason to believe that

the measurement itself is correct than the assumptions we made in order to make the

measurement.

This has implications, by the way, for the way in which we view the ―flow‖ of

evidence in science. In Figure 7, I have confidence in my measurements of Y. I have

low confidence in the relation a. Since I am using the relation a to measure X, one

might think that I should have low confidence in my measurement of X. This would

indeed be the case if I only measured X one way, but if I also measure X through the

other relation b, and they converge, then this will increase my confidence in X even if I

have low confidence in b. In fact, this might be reason to raise my confidence in the

applicability of the relations a and b. To put it in a loose but picturesque way, evidential

power does not flow monotonically from Y and Z towards X. Rather, under certain

circumstances such as converging measurements, X can be a new source of evidence,

and the evidential power can actually ―flow outward‖ from X. Of course, we have to be

careful about what such converging measurements actually show about the relations a

and b. The conclusion we can draw from such convergent measurement is that the

relations a and b are applicable under the conditions of the measurements, but we would

not know whether they would be applicable in other conditions.

There are other strategies besides converging measurement in which we can get

the indirect measurements themselves be evidence that the assumptions were correct.

They involve more complicated relational structures. The following strategy is what I

call decompositional success. For example, take a look again at Figure 2. Here, the

accessible property Z is affected by both the inaccessible property X, via the relation a,

31

and the inaccessible property Y, via the relation b. Suppose we don‘t have too much

confidence in the relations a or b, and we want to measure X. We might first try

guessing the effect of Y on Z, subtracting that effect out, and then measuring X using the

relation a. We now have a way of modeling the effect of X on Z using the relation a.

Now subtracting that effect out, we measure the value of Y using the relation b. Using

this new value of Y, we model the effect of Y on Z. We subtract out that effect and

measure a more refined value for X. Using this new, refined value for X, we model the

effect on Z, and we now come up with a new, refined measurement for Y.

If my measurements of X and Y seem to be converging on certain values, then

this is good evidence that this relational structure is approximately correct and the

relations a and b are also at least approximately applicable. Why? Suppose the relation

a is not approximately applicable. Then when we model the effect of X on Z and

subtract out this modeled effect in order to measure Y, we do not expect to get a good

value when we measure Y. Then, when we model the effect of Y and subtract it out to

measure X, we should expect this measurement not to give a good value for X, and thus

it should not agree with the previous value for X. Thus, if the sequence of values for X is

converging, this is evidence that the values for X and Y are correct. To put it loosely, we

are ―playing the measurements of X and Y off of each other‖—the measurement of X

presupposes that the measurement of Y is approximately correct, and the measurement of

Y presupposes that the measurement of X is correct. If either one is not approximately

correct, then in all probability the procedure should not work.

In actuality, these relational structures often turn out to be even more

complicated. But it is the very fact that these structures are so complicated that they can,

32

in some cases, confer very high confidence that some indirect measurements are correct.

The more complicated a structure, the more ways in which one can play measurements

off of one another, or try to measure one property in more than one way.

Now I want to discuss some limitations of these methods. First, as we shall see

when we start looking at actual cases of indirect measurement, most indirect

measurements are far from easy to do, especially when they involve systems that are

partially inaccessible. They often involve observations that are limited and hard to get,

and the calculations themselves can often be laborious, especially when we consider

sciences such as planetary astronomy in the sixteenth and seventeenth centuries. Thus,

indirect measurements will often be made with the hope that it will be shown down the

road that the assumptions that were made in carrying out the measurements will turn out

to be true. We will see in chapter 3, for example, that this is the best way to view what

Newton was doing in the Principia.7

The second limitation also has to do with the temporal dimension. These

methods all involve comparing the results of different indirect measurements. In most

cases, the indirect measurements will be made at different times. If the property you are

measuring changes over time, then you will not be able to get converging measurements.

Thus, a fundamental presupposition in using these methods is that the property you are

measuring will not be changing its value significantly over time—that the value will be

stable. This is an issue that I will discuss in more detail in chapter 3.

7 This is George Smith‘s view of the methodology of the Principia. This dissertation is largely

the result of trying to understand Smith‘s views of methodology particularly as they relate to the

problem of underdetermination.

33

8 Case studies

Now that I have laid out my general view of indirect measurement, the rest of

this dissertation is devoted to case studies of indirect measurement of complicated,

partially inaccessible systems. Each case will involve a problem where there is initially

a difficult problem of underdetermination—the available observations are not good

enough to uniquely determine the inaccessible properties of the system. Indirect

measurement through the use of enabling assumptions will resolve at least part of that

underdetermination. I will, for the most part, focus on understanding the justification for

the enabling assumptions.

In chapter 2, I examine the indirect measurement of planetary distances in the

solar system in the sixteenth and seventeenth centuries by Copernicus and Kepler. In

this case, there was an underdetermination between three different theories about the

motions of the planets, which can be partly resolved by the measurement of distances

between the planets. The measurement of these distances was enabled by making

certain assumptions about the motions of the planets. I argue that part of the

justification for making these assumptions comes from decompositional success in

playing off measurements of the earth‘s orbit and the Mars orbit against each other.

In chapter 3, I examine the indirect measurement of mechanical properties such

as mass and forces in the solar system by Newton. In this case, there were two

underdeterminations, the first an underdetermination between two theories about the

relative motion of the sun and the earth, and the second an underdetermination between

various theories for calculating planetary orbits. Newton resolves these two problems of

underdetermination through a research program where the various sources of force are

34

identified and accounted for. This program crucially requires the third law of motion to

apply between celestial objects, a point on which Newton was criticized. I examine the

justification for the application of the third law of motion through its successful use for

decomposition of forces in the solar system, in a long term research program. I further

discuss comments by Kant on the role of the third law of motion for Newton, in which

Kant recognizes its indispensability for a long-term program for determining the center

of mass of the solar system and thus defining a reference point relative to which forces

can be identified.

Chapter 4 covers the indirect measurement of density in the earth‘s interior using

observations of seismic waves. One of the difficult problems in this case is that we can

think of the interior density of the earth as a continuous function of radius—in order to

determine this radius function, you are in effect making a measurement of an infinite

number of points. The natural question to ask here is how much resolution the

observations give you. I will focus on the work of geophysicists who were concerned

with this problem, out of which eventually a standard model for the earth‘s density grew.

35

-2-

Copernicus, Kepler, and Decomposition

1 Planetary Astronomy

The most difficult problem of planetary astronomy in the sixteenth century

was that the observed two-dimensional motions of the planets across the night sky

are consistent with three different theories of the actual three-dimensional motions of

the planets through space—the Ptolemaic theory, the Copernican theory, and the

Tychonic theory. In other words, the theory of the actual motions of the planets was

underdetermined by the available observations. In fact, by making minor

modifications, you could make the theories empirically indistinguishable from each

other, given the kinds of observations that were available at the time. It seemed to

some astronomers in the sixteenth century that this underdetermination is

unresolvable, and that, in fact, trying to determine the actual motions of the planets

should not even be an aim of planetary astronomy.

This problem could be solved, however, if you could find a way of indirectly

measuring the distances between the planets, for the motions of the earth, the sun,

and the planets through space are different for each of these theories. Copernicus

36

and Kepler both use the method of triangulation to attempt to measure planetary

distances—setting up a triangle with the sun, the earth, and a planet at the corners

and using geometrical relations to determine distances. In order to carry out this

procedure, however, it is very important to know the angles of the triangle accurately.

But as I will explain, in order to determine these angles, you must perform what I

called a decomposition in chapter 1—you have to separate out the effects due to two

different features of the planetary motions. These features are called the first

inequality and the second inequality.

Thinking about this problem in terms of the picture of indirect measurement I

provided in chapter 1, we can take the solar system to be a complicated, partially

inaccessible system, with the apparent motions of the planets being the accessible

properties of the system, and the actual three-dimensional motions of the planets

being the inaccessible properties. In chapter 1, I explained that in order to carry out

an indirect measurement, you need to assume that (1) certain relations between the

accessible and the inaccessible properties apply, and (2) effects from various

inaccessible parts on the accessible parts can be decomposed in a certain way. The

central question was how you ensure that the indirect measurement is correct or

approximately correct in the face of (1) and (2).

With regard to assumption (1), the fundamental theory from with the relations

between the accessible and inaccessible properties, that is, the relations between the

apparent motions of the planets and the actual three-dimensional motions of the

planets, are derived, is Euclidean geometry. That Euclidean geometry is applicable

to the planets was never called into question by astronomers—they could not have,

37

of course, since they did not know of any other geometry than that of Euclid.

Assumption (2), however, involves exactly how you break down the apparent

motions of the planets. All astronomers at the time, following Ptolemy, separated

out two motions, the first inequality and the second inequality. There were

disagreements, however, as to how to characterize each of these motions, and to what

actual motions of the planets the first and second inequality corresponded. Since this

separation of motions had to be done in order to determine planetary distances, how

could an astronomer know whether a measurement of planetary distances involving

decomposition is correct?

2 Planetary Astronomy in the sixteenth century

Although a more thorough treatment of planetary astronomy from the mid-

sixteenth to the early seventeenth century would certainly require a section on Tycho

Brahe, I will focus on the work of Copernicus and Kepler. We will be thinking about

the work of Copernicus and Kepler in terms of the framework I described in Chapter

1. We have access to the two-dimensional motions of the planets across the night

sky, that is, angular distances of the planets relative to the constellations, over time.

What we want to know are the actual motions of the planets in three dimensions, that

is, relative distances and directions of the planets over time. We will find that the

measurement of planetary distances crucially involves separating out two different

features of the motions of the planets—the first inequality and the second inequality.

I will explain what the first and second inequalities are shortly, but let us first

consider the apparent two-dimensional motions of the planets. We can think of the

38

night sky as a vast, hollow sphere, onto the inner surface of which are painted the

stars that are visible from the earth, some forming the familiar shapes of the

constellations. The sun appears to make one entire circuit around this sphere every

year, and the great circle along which it travels is called the ecliptic. The planets

appear to move roughly along the ecliptic, but their motions are somewhat

complicated. Movement along the direction of the ecliptic is called longitudinal

motion, while movement perpendicular to the ecliptic is called latitudinal motion.

Since it is the longitudinal motions that ultimately yield information about planetary

distances, I will talk almost exclusively of the longitudinal motions in what follows.

Now, let us consider these longitudinal motions. The motions of the planets

are fairly regular, but they have two significant irregularities in their motion. One

irregularity is the famous retrograde motion. At some points along their journey

along the ecliptic, the planets will appear to stop and reverse direction for a while,

going the opposite direction along the ecliptic. This irregularity was called the

second inequality (or the second anomaly) by astronomers from the time of Ptolemy

through Kepler. We now know that the second inequality arises because we are

viewing the motions of the planets from a platform that is itself moving, namely the

earth.

The other irregularity is that the planets appear to speed up and slow down at

various points as they travel along the ecliptic. This variation in apparent angular

velocities was called the first inequality (or the first anomaly). We now know that

this variation occurs for two reasons. About half of the maximum variation in the

apparent angular velocity is because the planets actually do speed up and slow down

39

relative to the sun in accordance with Kepler‘s area rule, while the remaining half

comes from the sun not being at the center of the earth‘s orbit, but at a focus.

Figure 1

Figure 1 (from Swerdlow and Neugebauer 1984, 615) is a representation of the

Ptolemaic theory. The earth is labeled O, and the position of a planet is labeled P.

In this theory, the second inequality is accounted for by the use of epicycles. The

planet moves in an epicycle, which is a circular orbit, while the center of the epicycle

itself moves in a circular orbit, called the deferent, around the earth. In Figure 1, the

center of the epicycle is labeled C, while the center of the deferent is labeled M. The

first inequality is accounted for by offsetting the earth O from the center M of the

deferent, and having another point called the equant point, labeled E, located on the

opposite side of the center from the earth, at the same distance from the center as the

earth. The planet travels at constant angular velocity as seen from the equant point,

40

and thus when seen from the earth it will appear to speed up and slow down at

various points along its orbit. Since the equant point does not coincide with the

center of the deferent, the planet‘s actual motion along the deferent will not be

uniform circular motion.8

Ptolemaic astronomy was enormously successful—since its development in

the second century, it was not superseded in accuracy for over a thousand years, until

the work of Kepler. There were some aspects of Ptolemaic astronomy that were

unsatisfactory, however, if one tried to think about how it could be physically

implemented. Almost all astronomers before the time of Kepler believed the planets

were carried along in their circular orbits by being embedded in rotating crystalline

spheres. As I just mentioned, according to Ptolemaic theory, the speed of the planet

along the deferent is not uniform—thus if it is being carried along by a crystalline

sphere, the sphere must somehow slow down and speed up in such a way that the

planet has constant angular velocity as seen from the equant point. It was difficult to

see how this speeding up and slowing down could be physically implemented. In

response to this difficulty, there was a school of Arabic astronomers connected to the

Maragha observatory in modern-day Iran who, in the thirteenth and fourteenth

centuries, developed planetary models using only uniform circular motion, using

epicycles to account for the first inequality.

Famously, Copernicus came up with a theory in which the second inequality

is accounted for by putting the sun at the center of the solar system and having the

8 See Evans 1984 for an excellent exposition of the role that the equant plays in Ptolemaic

astronomy, and why this innovation allowed Ptolemaic astronomy to be so empirically

successful.

41

earth go around the sun. The second inequality is then seen to be the effect of

observing the planets from a point that is itself moving. Although popular accounts

of Copernicus have him rejecting the Ptolemaic theory because of its epicycles, he

actually objected to it for the same reason as the Maragha astronomers—because it

departed from uniform circular motion (Swerdlow and Neugebauer 1984, 293-294).

In fact, Copernicus accounts for the first inequality using the same principles as the

Maragha astronomers did, with an epicycle.9 In order to get the theory to capture the

motions that Ptolemy could using the equant, this epicyclic theory for the first

inequality had to be rather complicated. Figure 2 (from Swerdlow and Neugebauer

1984, 616) shows the Copernican theory for the first inequality.

Figure 2

9 Swerdlow and Neugebauer go so far as to say that Copernicus ―can be looked upon as, if

not the last, surely the most noted follower of the Maragha school‖. (295)

42

3 Triangulation

Since both Copernicus and Kepler use fundamentally the same method to get

planetary distances, I will first explain the basic method so that the explication will

be easier when we look specifically at what Kepler and Copernicus do. At root, the

method is very simple. First take a look at Figure 3. It shows the sun, the earth, and

a planet, surrounded by constellations. The constellations are taken to be fixed

permanently in their positions, and thus they provide a reference point for recording

observations of the planets. From the earth, I can observe the position of the sun S

and the planet P along the ecliptic. The position along the ecliptic is called the

longitude. The longitude as seen from the earth is called the geocentric longitude.

In Figure 3, it just so happens that the earth, the sun, and the planet are lined

up so that the sun and the planet are exactly on opposite sides from the earth. Notice

here that when I observe the planet from the earth, I see it exactly the way I would

see it from the sun. When the sun, the earth, and a planet are in this configuration,

this is called opposition. Call the longitude as seen from the sun the heliocentric

longitude. Then at opposition, the geocentric longitude and the heliocentric

longitude coincide.

43

Figure 3

Now suppose the planet, the earth, and the sun are in the configuration shown

in Figure 4. In this configuration, the planet would have a different longitude, that is,

it would appear to be moving through different constellations, depending on whether

I observe it from the earth or from the sun. We can see that when not in opposition,

the geocentric longitude and heliocentric longitude will be different.

Figure 4

44

Now suppose when the earth, a planet, and the sun are in the configuration of

figure 4, we want to find the distance from the earth to the planet, the distance EP, as

a ratio of the distance from the earth to the sun, the distance ES. Suppose we already

have a theory of the motion of the earth around the sun, so that we know, at any

given time, the longitude of the earth as seen from the sun. And suppose we have in

addition a theory of the motion of the planet P around the sun as well, so we know, at

any given time, the heliocentric longitude of the planet P. The theory of the earth‘s

motion will give us the direction of the line ES, while the theory of the motion of P

will give us the direction of the line SP. Making one observation from the earth will

give us the direction of the line EP, thus allowing us to find all the angles in the

triangle EPS. This will then allow us to find, by simple geometry, the ratio of the

length of the line EP to the length of the line ES, which is what we wanted. Thus,

given that this is the actual configuration of the earth, the sun, and the planet, and

that I have the proper theories for the earth‘s motion and the planet‘s motion, I can

find the distance from the earth to the planet, relative to the size of the earth‘s orbit.

4 Copernicus’s measurement of planetary distances

We will now move on to the specific method that Copernicus uses to measure

distances to the planets, which he does in Book 5 of De Revolutionibus. Since the

method he uses is basically the same for all five planets, with minor differences

depending upon whether the planet is an inner planet or an outer planet, I will only

describe his procedure for one of the planets, Saturn. The basic method is

triangulation, just as I described above. We can think of Figure 5 (from Swerdlow

45

and Neugebauer 1984, 635) as a much more detailed and complicated version of

Figure 4. Saturn is labeled P, the earth is labeled O, and the sun10

is labeled S. The

reason this figure is so much more complicated than figure 4 is that the theory of

Copernicus does not consist of the simple circles I have above. The theory of

motion for Saturn involves an epicycle to account for the first inequality, and there

are further complications because the sun is not located at the center of the orbit of

Saturn. But if we strip away some of these complications, the basic method is the

same. The idea is to determine the angles in the triangle formed by the earth, the sun,

and Saturn.

Figure 5

10

One detail that I will discuss in a later section is that the sun here is the mean sun, not the

true sun.

46

The first leg of the triangle, the direction of the line from the earth to the sun,

is given by the Copernican solar theory, which is really the theory of the earth‘s

UNDERDETERMINATION AND INDIRECT MEASUREMENT A …cs884mb1574... · 2011. 9. 22. · ii. I certify...

Documents

Transcript of UNDERDETERMINATION AND INDIRECT MEASUREMENT A …cs884mb1574... · 2011. 9. 22. · ii. I certify...