UNDERDETERMINATION AND INDIRECT MEASUREMENT A …cs884mb1574... · 2011. 9. 22. · ii. I certify...

155
UNDERDETERMINATION AND INDIRECT MEASUREMENT A DISSERTATION SUBMITTED TO THE DEPARTMENT OF PHILOSOPHY AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Teru Miyake June 2011

Transcript of UNDERDETERMINATION AND INDIRECT MEASUREMENT A …cs884mb1574... · 2011. 9. 22. · ii. I certify...

  • UNDERDETERMINATION AND INDIRECT MEASUREMENT

    A DISSERTATION

    SUBMITTED TO THE DEPARTMENT OF PHILOSOPHY

    AND THE COMMITTEE ON GRADUATE STUDIES

    OF STANFORD UNIVERSITY

    IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

    FOR THE DEGREE OF

    DOCTOR OF PHILOSOPHY

    Teru Miyake

    June 2011

  • http://creativecommons.org/licenses/by-nc/3.0/us/

    This dissertation is online at: http://purl.stanford.edu/cs884mb1574

    © 2011 by Teru Miyake. All Rights Reserved.

    Re-distributed by Stanford University under license with the author.

    This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

    ii

    http://creativecommons.org/licenses/by-nc/3.0/us/http://creativecommons.org/licenses/by-nc/3.0/us/http://purl.stanford.edu/cs884mb1574

  • I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

    Michael Friedman, Primary Adviser

    I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

    Helen Longino

    I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

    Patrick Suppes

    I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

    George Smith

    Approved for the Stanford University Committee on Graduate Studies.

    Patricia J. Gumport, Vice Provost Graduate Education

    This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

    iii

  • iv

    Abstract

    We have been astonishingly successful in gathering knowledge about certain

    objects or systems to which we seemingly have extremely limited access. Perhaps the

    most difficult problem in the investigation of such systems is that they are extremely

    underdetermined. What are the methods through which these cases of

    underdetermination are resolved?

    I argue in Chapter 1 that these methods are best understood by thinking of what

    scientists are doing as gaining access to the previously inaccessible parts of these

    systems through a series of indirect measurements. I then discuss two central problems

    with such indirect measurements, theory mediation and the combining of effects, and

    ways in which these difficulties can be dealt with.

    In chapter 2, I examine the indirect measurement of planetary distances in the

    solar system in the sixteenth and seventeenth centuries by Copernicus and Kepler. In

    this case, there was an underdetermination between three different theories about the

    motions of the planets, which can be partly resolved by the measurement of distances

    between the planets. The measurement of these distances was enabled by making

    certain assumptions about the motions of the planets. I argue that part of the

  • v

    justification for making these assumptions comes from decompositional success in

    playing off measurements of the earth‘s orbit and the Mars orbit against each other.

    In chapter 3, I examine the indirect measurement of mechanical properties such

    as mass and forces in the solar system by Newton. In this case, there were two

    underdeterminations, the first an underdetermination between two theories about the true

    motion of the sun and the earth, and the second an underdetermination between various

    theories for calculating planetary orbits. Newton resolves these two problems of

    underdetermination through a research program where the various sources of force are

    identified and accounted for. This program crucially requires the third law of motion to

    apply between celestial objects, an issue about which Newton was criticized by his

    contemporaries. I examine the justification for the application of the third law of motion

    through its successful use for decomposition of forces in the solar system in a long-term

    research program. I further discuss comments by Kant on the role of the third law of

    motion for Newton, in which Kant recognizes its indispensability for a long-term

    program for determining the center of mass of the solar system and thus defining a

    reference point relative to which forces can be identified.

    Chapter 4 covers the indirect measurement of density in the earth‘s interior using

    observations of seismic waves. One of the difficult problems in this case is that we can

    think of the interior density of the earth as a continuous function of radius—in order to

    determine this radius function, you are in effect making a measurement of an infinite

    number of points. The natural question to ask here is how much resolution the

    observations give you. I will focus on the work of geophysicists who were concerned

    with this problem, out of which a standard model for the earth‘s density was developed.

  • vi

    Acknowledgments

    I am incredibly lucky to have been able to take two extraordinary seminars in

    which the seeds for the ideas set forth in this dissertation were sown. The first is a

    seminar on Newton‘s Principia that George Smith taught at Tufts University that I took

    when I was an MA student. George‘s unwavering attention to the details that make a

    difference, his way of identifying and trying to answer truly deep and interesting

    questions about science, and above all his kindness and dedication to his students, all

    made a deep impression on me. I sat in on this seminar again when George taught a

    version of it when he visited Stanford University a few years later. I would like to sit in

    on it many more times if I could—I‘m sure I would get more out of it every time.

    The one other seminar that made a similarly deep impression on me was Michael

    Friedman‘s seminar on Kant‘s Metaphysical Foundations of Natural Science that I took

    at Stanford. I found Michael to be a thinker of a completely different sort from George,

    but I also saw a very similar uncompromising attitude with regard to the study of Kant

    and the sciences of his time, and Michael‘s warm personality made it easy for me to

    work with him as my advisor at Stanford. George and Michael are a pair of mentors

    who, each in his own unique way, sets the highest standards in his area of research. I

    only hope my own work could approach those standards someday.

  • vii

    The rest of the dissertation committee is no less distinguished. Pat Suppes is, of

    course, in a league of his own. When I first talked to Pat, I have to admit that it was

    with a mixture of awe and apprehension, but I grew to really enjoy walking out to visit

    him at Ventura Hall. Helen Longino was always very helpful and encouraging, even

    during a very busy stint as department chair. Tom Ryckman was not an official member

    of the committee, but he was certainly a committee member in my eyes. I have had

    countless discussions with him about the topics covered in this dissertation, and he was

    the most dependable source of advice and support during my years at Stanford.

    As I have already mentioned, I got my MA in philosophy at Tufts University,

    and besides George I would like to thank Dan Dennett, Jody Azzouni, Kathrin Koslicki,

    David Denby, and the members of my cohort. At Stanford, I would like to thank the

    following faculty: Brian Skyrms, David Hills, Krista Lawlor, Lanier Anderson, Chris

    Bobonich, Mark Crimmins, Nadeem Hussain, Marc Pauly, John Perry, and Dagfinn

    Follesdal. Grad students and visiting scholars who have contributed to the development

    of the ideas in this dissertation include Quayshawn Spencer, Angela Potochnik, Joel

    Velasco, Alistair Isaac, Johanna Wolff, Tomohiro Hoshi, Sally Riordan, Ben Wolfson,

    Dan Halliday, Danny Elstein, Shawn Burns, Micah Lewin, and Samuel Kahn. Part of

    this dissertation was given as a talk at the UC Irvine LPS department, and I thank the

    audience for their comments, and Jeff Barrett and Kyle Stanford in particular for their

    hospitality.

    I wrote much of this dissertation at the Max Planck Institute for the History of

    Science in Berlin, where I was a Predoctoral Fellow. The Max Planck Institute provided

    a perfect environment for writing this dissertation, and I would especially like to thank

  • viii

    Raine Daston and the scholars in Department II. Financial support for the years during

    which I was working on this dissertation was provided by the Whiting Foundation and

    the Ric Weiland Fellowship. In addition, I am proud to say that I was the very first Pat

    Suppes Fellow at Stanford, for which I would like to thank Pat a second time.

    I could not have had better preparation for the work I had to do for this

    dissertation than my undergraduate experience at Caltech. I want to thank all of my

    friends throughout those four very tough but ultimately rewarding years.

    Finally, all the members of my family know that the roots of my philosophical

    education began with long arguments over pretty much anything with my twin brother

    Kay. I would like to thank Dad, Mom, Yochan, June, and Kay for their support.

  • ix

    Table of Contents

    Chapter 1: Underdetermination and Indirect Measurement .................................................. 1

    Chapter 2: Copernicus, Kepler, and Decomposition .............................................................. 35

    Chapter 3: Newton and Kant on the Third Law of Motion ................................................... 68

    Chapter 4: Underdetermination in the Indirect Measurement of the Density Distribution

    of the Earth’s Interior............................................................................................................... 102

    Epilogue ..................................................................................................................................... 136

    Bibliography .............................................................................................................................. 141

  • 1

    -1-

    Underdetermination and Indirect Measurement

    1 Prelude

    Suppose one day archeologists unearth a mysterious artifact—a perfect black

    cube, 10 centimeters on a side, cool to the touch, made of what looks like the blackest

    possible steel. They decide, rather unimaginatively, to call the artifact ―Cube‖. It‘s a

    mere curiosity at first, but scientists soon find that it has some mystifying features. The

    material it is made out of is incredibly hard—it cannot be broken, cut, pierced, drilled, or

    dynamited. It cannot even be scraped in order to take samples of the material. All

    attempts to take CAT scans or MRI images of the inside of Cube have failed. On one

    face are several white dots that look as if they are projected onto the face from within

    Cube. The dots move across the face of Cube, tracing out trajectories over time.

    Now suppose we are scientists trying to figure out what is going on inside Cube.

    We will find, unfortunately, that our options are severely limited, since we have found

    no way of accessing the interior of Cube. What do we do? Perhaps the only thing to do

    is simply to assume that there are certain lawful connections between the internal and

    external states of Cube, that is, the dynamics of the external states depends somehow

  • 2

    upon the dynamics of the internal states. We then make hypotheses about (a) the

    dynamics of the internal states of Cube, and (b) the laws that connect the internal to the

    external states of Cube. From these hypotheses, we deduce predictions about the

    dynamics of the external states. If those predictions match our observations of the

    external states, we say that those hypotheses have been confirmed. This method, the

    hypothetico-deductive method, was described by Pierre Duhem in The Aim and Structure

    of Physical Theory (1954) as being the method of physics, and it has been widely

    adopted by philosophers, most notably Quine.

    There is a problem with this method, though, as Duhem recognized. Since we

    have no antecedent knowledge whatsoever about the internal states of Cube, there is

    enormous leeway in the hypotheses we can come up with. For any given dynamics of

    the external states of Cube, there will be many different sets of hypotheses that are

    consistent with those dynamics. In philosophical parlance, our theory of the internal

    states of Cube is massively underdetermined by our observation of the external states.

    Because of this underdetermination, the mere agreement of predictions about the

    dynamics of the external states of Cube with actual observations gives us little reason to

    think that the hypotheses from which those predictions were deduced have, in any way,

    characterized the true internal states of Cube. Faced with this predicament, we might

    give up on the idea that we can gain any knowledge at all about the internal states of

    Cube, and instead become instrumentalists. We change our aim to simply predicting the

    dynamics of the external states of Cube without making any claim to having any

    knowledge about the internal states.

  • 3

    2 Resolving underdetermination

    According to one way of thinking about the methodology of planetary astronomy

    in the sixteenth century, planetary astronomers were in a position very much like what

    the scientists faced with studying Cube. All of our knowledge about the solar system

    came from the observation of the motions of the planets as they moved across the night

    sky. More specifically, we can think of ourselves as being located inside an immense,

    hollow, black sphere, on the inner surface of which the constellations are painted. We

    can then determine the positions of the planets on this sphere, as seen from the earth, and

    thus record their apparent motions over time. We cannot, however, know how far away

    a planet is from us merely by looking at it. So we are, in effect, looking at the two-

    dimensional projection, onto the celestial sphere, of the actual three-dimensional

    motions of the planets through space. Moreover, although we did not know for sure in

    the sixteenth century, we are observing these motions from a platform, the earth, that is

    itself moving.

    Drawing out the analogy with the story of Cube, we can think of the apparent

    motions of the planets as corresponding to the external states of Cube, while the actual

    three-dimensional motions correspond to the internal states. Like the scientists studying

    Cube, astronomers in the sixteenth century faced a problem of radical

    underdetermination. Famously, the apparent motions of the planets across the night sky

    were compatible with three different theories of the actual motions of the planets—the

    Ptolemaic, the Copernican, and the Tychonic theories1—in which the actual three-

    dimensional motions of the planets are radically different from each other. This is a

    1 I will describe these theories in more detail in chapter 2.

  • 4

    classic situation of underdetermination. There were three radically different theories that

    could all be made to fit the observations then available to about the same degree of

    precision. At the end of the sixteenth century, some astronomers, such as a

    contemporary of Kepler‘s called Ursus, came to conclusions similar to those I discussed

    above about Cube.2 They decided that the aim of planetary astronomy should not be

    about acquiring knowledge about the actual motions of the planets at all. Instead, the

    aim of planetary astronomy should simply be to provide a convenient way of calculating

    the apparent motions of the planets.

    How was this state of underdetermination eventually resolved? Well, suppose

    the method of astronomy is, like for Cube, hypothetico-deductive. You make

    hypotheses, deduce the observable consequences of these hypotheses, and then you

    compare these consequences with actual observations. Since the problem is that there

    were three theories that could fit the observations to the same degree of precision, we

    might think that one way of resolving the underdetermination is through increasing the

    precision in the actual observations. As we shall see in chapter 2, however, Johannes

    Kepler shows in the Astronomia Nova that, with minor modifications, the Ptolemaic,

    Copernican, and Tychonic systems can be made to give exactly the same predictions for

    the apparent two-dimensional motions of the planets—they can be made empirically

    equivalent. Thus, a mere increase in precision of the observations of the apparent

    motions could not resolve the underdetermination. What actually happened is that

    Galileo turned his telescope to the skies in 1619 and observed that Venus has phases,

    just like the moon. This situation is inconsistent with the Ptolemaic theory, so it was

    2 I will discuss Ursus in chapter 2.

  • 5

    eliminated from contention.3 A new kind of technology, the telescope, allowed us to

    bring a new kind of evidence to bear on the question of what the actual motions of the

    planets are.

    I think, however, that there is a third way in which the underdetermination could

    have gotten resolved. In fact, Kepler had a good argument, prior to 1619, that the

    Ptolemaic theory is not the correct theory of planetary motion. I just got done saying

    that Kepler showed all three theories of the planetary motions could be made empirically

    equivalent to each other, and so could not be distinguished on the basis of observations

    of the apparent two-dimensional motions of the planets. We might note, however, that

    the three theories predict very different motions for the planets through three-

    dimensional space. If we could somehow measure the actual distances between the

    planets with confidence, we could eliminate one or more of the theories. As I said, we

    cannot get planetary distances simply by direct observation of the two-dimensional

    motions, but they can be inferred from these two-dimensional motions by indirect

    measurement.

    3 Indirect measurement

    So we might be able to resolve underdetermination in some cases by using

    indirect measurement. As we shall see, however, there is a problem. In order to carry

    out indirect measurement, you have to presuppose certain facts about the system you are

    investigating. The central question of this dissertation will be: How can we know with

    confidence that indirect measurements are correct or approximately correct, given that

    3 It was not until Newton that the Tychonic theory was conclusively laid to rest, as we will see in

    chapter 3.

  • 6

    we must presuppose certain facts about the system? Let me sharpen this question

    further by explaining what I mean by an indirect measurement, and giving some idea of

    what the assumptions are that you have to make about the system.

    Suppose there is a complicated, partially inaccessible system that I want to

    acquire knowledge about. A complicated system is one that consists of many parts,

    those parts having various properties and relations with each other. I say an object is

    partially inaccessible if we can only confidently measure a proper part of the properties

    of, and relations between, the parts of that object. I call the properties that we can

    confidently measure the accessible properties. I will also sometimes speak of accessible

    parts, by which I simply mean the parts of the system that have properties that we can

    confidently measure. In order to determine the properties and relations of the

    inaccessible parts, we must make inferences based upon what we know about the

    accessible parts. Indirect measurement, then, is the measurement of inaccessible

    properties or relations of a complicated, partially inaccessible system, through inference

    based upon observations of the accessible properties.

    We can think of the solar system, as viewed by astronomers in the sixteenth

    century, as a complicated, partially inaccessible system. It is complicated because it

    consists of many parts, namely the planets, the sun, and the moon, each having

    properties such as mass and size, and distance relations between them. It is partially

    inaccessible because we have access to the two-dimensional motions of the planets, but

    we do not have access to distances in three-dimensional space. So the measurement of

    planetary distances based upon observations of the apparent two-dimensional motions of

    the planets is indirect measurement.

  • 7

    Now let us go back to the question I asked a few paragraphs back. Could we

    have used the observations of the apparent two-dimensional motions of the planets to

    break out of the state of underdetermination prior to 1619? The answer to this question

    depends on whether we could have made indirect measurements of planetary distances

    with confidence prior to 1619. I think we could, as I will argue in chapter 2. But here, I

    simply want to examine what might make us lack confidence about indirect

    measurements.

    Before I go on with my discussion of indirect measurement, I want to distinguish

    indirect measurement from a somewhat similar kind of problem. Suppose there is a

    system that is partially inaccessible but not complicated. For example, say we have

    found a huge underground lake, and we want to know the mineral content in the various

    parts of the lake, but we only have access to parts of it. We might then take samples of

    the water from the parts we can access, measure the mineral content in these samples,

    and then extrapolate to the entire lake. We are making the assumption here, of course,

    that the mineral content in the parts of the lake that are inaccessible to us is going to be

    similar to the mineral content in the parts that are accessible. If this assumption turns

    out to be wrong, we will be wrong about the mineral content in the inaccessible parts.

    There can be interesting epistemological problems with this kind of extrapolation, but it

    will not be a central topic of this dissertation. I will stick to complicated systems, for

    which I believe there are particular problems and ways of dealing with these problems.

    I will now explain what I take to be the central problems with indirect

    measurement. First, note that we can be very confident about the results of some

    indirect measurements. I do not have direct access to the amount of electric current

  • 8

    flowing through a wire, but I can have great confidence in the value I measure using a

    galvanometer. At least part of the reason for this confidence has to do with what I call

    antecedent familiarity. If an object is of a type that is familiar to me, I can safely assume

    certain facts about that object. I know that if I drop a shot put from a height of 10 meters,

    it will reliably hit the ground in approximately 1.4 seconds, barring any extraordinary

    circumstances. I know this because I know that objects like shot puts fall with a uniform

    acceleration of approximately 9.8 m/s2 at the surface of the Earth. There have been some

    cases in the history of science, however, where we have wanted to know facts about an

    object that is utterly unlike anything else we knew of at the time. The solar system is a

    good example of such an object. For all astronomers knew in the sixteenth century, the

    solar system could have been radically different from anything else we knew of, so it

    was hard to know what a reasonable assumption to make about the solar system was.

    I think there are two main difficulties when carrying out an indirect measurement

    that would make us lack confidence in such a measurement, particularly if the system we

    are making the measurement on is antecedently unfamiliar. The first difficulty is theory-

    mediation. You have to make measurements of the inaccessible properties, based upon

    observations of the properties that are accessible. In order to make such measurements,

    you need to presuppose that a particular relation applies between the accessible

    properties and the inaccessible properties. If the relation you use to make the

    measurement is not known antecedently, then the question naturally arises as to how you

    can know that the measurement is correct.

    The second difficulty is the combining of effects. Again, the root of this

    difficulty is that you have to make measurements of inaccessible properties based upon

  • 9

    observation of accessible properties. Suppose the system you are making a

    measurement upon is complicated. If so, there could be more than one part of the

    system that has an effect on the accessible parts. If you want to measure a property of

    one of those parts, you might have to separate out, or decompose, the effects of the

    various parts on the accessible part. If you do not antecedently know the composition of

    the system, however, you might not know exactly how to carry out such a

    decomposition. If so, you might not be confident that the measurement you make using

    such a decomposition is correct.

    I will discuss these difficulties in more detail in the following sections of this

    chapter, but now let me return to the notion of underdetermination. Suppose that there is

    a system that we are interested in acquiring knowledge about, but there are two or more

    theories that can account for all observations equally well. As I mentioned, there are a

    couple of ways in which we can think we could resolve this situation of

    underdetermination. One way is simply to improve on the observations we already have,

    by increasing the precision of these observations. The other way is to come up with an

    entirely new set of observations, like Galileo observing the phases of Venus.

    What I am arguing in this dissertation is that there is a third way to resolve the

    underdetermination. This is to make indirect measurements by inference from the

    observations that are available to us. In order to make these indirect measurements,

    however, we must make certain assumptions about what the system is like. Because of

    the problem of theory-mediation, you have to make assumptions about the relation

    between the inaccessible properties and the accessible properties of the system. Because

    of the problem of combining of effects, you have to make assumptions about the

  • 10

    composition of the system, that is, the relation between the parts of the system. Since

    these assumptions enable indirect measurements to be made, I will sometimes refer to

    them as enabling assumptions.

    So the now sharpened-up central question of this dissertation is the following:

    Given that, in order to carry out an indirect measurement, you must make inferences

    from the accessible properties of a system to the inaccessible properties, and that in

    order to make these inferences, you need to make the assumptions that (1) certain

    relations between accessible and inaccessible properties apply, and (2) effects from

    various inaccessible parts on the accessible parts can be decomposed in a certain way,

    how do you ensure that the indirect measurement you made is correct, or approximately

    correct? I will lay out a preliminary answer to this question in the rest of this chapter.

    4 Theory mediation

    If I want to find out how wide my window is, I simply take out a tape measure

    and measure it. Sometimes, however, I do not have the right kind of access to an object

    on which I want to make a measurement. As I write this, the Tokyo Skytree, which will

    become the tallest freestanding structure in Japan when completed, is being built.

    Suppose I want to figure out how tall it is at this point during its construction. I could

    not very well take out a tape measure to measure its height. Instead, I might improvise a

    device with which I measure the angle from the horizon to the top of the Skytree. I then

    find out the distance from my position to the Skytree construction site. Simple geometry

    tells me that the height of the Skytree should then approximately be this distance times

  • 11

    the sine of the angle I measured, assuming that the angle is small. With the help of

    geometry, I have made a measurement of something that is physically inaccessible to me.

    In a particularly philosophical moment, I might realize that I have made the

    assumption here that the Skytree is the kind of thing to which Euclidean geometry

    applies. We would never call this assumption into question in our day-to-day dealings.

    But what if, instead of the Skytree, I was trying to calculate distances to something that

    is utterly unfamiliar to me? Astronomers in the sixteenth century, for example, used

    geometry in determining the orbits of the planets. If they had known of other geometries,

    they might well have raised the question of whether Euclidean geometry really applies to

    the planets. After all, those planets were known to be unimaginably distant, and nobody

    had the faintest clue what kind of material they could be made out of. Why should we

    believe Euclidean geometry applies to them?

    For almost all practical purposes, when we make such a measurement, we are on

    safe ground assuming that mathematics and geometry will apply to the objects that we

    are investigating. But sometimes, in order to make a measurement, we need to assume

    more than mathematics and geometry. Sometimes we have to assume that a system on

    which we are trying to make a measurement has certain physical properties, and behaves

    in accordance with certain mathematical relations. Because I make use of a bit of

    physical theory in order to make this kind of measurement, I say that such measurements

    are theory-mediated.

    Now, when we make measurements using bits of physical theory, the way in

    which the theory is used in the measurement can be surprisingly complicated. For

    example, consider the problem of trying to measure the muzzle velocity of a cannon.

  • 12

    One way we might make this measurement is to fire the cannon and measure how far the

    cannonballs fly. The following equation, allows you to calculate, given the angle at

    which a cannon is fired, the muzzle velocity v, and the gravitational acceleration g at the

    surface of the Earth, the horizontal distance D at which a cannonball lands:

    D = 2 v2 (cos sin ) / g. (1)

    This equation assumes no air resistance, a perfectly flat Earth, and a constant

    acceleration due to gravity. In order to calculate the distance D, all you need to do is

    plug in the values of the muzzle velocity and the angle of the cannon.

    Now, suppose we want to determine the muzzle velocity of a particular cannon,

    but we do not have any means of directly measuring the velocity of the cannonballs as

    they come shooting out of the muzzle. There is a way of using the equation given above

    for making a measurement of this muzzle velocity. We can think of this method as a

    way of measuring a property of something that is not directly accessible, much like our

    determination of the height of the Tokyo Skytree.

    First, we fire the cannon several times, at a predetermined angle, and measure the

    distances at which cannonballs land. We then might guess various values of v, for which

    we calculate the distances D at which we predict the cannonball ought to land. We take

    the value of v that gives us a predicted value for D that is the nearest to the actually

    observed values. Then we might refine our value of v further by taking a cluster of

    values around this best value for v, and calculating the distances at which we predict the

    cannonball ought to land given these values for v. We then compare these distances

    with the distances we have actually measured, and take the value of v that is closest to

  • 13

    these distances. We can keep repeating this until we home in on a value for v. Using

    this procedure, we hopefully will have measured the muzzle velocity.

    Note that this procedure involves using a mathematical equation where v and

    are independent variables, and D is a dependent variable. If the aim were to determine

    D given values for v and , one could simply plug in the values and use the equation to

    calculate D. In this case, however, we are using measured values of D in order to try to

    determine the value of v—that is, we are trying to determine the value of an independent

    variable using measured values for the dependent variable. The way in which we do this

    is to vary the value of v until we find one that fits the value of D that we have observed.

    Often, the independent variables such as v are called parameters, and this kind of

    problem is called a parameter estimation problem, or a bit more colloquially, curve-

    fitting. This kind of problem is also often called an inverse problem, particularly in

    cases where instead of trying to estimate discrete parameters, you are trying to estimate a

    continuous function.

    Suppose we take the mathematical equation to be correct, and that is known.

    Then the logical relation between v and D is in the form of an if-statement: if v has

    such-and-such a value, then D has such-and-such a value. Note that this relation does

    not uniquely determine v given D. What we really would want to guarantee uniqueness

    for the value of v would be a logical relation in the form of an if-and-only-if statement.

    There is also a further problem having to do with the logic. We used a kind of homing

    procedure to find the value of v, where we first guess a value and then adjust v until we

    get a value for D that best fits our measured value. Note that this homing procedure

    works because we know that D is going to be smooth over small variations in v. But if

  • 14

    the equation we were using were such that the dependent variable is sensitive to small

    fluctuations in the independent variables, we would not be able to do such a homing

    procedure. For the homing procedure to work, the logic has to be of the form if v has

    very nearly such-and-such a value, then D has very nearly such-and-such a value. In

    some cases of indirect measurement, the use of this very nearly relation is crucial, as we

    shall see in chapter 3.

    In some cases, due to the mathematical relation between the independent and

    dependent variables, there are problems having to do with the nonuniqueness of

    solutions. Methods for addressing these nonuniqueness problems have recently become

    important in geophysics, computer imaging, and other fields, under the rubric of

    ―inverse problem theory‖. I will postpone discussion of this problem until chapter 4.

    5 When a measurement is theory-mediated, how do we know it’s correct?

    As we did with the measurement of the height of Tokyo Skytree, we might think

    about the assumptions we are making when we carry out this measurement. How do we

    know that these assumptions will result in correct measurements? For example, what

    needs to be the case in order for us to come up with the correct value for v, the muzzle

    velocity of the cannon?

    Our initial impulse might be to say that the equation we are using, and the

    assumptions we are making about this system, must be true of the system. But we

    should immediately realize that the equation we are using, and the assumptions we are

    making about the system, such as no air resistance and a constant acceleration due to

    gravity, are, strictly speaking, false with respect to this system. Now, one might think

  • 15

    that we ought to try to make the measurement procedure as realistic as possible, by

    including as many details as we can. We could, for example, try to include air resistance,

    include known details of the terrain, even allow for things like wind and atmospheric

    pressure. The problem is that, in many cases, adding too many details to the

    measurement procedure complicates the procedure enormously, and in some cases

    makes the determination of a value impossible.

    On the other hand, we would be in trouble if the assumptions we make are too

    unrealistic. In that case, we could perhaps carry out the measurement procedure and

    determine values for the properties of the system. But if the assumptions we make are

    too unrealistic, the values we calculate would give us properties of some imaginary

    cannon, not the real cannon we are interested in. There is a tradeoff here. If the

    assumptions we use are too unrealistic, then we would get the wrong answer for our

    measurement. But if we are too realistic, then we won‘t be able to carry out the

    measurement procedure. The trick is to find assumptions that are realistic enough so

    that they will let us calculate a value for the muzzle velocity that is close enough, for our

    purposes, to the correct value for the real cannon.

    How, then, do we know that we are making the right assumptions, and using the

    right equation, to calculate the correct value for v? With regard to the cannon example,

    the answer to this question is ultimately going to be an appeal to our everyday

    experience, and our experience with cannons in particular (hopefully, we are

    experienced artillery engineers). Our familiarity with the type of thing that cannons are,

    and the conditions under which they are fired, allows us to justify the assumptions we

    make about the system.

  • 16

    There was also the further problem of the logic of the relation between v and D.

    Even if I have found a value for v that is consistent with the value for D that I have

    measured, the logic does not guarantee that the value for v that I found is unique. Here

    again, though, we make the assumption that the value for v is unique because of our

    familiarity with the situation. We know that, given a constant value for , and the

    conditions under which the cannon is fired, Equation 1 ought to apply at least

    approximately, and there should be a unique positive value of v for each value of D.

    In this example, there is a part of the system to which we do not have direct

    access—we cannot directly measure the muzzle velocity of the cannon. In order to

    make an indirect measurement of this muzzle velocity, we must make a large number of

    assumptions about the cannon. Fortunately, the cannon is a type of system that is

    familiar to us, so we can have confidence in the assumptions we make. We might say

    that the muzzle velocity of the cannon is an inaccessible property of a familiar system,

    and our familiarity with systems of this type allows us to set up a procedure through

    which we can measure this inaccessible property.

    What do we do, though, if we want to measure inaccessible properties of

    unfamiliar systems? Let us hold that thought until after I discuss the second of the two

    main difficulties of indirect measurement, the combining of effects.

    6 Representing partially inaccessible systems

    Before I discuss the combining of effects, however, I first want to introduce the

    following way of representing partially inaccessible systems. This will facilitate the

  • 17

    discussion by giving us an intuitive grasp of what is going on in cases of the combining

    of effects.

    Figure 1

    We might represent our measurement of the muzzle velocity of the cannon as in

    Figure 1. This diagram is in the form of a directed graph.4 The reason it is a directed

    graph should become clear in the next few sections, but let‘s just take a look at the figure

    first. There are two nodes, labeled X and Y. There is an arrow, labeled a, going from X

    to Y. Here is how to interpret this picture. Y stands for the distance the cannonballs

    travel, X stands for the muzzle velocity of the cannon, and the arrow a stands for the

    relation between X and Y, namely Equation 1 given above. The relation a uniquely

    determines Y, given X. That is, as I have mentioned, it is a logical relation of the form if

    4 I should say that some inspiration for these diagrams comes from Jim Woodward‘s work on

    causation. The idea of these diagrams, however, is not to try to infer causes from observation.

    In fact, it is almost the opposite—this sort of structure is assumed in order to enable

    measurements of properties. I was also greatly influenced by George Smith‘s work, particularly

    his paper ―Closing the Loop‖, encapsulated in the idea of trying to find the ―details that make a

    difference, and the differences they make‖.

  • 18

    X = v, then Y = w. We have access to Y, that is, we have the means for confidently

    measuring its value. What we want is to find the value of X. Ignoring, for now, the

    difficulties I mentioned involving nonuniqueness, we can say that the value of X can be

    determined if we know the value of Y, because we know the relation a.

    We can think of the arrow a, in this case, as representing a causal relation. But

    in other cases, the arrow could stand for other relations. For example, the measurement

    of the height of the Tokyo Skytree can also be represented by Figure 1. Think of X as

    standing for the height of the Skytree, and Y as standing for the angle from the top of the

    Skytree to the horizon and the distance from my position to the Skytree site. We are

    now interpreting Y as standing for two variables. The arrow a now stands for a

    geometrical relation between X and Y, which uniquely determines Y, given X. As in the

    cannon example, we can determine the value of X, given Y, because we know the

    relation a.

    Note, though, that these graphs should not be taken to be faithful representations

    of these systems. For example, as I mentioned with regard to the cannon example, the

    relation represented by a, Equation 1, is not actually true with regard to the system. We

    might further take issue with the structure of the diagram itself. There are factors, such

    as the wind, that will influence the distance that the cannonball travels. Shouldn‘t there,

    then, be other arrows that point towards the node Y? In fact, if we wanted to come up

    with a complete picture of what is happening with the cannon, we would have to have a

    very complicated graph, with nodes standing, say, for the wind, details of the terrain,

    variations in the gravitational constant, Coriolis forces, and so on. As experienced

    artillery engineers, we might decide that we do not consider any of those things. We

  • 19

    assume that those other things will not have much of an effect on the outcome, and we

    feel right about this in virtue of our experience as artillery engineers. In this case, this

    very simple picture involving just X, Y, and a is sufficient for us to get a reasonably

    accurate value for X, which is what we wanted. We assume that the relation a holds for

    this system well enough for us to make this measurement.

    One further remark: the diagram looks like the kind of thing that is often called a

    model, both by scientists and philosophers. Because the word is used for many different

    kinds of things in the philosophical literature, however, I have thought it best to avoid it.

    The role of these diagrams is simply to represent the elements that are necessary for the

    measurement to be carried out, and their relation with each other. I am using them as

    conceptual tools for thinking about particular cases of measurement, and to facilitate

    discussion about what is going on in such measurements. It should not be assumed that

    a scientist carrying out a measurement has such a diagram explicitly in mind.

    5 Combining of effects and decomposition

    Now the discussion in the previous section raises an obvious question. What if

    the system I am investigating is more complicated, having various different parts that

    have significant effects on the accessible parts? This is the problem I discussed earlier

    in this chapter as the problem of the combining of effects.

  • 20

    Figure 2

    We can now discuss this problem using the diagrams I have just introduced.

    What if I can‘t reduce a system to a very simple one like Figure 1, but it is more like

    Figure 2? In Figure 2, there are now three nodes, X, Y, and Z, and two arrows—one

    from X to Z, labeled a, and the other from Y to Z, labeled b. We can take a to be a

    relation that licenses an inference of the following form: given that there are no other

    factors affecting Z, then if X = v, then Z = w. Similarly, we can take b to be a relation

    that licenses an inference of the form given that there are no other factors affecting Z,

    then if Y = v, then Z = w. Now, suppose we have access to Z, and we want to measure

    either Y or X. Since Z is affected by both Y and X, we need some way of separating out

    their effects on Z. If we could somehow successfully separate out their effects, we

    would be able to measure X or Y.

    Let me illustrate this situation with the cannon example again. Let Z be the

    distance the cannonball travels, and let X be the muzzle velocity of the cannon. The

    arrow a going from X to Z again represents Equation 1. But now we have another factor,

  • 21

    represented by Y, that has an effect on the distance the cannonball travels. Say Y is the

    speed of the headwind or tailwind in the direction the cannonball is shot. Then in order

    to measure X given observations of Z, we would somehow have to compensate for the

    effect of Y on Z.

    How do we compensate? Perhaps the easiest way to do it is to wait to fire the

    cannon at times when there is no wind. Since at such times there will be no effect of Y

    on Z, we can effectively reduce Figure 2 to Figure 1. In this case, we are isolating the

    effect of X on Z from the effect of Y on Z, in order to measure X. Now, it just happens

    that in this example, this sort of measurement using isolation can be done. But what if

    there is never a time when the wind dies down, and there is always a headwind, for

    example?

    More generally, what do you do in a situation like in Figure 2, where you have

    access to Z, and you want to measure X, but there is always a significant effect of Y on

    Z? You would have to find some way to separate out the effects of X and Y on Z. I call

    the process of separating out the effects decomposition. How might you carry out this

    decomposition? One way to do it would be to somehow try to model what the effect of

    Y on Z would be, and then subtract that out in order to measure Z. Of course, we are

    making the assumption here that the effects of X and Y on Z will add linearly, which will

    not always be the case. At this point, however, I don‘t want to make things too

    complicated. Let me simply note, at this point, that we are indeed making this

    assumption about how the effects add together.

  • 22

    Figure 3

    Now, there are other arrangements we can think of as well. In Figure 3, we have

    an arrow going from X to Y, and an arrow from Y to Z. Suppose we have access to Z,

    and we want to determine the value of X. In this case, X has a causal effect on Y, and Y

    has a causal effect on Z, and we need somehow to measure X via its effect on Y.

    Another possible arrangement is in Figure 4, where now in addition there are arrows

    going between X and Y. We might think of this as a case where there is now some kind

    of causal interaction. I call the various different ways in which we can arrange the

    arrows and the nodes the relational structure. If all the relations are causal relations,

    then we can think of it as a kind of causal structure. In all of these cases, if we want to

  • 23

    measure X or Y based on our observations of Z, we must somehow separate out the

    effects of X and Y on Z—that is, we must carry out a decomposition.

    Figure 4

    All of this might seem complicated, but when we are trying to measure

    properties of something that is familiar to us, isolating and decomposing the various

    effects comes rather naturally. For example, suppose I am in a moving car and I have a

    radar gun with me. I want to measure the speed of an oncoming car. I point the radar

    gun at the car, then I look at the speed that the radar gun gives, and then I compensate

    for my own speed by looking at my speedometer and subtracting my own estimated

    speed. This is a form of decomposition that comes naturally because this is a system

    that is made up of parts that are familiar to us. Of course, as the directed graph gets

    more complicated and you have to decompose more effects, measurement can become

    immensely more difficult.

    I think decomposition is an aspect of indirect measurement, and of scientific

    methodology in general, which has been overlooked by philosophers. In individual

  • 24

    cases, scientists are certainly aware of the difficulties involved with separating out

    various effects when carrying out measurements. But there has been very little

    philosophical literature on the problems of decomposition.5

    7 Antecedently unfamiliar systems

    Up to this point, we have been talking about the measurement of inaccessible

    properties of familiar systems. What if, instead of a familiar system such as a cannon I

    was trying to make a measurement on an inaccessible property of a system that is

    antecedently unfamiliar to me? Is this even possible? Don‘t we have to know certain

    things about the system antecedently in order to measure such inaccessible properties?

    In the case of the cannon, we have to know the laws of physics, facts about the

    environment of the cannon such as properties of the air and terrain, and less quantifiable

    facts about cannons in general—how they are manufactured, how they are fired, and so

    on. How could we possibly set up a measurement of an inaccessible property of an

    antecedently unfamiliar system?

    History seems to show, however, that successful measurements have been made

    of inaccessible properties of antecedently unfamiliar systems. Consider planetary

    astronomy again. The solar system—not to be confused with the traces we observe of

    planets across the night sky, but the planetary system itself—was surely about as

    inaccessible and antecedently unfamiliar as a system could be. We might now laugh at

    the idea that the planets are carried around the heavens in crystalline spheres, but the

    solar system is utterly unlike anything astronomers at the time knew about. There was

    5 A few philosophers who have addressed this problem or related problems are George Smith

    (2002a, 2002b), Hasok Chang (2004), and William Wimsatt (2007).

  • 25

    simply no way to know in advance what a reasonable assumption about the planets is.

    Yet, as I show in chapters 2 and 3, the work of Kepler and Newton are examples of how

    measurements of antecedently unfamiliar systems can be carried out successfully.

    Let us think carefully about what makes the measurement of inaccessible

    properties of antecedently unfamiliar systems difficult. As I have discussed earlier,

    there are two basic problems—theory-mediation and the combining of effects. First, to

    illustrate the problem of theory-mediation, let us return to the cannonball example.

    Recall that the diagram for that example is given in Figure 1. There are two nodes, X

    and Y, with an arrow, a, pointing from X to Y. Now, suppose we didn‘t know the laws of

    physics, so we couldn‘t derive the relation, Equation 1, which relates X to Y and thereby

    allows us to measure X by observing Y. If we only have access to Y, we would not be

    able to measure X, without knowing this equation. Let me represent this situation in

    Figure 5. I have X and Y, but now only a dotted arrow from X to Y, with a question mark

    next to it. This is an indication that we think we know that there is a relation between X

    and Y, but we don‘t know exactly what it is.

    Figure 5

  • 26

    How would we measure X in this case? One thing we might think of doing is

    simply guessing the relation. But how could we be at all sure that we have measured X

    correctly, using a guessed relation? If I was really a cannon maker, here‘s what I would

    think of doing. I would try to build something like the cannon, that launches a heavy

    object like a cannonball, but for which I know the initial velocity—perhaps a catapult of

    some kind. By launching the object at different velocities, I might find some kind of

    relation between the initial velocity and the distance traveled. Then, by induction, I

    assume that the same relation holds for the cannon. Since I now have a relation between

    X and Y, I can make the measurement. So in this case I do not have to derive something

    like Equation 1 from fundamental theory—I can determine it empirically. Still, one

    might ask whether the inductive move is justified—how do I know that the relation I

    found from the catapult applies to cannons as well? Let us hold this thought for a while.

    Figure 6

    Now let us think about the problem of combining of effects. Think once again

    about the cannonball example. Suppose we do know Equation 1, so we know of the

  • 27

    relation a relating X to Y. But perhaps we are inexperienced as artillery engineers. We

    don‘t know whether there could be other influences on the distance traveled, such as the

    wind. Without knowing whether there could be such other influences, we would not be

    able to measure X with confidence. Let me represent this situation in Figure 6. I have X

    and Y, and an arrow going from X to Y as in Figure 1, but now I have a couple dotted

    arrows going towards Y with question marks beside them, indicating possible effects on

    Y. Now, again, if we were really cannon makers, there would be ways of determining

    whether, say, wind is a factor. We could, for example, fire the cannon using the same

    amount of powder under various conditions of wind to make sure that the distance the

    cannonball travels is not affected too much by the wind. But, of course, there could be

    further unforeseen conditions that affect the distance the cannonball flies. Without being

    able to anticipate such unforeseen conditions, we have no way of correcting for them.

    8 Indirect measurement and evidence

    Let me now return to what I said is the central question of this dissertation: Given

    that, in order to carry out an indirect measurement, you must make inferences from the

    accessible properties of a system to the inaccessible properties, and that in order to

    make these inferences, you need to make assumptions that (1) certain relations between

    accessible and inaccessible properties apply, and (2) effects from various inaccessible

    parts on the accessible parts can be decomposed in a certain way, how do you ensure

    that the indirect measurement that you made is correct, or approximately correct?

    If the system we are making an indirect measurement on is antecedently familiar,

    we can often give plausibility arguments for assumptions (1) and (2). For example,

  • 28

    going back to the cannon example again, we take it as given that the laws of physics

    apply to cannonballs, and that under the right conditions, Equation 1 will apply. And I

    can give an argument based on past experience to say that the actual conditions are

    indeed close enough to those conditions for us to be able to apply Equation 1 to this

    particular situation—that, say, the wind is not going to be a factor. But what do we do if

    the system is antecedently unfamiliar?

    If we look at cases from the history of science, there is not a simple answer,

    because the situations tend to be very complicated. Even in cases where the system you

    are investigating is antecedently unfamiliar, you can give plausibility arguments for the

    assumptions. For example, as we shall see in chapter 3, Newton referred to experiments

    done in his laboratory to justify the applicability of the laws of motion in the Principia.

    This is a reasonable assumption to make as a working hypothesis, but it could not have

    been known at the time that the laws are in fact applicable to celestial objects.

    Plausibility arguments are much weaker without the weight of experience behind them.

    I think there is a different way of gaining confidence that an indirect

    measurement is correct or approximately correct, which does not involve trying to come

    up with a straight justification for the assumptions (1) and (2): let the indirect

    measurements themselves be evidence that the assumptions were correct.

  • 29

    Figure 7

    I think there are at least two strategies through which this can be done. The first

    strategy is converging measurement.6 Suppose there is some system that we can

    represent by Figure 7. There is a node X with two arrows out from it, arrow a to node Y,

    and arrow b to node Z. Suppose both Y and Z are accessible properties, that is, we have

    a way of measuring their values confidently. Suppose we don‘t have too much

    confidence in the relations a and b. In this situation, there are two different ways of

    measuring X, through observation of Y using relation a, and through observation of Z

    using relation b. If we carry out both measurements, and we get approximately the same

    result, that is, they converge, then this is good reason to think that the measurements are

    good, and that the measurement of X is correct. We can, of course, have more than two

    such converging measurements. The more the results converge, the better reason we

    have to believe that the measurement of X is indeed correct. Note, however, that we can

    get converging results even if the relations a and b are not strictly true of the system—it

    could be the case that, say, relation a simply holds to a good approximation under the

    6 The term, and the idea, are George Smith‘s. See (Smith 2002a) and his unpublished

    manuscript ―Closing the Loop‖.

  • 30

    circumstances of the measurement. It turns out that we have more reason to believe that

    the measurement itself is correct than the assumptions we made in order to make the

    measurement.

    This has implications, by the way, for the way in which we view the ―flow‖ of

    evidence in science. In Figure 7, I have confidence in my measurements of Y. I have

    low confidence in the relation a. Since I am using the relation a to measure X, one

    might think that I should have low confidence in my measurement of X. This would

    indeed be the case if I only measured X one way, but if I also measure X through the

    other relation b, and they converge, then this will increase my confidence in X even if I

    have low confidence in b. In fact, this might be reason to raise my confidence in the

    applicability of the relations a and b. To put it in a loose but picturesque way, evidential

    power does not flow monotonically from Y and Z towards X. Rather, under certain

    circumstances such as converging measurements, X can be a new source of evidence,

    and the evidential power can actually ―flow outward‖ from X. Of course, we have to be

    careful about what such converging measurements actually show about the relations a

    and b. The conclusion we can draw from such convergent measurement is that the

    relations a and b are applicable under the conditions of the measurements, but we would

    not know whether they would be applicable in other conditions.

    There are other strategies besides converging measurement in which we can get

    the indirect measurements themselves be evidence that the assumptions were correct.

    They involve more complicated relational structures. The following strategy is what I

    call decompositional success. For example, take a look again at Figure 2. Here, the

    accessible property Z is affected by both the inaccessible property X, via the relation a,

  • 31

    and the inaccessible property Y, via the relation b. Suppose we don‘t have too much

    confidence in the relations a or b, and we want to measure X. We might first try

    guessing the effect of Y on Z, subtracting that effect out, and then measuring X using the

    relation a. We now have a way of modeling the effect of X on Z using the relation a.

    Now subtracting that effect out, we measure the value of Y using the relation b. Using

    this new value of Y, we model the effect of Y on Z. We subtract out that effect and

    measure a more refined value for X. Using this new, refined value for X, we model the

    effect on Z, and we now come up with a new, refined measurement for Y.

    If my measurements of X and Y seem to be converging on certain values, then

    this is good evidence that this relational structure is approximately correct and the

    relations a and b are also at least approximately applicable. Why? Suppose the relation

    a is not approximately applicable. Then when we model the effect of X on Z and

    subtract out this modeled effect in order to measure Y, we do not expect to get a good

    value when we measure Y. Then, when we model the effect of Y and subtract it out to

    measure X, we should expect this measurement not to give a good value for X, and thus

    it should not agree with the previous value for X. Thus, if the sequence of values for X is

    converging, this is evidence that the values for X and Y are correct. To put it loosely, we

    are ―playing the measurements of X and Y off of each other‖—the measurement of X

    presupposes that the measurement of Y is approximately correct, and the measurement of

    Y presupposes that the measurement of X is correct. If either one is not approximately

    correct, then in all probability the procedure should not work.

    In actuality, these relational structures often turn out to be even more

    complicated. But it is the very fact that these structures are so complicated that they can,

  • 32

    in some cases, confer very high confidence that some indirect measurements are correct.

    The more complicated a structure, the more ways in which one can play measurements

    off of one another, or try to measure one property in more than one way.

    Now I want to discuss some limitations of these methods. First, as we shall see

    when we start looking at actual cases of indirect measurement, most indirect

    measurements are far from easy to do, especially when they involve systems that are

    partially inaccessible. They often involve observations that are limited and hard to get,

    and the calculations themselves can often be laborious, especially when we consider

    sciences such as planetary astronomy in the sixteenth and seventeenth centuries. Thus,

    indirect measurements will often be made with the hope that it will be shown down the

    road that the assumptions that were made in carrying out the measurements will turn out

    to be true. We will see in chapter 3, for example, that this is the best way to view what

    Newton was doing in the Principia.7

    The second limitation also has to do with the temporal dimension. These

    methods all involve comparing the results of different indirect measurements. In most

    cases, the indirect measurements will be made at different times. If the property you are

    measuring changes over time, then you will not be able to get converging measurements.

    Thus, a fundamental presupposition in using these methods is that the property you are

    measuring will not be changing its value significantly over time—that the value will be

    stable. This is an issue that I will discuss in more detail in chapter 3.

    7 This is George Smith‘s view of the methodology of the Principia. This dissertation is largely

    the result of trying to understand Smith‘s views of methodology particularly as they relate to the

    problem of underdetermination.

  • 33

    8 Case studies

    Now that I have laid out my general view of indirect measurement, the rest of

    this dissertation is devoted to case studies of indirect measurement of complicated,

    partially inaccessible systems. Each case will involve a problem where there is initially

    a difficult problem of underdetermination—the available observations are not good

    enough to uniquely determine the inaccessible properties of the system. Indirect

    measurement through the use of enabling assumptions will resolve at least part of that

    underdetermination. I will, for the most part, focus on understanding the justification for

    the enabling assumptions.

    In chapter 2, I examine the indirect measurement of planetary distances in the

    solar system in the sixteenth and seventeenth centuries by Copernicus and Kepler. In

    this case, there was an underdetermination between three different theories about the

    motions of the planets, which can be partly resolved by the measurement of distances

    between the planets. The measurement of these distances was enabled by making

    certain assumptions about the motions of the planets. I argue that part of the

    justification for making these assumptions comes from decompositional success in

    playing off measurements of the earth‘s orbit and the Mars orbit against each other.

    In chapter 3, I examine the indirect measurement of mechanical properties such

    as mass and forces in the solar system by Newton. In this case, there were two

    underdeterminations, the first an underdetermination between two theories about the

    relative motion of the sun and the earth, and the second an underdetermination between

    various theories for calculating planetary orbits. Newton resolves these two problems of

    underdetermination through a research program where the various sources of force are

  • 34

    identified and accounted for. This program crucially requires the third law of motion to

    apply between celestial objects, a point on which Newton was criticized. I examine the

    justification for the application of the third law of motion through its successful use for

    decomposition of forces in the solar system, in a long term research program. I further

    discuss comments by Kant on the role of the third law of motion for Newton, in which

    Kant recognizes its indispensability for a long-term program for determining the center

    of mass of the solar system and thus defining a reference point relative to which forces

    can be identified.

    Chapter 4 covers the indirect measurement of density in the earth‘s interior using

    observations of seismic waves. One of the difficult problems in this case is that we can

    think of the interior density of the earth as a continuous function of radius—in order to

    determine this radius function, you are in effect making a measurement of an infinite

    number of points. The natural question to ask here is how much resolution the

    observations give you. I will focus on the work of geophysicists who were concerned

    with this problem, out of which eventually a standard model for the earth‘s density grew.

  • 35

    -2-

    Copernicus, Kepler, and Decomposition

    1 Planetary Astronomy

    The most difficult problem of planetary astronomy in the sixteenth century

    was that the observed two-dimensional motions of the planets across the night sky

    are consistent with three different theories of the actual three-dimensional motions of

    the planets through space—the Ptolemaic theory, the Copernican theory, and the

    Tychonic theory. In other words, the theory of the actual motions of the planets was

    underdetermined by the available observations. In fact, by making minor

    modifications, you could make the theories empirically indistinguishable from each

    other, given the kinds of observations that were available at the time. It seemed to

    some astronomers in the sixteenth century that this underdetermination is

    unresolvable, and that, in fact, trying to determine the actual motions of the planets

    should not even be an aim of planetary astronomy.

    This problem could be solved, however, if you could find a way of indirectly

    measuring the distances between the planets, for the motions of the earth, the sun,

    and the planets through space are different for each of these theories. Copernicus

  • 36

    and Kepler both use the method of triangulation to attempt to measure planetary

    distances—setting up a triangle with the sun, the earth, and a planet at the corners

    and using geometrical relations to determine distances. In order to carry out this

    procedure, however, it is very important to know the angles of the triangle accurately.

    But as I will explain, in order to determine these angles, you must perform what I

    called a decomposition in chapter 1—you have to separate out the effects due to two

    different features of the planetary motions. These features are called the first

    inequality and the second inequality.

    Thinking about this problem in terms of the picture of indirect measurement I

    provided in chapter 1, we can take the solar system to be a complicated, partially

    inaccessible system, with the apparent motions of the planets being the accessible

    properties of the system, and the actual three-dimensional motions of the planets

    being the inaccessible properties. In chapter 1, I explained that in order to carry out

    an indirect measurement, you need to assume that (1) certain relations between the

    accessible and the inaccessible properties apply, and (2) effects from various

    inaccessible parts on the accessible parts can be decomposed in a certain way. The

    central question was how you ensure that the indirect measurement is correct or

    approximately correct in the face of (1) and (2).

    With regard to assumption (1), the fundamental theory from with the relations

    between the accessible and inaccessible properties, that is, the relations between the

    apparent motions of the planets and the actual three-dimensional motions of the

    planets, are derived, is Euclidean geometry. That Euclidean geometry is applicable

    to the planets was never called into question by astronomers—they could not have,

  • 37

    of course, since they did not know of any other geometry than that of Euclid.

    Assumption (2), however, involves exactly how you break down the apparent

    motions of the planets. All astronomers at the time, following Ptolemy, separated

    out two motions, the first inequality and the second inequality. There were

    disagreements, however, as to how to characterize each of these motions, and to what

    actual motions of the planets the first and second inequality corresponded. Since this

    separation of motions had to be done in order to determine planetary distances, how

    could an astronomer know whether a measurement of planetary distances involving

    decomposition is correct?

    2 Planetary Astronomy in the sixteenth century

    Although a more thorough treatment of planetary astronomy from the mid-

    sixteenth to the early seventeenth century would certainly require a section on Tycho

    Brahe, I will focus on the work of Copernicus and Kepler. We will be thinking about

    the work of Copernicus and Kepler in terms of the framework I described in Chapter

    1. We have access to the two-dimensional motions of the planets across the night

    sky, that is, angular distances of the planets relative to the constellations, over time.

    What we want to know are the actual motions of the planets in three dimensions, that

    is, relative distances and directions of the planets over time. We will find that the

    measurement of planetary distances crucially involves separating out two different

    features of the motions of the planets—the first inequality and the second inequality.

    I will explain what the first and second inequalities are shortly, but let us first

    consider the apparent two-dimensional motions of the planets. We can think of the

  • 38

    night sky as a vast, hollow sphere, onto the inner surface of which are painted the

    stars that are visible from the earth, some forming the familiar shapes of the

    constellations. The sun appears to make one entire circuit around this sphere every

    year, and the great circle along which it travels is called the ecliptic. The planets

    appear to move roughly along the ecliptic, but their motions are somewhat

    complicated. Movement along the direction of the ecliptic is called longitudinal

    motion, while movement perpendicular to the ecliptic is called latitudinal motion.

    Since it is the longitudinal motions that ultimately yield information about planetary

    distances, I will talk almost exclusively of the longitudinal motions in what follows.

    Now, let us consider these longitudinal motions. The motions of the planets

    are fairly regular, but they have two significant irregularities in their motion. One

    irregularity is the famous retrograde motion. At some points along their journey

    along the ecliptic, the planets will appear to stop and reverse direction for a while,

    going the opposite direction along the ecliptic. This irregularity was called the

    second inequality (or the second anomaly) by astronomers from the time of Ptolemy

    through Kepler. We now know that the second inequality arises because we are

    viewing the motions of the planets from a platform that is itself moving, namely the

    earth.

    The other irregularity is that the planets appear to speed up and slow down at

    various points as they travel along the ecliptic. This variation in apparent angular

    velocities was called the first inequality (or the first anomaly). We now know that

    this variation occurs for two reasons. About half of the maximum variation in the

    apparent angular velocity is because the planets actually do speed up and slow down

  • 39

    relative to the sun in accordance with Kepler‘s area rule, while the remaining half

    comes from the sun not being at the center of the earth‘s orbit, but at a focus.

    Figure 1

    Figure 1 (from Swerdlow and Neugebauer 1984, 615) is a representation of the

    Ptolemaic theory. The earth is labeled O, and the position of a planet is labeled P.

    In this theory, the second inequality is accounted for by the use of epicycles. The

    planet moves in an epicycle, which is a circular orbit, while the center of the epicycle

    itself moves in a circular orbit, called the deferent, around the earth. In Figure 1, the

    center of the epicycle is labeled C, while the center of the deferent is labeled M. The

    first inequality is accounted for by offsetting the earth O from the center M of the

    deferent, and having another point called the equant point, labeled E, located on the

    opposite side of the center from the earth, at the same distance from the center as the

    earth. The planet travels at constant angular velocity as seen from the equant point,

  • 40

    and thus when seen from the earth it will appear to speed up and slow down at

    various points along its orbit. Since the equant point does not coincide with the

    center of the deferent, the planet‘s actual motion along the deferent will not be

    uniform circular motion.8

    Ptolemaic astronomy was enormously successful—since its development in

    the second century, it was not superseded in accuracy for over a thousand years, until

    the work of Kepler. There were some aspects of Ptolemaic astronomy that were

    unsatisfactory, however, if one tried to think about how it could be physically

    implemented. Almost all astronomers before the time of Kepler believed the planets

    were carried along in their circular orbits by being embedded in rotating crystalline

    spheres. As I just mentioned, according to Ptolemaic theory, the speed of the planet

    along the deferent is not uniform—thus if it is being carried along by a crystalline

    sphere, the sphere must somehow slow down and speed up in such a way that the

    planet has constant angular velocity as seen from the equant point. It was difficult to

    see how this speeding up and slowing down could be physically implemented. In

    response to this difficulty, there was a school of Arabic astronomers connected to the

    Maragha observatory in modern-day Iran who, in the thirteenth and fourteenth

    centuries, developed planetary models using only uniform circular motion, using

    epicycles to account for the first inequality.

    Famously, Copernicus came up with a theory in which the second inequality

    is accounted for by putting the sun at the center of the solar system and having the

    8 See Evans 1984 for an excellent exposition of the role that the equant plays in Ptolemaic

    astronomy, and why this innovation allowed Ptolemaic astronomy to be so empirically

    successful.

  • 41

    earth go around the sun. The second inequality is then seen to be the effect of

    observing the planets from a point that is itself moving. Although popular accounts

    of Copernicus have him rejecting the Ptolemaic theory because of its epicycles, he

    actually objected to it for the same reason as the Maragha astronomers—because it

    departed from uniform circular motion (Swerdlow and Neugebauer 1984, 293-294).

    In fact, Copernicus accounts for the first inequality using the same principles as the

    Maragha astronomers did, with an epicycle.9 In order to get the theory to capture the

    motions that Ptolemy could using the equant, this epicyclic theory for the first

    inequality had to be rather complicated. Figure 2 (from Swerdlow and Neugebauer

    1984, 616) shows the Copernican theory for the first inequality.

    Figure 2

    9 Swerdlow and Neugebauer go so far as to say that Copernicus ―can be looked upon as, if

    not the last, surely the most noted follower of the Maragha school‖. (295)

  • 42

    3 Triangulation

    Since both Copernicus and Kepler use fundamentally the same method to get

    planetary distances, I will first explain the basic method so that the explication will

    be easier when we look specifically at what Kepler and Copernicus do. At root, the

    method is very simple. First take a look at Figure 3. It shows the sun, the earth, and

    a planet, surrounded by constellations. The constellations are taken to be fixed

    permanently in their positions, and thus they provide a reference point for recording

    observations of the planets. From the earth, I can observe the position of the sun S

    and the planet P along the ecliptic. The position along the ecliptic is called the

    longitude. The longitude as seen from the earth is called the geocentric longitude.

    In Figure 3, it just so happens that the earth, the sun, and the planet are lined

    up so that the sun and the planet are exactly on opposite sides from the earth. Notice

    here that when I observe the planet from the earth, I see it exactly the way I would

    see it from the sun. When the sun, the earth, and a planet are in this configuration,

    this is called opposition. Call the longitude as seen from the sun the heliocentric

    longitude. Then at opposition, the geocentric longitude and the heliocentric

    longitude coincide.

  • 43

    Figure 3

    Now suppose the planet, the earth, and the sun are in the configuration shown

    in Figure 4. In this configuration, the planet would have a different longitude, that is,

    it would appear to be moving through different constellations, depending on whether

    I observe it from the earth or from the sun. We can see that when not in opposition,

    the geocentric longitude and heliocentric longitude will be different.

    Figure 4

  • 44

    Now suppose when the earth, a planet, and the sun are in the configuration of

    figure 4, we want to find the distance from the earth to the planet, the distance EP, as

    a ratio of the distance from the earth to the sun, the distance ES. Suppose we already

    have a theory of the motion of the earth around the sun, so that we know, at any

    given time, the longitude of the earth as seen from the sun. And suppose we have in

    addition a theory of the motion of the planet P around the sun as well, so we know, at

    any given time, the heliocentric longitude of the planet P. The theory of the earth‘s

    motion will give us the direction of the line ES, while the theory of the motion of P

    will give us the direction of the line SP. Making one observation from the earth will

    give us the direction of the line EP, thus allowing us to find all the angles in the

    triangle EPS. This will then allow us to find, by simple geometry, the ratio of the

    length of the line EP to the length of the line ES, which is what we wanted. Thus,

    given that this is the actual configuration of the earth, the sun, and the planet, and

    that I have the proper theories for the earth‘s motion and the planet‘s motion, I can

    find the distance from the earth to the planet, relative to the size of the earth‘s orbit.

    4 Copernicus’s measurement of planetary distances

    We will now move on to the specific method that Copernicus uses to measure

    distances to the planets, which he does in Book 5 of De Revolutionibus. Since the

    method he uses is basically the same for all five planets, with minor differences

    depending upon whether the planet is an inner planet or an outer planet, I will only

    describe his procedure for one of the planets, Saturn. The basic method is

    triangulation, just as I described above. We can think of Figure 5 (from Swerdlow

  • 45

    and Neugebauer 1984, 635) as a much more detailed and complicated version of

    Figure 4. Saturn is labeled P, the earth is labeled O, and the sun10

    is labeled S. The

    reason this figure is so much more complicated than figure 4 is that the theory of

    Copernicus does not consist of the simple circles I have above. The theory of

    motion for Saturn involves an epicycle to account for the first inequality, and there

    are further complications because the sun is not located at the center of the orbit of

    Saturn. But if we strip away some of these complications, the basic method is the

    same. The idea is to determine the angles in the triangle formed by the earth, the sun,

    and Saturn.

    Figure 5

    10

    One detail that I will discuss in a later section is that the sun here is the mean sun, not the

    true sun.

  • 46

    The first leg of the triangle, the direction of the line from the earth to the sun,

    is given by the Copernican solar theory, which is really the theory of the earth‘s