Gravimetry, Relativity, and the Global Navigation Satellite...

Gravimetry, Relativity,and the Global Navigation Satellite Systems

Albert Tarantola, Ludek Klimes,Jose Maria Pozo, and Bartolome Coll

January 12, 2005

Abstract

Relativity is an integral part of positioning systems, and this is taken into account in today’spractice by applying many ‘relativistic corrections’ to computations performed using using con-cepts borrowed from Galilean physics. A different, fully relativistic paradigm can be developedfor operating a positioning system. This implies some fundamental changes. For instance, thebasic coordinates are four times (with a symmetric meaning, not three space coordinate and onetime coordinate) and the satellites must have cross-link capabilities. Gravitation must, of course,be taken into account, but not using the Newtonian theory: the gravitation field is, and only is, thespace-time metric. This implies that the positioning problem and the gravimetry problem can notbe separated. An optimization theory can be developed that, because it is fully relativistic, doesnot contain any ‘relativistic correction’. We suggest that all positioning satellite systems shouldbe operated in this way. The first benefit of doing so would be a clarification and a simplificationof the theory. We also expect, at the end, to be able to run the positioning systems with betteraccuracy.

Contents

1 Introduction 2

2 Setting of the Problem 32.1 Different Kinds of Data and of A Priori Constraints . . . . . . . . . . . . . . . . . . . . 42.2 First Constraint on the Metric (Zero Diagonal) . . . . . . . . . . . . . . . . . . . . . . . 52.3 Second Constraint on the Metric (Einstein Equation) . . . . . . . . . . . . . . . . . . . . 52.4 Third Constraint on the Metric (Smoothness) . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Proper Time Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.6 Arrival Time Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.7 Accelerometer Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.8 Gyroscope Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.9 Gradiometer Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.10 Total Misfit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Optimization 123.1 Iterative Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Tangent Linear Applications and their Transposes . . . . . . . . . . . . . . . . . . . . . 15

1

4 Discussion and Conclusion 18

5 Bibliography 19

6 Appendixes 206.1 Perturbation of Einstein’s Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206.2 Arrival Time Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226.3 A Priori Information on the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.4 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1 Introduction

Many relativistic corrections are applied to the Global Navigation Satellite Systems (GNSS). NeilAshby presents in Physics Today (May 2002) a good account of how these relativistic corrections areapplied, why, and which are their orders of magnitude. Unfortunately, it is generally proposed thatrelativity is only a correction to be applied to Newtonian physics. We rather believe that there is afully relativistic way to understand a GPS system, this leading to a new way of operating it.

As gravitation has to be taken into account, it is inside the framework of general relativity that thetheory must be developed. The shift from a Newtonian viewpoint (relativistic corrections included ornot) into a relativistic framework requires some fundamental conceptual changes. Perhaps the mostimportant concerns the operational definition of a system of four space-time coordinates. We reachthe conclusion that there is an (essentially unique) coordinate system that, while being consistentwith a relativistic formulation, allows an immediate positioning of observers (the traditional Mikowskicoordinates t, x, y, z of flat space-time do not allow such an immediate positioning).

These coordinates are defined as follows1. If four clocks —having an arbitrary space-time trajec-tory— broadcast their proper time —using electromagnetic signals,— then, any observer receives,at any point along his personal space-time trajectory, four times, corresponding to the four signalsarriving at that space-time point. These four times, say τ1, τ2, τ3, τ4 , are, by definition, the coor-dinates of the space-time point. We don’t have one time coordinate and three space coordinates, asusual, but a ‘symmetric’ coordinate system with four time coordinates.

With the space-time endowed with those coordinates, any observer having the right receiver maymeasure (in real time) his personal trajectory. This is true, in particular, for the four clocks themselves:each clock constantly receives three of the coordinates and it defines the fourth. Therefore, each clockknows its own trajectory in this self-consistent coordinate system. Note that even if the clocks aresatellites around the Earth, the coordinates and the orbits are defined without any reference to anEarth based coordinate system: this allows to achieve maximum precision for this primary referencesystem. Of course, for applications on the Earth’s surface, the primary coordinates must be attachedto some terrestrial coordinate system, but this is just an attachment problem that should not interferewith the problem of defining the primary system itself.

In general relativity, the gravity field is the space-time metric. Should this metric be exactlyknown (in any coordinate system), the system just described would constitute an ideal position-ing system (and the components of the metric could be expressed in these coordinates). In practicethe space-time metric (i.e., the gravity field) is not exactly known, and the satellite system itself hasto be used to infer it. This note is about the problem of using a satellite system for both, positioning,and measuring the space-time metric.

1Coll and Morales (1992), Coll (2000, 2002, 2004), Coll and Tarantola (2003).

2

Information on the space-time metric may come from different sources. First any satellite systemhas more than four clocks. While four of the clocks define the coordinates, the redundant clocks canbe used to monitor the space-time metric. The considered satellites may have more that a clock: theymay have an accelerometer, this providing more information on the space-time metric (in fact onthe space-time connection). Should the satellites have also a gradiometer, this would give additionalinformation on the metric (in fact, on the Riemann tensor of space-time). Our theory will provideseamless integration of positioning systems with systems designed for gravimetry.

In the “post Newtonian” paradigm used today for operating positioning of gravimetry systems,the ever increasing accuracy of clocks makes that more and more “relativistic effects” have to betaken into account. On the contrary, the fully relativistic theory here developed will remain valid aslong as relativity itself remains valid.

It is our feeling that when GNSS and gravimetry systems will be operated using the principleshere exposed, new experimental possibilities will appear. One must realize that with the opticalclocks being developed may one day have a relative accuracy of 1018 . The possibility that someday we may approach this accuracy for positioning immediately suggests extraordinarily interestingapplications.

These applications would simply be impossible sticking to the present-day paradigm. To realizehow deeply nonrelativistic this paradigm is, consider that GPS clocks are kept synchronized. In thisyear 2005, when we celebrate the centenary of relativity, this sounds strange: is there anything lessrelativistic than the obstination to keep synchronized a system of clocks in relative movement?

There is one implication of the theory here developed for the Galileo positioning system nowbeing developed by the European Union. Our theory requires, as a fundamental fact, that the GNSSsatellites exchange signals. The most recent GPS satellites (from the USA) do have this “cross-link”capability. One could, in principle, use the cross-link data (or am ameliorated version of it) to operatethe system in the way here proposed. To our knowledge, unfortunately, the Galileo satellites will nothave this cross-link capability. This is a serious limitation that will complicate the evolution of thesystem towards a more precise one.

Finally, we need to write a disclaimer here. None of the algorithms proposed below are intendedto be practical. They are the simplest algorithms that would be fully consistent with relativity theory.Passing from these to actually implementable algorithms will require a lot of numerical analysis.

2 Setting of the Problem

Four clocks (called below the principal clocks) broadcast their proper time. Any observer in space-time receives, at any point along its space-time trajectory, four times τα = τ2, τ2, τ3, τ4 , that,by definition, are our space-time coordinates2. If the observer has his own clock, with proper timedenoted σ , then he knows his trajectory

τα = τα(σ) , (1)

i.e., he knows four functions (of his proper time σ ). Then the observer also knows his four-velocity

uα(σ) =dτα

dσ(σ) (2)

2The mathematical properties of these coordinates are examined in an accompanying paper (Pozo, 2005).

3

at every point along its space-time trajectory. If the clock is inside a ‘satellite’ that contains an ac-celerometer, the acceleration aα is known as a function of proper time:

aα(σ) = aα(σ) . (3)

If the satellite also has a ‘gradiometer’, then the relative accelerations δaα between some neighboringmasses (‘tidal accelerations’) are also known as a function of proper time,

δaα = δaα(σ) . (4)

Although some of the satellites may have an accelerometer and, perhaps, a gradiometer, it is onlyassumed as necessary that all have a clock, and that

• they may broadcast

– their proper time,

– their trajectory,

– any other data they may have (like the acceleration or the tidal acceleration),

• they may receive

– the signals sent by (some of) the other satellites.

Of course, the signals are received after the necessary travel time of the electromagnetic waves.The four principal clocks, that define the coordinates, also participate to the game, in particular

by emitting their trajectory: each of the principal clocks receives signals from the other three, so eachof the principal clock knows its own trajectory in this coordinate system. For the first principal clock,one has the trajectory

τ1 = τ1 ; τ2 = τ2(τ1) ; τ3 = τ3(τ1) ; τ4 = τ4(τ1) , (5)

for the second principal clock one has the trajectory

τ1 = τ1(τ2) ; τ2 = τ2 ; τ3 = τ3(τ2) ; τ4 = τ4(τ2) , (6)

and so on.

2.1 Different Kinds of Data and of A Priori Constraints

The space-time metric shall be determined thanks to four different kinds of data, and thanks to threedifferent constraints. The constraints are as follows:

1. the diagonal components g11, g22, g33, g44 of the contravariant metric must be zero in thenatural basis associated to the coordinates τα (see explanations below);

2. the metric has to approximately satisfy the Einstein equations, i.e., the Einstein tensor of themetric should be proportional to the stress-energy tensor of the space where the satellites evolve(essentially, the gas of the high atmosphere);

3. among all possible space-time metrics consistent with our data and other constraints, we wishthat the metric is ‘smooth’ in the usual sense given to this term in least-squares theory.

4

The five possible kinds of data explicitly examined below are as follows:

1. the signal emitted by one clock at proper time ρ reaches some other clock at proper time σ ;the two values ρ and σ are known; as the trajectories are also kwown as a function of propertime, the two space-time points of emission and of arrival of the signal are known; the metricshould be such that there is a zero-length geodesic (light signal) connecting the two space-timepoints (within some tolerable approximation);

2. the trajectories of the satellites are public (i.e., they are known), and each satellite has a clock,therefore, the metric has to be such that the integral of the

√gαβ dxα dxβ along each trajectory

should approximately correspond to the proper time as given by the clock;

3. if the satellites have an accelerometer, the metric has to be such that the computed acceleration(computed via the connection) is close to the observed one;

4. if in addition of having an accelerometer, the satellites have a gradiometer, the metric has to besuch that the computed tidal accelerations (computed the Riemann) are close to the observedones;

5. the satellites may have gyroscopes (like in the Gravity Probe B experiment), this providingfurther information on the space-time connection.

2.2 First Constraint on the Metric (Zero Diagonal)

[Question to Tolo and J.M.: Are we using here below all the available constraints on the metric?]Note: give here the argument showing that in the ‘light-coordinates’ τα being used, the con-

travariant components of the metric must are have zeros on the diagonal,

gαβ =

0 g12 g13 g14

g12 0 g23 g24

g13 g23 0 g34

g14 g24 g34 0

, (7)

so the basic unknowns of the problem are the six quantities g12, g13, g14, g23, g24, g34 . This con-straint is imposed exactly, by just expressing all the relations of the theory in terms of these six quan-tities.

The covariant components gαβ are defined, as usual, by the condition

gαγ gγβ = δβα . (8)

The diagonal components of gαβ are not zero.

2.3 Second Constraint on the Metric (Einstein Equation)

The notations use in this text for the connection Γαβγ , the Riemann Rα

βγδ , and the Ricci Rαβ asso-ciated to a metric gαβ , are as follows:

Γαβγ = 1

2 gασ ( ∂β gγσ + ∂γ gβσ − ∂σ gβγ )

Rαβγδ = ∂γ Γα

δβ − ∂δ Γαγβ + Γα

µγ Γµδβ − Γα

µδ Γµγβ

Rαβ = Rγαγβ .

(9)

5

The Einstein tensor is thenEαβ = Rαβ − 1

2 gαβ R , (10)

where R = gαβ Rαβ .The Einstein equation states that, at every point of the space-time, the Einstein tensor Eαβ (as-

sociated to the metric) is proportional to the stress-energy tensor tαβ describing the matter at thisspace-time point:

Eαβ = χ tαβ , (11)

where the proportionality constant is χ = 8πG/c4 . For instance, in vacuo, tαβ = 0 , and, therefore,Eαβ = 0 . When solving the Einstein equation for tαβ ,

tαβ =1χ

Eαβ (12)

we obtain (when replacing Eαβ by the expressions 10–9) the application

g 7→ tcomputed = t(g) , (13)

associating to any metric field g the corresponding stress-energy field t .Let tobs be our estimation of the stress-energy of the space-time. It could, for instance, be zero, if

we take for the space-time the model of vacuo. More realistically, we may take a simple model of therarefied gas that constitutes the high atmosphere. We wish that the space-time metric g is such thatthe associated stress-energy t(g) is close to tobs .

More precisely, we are going to impose that the t(g)− tobs is small in the sense of a least-squaresnorm

‖ t(g)− tobs ‖2Ct

≡ 〈 C−1t ( t(g)− tobs ) , ( t(g)− tobs ) 〉 , (14)

where Ct is a covariance operator to be discussed later. The notation 〈 · , · 〉 stands for a dualityproduct.

2.4 Third Constraint on the Metric (Smoothness)

Let gprior be some simple initial estimation of the space-time metric field. For instance, we could takefor gprior the metric of a flat space-time, the Schwarzschild metric of a point mass with the Earth’smass, or a realistic estimation of the actual space-time metric around the Earth.

We wish that our final estimation of the metric, g , is close to the initial estimation. More precisely,letting Cg be a suitably chosen covariance operator, we are going to impose that the least-squaresnorm3

‖ g− gprior ‖2Cg

≡ 〈 C−1g ( g− gprior ) , ( g− gprior ) 〉 (15)

is small.The covariance operator, to be discussed later, shall be a ’smoothing operator’ this implying,

from one side, that at every point of space-time the final metric is close to the initial metric, and,from another side, that the difference of the two metrics is smooth. As the initial metric shall besmooth, this imposes that the final metric is also smooth. In particular, the final metric will be defined‘continously’, in spite of the fact that we only ‘sample’ it along the space-time trajectories of thesatellites and of the light signals.

3The criterion in equation 15, that is based on a difference of (contravariant) metrics, is only provisional. In a moreadvanced state of the theory, we should introduce the logarithm of the metric, and base the minimization criterion on thedifference of logarithmic metrics.

6

2.5 Proper Time Data

We have assumed that the hardware of the satellite constellation allows the trajectory τα = τα(σ) ofeach satellite is known (equation 1). Then, the four-velocity uα = dτα/dσ is also known along thetrajectory (equation 2).

At any point along the trajectory of the satellite (whose proper time is σ ) we must have

gαβ dτα dτβ = dσ2 , (16)

or, introducing the four-velocitygαβ uα uβ = 1 (17)

The four-velocity vector uα is known at every point along the trajectory, the metric gαβ is not known.But that special combination of the metric components is known. Imposing that constraint at everypoint along a trajectory shall ensure that the proper time, as given by the internal clock, and as it canbe computed integrating the

√gαβ dxα dxβ , coincide.

Letg 7→ zcomputed = z(g) (18)

be the function that to any metric field g associates the values gαβ uα uβ at each point along each ofthe trajectories. Let be 1 the field defined on the trajectories that associates the value ‘one’ to eachpoint of each trajectory. We wish the (least-squares) norm

‖ z(g)− 1 ‖2Cz

≡ 〈 C−1z ( z(g)− 1 ) , ( z(g)− 1 ) 〉 (19)

to be small, where Cz is a suitably chosen covariance operator.

2.6 Arrival Time Data

Consider one of the auxiliary clocks, with proper time σ . As it receives the signals of the fourprincipal clocks, the space-time trajectory of the auxiliary clock,

τα = τα(σ) (20)

is known. We shall later need the fact that, as the trajectories are known, the four-velocity

uα(σ) =dτα

dσ(σ) (21)

is also known at each point along each trajectory. In this first approach to the problem, trajectoriesand four-velocities are assumed to be known exactly (without any observational uncertainty).

At some instants along their trajectory, the auxiliary clocks emit signals, each signal consisting infive quantities: the value of the proper time, say ρ , at the emission instant, and the position of theclock at this same instant, say τα .

This signal is received by some of the other auxiliary clocks. The signal emitted by one of theclocks at proper time ρ is received by some other clock at proper time σ . As all the trajectories are ex-actly known (in our self-consistent system of coordinates), if the space-time metric gαβ(τ1, τ2, τ3, τ4)was known exactly, the arrival time σ could be exactly computed (by tracing the zero length geodesicconnecting the emission point of the first clock and the trajectory of the second clock). In the real sit-uation, the metric is only known approximately, and the computed value of the arrival time, sayσcomputed , will not be identical to the time actually observed time, say σobs .

7

Roughly speaking, our goal is going to be to determine the space-time metric that minimizes thedifferences between calculated and observed arrival times.

We could use a label for each emitting point, and, given this emitting point, use another label foreach receiving point along each of the other trajectories. Instead, we choose to simplify the notationsby using a single label that corresponds to each zero-length geodesic connecting an emitting to areceiving point, without making explicit the clock trajectories to which the emitting and receivingpoint belong.

Our data, therefore, consists on a set of N values

σ iobs ; i = 1, 2, . . . , N . (22)

These values are assumed to be subjected to some observational uncertainties, discussed below. Ofcourse, assuming that there are uncertainties in these values, and not on the trajectories themselves, isan oversimplification that is only acceptable at the this preliminary level of theoretical development.

In the context of inverse problem theory, the forward operator here is going to be the operator thatto every conceivable space-time metric field g associates the computed data vector σ computed . Wewrite

g 7→ σ computed = σ(g) (23)

for this forward operator. It corresponds to a (nonlinear) application from G into D .From an algorithmic point of view, for a given g the computation of σ(g) involves taking one

by one all the trajects between a source and a receiver, and for each of the trajects compute an arrivaltime.

With some more detail, the coordinates of the emitting point are known (because the emittingclock has been assumed to emit not only its proper time, but also its instantaneous coordinates). Thetrajectory of the receiving clock is also known. If the proper time along this trajectory is genericallydenoted σ , we must obtain the particular value of σ where the light cone leaving the emittingpoint crosses the trajectory. This involves the tracing of zero-length geodesics, using some algorithm,whose particular details we don’t need to discuss.

Let σobs the observed values of the arrival times of the signals, let Cσ the covariance operatordescribing experimental uncertainties, and let ‖σ ‖Cσ

the associated norm. Let σ = ϕ(g) the rela-tion solving the modeling problem (of computing the arrival times given an arbitrary metric model),an expression that we have already written in equation 23.

We wish our synthetic data, σ(g) , to be close to the observed data, say σobs . Therefore, we wishthe (least-squares) norm

‖σ(g)−σobs ‖2Cσ

≡ 〈 C−1σ (σ(g)−σobs ) , (σ(g)−σobs ) 〉 (24)

to be small, where Cσ is a covariance operator describing the experimetal uncertainties in the mea-sured arrival time values.

2.7 Accelerometer Data

We have to explore here the case where each ‘satellite’ has an accelerometer. The acceleration alonga trajectory is

aα = uβ ∂uα

∂xβ+ Γα

βγ uβ uγ =duα

dτ+ Γα

βγ uβ uγ , (25)

where τ is the proper time along the trajectory.

8

The easiest way to ‘measure’ the acceleration on board would be, of course, to force the satellite(or its clock) to be in free-fall (i.e., to follow a geodesic of the spacetime metric). Then, one wouldhave aα = 0 . Let us keep considering here the general case where the acceleration may be nonzero(because, for instance, by residual drag by the high atmosphere), but is measured.

The measure of the acceleration provides information on the connection, i.e., in fact, on the gra-dients of the metric.

Given the metric field g , equation 25 allows to compute the acceleration at all the space-timepoints when it is measured. We write

g 7→ acomputed = a(g) (26)

the application so defined. We wish that the computed accelerations, a(g) , are close to the observedones, say aobs . More precisely, we wish the (least-squares) norm

‖ a(g)− aobs ‖2Ca

≡ 〈 C−1a ( a(g)− aobs ) , ( a(g)− aobs ) 〉 (27)

to be small, where Ca is a covariance operator describing the experimental uncertainties in the mea-sured acceleration values.

2.8 Gyroscope Data

A gyroscope is described by its spin vector (or angular momentum vector) sα , a four-vector that isorthogonal to the four-velocity uα of the rotating particle: gαβ uα sβ .

Assume that the gyroscope follows a trajectory xα = xα(τ) , whose velocity is uα = dxα/dτ andwhose acceleration aα is that expressed in equation 25. Then, the evolution of the spin vector alongthe trajectory is described4 by the so-called Fermi-Walker transport:

Dsα

dτ≡ dsα

dτ+ Γα

βγ uβ sγ = sβ (aβ uα − aα uβ) . (28)

Should the gyroscope be in free fall, aα = 0 , and dsα/dτ + Γαβγ uβ sγ = 0 , this meaning that the

spin vector would be transported by parallelism.In our case, the monitoring of the spin vector sα(τ) (besides the monitoring of the acceleration

aα ) would provide the values Γαβγ uβ sγ , an information complementary to that provided by the

monitoring of the acceleration (that provides the values Γαβγ uβ uγ ).

Consider that our data isπα =

dsα

dτ. (29)

Then we haveπα = sβ (aβ uα − aα uβ)− Γα

βγ uβ sγ . (30)

Given the metric field g ,this equation allows to compute the vector πα at all the space-time pointswhen it is measured. We write

g 7→ π computed = π(g) (31)

the application so defined. We wish that the computed values, π(g) , are close to the observed ones,say πobs . More precisely, we wish the (least-squares) norm

‖π(g)− πobs ‖2Cπ

≡ 〈 C−1π ( π(g)− πobs ) , ( π(g)− πobs ) 〉 (32)

4For details on the relativistic treatment of a spinning test particle, see Papatetrou (1951), Weinberg (1972), orHernandez-Pastora et al. (2001).

9

to be small, where Cπ is a covariance operator describing the experimental uncertainties in themeasured values.

Of course, one may not wish to measure the evolution of the spin vector to provide informationon the connection, but to ‘test’ general relativity, as in the Gravity Probe B experiment. From theviewpoint of the present work, the detection of any inconsistency in the data would put relativitytheory in jeopardy.

2.9 Gradiometer Data

To study the gravity field around the Earth, different satellite missions are on course or planned5. Ofparticular importance are the gradiometers with which modern gravimetric satellites are equipped. Inthe GOCE satellite, there are three perpendicular “gradiometer arms”, each arm consisting in twomasses (50 cm apart) that are submitted to electrostatic forces to keep each of them at the center of acage. These forces are monitored, thus providing the accelerations. The basic data are the half-sumand the difference of these accelerations (for each of the three gradiometer arms).

The half-sum of the accelerations gives what a simple accelerometer would give. The differencecorresponds to the “tidal forces” in the region where the satellite operates.

A simple model for the gradiometry data is as follows. A mass follows some space-time line that,to simplify the discussion, is assumed to be a geodesic (i.e., the mass is assumed to be in free-fall,but taking into account its possible acceleration would be simple). [J.M.: Can you please write theformulas for the case when the initial trajectory is not a geodesic?] This geodesic is represented atthe left in figure 1. Let uα be the unit vector tangent to this geodesic trajectory. Consider, at someinitial point along the geodesic, a “small” space-time vector δvα that, to fix ideas, may be assumed tobe a space-like vector. By parallel transport of δvα along the geodesic one defines a second trajectory,that is not necessarily a geodesic (the line at the right in figure 1). Let us denote wα the tangentvector to this trajectory, and δaα the acceleration along it. Note that, as the trajectory is close to beinggeodesic, the acceleration δaα is small (and would vanish if δvα = 0 ).

Figure 1: For the incorporation of gradiometry data, we consider a geodesicspace-time trajectory, and the trajectory defined by transporting a small vectoralong the geodesic (see text for details).

geod

esic

not g

eode

sic

u wδv

Note: explain thatgµν uµ uν = 1 ; gµν wµ δaν = 0 . (33)

Explain also that (i) the tangent vector wα is obtained, all along the trajectory, by parallel transportof uα along δvα , and (ii) at this level of approximation, the proper time along the second trajectory

5The LAGEOS (LAser GEOdynamics Satellites) are passive spherical bodies covered with retroreflectors. Note thatCiufolini and Pavlis (2004) have recently been able to confirm the Lense-Thirring effect using LAGEOS data. The CHAMP(CHAllenging Minisatellite Payload) satellite is equipped with a precise orbit determination and an accelerometer. TheGRACE (GRAvity recovery and Climate Experiment) consists in two satellites with precise orbit determination, accelerom-eters and measure os their mutual distance with an accuracy of a few microns. The GOCE (Gravity Field and Steady-StateOcean Circulation Explorer) satellite is still to be launched. It will consist in a three axis gradiometer: six accelerometers ina so-called diamond configuration. The observables are the differences of the accelerations.

10

is identical to the proper time along the first trajectory.A mass can be forced to follow this line, and the forces required to do this can be monitored, this

giving a measurement of the acceleration δaα of the mass.We do not need to exactly evaluate the theoretical relation expressing δaα , the approximation that

is first order in δvα will be sufficient (because δvα is small). As demonstrated by Pozo et al. (2005),one has δaα = Rα

µνρ uµ uρ δvν + . . . , where the remaining terms are at least second order in δvα .Then, with a sufficient approximation, we use below the expression

δaα = Rαµνρ uµ uρ δvν . (34)

As the three vectors aα , uα , and δvα are known, we have a direct information on the componentsof the Riemann tensor.

A typical gradiometer contains three arms (in three perpendicular directions in space). Thismeans that we have three different vectors δvα with which to apply equation 34. The vector uα isunique (fixed by the trajectory of the satellite). Should one have different satellites at approximatelythe same space-time point, with significantly different trajectories, one would have extra constraintson the Riemann tensor (at the given space-time point).

In order to simplify the notations in later sections of the paper, we drop the δ for the vector δvα ,and we write ωα instead of δaα . Then, equation 34 becomes

ωα = Rαµνρ uµ uρ vν . (35)

Given a metric field g , the theoretical values of the tidal acceleration are detoted ωcomputed , andwe write

g 7→ ωcomputed = ω(g) , (36)

where ααcomputed = Rα

µνρ(g) uµ uρ vν . The gradiometer provides the ‘observed acceleration’ ωobs ,with observational uncertainties represented by a covariance operator Cω . We wish that the tidal ac-celerations, ω(g) , are close to the observed ones, ωobs . More precisely, we wish the (least-squares)norm

‖ω(g)−ωobs ‖2Cω

≡ 〈 C−1ω (ω(g)−ωobs ) , (ω(g)−ωobs ) 〉 (37)

to be small.

2.10 Total Misfit

Using standard arguments from least-squares theory (see Tarantola [2004]), we shall define here the‘best metric field’ as the field g that minimizes the sum of all the misfit terms introduced above(equations 14, 15, 19, 24, 27, 32, and 37). The total misfit function, that we denote S(g) , is, therefore,given by

2 S(g) = ‖ g− gprior ‖2Cg

+ ‖ z(g)− 1 ‖2Cz

+ ‖ t(g)− tobs ‖2Ct

+ ‖σ(g)−σobs ‖2Cσ

+‖ a(g)− aobs ‖2Ca

+ ‖π(g)− πobs ‖2Cπ

+ ‖ω(g)−ωobs ‖2Cω

,(38)

11

i.e.,

2 S(g) = 〈 C−1g ( g− gprior ) , ( g− gprior ) 〉

+〈 C−1z ( z(g)− 1 ) , ( z(g)− 1 ) 〉

+〈 C−1t ( t(g)− tobs ) , ( t(g)− tobs ) 〉

+〈 C−1σ (σ(g)−σobs ) , (σ(g)−σobs ) 〉

+〈 C−1a ( a(g)− aobs ) , ( a(g)− aobs ) 〉

+〈 C−1π ( π(g)− πobs ) , ( π(g)− πobs ) 〉

+〈 C−1ω (ω(g)−ωobs ) , (ω(g)−ωobs ) 〉 .

(39)

Sometimes, in least-squares theory it is allowed for these different terms to have different ‘weights’,by multiplying them by some ad-hoc numerical factors. This is not necessary if all the covariance op-erators are chosen properly. In any case, adding some extra numerical factors is a trivial task that wedo not contemplate here.

Although in this paper we limit our scope to providing the simplest method that could be usedto actually find the metric field g that minimizes the misfit function, it is interesting to know that thefunction S(g) carries a more fundamental information. In fact, as shown, for instance, in Tarantola(2004), the expression

f (g) = k exp(− S(g) ) (40)

defines a probability density (infinite-dimensional) that represents the information we have on theactual metric field, i.e., in fact, the respective ‘likelihoods’ of all possible metric fields.

3 Optimization

3.1 Iterative Algorithm

Once the misfit function S(g) as been defined (equation 39), and the associated probability distri-bution f (g) has been introduced (equation 40), the ideal (although totally impractical) approach forextracting all the information on g brought by the data of our problem would be to sample the prob-ability distribution6 f (g) . Examples of the sampling of a probability distribution in the context ofinverse problems can be found in Tarantola (2004).

In the present problem, where the initial metric shall not be too far from the actual metric, thenonlinearities of the problem are going to be weak. This implies that the probability distributionf (g) is monomodal, i.e., the misfit function S(g) has a unique minimum (in the region of interest ofthe parameter space). Therefore, the general sampling techniques can here be replaced by the muchmore efficient optimization techniques. The basic question becomes: for which metric field g themisfit function S(g) attains its minimum?

This problem can be solved using gradient-based techniques. These techniques are quite sophis-ticated, and require careful adaptation to the problem at hand if they have to work with acceptableefficiency. As we do not wish to develop this topic in this paper, we just choose here to explore themore universal of the gradient-based techniques: the quasi-Newton method.

6Sampling an infinite-dimensional probability distribution is not possible, but we could define a (dense enough) gridin the space-time where the values of g are considered, this discretization rendering the probability distribution finite-dimensional.

12

To obtain the actual algorithm, one may use the formulas developed in Tarantola (2004). Theresulting iterative algorithm can be written

gk+1 = gk −H−1k γ∗

k , (41)

where the ‘Hessian operator’ Hk is

Hk = C−1g + Zt

k C−1z Zk + Tt

k C−1t Tk + Σt

k C−1σ Σk + At

k C−1a Ak + Πt

k C−1π Πk + Ωt

k C−1ω Ωk , (42)

the ‘gradient vector’ is

γ∗k = C−1

g (gk − gprior) + Ztk C−1

z (z(gk)− 1)

+ Ttk C−1

t (t(gk)− tobs)

+ Σtk C−1

σ (σ(gk)−σobs)

+ Atk C−1

a (a(gk)− aobs)

+ Πtk C−1

π (π(gk)− πobs)

+ Ωtk C−1

ω (ω(gk)−ωobs) ,

(43)

where the linear operators Zk , Tk , Σk , Ak , Πk , and Ωk , are the Frechet derivatives (tangent linearapplications) of the operators z(g) , t(g) , σ(g) , a(g) , π(g) , and ω(g) , introduced in equations 13,18, 23, 26, 31, and 36, all the operators evaluated for g = gk , and where Zt

k , Ttk , Σt

k , Atk , Πt

k , andΩt

k , are the respective transpose operators. We say transpose operators, better than dual operators,because the difference between the two notions matters inside the theory of least-squares7.

All the linear operators just introduced are evaluated in section 3.2. But before going into thesedetails, some comments on the iterative algorithm are needed.

The quasi-Newton algorithm 41 can be initialized at an arbitrary point (i.e., at any metric field)g0 . If working in the vicinity of an ordinary planet, the present problem will only be mildly nonlinear,and the convergence point will be independent of the initial point. The simplest choice, of course, is

g0 = gprior . (44)

Before entering on the problem of how many iterations must the done in practice, let us takethe strict mathematical point of view that, in principle, an infinite number of iterations should beperformed. The optimal estimate of the space-time metric would then be

g = g∞ . (45)

The least-squared method not only provides an optimal solution, it also provides a mean of estimat-ing the uncertainties on this solution. It can be shown (Tarantola, 2004) that these uncertainties arethose represented by the covariance operator

Cg = H−1∞ . (46)

Crudely speaking, we started with the a priori metric gprior , with uncertainties represented by thecovariance operator Cg , and we end up with the a posteriori metric g , with uncertainties repre-sented by the covariance operator Cg .

7A proper introduction of the dual operators (denoted with a ‘star’) would respectively give Z∗k = Cg Zt

k C−1z , T∗

k =Cg Tt

k C−1t , Σ∗

k = Cg Σtk C−1

σ , A∗k = Cg At

k C−1a , Π∗

k = Cg Πtk C−1

π , and Ω∗k = Cg Ωt

k C−1ω . See Tarantola (2004) for details.

13

The practical experience we have with the quasi-Newton algorithm for travel-time fitting prob-lems suggests that the algorithm should converge to the proper solution (with sufficient accuracy) ina few iterations (less than 10). Then, for all practical purposes, we can replace ∞ by 10 in the twoequations 45–46.

An important practical consideration is the following. The Hessian operator (equation 42) shallbe completely characterized below, and the different covariance operators shall be directly given.But the algorithm in equations 41–43 contains the inverse of these linear operators. It is a very ba-sic result of numerical analysis (Ciarlet, 1982) that the numerical resolution of a linear system maybe dramatically more economical than the numerical evaluation of the inverse of a linear operator.Therefore, we need to rewrite the quasi-Newton algorithm replacing every occurrence of the inverseof an operator by the associated resolution of a linear system.

Let us start by the evaluation of the gradient vector γ∗k . Expression 43 can be rewritten

γ∗k = δg∗k + Zt

k δz∗k + Ttk δt∗k + Σt

k δσ∗k + At

k δa∗k + Πtk δπ∗

k + Ωtk δω∗

k , (47)

where the vectors δg∗k , δz∗k , δt∗k , δσ∗k δa∗k , δπ∗

k , and δω∗k , are the respective solutions of the linear

systems

Cg δg∗k = gk − gprior

Cz δz∗k = z(gk)− 1

Ct δt∗k = t(gk)− tobs

Cσ δσ∗k = σ(gk)−σobs

Ca δa∗k = a(gk)− aobs

Cπ δπ∗k = π(gk)− πobs

Cω δω∗k = ω(gk)−ωobs .

(48)

Once the gradient vector γ∗k is evaluated, one can turn to the iterative step (equation 41). It can

be writtengk+1 = gk −γk , (49)

where γk is the solution of the linear system

Hk γk = γ∗k . (50)

Using the expression 42 for the operator Hk we can equivalently say that γk is the solution of thelinear system

∆g∗k + Ztk ∆z∗k + Tt

k ∆t∗k + Σtk ∆σ∗

k + Atk ∆a∗k + Πt

k ∆π∗k + Ωt

k ∆ω∗k = γ∗

k , (51)

where, when introducing the vectors

∆gk = γk

∆zk = Zk γk

∆tk = Tk γk

∆σ k = Σk γk

∆ak = Ak γk

∆π k = Πk γk

∆ωk = Ωk γk ,

(52)

14

the vectors ∆g∗k , ∆z∗k , ∆t∗k , ∆σ∗k , ∆a∗k , ∆π∗

k , and ∆ω∗k , are the respective solutions of the linear

systems

Cg ∆g∗k = ∆gk

Cz ∆z∗k = ∆zk

Ct ∆t∗k = ∆tk

Cσ ∆σ∗k = ∆σ k

Ca ∆a∗k = ∆ak

Cπ ∆π∗k = ∆π k

Cω ∆ω∗k = ∆ωk .

(53)

It is now time to determine the operators Zk , Tk , Σk , Ak , Πk , and Ωk , and their transposesZt

k , Ttk , Σt

k , Atk , Πt

k , and Ωtk .

3.2 Tangent Linear Applications and their Transposes

The linear tangent application to an application

g 7→ h(g) , (54)

is the linear application H defined by the first order development

h(g + δg) = h(g) + H δg + . . . (55)

The linear operator H depends, in general, on the ‘point’ g where it is evaluated, so, to be moreexplicit, one could write h(g + δg) = h(g) + H(g) δg + . . . The linear operator H is called theFrechet derivative of the operator h . It is this first order development that must actually be used toevaluate the Frechet derivative of an operator.

For a good text on functional analysis, in particular on the transpose and adjoint of a linear op-erator, see Taylor and Lay (1980). Some of the results in the following sections are provided withoutdemonstration: to check the proposed results, the reader should become familiar with the conceptsproposed in that book. Let us only mention here two elementary results. The transpose of the linearoperator defined through

yi j...k`... = Ai j...µν...

k`...αβ... xαβ...µν...... (56)

is the linear operator defined through

xαβ...µν...... = Ai j...µν...

k`...αβ... yi j...k`... (57)

The transpose of the linear operator defined through

yαβ...γµν...... = ∇γ xαβ...

µν...... (58)

is the linear operator defined through

xαβ...µν...... = −∇γ yαβ...

γµν...... (59)

There are typically some boundary conditions to be attached to a differential operator, what impliesfor the transpose operator a set of ‘dual’ boundary conditions, but we shall not enter into these‘details’ in this preliminary version of the theory.

15

3.2.1 Proper Time Data

The operator z(g) defined in equation 18 is linear, and, therefore, its tangent linear operator, Z , isthe operator z itself: for any δg , one has Z δg = z(δg) . Explicitly, the tangent linear operator is the(linear) operator that to any metric perturbation, say δgαβ , associates the values δgαβ uα uβ alongthe trajectories. This operator Z is to be used in the second of equations 52. The index in Zk can bedropped: the operator z being linear, its Frechet derivative is constant.

We leave to the reader to demonstrate that Zt the transpose of Z is the linear operator that to anyset of scalar values δχ defined along the trajectories, associates a contravariant metric field δgαβ thatis zero everywhere excepted along the given trajectories, where it takes the value δgαβ = uα uβ δχ .

The operator Zt appears twice in the algorithm, in equations 47 and 51. In equation 47 it actson δz∗k , that is defined via Cz δz∗k = z(gk)− 1 (second of equations 48). The term z(gk)− 1 rep-resents, at each point along each trajectory, the difference between the computed clock rate (via themetric) and the actual clock rate. The term δz∗k represents the same values, but weighted with the in-verse covariance values. The operation Zt δz∗k produces, at every point along each of the space-timetrajectories, the contravariant metric δz∗k uα uβ (and the value zero outside the trajectories).

In equation 51, the operator Zt acts on ∆z∗k , defined through Cz ∆z∗k = ∆zk (second of equa-tions 53), where ∆zk = Z γk (second of equations 52).

3.2.2 Einstein Equation

Consider the application that to every metric tensor field gαβ associates the Einstein tensor fieldEαβ defined by equations 9–10, then, the stress-energy tensor field tαβ = 1

χEαβ (equation 12). This

nonlinear application has, at every ‘point’ gαβ (i.e., for every given metric field) a linear tangentapplication. As demonstrated in appendix 6.1 (see equations 92–93), this linear tangent application isthe (linear) application that to every δgαβ associates the δtαβ given by

δtαβ =1χ

(Aαβ

µν,ρσ ∇(ρ∇σ) δgµν + Bαβµν δgµν

), (60)

where

Aαβµν,ρσ = 2 g(µ|(σ

δρ)(α , δ|ν)

β) −12 gµν δ

ρ

(α δσβ) −

12 gρσ δ

µ

(α δνβ) + 1

2 gµν gρσ gαβ − 12 gµ(ρ gσ)ν gαβ

Bαβµν = 1

2 (Rµ(αβ)

ν + R(µ(α δ

ν)β) + Rµν gαβ − R δ

µ

(α δνβ)) .

(61)

The linear operator so defined was denoted T above. It appears in the third of equations 52. Theindex k in these equations means that in equations 60–61 it is the current value of the metric field,gk , that should be used.

Let us now evaluate the transpose (sometimes called the adjoint) of this linear operator. It is alinear operator that to any δtαβ (element of the dual of the stress-energy tensor space) associatessome δgαβ (element of the dual of the metric tensor space). The obvious guess [note: this has to bedemonstrated!] is that δgαβ is given by

δgµν =1χ

(∇(ρ∇σ) (Aαβ

µν,ρσ δtαβ) + Bαβµν δtαβ

). (62)

As the tensor Aαβµν,ρσ only contains the metric, it commutes with the covariant derivative. Then

this expression can be written

δgµν =1χ

(Aαβ

µν,ρσ ∇(ρ∇σ) δtαβ + Bαβµν δtαβ

). (63)

16

The linear operator so defined was denoted Tt above. It appears in equations 47 and 51.

3.2.3 Arrival Time Data

When the metric is perturbed from gαβ to gαβ + δgαβ , the computed arrival times are perturbedfrom σ to σ + δσ , where (see equation 122 in appendix 6.2)

δσ = − 1/2gµν uµ `ν

∫λ(g)

dλ `α `β δgαβ , (64)

where uα is the tangent vector to the trajectory of the receiver, uα = dxα/dτ , and `α is the tangentvector to the trajectory of the light ray, `α = dxα/dλ (where λ is an affine parameter along the ray).The linear operator so defined was denoted Σ above. It appears in the fourth of equations 52.

To any space-time metric field perturbation δg , expression 64 associates the scalar δσ . We shalldemonstrate here [note: this has to be demonstrated] that the transpose of this operator is the linearoperator that to any scalar δσ associates the contravariant metric field δgαβ that is zero everywherein the space-time excepted along the given signal trajectory, where the value is

δgαβ = − 1/2gµν uµ `ν

`α `β δσ . (65)

The linear operator so defined was denoted Σt above. It appears in equations 47 and 51.

3.2.4 Accelerometer Data

It follows from equation 25 that a perturbation of the metric gαβ 7→ gαβ + δgαβ , produces a pertur-bation of the computed acceleration given by δaα = δΓα

βγ uβ uγ . The expression for δΓαβγ is in

appendix 6.1 (see equation 81, page 20), δΓαβγ = gασ δΓσβγ , with δΓαβγ = 1

2 (∇γδgαβ +∇βδgαγ −∇αδgβγ) . This gives

δaα = 12 gασ

(∇γ δgσβ +∇β δgσγ −∇σ δgβγ

)uβ uγ . (66)

The linear operator so defined was denoted A above. To any metric field perturbation δgαβ thisoperator associates, at every point of a space-time trajectory where the acceleration was measured,the values δaα just written. The operator A appears in the fifth of equations 52.

It follows [note: demonstrate this] that the transpose operator At is the linear operator that toany δaα , defined at the points where the acceleration was measured, associates, at the same points,the values [note: this still has to be simplified]

δgαβ = − 12

(gαν uβ uµ + gβν uα uµ − gµν uα uβ

)∇µ δπν . (67)

This operator appears in equations 47 and 51.

3.2.5 Gyroscope Data

The application g 7→ π(g) is (equation 30)

πα ≡ dsα

dτ= sβ (aβ uα − aα uβ)− Γα

βγ uβ sγ . (68)

17

To compute the first order perturbation π 7→ π + δπ produced by a perturbation g 7→ g + δg ,we must, in this equation, make the replacements Γα

βγ 7→ Γαβγ + δΓα

βγ and sα 7→ sα + δsα , withsubsequent expression of δΓα

βγ and δsα in terms of δgα . Rather than the exact computation, let ushere simplify the problem by using measured values of sα as auxiliary values, in which case we aresimply left with

δπα = −δΓαβγ uβ sγ . (69)

As mentioned in the previous section, the expression for δΓαβγ is in appendix 6.1. Using this, we are

immediately left to an expression similar to 66:

δπα = − 12 gασ

(∇γ δgσβ +∇β δgσγ −∇σ δgβγ

)uβ σγ . (70)

The operator that to any δgαβ associates the δπα given by this equation is the operator Π , appearingin the sixth of equations 52.

The transpose operator, Πt is [note: this still has to be simplified]

δgαβ = − 14

(gαν (uβ σµ + uµ σβ) + gβν (uα σµ + uµ σα)− gµν (uα σβ + uβ σα)

)∇µ δπν . (71)

It appears in equations 47 and 51.

3.2.6 Gradiometer Data

In view of equation 35, a perturbation of the metric field will produce the perturbation

δωα = δRαβγδ uβ uδ vγ (72)

of the tidal acceleration, where δRαµνρ is the first order perturbation to the Riemann tensor. This

perturbation is obtained as a by product in our computation of the perturbation of the Einstein tensorin appendix 6.1 (see equation 83, page 20). The result is

δRαβγδ = 2∇[γΩα

δ]β , (73)

whereΩα

βγ = gασΩσβγ with Ωαβγ = 12 (∇γδgαβ +∇βδgαγ −∇αδgβγ) . (74)

This defines the operator Ω , that appears in the seventh of equations 52.The transpose operator, Ωt , associates to any δωα the δgαβ given by [note: this still has to be

simplified]

δgαβ = vµ uν (uα ∇µνδωβ + uβ ∇µνδωα) + (uα vβ + uβ vα) uµ ∇µνδων

−uµ uν (vα ∇µνδωβ + vβ ∇µνδωα − 2 uα uβ vµ ∇µνδων .(75)

It appears in equations 47 and 51.

4 Discussion and Conclusion

We have been able to develop a consistent theory, fully relativistic, where the data brought by satel-lites emitting and receiving time signals is used to infer trajectories and the space-time metric. This

18

constitutes both, a kind of ultimate gravimeter and a positioning system. Any observer with re-ceiving capabilities shall know its own space-time trajectory “in real time”. These coordinates arenot the usual ‘geographical’ coordinates plus a time, but are four times. The problem of attachingthese four time coordinates to any terrestrial system of coordinates is just an attachment problem thatshould not interfere with the basic problem of defining an accurate reference system, and of knowingspace-time trajectories into this system.

For more generality, we have considered the possibility that the satellites may have accelerom-eters, gradiometers, or gyroscopes. This is because the positioning problem and the problem ofestimating the gravity field (i.e., the space-time metric) are coupled. If fact, all modern gravimetrysatellite missions are coupled with GNSS satellites. Our theory applies, in particular, to the GOCEsatellite mission (orbiting gradiometers). It also applies to the Gravity Probe B experiment, but thisis more anecdotical.

The optimization algorithm proposed (Newton algorithm) is by no means the more economicalto be used in the present context, and considerable effort is required to propose a practical algo-rithm, possibly using the ‘Kalman filter’ approach briefly mentioned in appendix 6.4. We are quiteconfident in our prediction that, some day, all positioning systems will be run using the basic princi-ples exposed in this paper: the ever-increasing accuracy of time measurements with eventually forceeveryone to take relativity theory seriously —at last.—

5 Bibliography

Ashby, N., 2002, Relativity and the global positioning system, Physics Today, 55 (5), May 2002, pp.41–47.

Ciarlet, P.G., 1982, Introduction a l’analyse numerique matricielle et a l’optimisation, Masson, Paris.Cerveny, V., 2002, Fermat’s variational principle for anisotropic inhomogeneous media, Stud. geo-

phys. geod., vol. 46, pp. 567–588, online at http://sw3d.mff.cuni.cz .Ciufolini, I., and Pavlis, E.C., 2004, A confirmation of the general relativistic prediction of the Lense-

Thirring effect, Nature, Vol. 431, pp. 958–960.Coll, B., and Morales, J.A., 1992, 199 causal classes of space-time frames, International Journal of

Theoretical Physics, Vol. 31, No. 6, pp. 1045–1062.Coll, B., 2000, Elements for a theory of relativistic coordinate systems, formal and Physical aspects,

ERES 2000, Valladolid.Coll, B., 2001, Physical Relativistic Frames, JSR 2001, ed. N. Capitaine, Pub. Observatoire de Paris,

pp. 169–174.Coll, B., 2002, A principal positioning system for the Earth, JSR 2002, eds. N. Capitaine and M. Stavin-

schi, Pub. Observatoire de Paris, pp. 34–38.Coll, B. and Tarantola, A., 2003, Galactic positioning system; physical relativistic coordinates for

the Solar system and its surroundings, eds. A. Finkelstein and N. Capitaine, JSR 2003, Pub. St.Petersbourg Observatory, pp. 333-334.

Grewal, M.S., Weill, L.R., and Andrews, A.P., 2001, Global positioning systems, inertial navigation,and integration, John Wiley & Sons.

Hernandez-Pastora, J.L., Martın, J., and Ruiz, E., 2001, On gyroscope precession, arXiv:gr-qc/0009062.Klimes, L., 2002, Second-order and higher-order perturbations of travel time in isotropic and aniso-

tropic media, Stud. geophys. geod., vol. 46, pp. 213–248, online at http://sw3d.mff.cuni.cz .Papapetrou, A., 1951, Spinning test-particles in general relativity (I), Proceedings of the Royal Society

of London. A209, pp. 248–258

19

Pozo, J.M., 2005, . . .Tarantola, A., 2004, Inverse problem theory and methods for model parameter estimation, SIAM.Taylor, A.E., and Lay, D.C., 1980, Introduction to functional analysis, Wiley.Weinberg, S., 1972, Gravitation and Cosmology, Wiley.

6 Appendixes

6.1 Perturbation of Einstein’s Tensor

Note: explain somewhere that, as for any matrix a , (a + δa)−1 = a−1 − a−1 δa a−1 + · · · , whenimposing to the metric the perturbation

gαβ 7→ gαβ + δgαβ , (76)

the contravariant components have the perturbation

gαβ 7→ gαβ − gαγ δgγδ gδβ + · · · . (77)

When introducing the perturbations 76–77 in the expressions 9, one obtains, keeping only firstorder terms in δgαβ , the perturbation δEαβ of the Einstein tensor.

We consider the perturbation of the metric and the perturbation of the connection.

gαβ → g′αβ = gαβ + δgαβ ; Γαβγ → Γ ′αβγ = Γα

βγ + δΓαβγ . (78)

In this appendix, and in order to make the expressions more compact, let us denote

δgαβ = hαβ ; δΓαβγ = Ωα

βγ . (79)

We will use the unperturbed metric to raise and lower indices. For instance, we will write hαβ ≡

gαγhγβ .By requiring that both, the unperturbed and the perturbed connection, to be metric, ∇g = ∇′g′ =

0, and symmetric, Ωα[βγ] = 0, we get:

∇′γg′αβ = ∇γhαβ −Ωδ

αγg′δβ −Ωδβγg′αδ = 0

Ωα[βγ] = 0

⇒ g′αδΩ

δβγ =

12

(∇γhαβ +∇βhαγ −∇αhβγ) . (80)

Then, Ωµβγ is obtained by contracting this expression with the inverse of the perturbed metric g′µα.

But the first order is given by the contraction with the unperturbed one:

Ωαβγ = gαδΩδβγ where Ωαβγ =

12

(∇γhαβ +∇βhαγ −∇αhβγ) . (81)

The two Riemann tensors are related by

R′αβγδ = Rα

βγδ + 2∇[γΩαδ]β + 2Ωα

µ[γΩµδ]β . (82)

Thus, the first order perturbation of the Riemann is given by

δRαβγδ = 2∇[γΩα

δ]β , (83)

20

from which we get the first order perturbation of the Ricci tensor:

δRαβ = δRδαδβ = 2∇[δΩ

δβ]α = ∇[δ∇β]h

δα +

12

(∇δ∇αhδβ −∇β∇αhδ

δ)−12

(∇δ∇δhαβ +∇β∇δhαδ) .(84)

Splitting into symmetric and intisymmetric parts of the two covariant derivatives

δRαβ = ∇[δ∇β]hδα + 1

2 ∇[δ∇α]hδβ + 1

2 ∇[β∇δ]hδα

+ 12 ∇(δ∇α)hδ

β − 12 ∇(β∇α)hδ

δ − 12 ∇δ∇δhαβ + 1

2 ∇(β∇δ)hδα .

(85)

Substituting now the identity 2∇[δ∇α]hδβ = Rδ

µδα hµβ − Rµ

βδα hδµ and introducing the notation

Hαβ,γδ ≡ ∇(γ∇δ)hαβ , (86)

2 δRαβ = Rµ(αhµβ) + Rµ

(αβ)δhδµ + Hδ

β,αδ + Hδα,βδ − Hδ

δ,αβ − Hαβ,δδ . (87)

In order to obtain the perturbation of the Ricci scalar we also need the perturbation of the con-travariant metric:

gαγ gγβ = δαβ ⇒ δgαβ = −gαγ hγδ gδβ = −hαβ . (88)

Thus, the perturbation of the Ricci scalar is

δR = δgαβ Rαβ + gαβ δRαβ = −Rαβ hαβ + Hαβ,αβ − Hα

α,β

β . (89)

Finally we obtain the first order perturbation of the Einstein tensor,

δEαβ = δRαβ − 12 (δR gαβ + R δgαβ)

= 12 (Rδ

(αβ)µ hδµ + Rµ

(α hβ)µ + Rγδ hγδ gαβ − R hαβ)

+ 12 (Hδ

β,αδ + Hδα,βδ − Hδ

δ,αβ − Hαβ,δδ + Hγ

γ,δδgαβ − Hγδ

,γδ gαβ)

(90)

This result can be rewriten as

δEαβ = Aαβγδ,ρσ Hγδ,ρσ + Bαβ

γδ hγδ (91)

with

Aαβγδ,ρσ = 2 g(γ|(σ

δρ)(α , δ|δ)

β) −12 gγδ δ

ρ

(α δσβ) −

12 gρσ δ

γ

(α δδβ) + 1

2 gγδ gρσ gαβ − 12 gγ(ρ gσ)δ gαβ

Bαβγδ = 1

2 (Rγ(αβ)

δ + R(γ(α δ

δ)β) + Rγδ gαβ − R δ

γ

(α δδβ)) .

(92)

Note that using he definition 86, equation 91 can be written, explicitly,

δEαβ = Aαβγδ,ρσ ∇(ρ∇σ)hγδ + Bαβ

γδ hγδ . (93)

Observe that, by construction, the two tensors A and B are symmetrics in each pair of indices,

Aαβγδ,µν = A(αβ)

(γδ),(µν) and Bαβγδ = B(αβ)

(γδ) . (94)

In addition, it results that Aαβγδ,µν is symmetric respect to the interchange of the two contravariant

pairs:Aαβ

γδ,µν = Aαβµν,γδ . (95)

21

This implies that not all the information in Hαβ,γδ contrubutes to δEαβ. In fact, we can express theterm Aαβ

γδ,µν Hαβ,γδ in an interesting form. Let us define Jαβγδ ≡ 2H[δ|[α,β]|γ]. This tensor containsless information than H[γ|[α,β]|δ], and has the same symmetries as a Riemann:

Jαβγδ = J[αβ][γδ] = Jγδαβ and Jα[βγδ] = 0 . (96)

We can then take the traces of this tensor (obtaining a Ricci-like tensor and scalar): Jαβ ≡ Jγαγβ andβ ≡ Jαα. Then, it is easy to check that the contribution of Hαβ,γδ is only the Einstein-like tensor ofJαβγδ:

Aαβγδ,µν Hγδ,µν = Jαβ −

12

J gαβ (97)

In contrast, observe that Bαβγδ contains all the information of the Riemann tensor.

6.2 Arrival Time Data

We need the linear operator Σ that is tangent to the forward operator σ at some g0 . Formally,

σ(g + δg) = σ(g) + Σ δg + O(δg)2 . (98)

It is easy to understand the meaning of Σ . While σ associates to any metric g some arrival timesσ i , the operator Σ associates to every metric perturbation δg (around g ) the perturbation δσ i ofarrival times. Let us compute these perturbations.

6.2.1 Hamiltonian Formulation of Finsler Geometry

The Finsler space is a generalization of the Riemann space. This generalization is appropriate for thedescription of the propagation of light and many other waves.

Proper time τ in the Finsler space satisfies the stationary Hamilton-Jacobi equation

H(xκ , τ,µ) = const. , (99)

where H(xκ , pµ) is the Hamiltonian. The geodesics can then be described by the Hamilton equations

dxα

dλ=

∂H∂pα

, (100)

dpα

dλ= − ∂H

∂xα, (101)

with initial conditionsxα(λ0) = xα

0 ; pα(λ0) = τ,α(xµ0 ) . (102)

Thenτ,α[xµ(λ)] = pα(λ) (103)

along the geodesics. Parameter λ along a geodesic is determined by the form of the Hamiltonian andby initial conditions (equation 102) for the geodesic. Proper time τ along the geodesic is then givenby

τ(λ) = τ(λ0) =∫ λ

λ0

dλ pα∂H∂pα

, (104)

22

which follows from equations 100 and 103. Note that equal geodesics may be generated by variousHamiltonians. For example, Hamiltonian H(xκ , pµ) = F[H(xκ , pµ)], where F(x) is an arbitrary func-tion with a non–vanishing finite derivative at x equal to the right–hand side of equation 99, yieldsequal geodesics as Hamiltonian H(xκ , pµ). The Hamiltonian is often chosen as a homogeneous func-tion of degree N in pα. Especially, homogeneous Hamiltonians of degrees N = 2, N = 1 or N = −1are frequently used.

If the Hamiltonian is chosen as a homogeneous function of degree N = 2 in pα, and is properlynormalized, then

gαβ(xκ , pµ) =∂2H

∂pα∂pβ

(xκ , pµ) (105)

is the contravariant Finslerian metric tensor. If metric tensor in equation 105 is independent of pµ,

gαβ(xκ , pµ) = gαβ(xκ) , (106)

the Finsler space reduces to the Riemann space.On the other hand, if we know the contravariant metric tensor, we may construct a homogeneous

Hamiltonian of degree N in pα as

H(xκ , pµ) =1N

[pαgαβ(xκ , pµ)pβ]N2 . (107)

Whereas degree N may be arbitrary for spatial or time-like geodesics, N 6= 2 should be avoided forzero-length geodesics in order to keep the right–hand sides of Hamilton equations 100 and 101 finiteand non-vanishing identically.

For homogeneous Hamiltonians (equation 107), equation 104 reads

τ(λ) = τ(λ0) +∫ λ

λ0

dλ [pαgαβ(xκ , pµ)pβ]N2 , (108)

and equation 100 yieldsdxα

dλgαβ

dxβ

dλ= [pαgαβpβ]N−1 . (109)

Considering equation 109, equation 108, can be expressed in the form

τ(λ) = τ(λ0) +∫ λ

λ0

dλ

[dxα

dλgαβ(xκ , pµ)

dxβ

dλ

] N2(N−1)

. (110)

In the Hamiltonian formulation, the Finsler geometry is no more complex than the Riemann geome-try.

6.2.2 Perturbation of Proper Time

The first-order perturbation of proper time (equation 104) is (Klimes, 2002, eq. 25)

δτ(λ) = δτ(λ0)−∫ λ

λ0

dλ δH . (111)

If we wish to perform perturbations with respet to the components of the metric tensor along zero–length space–time geodesics, homogeneous Hamiltonians (equation 107) should be of degree N = 2to avoid zero or infinite perturbations δH of the Hamiltonian.

One alternative to the present Hamiltomian formulation, would be to use a Lagrangian formu-lation of the first degree, this leading to the usual Fermat’s integral. There are four reasons why theformulation here presented is better:

23

• Perturbations of a homogeneous Lagrangian of degree N with respect to the components of themetric tensor are zero for N > 2 and infinite for N < 2, which results in singularities in thecomputation;

• Hamilton equations beak down for N 6= 2, which would prevent us from using efficient toolsof Hamiltonian formulation;

• Dual Legendre transform between homogeneous Hamiltonian and Lagrangian of the first de-gree is not possible (Cerveny, 2002), which also holds for spatial and time–like geodesics;

• The integral is generally complex-valued for indefinite metric tensors.

In the following, we shall thus consider an homogeneous Hamiltonian (equation 107) of degreeN = 2,

H(xκ , pµ) =12

pα gαβ(xκ , pµ) pβ . (112)

Equation 100 then readsdxα

dλ= gαβ(xκ , pµ)pβ , (113)

and equation 111, with δτ(λ0) = 0 , reads

δτ(λ) = −12

∫ λ

λ0

dλ pαδgαβpβ . (114)

Inserting δgαβ = −gακδgκµgµβ , we obtain

δτ(λ) =12

∫ λ

λ0

dλ pαgακδgκµgµβpβ . (115)

Inserting equation 113 into equation 115, we arrive at

δτ(λ) =12

∫ λ

λ0

dλdxα

dλδgαβ

dxβ

dλ. (116)

6.2.3 Perturbation of arrival time

Assume the trajectoryxi = yi(σ) (117)

parametrized by proper time σ along it (in general, σ may represent an arbitrary parameter alongthe trajectory). A light signal emitted at the given point will hit the given trajectory at proper timeσ = σ0. Assume now that the space–time metric is perturbed from gi j to gi j + δgi j.The light signalwill now hit the trajectory at proper timeσ0 +δσ . We shall now derive the first order relation betweenδσ and delta gi j.

The space-time wavefront may be expressed in the form

τ(xα) = 0 , (118)

where τ(xα) is measured along the geodesic from the given point to point xα. The geodesics can becalculated by Hamiltonian ray tracing from the given point.

24

Proper time σ at the point of intersection of the trajectory with the space-time wavefront thensatisfies equation

τ(yα(σ)) = 0 . (119)

Perturbation of this equation yields

δτ(yβ(σ)) + τ,α(yβ(σ))dyα

dσ(σ) δσ = 0 . (120)

Then

δσ = − δτ(yβ(σ))

τ,α(yβ(σ)) dyα

dσ(σ)

. (121)

Inserting pα from equation 113 for τ,α and equation 116 for δτ(yβ(σ)), equation 121 can be expressedin the form

δσ = −12

[dyβ

dσgαβ

dxα

dλ

]−1 ∫ λ

λ0

dλdxα

dλδgαβ

dxβ

dλ. (122)

6.3 A Priori Information on the Metric

Let gprior some reference space-time metric (for instance the Minkowski or the Schwarzschild met-ric), and let g be the actual metric. In the simple (and a little bit simplistic) approach proposed here,it is assumed that the difference

g− gprior (123)

is small, and is assumed to be a random realization of a Gaussian random field with zero mean andprescribed covariance. Because in the light coordinates used here it is the contravariant metric thathas some simple properties, the difference in equation 123 is taken using the contravariant compo-nents.

To obtain a reasonable model of covariance operator for the metric, we could perform a thoughtexperiment. We imagine a large number of metric fields, all of the form

gαβ =

0 g12 g13 g14

g12 0 g23 g24

g13 g23 0 g34

g14 g24 g34 0

(124)

at every point, all smoothly varying over space-time, and with the quantities gαβ randomly varyingaround the values corresponding to the reference metric, with prescribed, simple probability distri-butions (independent, to start with). The we could evaluate the covariance of such a ‘random field’using the direct definition of covariance:

Cαβµν(τ1, τ2, τ3, τ4,σ1,σ2,σ3,σ4) =(gαβ(τ1, τ2, τ3, τ4)− gαβ(τ1, τ2, τ3, τ4)

) (gµν(σ1,σ2,σ3,σ4)− gµν(σ1,σ2,σ3,σ4)

),

(125)

where x means the mean value of x . The mean metric gαβ would be the reference metric.

25

Another option is to try to insert more constraints that we know are satisfied by the metric. Forinstance, Pozo (2005) shows that the metric has necessarily the form

0 g12 g13 g14

g12 0 g23 g24

g13 g23 0 g34

g14 g24 g34 0

=

a 0 0 00 b 0 00 0 c 00 0 0 d

0 A B 1A 0 1 BB 1 0 A1 B A 0

a 0 0 00 b 0 00 0 c 00 0 0 d

, (126)

where the constants a, b, c, d are positive, and the constants A, B should satisfy the constraintthat a triangle exists in the Euclidean plane whose sides have the lengths A, B, 1 . One couldperhaps use the six quantities a, b, c, d, A, B as basic quantities, and assume a Gaussian distributionfor some simple functions of them.

We do not explore yet this possibility. Also, it is very likely that the basic variable to be used inthe optimization problem is not the metric gαβ , but the logarithmic metric. This point is, for the timebeing, not examined.

We don’t try to be more specific at this point, we simply assume that some covariance function

Cαβµν(τ1, τ2, τ3, τ4,σ1,σ2,σ3,σ4) (127)

is chosen. The inverse W = C−1 of the covariance operator (a distribution) has the kernel

Wαβµν(τ1, τ2, τ3, τ4;σ1,σ2,σ3,σ4) . (128)

By definition (formally)∫dv(ρ1, ρ2, ρ3, ρ4) Wαβρσ (τ1, τ2, τ3, τ4; ρ1, ρ2, ρ3, ρ4)×

× Cρσµν(ρ1, ρ2, ρ3, ρ4;σ1,σ2,σ3,σ4) =

= δµα δν

β δ(τ1 −σ1) δ(τ2 −σ2) δ(τ3 −σ3) δ(τ4 −σ4) ,

(129)

wheredv(ρ1, ρ2, ρ3, ρ4) =

√− det gprior(ρ1, ρ2, ρ3, ρ4) dρ1 dρ2 dρ3 dρ4 . (130)

The operators C(g) and W(g) being symmetric and positive definite, define a bijection betweenG , the space of metric field perturbations and its dual, G∗ . We shall write

δg = W δg ; δg = C δg . (131)

Explicitly,

δgαβ(τ1, τ2, τ3, τ4) =∫

dv(ρ1, ρ2, ρ3, ρ4)

Wαβµν(τ1, τ2, τ3, τ4;σ1,σ2,σ3,σ4) δgµν(σ1,σ2,σ3,σ4)(132)

and

δgαβ(τ1, τ2, τ3, τ4) =∫

dv(ρ1, ρ2, ρ3, ρ4)

Cαβµν(τ1, τ2, τ3, τ4;σ1,σ2,σ3,σ4) δgµν(σ1,σ2,σ3,σ4) .(133)

26

The duality product of a dual metric field perturbation δg by a metric field perturbation δγ isdefined as

〈 δg , δγ 〉 =∫

dv(τ1, τ2, τ3, τ4) δgαβ(τ1, τ2, τ3, τ4) δγαβ(τ1, τ2, τ3, τ4) , (134)

the scalar product of two metric field perturbations is

δg1 · δg2 = 〈W δg1 , δg2 〉 , (135)

and the norm of a metric field perturbation is

‖ δg ‖Cg =√

δg · δg . (136)

Denoting by gprior the a priori metric and by g our estimation of the actual metric field, we arelater going to impose that the squared norm

2 Sg(g) = ‖ g− gprior ‖2Cg

(137)

is small.

6.4 Kalman Filter

Assume that some linear model allows to make a preliminary prediction of the state of the system attime k in terms of the state of the system at time k − 1 (we retain here the notations in Grewal etal. (2001)):

x−k = Φk x+k−1 . (138)

If the uncertainties we had on x+k−1 are represented by the covariance matrix P+

k−1 and if the predic-tion by the linear model Φk has uncertainties described by the covariance matrix Qk−1 , the uncer-tainty we have on x−k is represented by the covariance matrix

P−k = Φtk P+

k−1 Φtk + Qk−1 . (139)

So we have the prior value x−k with uncertainties described by the prior covariance matrix P−k .To pass from the preliminary estimate x−k to the actual estimate x+

k we now use some observed datazk that is assumed to be related to x+

k via a linear relation zk ≈ Hk x+k , with uncertainties described

by the covariance matrix Rk . The standard theory of linear least-squares then provides the posteriorestimate as

x+k = x−k + P−k Ht

k ( Hk P−k Htk + Rk )−1 ( zk − Hk x−k ) , (140)

that has uncertainties represented by the posterior covariance matrix

P+k = P−k − P−k Ht

k ( Hk P−k Htk + Rk )−1 Hk P−k . (141)

Then, if at each time step we input Φk , Qk−1 , zk , Hk , and Rk , equations 138–141 allow to have acontinuous estimation of the state of the system, x+

k , together with an estimation of the uncertainties,P+

k .The reader may recognize that equations 140–141 are identical to the standard equations of linear

least-squares theory (see equations 3.37 and 3.38 in Tarantola (2004)). The matrix

Kk = P−k Htk ( Hk P−k Ht

k + Rk )−1 , (142)

27

that appears in the two equations 140–141, is called the ‘Kalman gain matrix’.Example. As a simple example, consider, in non-relativistic physics, the trajectory of a mass that

has been equipped with some sensors. We can choose to represent the state of the system at any timeby a 9-dimensional vector x , that contains the three values of the position, the three values of thevelocity and the three values of the acceleration. Assume that, as a result of the previous iteration, atsome moment we have the estimation x+

k−1 with uncertainties P+k−1 . Equation 138 may simply cor-

respond to the use of the velocity to extrapolate the position one step in time, to use the accelerationto extrapolate the velocity, and to keep the acceleration unchanged. This perfectly characterizes thematrix Φk . Equation 139 then is used to update the estimation of uncertainty, where we can take forQk−1 something as simple as a zero matrix excepted for the three diagonal elements associated tothe acceleration, where a small variance will take into account that our extrapolation of accelerationis uncertain. The data zk may consist in the output of some sensors, like accelerometers or GPS re-ceivers. The relation zk = Hk x−k would correspond to the theoretical calculation of the data zk giventhe state x−k . This is not a linear relation, and the theory should be developed to directly account forthis, but if the time steps are small enough, we can always linearize the theory, this then defining thematrix Hk . Denoting now by zk the actual output of the sensors, and by Rk the experimental uncer-tainties, equation 140 is used to obtain our second estimation of the state of the system, equation 141providing the associated uncertainties.

28

Gravimetry, Relativity, and the Global Navigation Satellite...

Documents

Transcript of Gravimetry, Relativity, and the Global Navigation Satellite...