Interpretation of Quantum Mechanics -...
Transcript of Interpretation of Quantum Mechanics -...
1
Interpretation of Quantum Mechanics
using a density matrix formalism. by Marcel Nooijen
Chemistry Department
Princeton University
One arrives at very implausible theoretical conceptions, if one attempts to maintain the
thesis that the statistical quantum theory is in principle capable of producing a complete
description of an individual physical system. On the other hand, those difficulties of
theoretical interpretation disappear, if one views the quantum mechanical description as
the description of ensembles of systems. Albert Einstein (1949).
Quantum mechanics has given rise to a lot of discussion since its conception, as some
things are so weird. I will try to clarify some of the issues here. One thing is very
important from the outset. Quantum mechanics is a statistical theory. It tells us the
various possible outcomes of experiments and the corresponding probablilities if we
would do a large number of identical experiments on individual quantum systems.
Identical experiments are necessarily idealizations, but this is not much of a restriction in
practice, as many variables (e.g. what's is going on in Sidney or on the next bench in the
lab) are irrelevant. If we take this view that quantum mechanics provides a statistical
description of a large number of identical experiments, but cannot say much (unless
probabilities are unity or zero) about the outcome of a single experiment, a lot of the
difficulties dissappear. In this context taking a spectrum of a sample in the gas phase
appears to be a single experiment but in our view it amounts to doing measurements on
many individual quantum systems. The systems are not all identical but this is the same
type of fluctuation that occurs in classical statistical descriptions. At first sight the
situation may not appear very different from the description provided by classical
statistical mechanics. In that case however, we have a description (classical mechanics)
that provides a complete description of the world, which is far too complex, however, to
2
be of use. Think of the microscopic description of a gas of classical particles for example.
In addition there are issues with classical chaotic behavior that make the premise of an
"in principle complete classical description" a little shaky. In quantum mechanics we do
have the statistics, but there appears to be no underlying more complete level of
description. For instance in quantum mechanics it is in principle impossible to say when a
particular nucleus will decay, or if an electron will make a left or a right turn in a Stern-
Gerlach experiment. The information we can obtain is inherently statistical, and there is
no indication at present that we will be able to do more than give a statistical description
of experiments. It also appears to be sufficient in practice. In particular there is no
contradiction in that quantum mechanics can be used to derive classical mechanics, as at
a microscopic level there are always many nearly identical 'experiments' leading to
deterministic probability distributions that can be used to formulate classical mechanics.
A detailed analysis of this is rather involved and certainly beyond the scope of these
notes. Let me emphasize however that statistical arguments based on a larger ensemble of
some sort will always need to be invoked in Quantum Mechanical explanations.
These notes consist of four parts. The first part gives a brief discussion of the connection
between theory and experiment and gives a brief description of the conventional
formulation of quantum mechanics, but phrased in terms of density matrices and
projectors, rather than wave functions. The wave function formulation itself can be found
in many text books and I refer you to the excellent textbook by Cohen-Tannoudji, Diu,
and Laloë. In the second part I will discuss a childishly simple example of a classical
measurement and indicate which aspects are different in quantum mechanics. They may
not make sense - which is the point I wish to drive home - but they do describe the factual
experimental situation. This example includes in essence a discussion of the uncertainty
relations, the so-called non-separability problem, Bell's inequalities and Aspects
experiments in the eighties and my reading of it. In the third part we will go beyond the
conventional presentation of QM and discuss in more detail the act of measurement and
an elegant formulation in terms of reduced density operators. This treatment is essential
to describe many common experiments as the conventional formulation of quantum
mechanics is too restricted. In particular we need to be able to describe a sequence of
measurements starting from a single ensemble. In the conventional treatment we always
3
continue with only the branch that yielded a particular outcome to the first experiment,
and so forth. This is seldom the case in practice. In the fourth part I will discuss that the
essence of measurement: decoherence of the wave packet and evolution into a statistical
mixture does not require a macroscopic apparatus. This provides a basis to derive
statistical mechanics from quantum mechanics using collisons of microscopic systems to
create density matrices that are diagonal in the energy representation. This is very much
the same as the classical treatment and Boltzmann's H-theorem. In a way these notes
extend far beyond interpretation issues, but that is how they started to evolve.
I. Elementary introduction.
Every description of an experiment on a microscopic system (even single molecule
spectroscopy) is essentially statistical. Typically one performs an experiment on a sample
consisting of similar microscopic systems. In an idealized theoretical description we view
such an experiment as equivalent to performing a sequence of measurements on each
(now supposedly identical) microscopic system in isolation. This generates a definite
result for each individual experiment, and the statistics of the distribution of results is
described by quantum mechanics. This distribution of results may not quite agree with
the experimental result for a variety of reasons. The actual experimental sample will
contain a distribution of different microsystems, it will involve some interaction between
different microsystems or the environment, and so forth. Some of these effects can be
taken into account by using statistical mechanics. This is beyond were I want to go,
however. Let us simply assume that the above quantum description would agree very
well with the experimental result.
You may have noticed that the above is quite an abstraction from the actual experimental
situation. Theory describes the outcome of experiment, but it does not even try to
describe the actual physical situation or reality. Taking this minimalistic point of view on
the theoretical descriptions will take the angle out of many issues on interpretation of
quantum mechanics. The agreement with experiment is what we can verify. All the rest is
speculation. I do not say that the rest is not important for science. As scientists we need
some visualization of reality in order to think creatively about new possibilities. And it is
4
even perfectly valid to teach these views as it helps to progress science. However, these
views may be personal and erroneous. We may argue a lot about these things, but at the
end of the day who is to say. Therefore I will restrict the discussion as much as possible
to things we do 'know'. Or, according to Popper: "a statement can only be scientific if it is
possible, in principle, to do an experiment to prove it wrong".
---- Discussion of postulates ----- See handout Cohen-Tannoudji -----
Discussion of postulates using density matrices and projectors.
The formulation of quantum mechanics can be phrased a little more compactly and
elegantly using density matrices and projectors. This avoids distinguishing between cases
where eigenvalues are degenerate and also the overall phase of the wave function is
irrelevant. It will pave the way for subsequent discussion.
Let us denote by a t i t t ni i, , , ,≡ = 1 a set of orthonormal eigenvectors of the operator A
that correspond to the same ni -fold degenerate eigenvalue ai that hence span the
corresponding subspace. Then we can define the orthogonal projector on the eigenspace
by ( ) , ,P a i t i tit
ni
==∑
1
, with matrix representation P a p i t i t qpq it
ni
( ) , ,==∑
1
. An
orthogonal projector has the important properties ,†P P P P= =2 (idempotency) as you
can verify for yourself. Moreover the only eigenvalues of ( )P ai are 0 or 1. Any vector
(completely) within the subspace corresponds to eigenvalue 1, while vectors orthogonal
to the subspace have eigenvalue 0. ( )P ai acts as the identity operator within the
subspace. The operator ( )P ai is independent of the precise definition of the eigenvectors
i t, . Another orthonormal set of vectors i x, that span the subspace would do just as
well: ( ) , ,P a i x i xix
ni
==∑
1
would give the same matrix repesentation P apq i( ) (verify).
Finally ( ) ( ) ,P a P a a ai j i j= ≠0 , because the eigenvectors corresponding to different
5
eigenvalues are orthogonal. The operator A can be represented as ( )A a P aii
i= ∑ , from
which follows immediately
( ) ( ) ( )
( ) ( ) ( )
A a P a a P a a P a
f A f a P a
ii
i jj
j ii
i
i ii
2 2= =
=
∑ ∑ ∑
∑
The probability to measure an eigenvalue ai in a state Ψ is given by
Ψ Ψ Ψ Ψi t i t P at
n
i
i
, , ( )=∑ =
1
. Moreover the (unnormalized) state after the
measurement is given by ( ) , ,P a i t i tit
ni
Ψ Ψ==∑
1
. In short projectors are a convenient
way to deal with degenerate states. Only the subspaces are relevant and this is precisely
the focus of the projectors.
We can go one step further and associate a projector with the state Ψ itself. This is
called the density operator and is denoted D = Ψ Ψ . In this case the density operator
corresponds to a pure state and is a projector. Moreover
Tr D p p p pp p
( ) = = = =∑ ∑Ψ Ψ Ψ Ψ Ψ Ψ 1
The density D completely characterizes the system and is independent of the overall
phase of Ψ . The probability to measure ai on a system described by D is given by
p a Tr P a D p i t i t p
p p i t i t i t i t
i ip t
n
p t
n
t
n
i
i i
( ) ( ( ) ) , ,
, , , ,
= = =
= =
∑ ∑
∑ ∑ ∑
=
= =
1
1 1
Ψ Ψ
Ψ Ψ Ψ Ψ
which agrees with the postulates. The system after measurement of eigenvalue ai
(without normalization) would be given by
( ) ( ) , , , ,,
P a DP a i t i t i s i si it s
= ∑ Ψ Ψ
The density as given above is normalized to
6
Tr P a DP a Tr P a P a D Tr P a D p ai i i i i i( ( ) ( )) ( ( ) ( ) ) ( ( ) ) ( )= = =
Later on we will see that the complete ensemble after the measurement of A (at time ta )
can be represented as ( ) ( ) ( )D t P a DP aa ii
i= ∑ with normalization p aii
( ) =∑ 1. This
density is not idempotent in general and is not a projector. It would not correspond to a
pure state but to a mixture ( ) ( )D t p aa ii
i i= ∑ Ψ Ψ . We will discuss this later on.
The time dependence of the density operator (in general, pure state or mixture) is
given by − ∂∂
=i Dt
D H, . This is discussed in section of Cohen-Tannoudji that
accompany these notes, and I have little to add. We will use this in the excersises.
Let us finally consider two hermitean operators A and B having the respective
eigenspace projectors ( )P ai and ( )P bj . If A and B commute they have a complete set of
common eigenvectors. It can be shown that in this case the projectors on the respective
eigenspaces commute ( ), ( ) ,P a P b i ji j = ∀0 . The proof runs as follows. Let
( ); ( )A a P a B b P bii
i jj
j= =∑ ∑
and
, ( ), ( ),
A B a b P a P bi j i ji j
= =∑ 0
Each individual term in the sum should equal zero as the operator parts are independent
(projectors on different subspaces). Therefore either ai = 0, bj = 0 or the individual
projectors commute. The special cases require some extra work. Since all projectors not
corresponding to zero eigenvalues necessarily commute, we know that
( ), ( ( )), ( ), ( )P a B P a B P a b P bia
j jji≠
∑ ∑LNM
OQP= = − = = − =
LNM
OQP=
00 00 1 0 0 0
Hence the "null-projector" for A commutes with all non-null-projectors for B and
therefore also with ( ( )) ( )1 00
− = =≠∑P b P bj jbj
, which completes the proof. This result is
completely equivalent to the statement that A and B have a complete set of common
7
eigenfunctions. We will use this result later on. The projectors ( ) ( )P a P bi j would project
on the subspace spanned by those eigenvectors a b ti j, , that all have the same
eigenvalues ai and bj .
II. Measurement of non-commuting observables.
The most famous example of two non-commuting observables are position and
momentum. The properties of these operators are a little complicated because their
spectra are continous. It is easier to consider the case of measuring angular momentum or
even better the spin of an S=1/2 system. The three cartesian components of S do not
commute and we have the commutation relations ,S S i Sx y z= . However we can very
well measure any of these individual quantities and we can also perform a sequence of
measurements and analyse the results. In the absence of magnetic interactions in the
hamiltonian the resulting state vectors after the measurement are independent of time,
which is another simplification. In fact to discuss the results of quantum mechanics let us
not use any mathematics at all. Let us analyse the real content first and then venture into
mathematical formulations.
As our ensemble we take a class of schoolkids. Each of these kids has a lunchpacket that
consists of three items. They all have a turkey or roastbeef sandwich (t or r ), a coke or a
sprite to drink (c or s ) and an apple or an orange (a or o ) for desert. Our measurement
consists of asking a kid what is in the lunchbag, and getting statistics on the ensemble
(the class). However, we can ask only one question at a time. For example: "everybody
with a turkey sandwich stand to the right". But not: "All that have an orange and a coke
please stand on the left". That is asking two questions at once, and in the anology with the
spin system reflects the impossibility to simultaneously measure non-commuting
observables. In fact any 'measurement' we do should obey the laws of quantum
mechanics. Our goal is to characterize the distribution of lunchbags (e.g how many
tca rca tsa rco, , , ,...) etc, are there. Can we do this? If things behaved classically, easily.
But not in the quantum world. Let us try. We would first ask all kids who has turkey and
8
who has roast beef, and partition them into two groups. Then we would ask the turkey
group who has a coke and set them apart. Fine. we already have an ensemble that has
both a turkey and a coke, right? Let us check, and ask again. Who has a coke? Everybody
has a coke. Now, who has a turkey sandwich? Oops. This doesn't work. Only about half
of them has turkey. Asking the coke question destroyed the information we had on the
turkey. In the quantum world it is impossible to isolate a group where everybody has both
a coke and turkey. Asking the question changes the ensemble. This is fairly easy to
understand mathematically, describing an ensemble as a vector in Hilbert space, that
rotates under measurement, but it certainly does not make much sense when asking about
lunch bags.
The above is a representation in as simple a language as possible of some puzzling
properties of quantum mechanics. The essence is that according to quantum mechanics
(sometimes) we cannot create an ensemble that for sure will yield definite values for two
non-commuting observables. This is the content of the Heisenberg uncertainty principle.
The precise formulation would be
∆ ∆A B A B≥12
, .
For a proof and discussion see Cohen-Tannoudhji pages 286-289.
It is often stated as "one cannot measure the precise value of A and B simultaneously".
This is a very incomplete statement of the principle and it has led to all kinds of
ingenious constructions to violate the principle. It is much easier and complete to
interpret the principle in a different way. There is no problem to measure A or B , and for
each measurement (either A or B , but not both) on an individual system you get definite
results. However, for certain pairs of eigenvalues of A and B , (a bi j, ) say, it is in
principle impossible (according to QM) to prepare an ensemble such that all of the
measurements on this ensemble yield precisely the result ai if you measure A and bj if
you would measure B . In contrast there is no problem in preparing an ensemble such that
every member would yield ai if you measure A. You might put in some effort to
appreciate the precise translation of the mathematical formulation of the uncertainty
9
principle into words. It is a little easier if the commutator ,A B is a constant, since then
no ensemble will yield the same value for A and B for all elements in the ensemble. So
necessarily there is a spread, and the minimum spread depends on the commutator. In the
general formulation the mimimum spread depends on an expectation value and hence on
the state under consideration. Note that quantum mechanics actually does not preclude
that individual systems have definite values for all observables. It does say that within the
realm of quantum mechanics you cannot create an ensemble to prove it. Also note that it
is impossible to discuss the uncertainty principle using a single system. It is perfectly
possible to have an experiment where you measure A then B then A then B and find
nothing weird: measurement of A yields ai twice, while the measurement of B yields bj
in both cases. This is quite a possible outcome of this experiment. But beforehand you
cannot be certain that it will happen that way. It is impossible to create an ensemble
where all elements necessarily behave in this fashion. Of course you might be lucky and
by chance, using small enough ensembles one can easily violate the Heisenberg
uncertainty principle. That is all part of statistics.
Let us discuss another hair raising situation. A long standing controversy is the so-called
Einstein-Podolski-Rosen Paradox (EPR). EPR sought for the properties of individual
systems obeying the laws of quantum mechanics. In essence all parties can agree on the
fact that a measurement can change the system. So in the example above if I ask Mary if
she has a coke, afterwards she might no longer have the roastbeef sandwich that she
started out with. However, the issue at hand is different. EPR thought it would be
possible that each lunchbag has a definite content before measurement, and we are simply
looking what is in it. By looking at one piece of information we might, in the convoluted
act of measurement, change another piece of information in ways that are hard to predict.
This would then be the reason that one cannot prepare well specified ensembles, which
are themselves prepared by measurements. It might be that we simply have too little
control over the act of measurement (at present ?). Quantum afficionados tend to think
differently about what happens during a measurement on an individual system. Their idea
is that by measuring you force the microsystem to take a position. It is like flipping a coin
10
at the moment of measurement. "Choose my dear electron! Up or Down?" By the act of
measurement you force the system into an eigenstate of the corresponding observable,
and it does so with probabilities predicted by quantum mechanics. The precise outcome
of an individual experiment is unpredictable in principle. If one reads initial accounts of
the Heisenberg uncertainty principle, they very much reflect the viewpoint of EPR.
Heisenberg himself for example discusses how measuring position necessarily changes
the momentum of an electron. The later accepted viewpoint according to the so-called
Kopenhagen interpretation is rather convoluted in that they use classical mechanics to
describe the measuring aparatus and so there is a mysterious connection between the
quantum and classical system. However, I think that the above stated position of the
quantum afficionado reflects the attitude of many scientists in the field. It was my
position until I wrote these notes.
Let us adjust our lunchbag parabel a little so that we can describe the EPR line of thought
in trivial terms. What if we could gain information about what is in a lunchbag without
asking a question? Let us set up the experiment in a tricky way. Say we know that the
lunchbags are handed out in complementary pairs. Each pair contains both turkey and
roastbeef, an apple and an orange, a coke and a sprite. So a pair of lunchbags might
consist of tca rso& or rso tca& and so forth. We look when the lunchbags are handed out
and keep track of the corresponding pairing of the kids. The actual quantum experiment
consists for example of two spin 1/2 atoms in an overall S=0 state. You will discuss it
yourself later on, working through a set of questions... Back to the kids. Let's say, Lois
and Clark form a pair. Now we take Clark out on the playground and ask him about his
sandwich. "Turkey he says. I would like salami!" Lois doesn't even know we asked, but
we now know that she has roastbeef without asking her (or perturbing her lunchbag). If
we would ask her she would say roastbeef 100% of the time. However, we don't need to
ask her about her sandwich as we know already. Instead we ask Lois about her drink. "I
have a coke she says". After we ask the coke question she might no longer have
roastbeef, but if we assume she has something definite in her lunchbag, before the coke
question it was most definitely roastbeef and a coke. So this is a smart measurement that
shows it makes perfect sense that every lunchbag has something definite in it and by
11
measuring we simply find out what it is. Only, by asking one specific question we might
change the content of the lunchbag in other respects, and in unpredictable ways. At the
time EPR wrote their paper this interpretation was in no conflict with any piece of data
whatsoever. It was just an interpretation that should have appealed as something far more
rational than flipping a coin at the time of measurement. If we take the alternative
quantum interpretation about what actually happens, the EPR experiment is seen to take
on all of its weirdness. Asking Clark what is in his lunchpacket forces him to take a
position. Clark flips a coin to make a decision. "Turkey". If we now would ask Lois about
her sandwich she will say roastbeef for sure. So she flips her coin too, but it always yields
the same result. If we wouldn't have asked Clark it would give a fifty-fifty result, but now
it yields a 100%. Now Lois nor her coin knows anything about our asking Clark. To put it
in the extreme: flipping a coin in Tokyo determines the outcome of the flipping of the
coin in New York. That doesn't make sense. The EPR interpretation is far more
reasonable: if we assume there is something definite in the lunch bag, there is nothing
strange about us knowing what is in Lois's lunch bag if we know what Clark has, given
they form a perfect pair.
However, EPR did something more. They claimed that physical theories should describe
'reality', which means that quantum mechanics should allow for ensembles of completely
specified lunchbags. This it did not, and therefore the theory was not quite up to par.
Quantum theory was incomplete. In order to describe ensembles of well specified
, ,S S Sx y z the structure of the theory needs to be changed completely. If we use the
concept of Hilbert space, operators and eigenvalues it can not accommodate EPR's
reality. Quantum theory was too successfull to discard it, just because of a difference in
interpretation that had apparently no measurable consequences. Glad we didn't. In my
opinion the more reasonable thing to do would have been to adopt the EPR interpretation
but live with the fact that quantum mechanics only describes ensembles that can actually
be prepared by measurements. Measurements unfortunately tend to perturb the system
such that no fully specified ensembles can be prepared. Or perhaps they could and one
might make further advances, necessarily leading to a new theory. In essence this would
mean to say EPR might be right, but quantum mechanics seemed to do the job in practice.
12
It would have kept the search for alternative theories alive but they would necessarily
have the same statistics as quantum mechanics, which has served us very well.
This was the situation until John Bell came around. He showed that the EPR
interpretation might lead to different results from the usual quantum theory for some
experiments. And he used the EPR experiment to show it. This is how it works in terms
of lunchbags. If EPR's postion is right then in fact I can construct what was in Lois's
lunchpacket from the pairing experiment. From Clark's answer I know she had roastbeef,
and by our question we also know she has a coke. We are simply assuming that the
question to Clark could not possibly have affected Lois's lunch box. There is no
unpredictable act of measurement that has a range from New York to Tokyo. Let us
assume therefore for the sake of argument that EPR are right. Every lunchbox has a
definite content and by doing the pairing experiment I can determine two items in a
luchbox. Now we take our whole class and do three types of experiment starting from
identical ensembles in each experiment. In the first experiment we use the pairing
experiment to determine if somebody has a turkey sandwich and a coke. By assumption
she would then have either an orange or an apple as the third item. If we do this for the
whole first ensemble we can write n t c n t c o n t c a[ , ] [ , , ] [ , , ]= +
where n t c[ , ] denotes the number of kids in the enesemble that have both a turkey and a
coke, and so forth. In the next group we determine the number that has a sprite and an
orange, in the third group turkey and orange. In total we would then have the following
relations, assuming the minimal EPR conditions
n t c n t c o n t c an s o n t s o n r s on t o n t c o n t s o
[ , ] [ , , ] [ , , ][ , ] [ , , ] [ , , ][ , ] [ , , ] [ , , ]
= += += +
From this we can derive the so-called Bell inequality: n t c n s o n t c o n t s o n t o[ , ] [ , ] [ , , ] [ , , ] [ , ]+ ≥ + =
This is an inequality that one can test in an actual experiment, as one can make a spin
zero pair, let it fly apart and measure the spin in different directions for the particle in
Tokyo and the particle in New York. We will discuss the full details of the precise
13
quantum treatment later on. But the outcome is that the usual treatment of quantum
mechanics is in conflict with the above analysis based on the assumptions of EPR.
Quantum mechanics violates Bell's inequalities. At the time of the EPR paper (1935)
people couln't really say if EPR was right or the standard Kopenhagen interpretation was
right. Neither did contradict any experiment. It appeared simply a matter of
interpretation. With Bell however, there was a testable hypothesis. Experiments were
done in the seventies, and the experiments by Alain Aspect are perhaps best known
(though not the first). The technical details and fine print are rather involved, but the
conclusion was that the traditional laws of quantum mechanics are correct. So one cannot
assume that individual particles actually have definite values for , ,S S Sx y z and we are
simply determining what they are, although perturbing these values in the process.
Does this mean that we have to accept the alternative interpretation? Flipping a coin in
Tokyo determines the outcome of the flipping of a coin in New York? Not in my opinion.
This 'making a choice during the measurement' aspect appears to be an act of human
imagination. Us trying to understand what we cannot grasp. I think it is better to take a
very mundane position. Quantum mechanics describes the statistical outcomes of
complete experiments. In doing the measurement in Tokyo I am preparing a specific
ensemble. The subsequent measurement in New York is described using this new
ensemble. From the perspective of quantum mechanics the pairing experiment is no
different from first asking Lois if she has roastbeef and then asking if she has a coke. A
measurement is a measurement, and if a measurement in Tokyo tells you something
about the situation in New York, you have to adjust the ensemble accordingly. Of course
this is nothing more than using the laws of quantum mechanics which are very definite
for this type of experiment. What is hard to understand is how there can be such a strong
correlation between two distant particles, which cannot be assumed to individually have
definite properties, while as a pair they do. Quantum mechanics gives us the
mathematical prescription but it does go against common sense, and this is illustrated
very vividly by the flipping the coin at the time of measurement picture. This
phenomenon is called entanglement in the literature. I have a set of excersises that has
you work out the quantum mechanics of Bell's experiment.
14
III. Measurements.
Below I will discuss a simple model of measurement, treating everything quantum
mechanically. Many things will be seen to fall into place, as long as the quantum
mechanics is treated only as a statistical theory and NOT as a description of individual
quantum systems. This includes interference effects that can be washed out by
measurements, 'reduction of the wave packet', Schrödinger's cat paradox, and so forth.
Let us start from the basic postulates of quantum mechanics and build a model for
measurements that is consistent with these postulates. Let us consider a two-level
quantum system described by a state Ψ and an observable B with two non-degenerate
eigenfunctions b b1 2, having eigenvalues b1 and b2 respectively. The observable B is
measured by a perfect measuring apparatus that can be in one of three possible states
B B B0 1 2, , , where B0 indicates that nothing has been measured, while B1 and B2
indicate two different values for the pointer on the apparatus to be associated with the
two eigenvalues of the observable B . We want to describe everything quantum
mechanically, so at time t0 (long before measurement) the overall state of the world W
is given by
W t B( )0 0= Ψ
Let us be perfectly clear about the notation. The Hilbert space that contains the state of
the world is a direct product space
H H HW B Q= ⊗
The fact that W t( )0 is a product function (rather than some linear combination) indicates
that the system is initially non-interacting. The overall Hamiltonian is given by
int intH H H H H H HW B Q B Q B Q= ⊗ + ⊗ + = ⊕ ⊕1 1 .
The interaction part of the Hamiltonian that is responsible for the actual measurement is
supposed to be short range (think of a particle flying through a detector as in a Stern-
Gerlach experiment). The system evolves in time and the Hamiltonian of interaction
15
between quantum system and measuring apparatus acts in such a fashion that shortly after
the measurement (time tb say) the system is described by
W t B b b B b bb( ) = +1 1 1 2 2 2Ψ Ψ
Beyond this time the effect of intH is negligible again, and both quantum system and
measuring apparatus evolve independently. We are making some simplications here.
There is no time evolution under the influence of the zeroth order Hamiltonian H HB Q+ ,
implying we are using the interaction picture (to be discussed later). We also completely
leave open the act of measurement or a discussion of intH as it does not enter the
postulates. Our aim is to find a model system that is completely described by quantum
mechanics, and which agrees with the postulates. A better understanding of
measurements will demand a more complete description. For the current goals of
interpretation the above is sufficient, however.
The data to be abstracted from the above wave function is as follows: "If we would repeat
this experiment many times the pointer would read B1 with a probability p b b( )1 12
= Ψ
and B2 with a probability p b b( )2 22
= Ψ ". This is in perfect agreement with the
postulates. Note that the wave function of the world is a superposition of macroscopic
states. This is not a problem as this just reflects the statistics of the situation and not a
"true state of affairs" (whatever that means). The paradox of Schrödinger's cat does not
appear, as Schrödinger discussed only a single experiment (one cat). A quantum
treatment and interpretation necessarily involves a large number of identical experiments.
In this case we do have a valid repeatable experiment. Now you may well ask, "why is
this a measurement? It looks like ordinary quantum mechanics of a compound system."
Very true. There is no reduction of the wave function and there appears to be nothing too
special about measurements. There are no quantum jumps, just a special type of time-
evolution, described by the world-Hamiltonian in the world-Hilbert space. There is
something special about the macroscopic states Bi however. They are so different that
the two components of the wavefunction can never show interference effects anymore.
Mathematically B O B1 2 0= for any physical operator O (that consists of one- and two-
particle interactions between elementary particles). This is the essence of the act of
16
measurement: The quantum wave function is decomposed into its eigenvector
components and each of these components becomes correlated with a macroscopic state,
destroying interference. At that point the various components of the wave function start
behaving as a classical ensemble (in particular no interference effects). This means that I
can think of my ensemble of being partitioned into two subsets. All particles in subset 1
have property b1, while the elements of subset 2 have property b2. Everything I can derive
from the wave function of the world would agree with such an interpretation. Finally an
element in the ensemble that has property b1 has the measurement apparatus in state B1
and this means that the pointer on the apparatus refers to this particular value. It is clear
that this discussion is not altogether clearcut or comfortable. The idea of measurement is
very difficult, and in the end we might not be able to say more than that we are linking
different entities together by introducing correlations. Ok, let us not make this into a
philosophy class and discuss what we can learn from the wave function of the world.
Next consider the following situation. Suppose that before we measure B we measure an
observable A. We have to adjust the wave function of the world and the Hilbert space
accordingly to incorporate the new situation. Using analogous notation, the wave
function before any measurement would look like
W t A B( )0 0 0= Ψ
Shortly after measuring A (time tA say) we have the wave function
W t A B a a A B a aA( ) = +1 0 1 1 2 0 2 2Ψ Ψ
while if we subsequently measure B we obtain
W t A B b b a a A B b b a a
A B b b a a A B b b a aB( ) = +
+ +1 1 1 1 1 1 1 2 2 2 1 1
2 1 1 1 2 2 2 2 2 2 2 2
Ψ Ψ
Ψ Ψ
There are now four distinct events with their individual probability amplitudes. They are
most easily collected in a matrix that reflect the pointers on apparata A and B
A A
B b a a b a aB b a a b a a
1 2
1 1 1 1 1 2 2
2 2 1 1 2 2 2
Ψ ΨΨ Ψ
17
The probability to measure b1 irrespective of the value of A is now given by the sum of
the squares in the first row
p b b a a b a a
b a p a b a p a
( )
( ) ( )
1 1 12
12
1 22
22
1 12
1 1 22
2
= +
= +
Ψ Ψ
This may be compared to the probability to measure B1 if we wouldn't measure A first,
which was given by
p b b b a a b a a
b a a b a a b a a b a a
( )
Re( )* *
1 12
1 1 1 1 2 22
1 12
12
1 22
22
1 1 1 1 2 22
= = +
= + +
Ψ Ψ Ψ
Ψ Ψ Ψ Ψ
It is seen that the two results differ by the so-called interference term in the B -only
measurement. Measuring A before B affects the results of the measurement of B unless
the interference term happens to be zero! This is the origin of the statement that
according to quantum mechanics measurements necessarily perturb the system. Please
note that the first result can be interpreted as that a fraction p a( )1 is in the state a1 and
another fraction p a( )2 is in the state a2 . The measurement of A has effectively
partitioned the ensemble into two subensembles. The above has immediate bearing on for
example the two-slit experiment: measuring through which slit the electron goes destroys
the interference pattern. The above discussion indicates that interference is the normal
case, when dealing with waves or wave functions. The source for wonder should be that
the act of measurement can destroy this interference pattern. According to quantum
mechanics it does not depend in any way on the details of the measurement (e.g. the
precise form of the interaction Hamiltonian). We will investigate in a minute what
happens if A and B commute and one might expect that the measurements do not
influence each other then.
Let me mention here another way to calculate the probabilities in the A B+
measurement. The wave function of the world corresponding to the eigenvalue b1 is two-
fold degenerate: A B b1 1 1 and A B b2 1 1 are two orthogonal eigenfunctions. The
18
probability to find the eigenvalue b1 is then the length of W along this direction which is
given by b a p a b a p a1 12
1 1 22
2( ) ( )+ as before.
The above notation is rather cumbersome as we have in principle to keep track of all
measurements to understand what is going on. Life can be simplified by using the
concept of density matrices and in particular so-called reduced density matrices. By
definition we define the density matrix of a pure quantum state Ψ as the operator
D Ψ Ψ Ψ=
It is easily verified that the density operator for a pure state is a projection operator
, †D D D D2 = = . In addition Tr D[ ]=1. Similarly, measurements are best described in
terms of projection operators, in case there are degeneracies. In the above model we
assumed no degeneracies, and we will continue to do so. However, using projection
operators it makes no difference, which is one reason to use them. So let us define
( ) , ( )P a a a P A A AQ A1 1 1 1 1 1= = , etc.
Please note that the projectors act on different Hilbert spaces. We can construct operators
that act in the complete space HW as before, e.g. ( )1 1A QP a⊗ . None of this poses any
essential difficulties. It is merely notation. The superscripts on the projection operators
are redundant and we will omit them from now on.
Let us return to a description of the measurement of an observable A using a description
in terms of density matrices. The initial density matrix of the world is given by
( )D t A AW0 0 0= Ψ Ψ
and it evolves under action of the measurement Hamiltonian into
( ) ( )( )D t A a a A a a a a A a a AWa = + +1 1 1 2 2 2 1 1 1 2 2 2Ψ Ψ Ψ Ψ
The density matrix ( )D tW still corresponds to a pure state, and we only used the time-
evolution of ΨW to derive the above equation. Later on we will discuss the equation of
motion for density matrices (the Liouville equation). At this point we have collected all
of the information concerning the measurement on A. In order to get rid of the
19
redundancies in the notation we can integrate out the inessential variables concerning the
measuring apparatus A. To this end we define a reduced density operator
( ) ( )
( ) ( )
( ) ( )
R t A D t A
a a a a a a a a
p a a a p a a a
p a P a
Qa i
i
WA i
ii
i
= =
+
= +
=
∑
∑
1 1 1 1 2 2 2 2
1 1 1 2 2 2
Ψ Ψ Ψ Ψ
The reduced density operator acts solely in the 'quantum' Hilbert space. It is seen that all
potential cross terms disappear due to taking the trace. This reduced density matrix fully
describes the quantum system after measurement of A. The weights p ai( ) are precisely
the probabilities to measure Ai.
Suppose now we decide to also measure B at a later time tb . We should go back and
redefine our initial ( )D tW . The state B0 carries through the whole analysis and after
taking the trace over the states in H A , we obtain for the density operator before
measurement of B (but after measurement of A):
( ) ( ) ( )R t p a B a a B p a B a a BBQa = +1 0 1 1 0 2 0 2 2 0
Upon measurement of B this operator evolves into
( ) ( )( )( )
( )( )( )
R t p a B b b a B b b a a b b B a b b B
p a B b b a B b b a a b b B a b b B
BQb = + +
+ +1 1 1 1 1 2 2 2 1 1 1 1 1 1 2 2 2
2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 2
Upon taking the trace with respect to H B we obtain the reduced quantum density operator
( ) ( )( )
( )( )
R t p a b b a b b b a b
p a b b a b b b a b
Qb = +
+ +
1 1 1 12
1 2 2 12
2
2 1 1 22
1 2 2 22
2
Overall the probability to find the measured value b1 equals
p b p a b a p a b a( ) ( ) ( )1 1 1 12
2 1 22
= +
the same result as obtained before. At this point the usage of the reduced density matrices
becomes clear. One never really has to explicitly consider the measuring apparatus. For
good reasons it does not enter the postulates. The final measurement dictates what a
convenient representation of RQ looks like. It is diagonal in the eigenvector basis
20
corresponding to the final measurement, and in general it is not a pure state, but a
mixture, meaning R R2 ≠ in general.
The general formulation uses projection operators, which are independent of a choice of
basis. So if an initial (reduced) density matrix is given by ( )R t0 , the reduced density
matrix after measurement of an operator A would be given by
( ) ( ) ( ) ( )R t P a R t P aa i ii
= ∑ 0
which is independent of the basis set. The probability to find the eigenvalue ak is given
by
p a Tr P a R t
Tr P a P a R t P a
Tr P a R t
k k a
k ii
i
k
( ) [ ( ) ( )]
[ ( ) ( ) ( ) ( )]
[ ( ) ( )]
=
=
=
∑ 0
0
You can easily verify that Tr R t Tr R ta[ ( )] [ ( )]= =0 1 for any measurement. This does not
require any renormalization of any kind. It is clear that the use of reduced density
matrices has enormous advantages. We will need to spend some more time to find the
properties and practicing with the above general formulation. Something else that has
been neglected at this point is the time-dependence of the system due to H Q . This will
need to be clarified, but it is not very essential for the interpretation of Quantum
Mechanics.
The nature of the reduction postulate is also becoming clear at present. In the case that we
consider the measurement of A followed by B one considers what can happen next to a
system, given that at time ta one has actually measured ai . Therefore one is interested in
finding the so-called conditional probability to find bj , given the result ai upon
measuring A. This conditional probability is given by P A B P Ai j i( & ) / ( ), and is simply
given by b aj i
2 in the non-degenerate case. This is the reason to redefine the wave
function to be Ψ = ai if we want the conditional probability. In general:
21
Measuring A for a density matrix R yields
Tr P a R p ai i( ( ) ) ( )=
( ) ( ) ( )R t P a R t P aa ii
ib g =∑ 0
Subsequent measurement of B :
p b Tr P b P a RP aj j ii
i( ) [ ( ) ( ) ( )]= ∑
p b a Tr P b P a R t P aj i j i i( & ) [ ( ) ( ) ( ) ( )]= 0
p b a p aj i i( & ) / ( ) ??=
Nothing simple emerges. The above result would agree with the conditional probability
amplitude that is described by the usual postulates of quantum mechanics. However it
depends on the initial state in the non-degenerate case, and we cannot simplify. We will
do an excersise to make this clear. The fact remains that a reduction of the wave function
is done more out of convenience than anything else. Using reduced density matrices there
is little reason to proceed like this. The density matrix formalism is capable of describing
actual experiments that contain sequential measurements on a complete ensemble.
Moreover everything is described in the language of quantum mechanics. There is no
reference to the fact that the measuring apparatus should be described by classical
mechanics, as in the Kopenhagen representation of quantum mechanics.
Let us also consider the case of commuting observables. They are alternatively
characterized by the fact that the projectors on the respective eigenspaces commute
( ), ( ) ,P a P b i ji j = ∀0 as discussed before. Using this result we can easily show that the
measurement of A before B yields the same result as measurement of B alone, provided
A and B commute.
Starting from a density matrix R the density matrix after measuring A is given by
( ) ( )P a RP ai ii∑
Subsequently upon measuring B the probability to measure the eigenvalue bk is given by
22
Tr P b P a RP a Tr P a P b RP a
Tr P a P b R Tr P a P b R Tr P b R
k ii
i ii
k i
ii
k ii
k k
[ ( ) ( ) ( )] [ ( ) ( ) ( )]
[ ( ) ( ) ] [ ( ) ( ) ] [ ( ) ]
∑ ∑
∑ ∑
=
= = =2 1
which is the probability to measure bk directly. Therefore the measurements of A and B
do not influence each other. Both can be measured precisely, and a system can be
prepared in a common eigenstate of A and B . You may wish to return to the more
elementary discussion on page 13 and 14 to see what happens if A and B commute.
Finally it is easy to include the time evolution in the formalism. If we assume a time
independent hamiltonian for the quantum system we have the equation of motion
i D t
tHD t D t H D t H D t t D t
D t e D t eiH t t iH t t
∂∂
= − = = = →
= − − −
( ) ( ) ( ) ( ), ; ( ) ( )
( ) ( )( )/ ( )/
0 0
00 0
Following the discussion in Cohen-Tannoudji we can also use the evolution operator
U t t e e p piH t t i tQ p( , ) ( )/0
0= =− − − ω where the basis p are the eigenfunctions of the time-
independent HQ with eigenvalue Ep p= ω . Using the time evolution operator the time
dependent density matrix can be written as
( ) ( , ) ( ) ( , ) ( , ) ( ) ( , )†D t U t t D t U t t U t t D t U t t= =0 0 0 0 0 0 ,
and this expression is even valid if the Hamiltonian would be time-dependent. Please
notice the nice matrix-like ordering of the time arguments in this formulation. The time
evolution using U would take place until the time of (instantaneous) measurement, then
the projection operators corresponding to measurement are inserted and the time
evolution is continued. For example measuring A first at time ta and then B at time tb
would yield the probablities (in full glory)
p b t t t Tr P b U t t P a U t t D t U t t P a U t tj b ai
j b a i a a i a b( )[ , , ] ( ( ) ( , ) ( ) ( , ) ( ) ( , ) ( ) ( , )]0 0 0 0= ∑
Simplicfications occur if for example , ,H A = 0 and hence ( ), ( , )P a U t ti 0 0= . the
above expression would then no longer depend on ta as using the commutation and the
fact that U t t U t t U t tb a a b( , ) ( , ) ( , )0 0= we can write
23
p b t t Tr P b U t t P a D t P a U t t
Tr P b P a U t t D t U t t P a Tr P a P b P a D t
j bi
j b i i b
ij i b b i
ii j i b
( )[ , ] ( ( ) ( , ) ( ) ( ) ( ) ( , )]
( ( ) ( ) ( , ) ( ) ( , ) ( )] ( ( ) ( ) ( ) ( )]
0 0 0 0
0 0 0
=
= =
∑
∑ ∑
Using the cyclic invariance of the trace operation many different formulations can be
achieved, all yielding precisely the same results. In the excersises you will be asked to
evalutate various probabilities and their time dependence depending on the mutual
comutation relations of ,A B and H using a 4-dimensional Hilbert space.
That's it for now. We will work through some excersises to help you digest this material.