ELEMENTS OF QUANTUM MECHANICSjeffery/course/c_quantum/sqm/savedir2/001_elements...ELEMENTS OF...
Transcript of ELEMENTS OF QUANTUM MECHANICSjeffery/course/c_quantum/sqm/savedir2/001_elements...ELEMENTS OF...
ELEMENTS OF QUANTUM MECHANICS
David J. Jeffery
2012 January 1
1. INTRODUCTION
The topic of Speakable Quantum Mechanics (SQM) is introductory non-relativistic quan-
tum mechanics which we will usually just call quantum mechanics, unless we need to be more
specific. Relativistic quantum mechanics is usually referred to as quantum field theory—it
is beyond our scope, except for occasional bits and pieces.
Non-relativistic quantum mechanics is necessarily an approximation to relativistic quan-
tum mechanics. It is valid in the regime where kinetic energies and potential energies (more
generally field energies) are much less than the rest mass energies of particles. Following
philosophy of Laughlin (2005), I prefer to think of non-relativistic quantum mechanics as an
emergent theory: one that is exactly true in the limit of low kinetic and potential energies.
Non-relativistic quantum mechanics is such a well verified theory—all modern electronics
testifies to it—that it is hard to believe there is not a limit in which it is exact. If there were
not such a limit—let’s call this limit the non-relativistic-quantum-mechanics limit (NRQM
limit)—then I think non-relativistic quantum mechanics would somehow be a crude theory
or model that is sometimes unreliable—which it is not.
In this chapter, we develop the elements of quantum mechanics. The approach taken is
a bit more abstract and formal than in most introductory quantum textbooks. But yours
truly thinks it best to get the full formalism first and then derive results for systems rather
than derive results with only part of the formalism and then have to rethink the systems and
results in terms of the full formalism. I just think it is the easier way as long as the readers
– 102 –
can bear formalism that looks a long way from physics—and a lot like not very rigorous
math.
2. AXIOMS
Quantum mechanics can be developed in an axiomatic way, but there are many different
ways of formulating the axioms and explicating them. For example, Cohen-Tannoudji et al.
(1977, p. 211ff) gives 6 axioms and an elaborate explication. The axiomatic path absolutely
has its value. But I think it is not terribly memorable path in pure form. Also in thinking
about quantum mechanics and in solving quantum mechanics problems, the mind swims in
the sea of axioms and results without placing them in a definite hierarchy at every moment.
Yours truly will compromise and present a shorthand set of 5 axioms here (mainly following
Zurek (2009)). The shorthand set allows for easy remembering and cogitation. The axioms
require elaborate explication. We give much of that in this chapter and more in as we go
along in SQM.
The 5 axioms are:
1. State Vector Axiom: States are vectors in Hilbert spaces, and thus the state vector
contains all information about the state of system. For further explication see § 3.
2. Schrodinger’s Equation Axiom: Evolutions of states are deterministic when determined
(in non-relativistic quantum mechanics) by Schrodinger’s equation:
H|Ψ〉 = ih− ∂
∂t|Ψ〉 , (1)
where H is the Hamiltonian (which is the energy operator) and Ψ is contentional state
label. The partial derivative symbol is actually conventional since there is no implicit
time dependence. The full time derivative (i.e., d/dt) could be used as well (e.g., Cohen-
– 103 –
Tannoudji et al. 1977, p. 222). Schrodinger’s equation is referenced to initial frames
unless non-inertial frame effects are accounted for or approximated as negligible. For
further explication see § 6.
3. Repitition Axiom: The immediate repetition of a measurement yields the same outcome
as as the first performance. The meaning of the term measurement in this context is
discussed in § 9. Actually, testing this axiom directly is difficult in many cases, but it
seems essential to the overall validity of quantum mechanics (Zurek 2009).
4. Wave Function Collapse Axiom: The outcomes of measurements are eigenvalues of the
measured observable and the measurement collapses the state to the corresponding
eigenstate of the observable. The axiom actually applies to states not describable by
wave functions too. However, the term wave function collapse is used generally for
the collapse event. The terms “observable” and “measurement” need further explica-
tion. However, we note here that “measurement” is conventional and does necessarily
imply human measurement. It refers to a strong interaction of the system with the
environment. For further explication see § 9.
5. Born’s rule Axiom: The probability of a measurement outcome (i.e., collapse to eigen-
state of an observable) is the magnitude squared of the state amplitude of the eigenstate
in the expansion of the state in the complete set of eigenstates of the observable. State
amplitude is conventionally called probability amplitude, but SQM prefers state am-
plitude for reasons given in § 9. For the position presentation, the state amplitude is
the magnitude squared of the wave function |Ψ|2.
After reading this chapter, one hopes these axioms largely click into place in the minds of
readers.
Actually quantum mechanics has more axioms to deal with actual physical entities.
– 104 –
There is such a thing as spin (see the chapter Spin) whose existence is an axiom. The sym-
metrization principle is an axiom needed to deal with identical particles (see the chapter
The Symmetrization Principle). Moreover, it is my view that what I call micro-axioms fre-
quently get added as quantum mechanics is developed without bothering to tabulate them:
the practitioner just assimilates them. For example, the eigenstates of the space operator for
the Hilbert space (§ 4) of space space (i.e., ordinary 3-dimensional space) are Dirac delta
functions even though they Dirac delta functions are not real functions (§ 5). That we can
put the Dirac delta functions on the Procrustian bed of Hilbert space eigenstates seems to
be me to be a micro-axiom.
We need to further remark that quantum mechanics is at the same time profoundly like
and unlike classical mechanics. We will just let those likenesses and unlikenesses emerge as we
go along in SQM. The likenesses certainly guided the development of quantum mechanics, but
new axioms were needed and quantum mechanics cannot be derived from classical mechanics.
It can, of course, be derived as an emergent theory from quantum field theory in the NRQM
limit—but we won’t do that.
3. STATE AS VECTOR
From § 2
This axiom has been contested by some (notably Einstein) since the dawn
of quantum mechanics. The contesters believe that the state vector of quantum
mechanics theory has only statistical information about the state, and so is not
identical with the state. With our current knowledge, there is no practical difference
in the two views since quantum mechanics has never been falsified—to use the
jargon of falsifiability (e.g., Wikipedia: Falsifiability). The theoretical difference is
– 105 –
profound since the statistical view implies there is a more fundamental theory than
quantum mechanics. There is a notable proof by Pusey et al. (2011) that quantum
mechanics cannot be interpreted statistically—this argument supports the axiom
as stated. The validity of this proof is still under debate.
What does the state being a state vector mean? We cannot say
everything possible here. We can say it means that it takes many numbers
to describe the state in general. From a classical point of view, no surprise.
That it takes a continuum infinity (i.e., an uncountable infinity) of complex
numbers numbers to describe the state of a single particle is a surprise.
This continuum infinity is called the wave function.
Also in classical physics, the numbers used in describing the state depend spatial
coordinate system or basis system. This is true even for kinetic energy which is a scalar:
the velocity used to evaluate the kinetic energy depends on the motion of the axes. In
quantum mechanics, the bases of Hilbert space are not spatial coordinates although spatial
coordinates are needed too. The Hilbert space bases include most importantly the position,
wavenumber (AKA momentum) and energy bases. We will elucidate these and other bases
later. Here just as key example, the wave function can be expanded in the position basis: this
means that there is a complex number (which is a basis vector coefficient) for every point
in space. If one changes to the wavenumber basis expansion for the wave function, there
is a complex number for every point in wavenumber space. The position and wavenumber
bases have continuum infinity of basis vectors. The energy basis for the same system can
have continuum infinity of basis vectors for unbound and therefore unquantized states or a
discrete infinity (i.e., a countable infinity) of basis vectors for bound and therefore quantized
states. Of course, there are system which have both unbound and bound states, and therefore
both a continuum infinity and discrete infinity in the set of basis vectors. We discuss state
– 106 –
vectors further below in § 3.
So to make a start, the state of a system in quantum mechanics is (or is described by if
you prefer) an abstract state vector in a Hilbert space which represents the system. We will
explicate Hilbert spaces below in § 4. The conventional symbolization for the vector is
|Ψ〉 . (2)
The form |〉 is ket vector form in Dirac’s bra-ket notation and the Greek Psi Ψ is the
conventional state symbol. Of course, other state symbols are introduced as needed. State
vector, state, and wave function (because it turns out to be the most basic basis expansion
of the state vector) are effectively synonyms and we will use them as such on most occasions.
Note that by common physics convention, the word system can also mean the state of the
system. Context decides whether “system” means system or state of the system.
We need to remark that a vector, except a trivial one, is not specified by a single
number. It is specified by a set of coefficients for unit vectors for the space it is embedded
in. In quantum mechanics, those coefficients are called amplitudes—at least in SQM—and
we will get to discussing them in § ?? and the remainder of this chapter—so readers hold
your horses of anxiety.
In quantum mechanics as in classical mechanics, everything in the universe is connected
to everything else. So the exact system is the universe. Fortunately, as in classical mechanics,
it is possible to understand a part of the universe by idealizing that part as the system and
the rest of the universe as the environment. In the limit that the environment has no effect,
the system is the exactly correct system. Otherwise, the system is an approximation whose
limitations can be reduced by including more of the universe in the system. In quantum
mechanics, a more exact system is a bigger or more comprehensive Hilbert space. However,
there is an immense and subtle issue called decoherence theory (e.g., Zurek 2003) to be
introduced in regard to system and environment. The time evolution of systems involves
– 107 –
decoherence and decoherence is caused by the environment. So many quantum mechanics
calculations, the enviromnent must be included. Decoherence theory has grown up since
about 1970 (e.g., Zurek 2003, p. 5) and is still a work in progress, but the process of de-
coherence has had to be dealt with since the early days of quantum mechanics—without
being identified as decoherence. It was dealt with via Born’s rule and wave function collapse
which in traditional quantum mechanics are extra axioms. We will take up the subjects of
the Born rule and wave function collapse and decoherence in § 9. In our first developments,
the system will be a single particle in force field described by a potential energy—which in
quantum mechanics jargon is called a potential—which is not to be confused with potential
in classical electromagnetism which is potential energy per unit charge.
The state vector is normalized (i.e., its length is 1) and its direction in the Hilbert
space gives all the information about the state that one can know. How that information is
extracted we elucidate below in the § 4. The evolution of the direction in the Hilbert space is
the evolution of the system. Actually, one seldom/never describes the evolution as evolution
of direction, but its a valid perspective.
The time evolution of state is determined by Schrodinger’s equation (AKA Schrodinger’s
equation) which we introduce below in § 6. It the general equation of motion of non-
relativistic quantum mechanics and is the quantum mechanical analog to Newton’s 2nd law
(AKA ~Fnet = m~a). Like Newton’s 2nd law, Schrodinger’s equation can be loosely described
as an imbalance or disequilibrium between energy terms that determines time evolution. (I
am thinking of classical forces as field-energy-structure-derived entities.) A special case, is
when there is no imbalance. Classically, “no imbalance” gives an equilibrium state that is
a static state in rest frame of the system. In quantum mechanics, “no imbalance” gives an
elementary time dependence and we call the state a stationary state.
The idea of representing the state of system by a vector is not unknown in classical
– 108 –
mechanics. The three position variables and three momentum variables of a particle together
constitute a position or displacement vector in 6 dimensional phase space. But there is a
profound difference. The components of phase space vector are definite characteristics of the
particle. In quantum mechanics, the components of the vector state with respect to a basis
(i.e., a set of unit vectors that span the Hilbert space) are the amplitudes for that basis.
We explicate the terms basis and amplitude below in § 4. The fact that the components are
amplitudes leads to the requirement that the state vector be normalized as we will also see
in § 4.
4. HILBERT SPACES AND OBSERVABLES
5. THE WAVE FUNCTION
6. SCHROEDINGER’S EQUATION
Schrodinger’s equation is
ih−d|Ψ〉dt
= H|Ψ〉 , (3)
Schrodinger’s equation is often written with a partial time derivative rather than full
time derivative. The full time derivative seems to be the more correct form (e.g., Cohen-
Tannoudji et al. 1977, p. 222), but there is no distinction in most cases since there usually
will be no implicit time dependence in the independent arguments of |Psi〉.
6.1. A Natural Path to Schrodinger’s Equation
Schrodinger’s equation cannot be derived from classical physics. Counterfactually, if it
could, then we would be partially along the path to finding that quantum mechanics is an
emergent theory from classical physics. If fact, at the moment it seems the other way around.
– 109 –
But, of course, both may be emergent theories from some more general physics that is as yet
unknown.
It is possible to follow a natural path to Schrodinger’s equation that makes use of
classical and quantum mechanics concepts. We call it a natural path since it one that could
have been used in the discovery of Schrodinger’s equation by Erwin Schrodinger (1887–1961)
(Wikipedia: Erwin Schrodinger). The actual historical path was probably more convoluted,
but we leave the history path to the history of science. It is pedagogically useful to follow
the natural path since it helps to understand quantum mechanics and to understand how
theorizing works.
In the early 1920s, de Broglie hypothesized that massive particles like photons (which
were already widely believed to exist) would obey the relations (which we know call the de
Broglie relations)
λ =h
pand E = hν , (4)
where λ is the particle’s wavelength, p is its momentum, E is its energy, ν is its frequency, and
h is Planck’s constant. The formula for λ is conventionally called the de Broglie wavelength.
De Broglie’s hypothesis included the idea that the wavelength and frequency were meaningful
for a massive particle. De Broglie’s relations are quantum mechanical concepts. We assume
that E is the mechanical energy: i.e., the sum of kinetic energy T and potential energy V—
which we will just call potential hereafter following the convention of quanutm mechanics.
Thus,
E = T + V =p
2m, (5)
where p is momentum again,m is the particle mass, and we have assumed the non-relativistic
limit.
We now invoke classical concepts. If there are matter waves, then we can imagine that
there must be a wave function that describes oscillation of whatever is oscillating as a function
– 110 –
of space and time. There should also be wave equation that is the equation of motion for
the matter waves: i.e., a differential equation that is the dynamical law that governs their
evolution. We will consider just a one spatial dimensional system for our wave function and
wave equation development.
For a wave function, we ansatz the traveling wave function
Ψ = cei(kx−ωt) , (6)
where c is an amplitude we leave unspecified for the moment, k = 2π/lambda is wavenumber,
x is the spatial coordinate, ω = 2πν is the frequency, t is time, and i is, of course, the
imaginary unit. We note using equation (4) that
p =h
λ=hk
2π= h−k and E = hν =
hω
2π= h−ω . (7)
What is Ψ? For the moment, we will just say that it a function that somehow tells us
where the particle is or is spread out in space. Our traveling wave wave function, in fact,
complete delocalized. Aside from the periodic oscillation itself, it is uniform for all x. But
we will not let that stop us.
Now that we have a wave function, can we deduce the wave equation from which it
follows? Well first let’s assume an energy conserving system (i.e., E constant) with a con-
stant potential V , and therefore constant kinetic energy T . We have the equation, so far
uninteresting,
T + V = E . (8)
From our wave function, we see that we can extract the T value and the E value using,
resprectively, the operators,
Top = − h−2
2m
∂2
∂x2and top = ih− ∂
∂t, (9)
– 111 –
where the first is the kinetic energy operator and the second is the time operator. Using
these operators and equation (8), we obtain
TopΨ + VΨ = topΨ or HΨ = ih−∂Ψ∂t
, (10)
where H = Top + V is the quantum mechanical Hamiltonian that we has seen above. Equa-
tion (10), which is satisfied by our wave function, is Schrodinger’s equation.
Following the natural path, we assume that equation (10) generalizes to all cases which
includes three dimensions, space and time dependent potentials, and multiple particles. Then
we have ansatz the standard interpretation of the wave function as a state amplitude and
Ψ|2 as a state density, and ansatz Born’s rule. That completes the natural path.
Of course, all the history of quantum mechanics confirms that Schrodinger’s equation
is the fundamental equation of motion for non-relativistic quantum mechanics.
6.2. Continuity Conditions
6.3. Free Particle Case
7. THE UNCERTAINTY PRINCIPLE
The uncertainty principle is immensely important result in quantum mechanics both
as a conceptual aid to understanting quantum mechanics and as calculation tool in making
estimates. The word “principle” is for historical reasons. The uncertainty principle is, in
fact, a result of the vector formalism of quantum mechanics. The word “uncertainty” is
a bit of misnomer. The modern uncertainty principle is not directly about measurement
uncertainty, but is a relationship between the widths (more precisely standard deviations)
of the superposition distribution of a state for two different eigenstate bases.
– 112 –
The general version of the uncertainty principle is
σAσB ≥ 1
2|〈i[A,B]〉| (11)
where A and B are general observables, σA and σB are the standard deviations for the
observables for a general state |α〉. There is actually a limitation on generality of the operators
and states. The vectors A|α〉 and B|α〉 should, in general, still be ones for which A and B are
Hermitian operators—there are tricky cases where this not so: we will consider two example
in § 7.4. For these tricky cases, the uncertainty principle does not hold.
We should also note that i[A,B] is a Hermitian operator (when uncertainty principle
holds), and 〈i[A,B]〉 is a real number. We should prove this. From § ??, we know that
[A,B]† = −[A†, B†] (12)
for general operators A and B. If the operators are, in fact, Hermitian, then
(i[A,B])† = i∗(−[A,B]) = i[A,B] (13)
and that’s QED since the Hermitian conjugate of i[A,B] equals i[A,B]: i.e.,
(i[A,B])† = i[A,B] . (14)
Another comment to make is in regard to the time-energy uncertainty principle. It is
not a special case of the general uncertainty principle equation (11). The general uncertainty
principle is evaluated all at one instant in time. There is no precise meaning to attach
to the idea of the state distributed into eigenstates of time. The time-energy uncertainty
principle provides information of the evolution of the system. To include time evolution in
the formalism, we need the equation of motion for the system: i.e., Schrodinger’s equation
(§ 6). Making use of Ehrenfest’s theorem (which is derived using Schrodinger’s equation:
§ 8) and the general uncertainty principle, the time-energy uncertainty principle is derived
– 113 –
(§ ??). The time-energy uncertainty principle is called a principle for historical reasons even
though it simply result.
Now for the main event: the proof of the uncertainty principle.
7.1. Proof
First, we note the variance for A is
σ2A = 〈α|(A− 〈A〉)2|α〉 = 〈α|(A− 〈A〉)(A− 〈A〉)2|α〉 = 〈α|(A− 〈A〉)†(A− 〈A〉)|α〉
= 〈(A− 〈A〉)α|(A− 〈A〉)|α〉 , (15)
where we have used the facts that A is Hermitian for |α〉 and tht the expection value 〈A〉 is
a pure real c-number (since A is Hermitian for |α〉), and so is a trivial Hermitian operator
too. Similarily, the variance for B is
σ2B = 〈(B − 〈B〉)α|(B − 〈B〉)|α〉 . (16)
Second, we note
σ2Aσ
2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 ≥ |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 ,
(17)
where we have used the Schwarz inequality (§ ??). Now
|〈(A− 〈A〉)α|(B − 〈B〉)|α〉|2 = |〈α|(A− 〈A〉)†(B − 〈B〉)|α〉|2
= |〈α|(A†B − A†〈B〉 − 〈A〉B + 〈A〉〈B〉)|α〉|2
= |〈α|(A†B − A〈B〉 − 〈A〉B + 〈A〉〈B〉)|α〉|2
= |〈A†B〉 − 〈A〉〈B〉|2
= (Re[〈A†B〉] − 〈A〉〈B〉)2 + Im[〈A†B〉]2
=
[
1
2(〈A†B〉 + 〈B†A〉) − 〈A〉〈B〉
]2
– 114 –
+
[
− i
2(〈A†B〉 − 〈B†A〉)
]2
, (18)
where we have not assumed that A is Hermitian for B|α〉 nor that B is Hermitian for A|α〉
and where Re[] and Im[] are functions that evaluate, respectively, to the real and imaginary
parts of their arguments. Now we can write the rather general inequality
σ2Aσ
2B ≥
[
1
2(〈A†B〉 + 〈B†A〉) − 〈A〉〈B〉
]2
+
[
i
2(〈A†B〉 − 〈B†A〉)
]2
(19)
Third, we now do assume that A is Hermitian for B|α〉 and that B is Hermitian for
A|α〉 and our inequality reduces to
σ2Aσ
2B ≥
(
1
2〈{A,B}〉 − 〈A〉〈B〉)2
)2
+
(
1
2〈i[A,B]〉
)2
, (20)
where we have made use of the commutator and the anticommutator (§ ??). The first term
on the right-hand side is the classical analog term. If A and B were commuting functions
and squared amplitude was a set of probabilities or a probability density, then this term
would become the square of the covariance 〈AB〉− 〈A〉〈B〉 of the functions A and B for the
probability distribution (e.g. Bevington 1969, p. 64). The second term is purely quantum
mechanical. It is non-zero in general if A and B do not commute (i.e., are incompatible
observables) and therefore have no common basis. It may be zero for particular states |α〉
for incompatible observables, if expectation value 〈i[A,B]〉 happens to be zero.
Finally, dropping the classical analog term, we have the uncertainty principle (eq. 11)
as given above:
σAσB ≥ 1
2|〈i[A,B]〉| . (21)
The uncertainty principle shows neither σA nor σB can be zero if 〈i[A,B]〉 6= 0.
The uncertainty principle can be used to estimate one of σA nor σB if the other can
be estimated and (1/2)|〈i[A,B]〉| can be estimated. For example some actual experiment
measurement confines (by wave function collapse) the distribution of the state among the A
– 115 –
eigenstates and (1/2)|〈i[A,B]〉| is estimated, then we can obtain a lower bound on the range
of B eigenvalues obtainable from a second measurement that collapses the wave function.
So the uncertainty principle does have use in making estimates of quantities that might be
very hard to calculate or measure exactly.
7.2. The Heisenberg Uncertainty Principle
The uncertainty principle’s most famous special case is for the position and wavenumber
observables. This special case is called the Heisenberg uncertainty principle.
Consider the position observable xop,i = xi and the wavenumber observable
kop,j =1
i
∂
∂xj
(22)
where i and j are general coordinate indices which could be any of 1, 2, 3 (§ 5). Evaluating
the commutator of these gives
[xop,i, kop,j] = xikop,j − kop,jxi = xikop,j −1
i− xikop,jxi = iδij (23)
Thus, we have commutators
[xop,i, kop,j] = iδij [xop,i, pop,j] = ih−δij (24)
and the special uncertainty principle cases
σxop,iσkop,j
≥ 1
2δij and σxop,i
σpop,j≥ h−
2δij (25)
Either of these can be called the Heisenberg uncertainty principle, but that term is often
reserved just for
σxop,iσpop,i
≥ h−2. (26)
We emphasize again that the uncertainty principle is not directly about measurement
uncertainty, but is a relationship between the widths (more precisely standard deviations)
– 116 –
of the superposition distribution of a state for two different eigenstate bases. But it can be
used to make estimates of the range of experimental results as discussed in § 7.
In the Heisenberg uncertainty, the lower limit on the value of σxop,iσkop,j
and σxop,iσpop,j
is
independent of the particular state. This makes using the Heisenberg uncertainty principle
for estimates of the ranges of experimental results particularly easy. Since position and
wavenumber/momentum are very important dynamical variables, the Heisenberg uncertainty
principle has great utility.
For example, say that you have experimentally confined a particle to a range ∼ ∆x in
the x-direction. Then the range in electron velocities ∆v (for the x-direction) on a subsequent
measurement would satisfy
∆v &h−
2m∆x. (27)
For example, say we have an electron (m = 9.109 . . . × 10−19 kg) confined to about 1 A =
10−10 m. We find
∆v &1
2× 106 m/s , (28)
where the 1/2 factor may be insignificant depending on the particular case. The velocity range
lower limit is high, but does not imply the velocity values themselves will be relativistic (in
which case our non-relativistic treatment might fail). The result of equation (eqn-Heisenberg-
uncertainty-principle-2) is only a lower limit on the range of velocities. The in actual cases
where the expectation value for velocity is known to be zero, the lower limit on the range is
often a good estimate of the order of magnitude of the velocity.
Actually, Heisenberg derived his eponymic uncertainty principle for actual measure-
ments, where its interpretation is different than that of the modern uncertainty principle.
The uncertainty quantities represented actual possible measurement uncertainties, whereas
the in modern formulation they are standard deviations of the state for observables. Hei-
senberg’s interpretation is now believed to be not completely valid. A modern experimental
– 117 –
uncertainty relation has been derived by Ozawa (2012, and references therein) which has
been experimentally verified so far Erhart et al. (2012). It is beyond our scope to further
into the subject of the experimental uncertainty relation.
7.3. The Minimum Uncertainty State
In proving the uncertainty printiple, we had a step equation (17)
σ2Aσ
2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 ≥ |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 .
(29)
We define a minimum uncertainty state to be one where the equality holds for this expression.
Thus, a minimum uncertainty state satisfies
σ2Aσ
2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 = |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 .
(30)
Now from the Schwarz inequality (§ ??), we now that the last equality is equivalent to
having
(A− 〈A〉)|α〉 = c|(B − 〈B〉)|α〉 or (A− cB)|α〉 = −(〈A〉 − c〈B〉)|α〉 (31)
where c is some c-number. The last expression is an eigenvalue problem, but not necessarily
a Hermitian operator eigenvalue problem. The Hermitian conjugate of (A− cB) is
(A− cB)† = A† − c†B)† = A− c∗B (32)
which is Hermitian only for the case of c being pure real.
We are not completely free to choose c. The fact that the minimum uncertainty state be
normalizable puts a contraint on c. Still there can be some freedom in choosing c, and thus
making σAσB. We might guess that choosing c to be pure imaginary would help minimize
– 118 –
σAσB . If c were pure imaginary,
〈(A− 〈A〉)α|(B − 〈B〉)|α〉 = c∗〈α|(B − 〈B〉)2|α〉 (33)
would be pure imaginary since (B−〈B〉)2 is a Hermitian operator and this would mean that
the classical term in equation (20) would be zero. Thus, we find that
σAσB =1
2|〈i[A,B]〉| . (34)
But it is not clear in general that c can be chosen to be pure imaginary nor that doing
so necessarily leads to the smallest possible value of σAσB. In fact, choosing c to be pure
imaginary does lead to the smallest possible σAσB—but we only know this after we have
shown it.
We need to remark that the state obtained by solving equation (31) is just the state
at one instant in time in general. There is no guarantee that it is a stationary state. So in
general it will evolve and cease to be a minimum uncertainty state after that one instant.
We cannot go any further in general, and so will specialize equation (31) to the case of
the Heisenberg uncertainty principle in the position basis for one dimension:
(1
i
∂
∂x− cx)Ψ == −(〈k〉 − c〈x〉)Ψ . (35)
where have chosen A = kop = (1/i)∂/∂x and B = xop = x. The solution of this differential
equation is straightforward: one obtains
Ψ = C exp[i(〈k〉 − c〈x〉) +ic
2x2] (36)
where C is a normalization contant. For this wave function to be normalizable, c must have
a non-zero positive imaginary part. Let us write c = a+ ib where b > 0. Now we have
Ψ = C exp[i(〈k〉 − a〈x〉) − b
2x2] + b〈x〉]
– 119 –
= D exp[i(〈k〉 − a〈x〉) − b
2(x− 〈x〉)2]
=1
√
σ√
2πexp[i(〈k〉 − a〈x〉) − 1
2(2σ2x)
(x− 〈x〉)2] , (37)
where we have completed the square for x quadratic and adjusted the normalization constant
according, identified the wave function as a Gaussian and the recognized that standard
deviation of the squared amplitude (which is also a Gaussian) satisfies b = 1/(2σ2x) (e.g.
Bevington 1969, p. 53). Now we find
〈(A− 〈A〉)α|(B − 〈B〉)|α〉 = c∗〈α|(B − 〈B〉)2|α〉 = c∗σ2x =
a− ib
2b(38)
Thus we find
σxσk =
∣
∣
∣
∣
∣
√a2 + b2
2b
∣
∣
∣
∣
∣
(39)
Obviously, we minimize σxσk by choosing a = 0. As we anticiplated, the minimum wave func-
tion is obtained for pure imaginary c. Since the b cancels out of the result, it is indeterminate.
Thus, our minimum uncertainty Heisenberg wavefunction is
Ψ ==1
√
σ√
2πexp[i〈k〉 − 1
2(2σ2x)
(x− 〈x〉)2] (40)
and this gives
σxσk =1
2and σxσp =
h−2. (41)
As we anticiplated, the minimum wave function is obtained for c pure real.
The minimum Heisenberg uncertainty wave function can probably evolved in many sys-
tems, but only existing then for an instant in time in general. However, the ground eigenstate
(i.e., ground stationary state) of the 1-dimensional harmonic oscillator is a Gaussian wave
packet (with a = 0), and so is minimum Heisenberg uncertainty wave function that does not
evolve in time. Another wave function that remains a Gaussian for more than an instant is
free particle wave packet that is initially a Gaussian (e.g., Griffiths 2005, 67). Its standard
deviation increases with time, but it remains Gaussian. We consider both the 1-dimensional
– 120 –
harmonic oscillator and the 1-dimensional wave packet in the chapter 1-Dimensional Sys-
tems.
7.4. Cases Where the Uncertainty Principle Does Not Apply
There is actually a limitation on generality of the operators and states for the uncertainty
principle as mentioned in § 7. The vectors A|α〉 and B|α〉 should, in general, still be ones for
which A and B are Hermitian operators for the uncertainty principle to apply.
The tricky cases (which do turn up) where the uncertainty principle does not apply.
Discussing them actually will get us a bit ahead of the topics in this chapter, but so be it.
Consider 1-dimensional system of length L with zero potential (or constant potential chosen
to have value zero) containing a particle of mass m. We impose periodic boundary conditions:
i.e.,
ψ(0) = ψ(L) . (42)
These boundary conditions cannot be exactly valid for any Euclidean 1-dimensional space,
but there are cases where it is valid approximation as we will discuss in the chapter Solids:
one can approximate a finite crystal using periodic boundary conditions. In this case, the
normalized solutions of the time-independent Schrodinger equation are
ψk(x) =eikx
√L, (43)
where evidently both energy and wavenumber eigenstates just as for the free particle case
briefly discussed in § 6.3. Quantization on the eigenstates by the need to meet the boundary
conditions:
kL = 2πn or k =2π
Ln , (44)
where n must be a general integer. The integer n is a quantum number in the jargon of quan-
tum mechanics: a dimensionless number (not always an integer) that indexes the eigenvalues
– 121 –
of an observable: the eigenvalues are functions of the quantum numbers. The eigen-energies
are also quantized, of course:
E =h−2k2
2m=
h−2
2m
(
2π
L
)
n2 . (45)
We can see the energy values have a double-degeneracy: two wavenumber states (those for
n and −n) have each energy.
The wavenumber states by quantum mechanics axiom (§ 4) should form a basis for the
system space [0, L]. In this case, Sturm-Liouville theory of mathematics Arfken (e.g. 1970,
p. 442ff) verifies that the wavenumbers states are a complete set (i.e., are a basis) in a Sturm-
Liouville theory sense. This means any piecewise continuous function in the interval [0, L]
can be expanded in the basis with the squared discrepancy between function and expansion
integrated over the interval vanishing. This means there are Hilbert space vectors that cannot
be physical states. Recall we require wave functions and their 1st derivatives to be continuous
(§ 6.2) This rules vectors that are only piecewise continuous. We also rule out vectors that
do not satisfy the boundary conditions: they are discontinuous at the boundary.
The wavenumber basis is, in fact, the exponential Fourier series scaled to interval [0, L]
Arfken (e.g. 1970, p. 643).
If the particle is in a wavenumber eigenstate, the standard deviation of the wavenumber
observable is σkop= 0. The standard deviation of the position observable xop = x is finite
since the whole system only extends from x = 0 to x = L. So we have
σxσk = 0 <1
2(46)
in apparent violation of the Heisenberg uncertainty principle (§ 7.2).
The paradox of apparent violation is resolved by noting that
xψk(x) = xeikx
√L
(47)
– 122 –
is not a vector for which kop is a Hermitian operator. Behold:
〈k′|kop|k〉 =
∫ x
0
e−ik′x
√Lkop
(
xeikx
√L
)
dx
=e−ik′x
√L
1
i
(
xeikx
√L
)
∣
∣
∣
L
0−
∫ x
0
xeikx
√L
1
i
∂
∂x
(
e−ik′x
√L
)
dx
= −i+ 〈k′|kop|k〉∗ (48)
The −i term in the last expression shows that kop is not Hermitian (i.e., kop 6= k†op) for the
vector of equation (47). For any vector matching the periodic boundary conditions, kop is
Hermitian.
In the derivation of the uncertainty principle, we relied on an observable being Hermitian
for acting on vector formed from the other observable and the general state of the derivation.
Without this condition—which we do not have for xψk(x), the uncertainty principle does
not apply. If we look at the intermediate equation (18) in the proof of the uncertainty
principle—before making the use of general Hermiticity of operators A and B, we have the
inequality
σ2Aσ
2B ≥ |〈A†B〉 − 〈A〉〈B〉|2 . (49)
Choosing A = xop (which is always Hermition), B = kop, and ψk as our general state, we
find
σ2Aσ
2B ≥ |L
2k − L
2k|2 = 0 (50)
which is exactly correct for our periodic boundary condition case.
As mentioned above, periodic boundary conditions cannot be exactly valid for any Eucli-
dean 1-dimensional space. However, they exact for the azimuthal angle space (the interval
[0, 2π]) of spherical polar coordinates must have periodic boundary conditions. The position
operator for this space is φop = φ (the azimuthal angle itself) and the z component angular
momentum observable (which determines the z component angular momentum eigenstates)
– 123 –
is
Lz =h−i
∂
∂φ(51)
(see the chapter Angular Momentum). This angular case is exactly like the periodic boundary
case we have just analyzed, mutatis mutandis, in regard to the uncertainty principle. So the
uncertainty principle does not apply to it and the fact that φop and Lz do not commute does
require the product σφopσLz
to be greater than zero: it can be zero.
8. EHRENFEST’S THEOREM
Ehrenfest’s theorem (e.g., Wikipedia: Ehrenfest theorem; Griffiths 2005, p. 115) is one
of the most basic results of quantum mechanics. The theorem is
d〈A〉dt
=1
ih−〈Ψ|[A,H ]|Ψ〉+
⟨
∂A
∂t
⟩
, (52)
where A is a very general operator, H is the Hamiltonian, [A,H ] is the commutator of A and
H , Ψ is a general wave state, and ∂A/∂t is the partial time derivative of operator A which is
zero in many cases. We require of A only that 〈A〉 exist (i.e., not diverge). It does not have
to be an observable. However, the physical interpretation of 〈A〉 and d〈A〉/dt if A is not an
observable have to elucidated. If A is an observable, then [A,H ] is not an observable, but
(1/i)[A,H ] (see Appendix ??).
Ehrenfest’s theorem gives the time derivative of the expectation value of A. If d〈A〉/dt
is zero, A is called a constant of the motion (e.g., Cohen-Tannoudji et al. 1977, p. 247)
and expectation value 〈A〉 is conserved. We see that if [A,H ] (i.e., A and H commute) and
∂A/∂t = 0 (i.e., A has no explicit time dependence), then A is a constant of the motion. It
seems odd to call an operator a constant of the motion, but that is the accepted jargon. One
can probably call 〈A〉 a constant of the motion too if A is a constant of the motion.
An immediate basic result of Ehrenfest’s theorem is that if H is time-independent, then
– 124 –
H is a constant of the motion and 〈H〉 is conserved. This is because all operators commute
with themselves. Similarly, f(H) is a constant of the motion where f is any power-law
expansion in its argument.
Ehrenfest’s theorem is derived straight from Schrodinger’s equation. From it many other
basic and interesting results can be easily derived. Here we will derive Ehrenfest’s theorem
and some of these other basic results.
8.1. Derivation of Ehrenfest’s Theorem
Consider very general operator A. We require of A only that
〈A〉 = 〈Ψ|A|Ψ〉 (53)
exist (i.e., not diverge to infinity) for general state |Ψ〉.
Now
d〈A〉dt
=d〈Ψ|A|Ψ〉
dt=d〈Ψ|dt
|A|Ψ〉 + 〈Ψ|∂A∂t
|Ψ〉 + 〈Ψ|A|d|Ψ〉dt
= − 1
ih−〈Ψ|H†A|Ψ〉 +
1
ih−〈Ψ|AH|Ψ〉 + 〈Ψ|∂A
∂t|Ψ〉
=1
ih−〈Ψ|[A,H ]|Ψ〉+
⟨
∂A
∂t
⟩
, (54)
where we make use of the product rule, Schrodinger’s equation and its Hermitian conjugate,
ih−d|Ψ〉dt
= H‖Ψ〉 and − ih−d〈Ψ|dt
= 〈Ψ|H† = 〈Ψ|H , (55)
and results from Appendix ??. Note ∂A/∂t = dA/dt since we can assume operators have no
implicit time dependence as a usual rule.
Equation (54) is the complete proof.
– 125 –
9. WAVE FUNCTION COLLAPSE AND BORN’S RULE
Wave function collapse is the most mysterious and controversial aspect of quantum
mechanics and has been since the beginning of quantum mechanics in 1921–1926. It some-
times referred to as the wave function reduction or fundamental perturbation (e.g., Cohen-
Tannoudji et al. 1977, p. 220, 226). These terms seem like euphisms to me. Older texbooks
seem often not to mention wave function collapse at all or refer it fleetingly Let’s elucidate
wave function collapse insofar as we can.
First, we note that Schrodinger’s equation is completely deterministic. Given the initial
conditions and the time evolution of any external potential, the future evolution and the
past evolution too are completely determined.
However, when an ideal “measurement” of a dynamical variable is made, the wave
function collapses and
For the position presentation, the state amplitude is the magnitude squared of the wave
function |Ψ|2.
Decoherence theory dispenses with this axiom by showing how wave func-
tion collapse is, in fact, a decoherence event or effective wave function collapse that
follows from quantum mechanics based on the first three axioms. If decoherence
theory is accepted, then the wave function collapse axiom becomes faux axiom that
is just a highly useful rule in prediction outcomes of measurements.
SQM is neutral about decoherence theory. It is a theory which has gained
traction since its early formulation the 1970s and 1980s (Zurek 2003, e.g.,). But
it is not universely accepted (Schlosshauer et al. 2013, e.g.,). In fact, decoherence
with effective wave function collapse and
It is still a possibility that there are true wave function collapses. Appendix C.
– 126 –
Decoherence theory dispenses with this axiom too by proving how Born’s
rule follows from quantum mechanics based on the first three axioms. SQM accepts
decoherence theory, and so for us the axiom is a faux axiom. However, it is a
traditional quantum mechanics axiom and and it remains highly useful in practical
applications of quantum mechanics. And so we state it here.
In Appendix B, we give a proof of Born’s rule in a special case for pedagogical
reasons.
10. ENTANGLEMENT
11. FOUNDATIONAL ISSUES
Quantum mechanics has been riven by foundational issues every since its formulation in
1925–1926. We have alluded to the issues in earlier sections. This issues are not at all resolved
despite decades of experiment, application, theorizing, and argument (e.g. Schlosshauer et
al. 2013; Norsen & Nelson 2013).
Let’s summarize and discuss the main ones now insofar as yours truly understands them.
1. Is quantum mechanics ontic or epistemic in the jargon that is currently in vogue (e.g.
Schlosshauer et al. 2013; Norsen & Nelson 2013).
“Ontic” means quantum mechanics deals with real objects, in particular that the
wave function or state is real thing. “Epistemic” on the other hand means that quantum
mechanics is just informational like a probability distribution (e.g., for a coin toss).
If quantum mechanics is epistemic, then necessarily there is a deeper theory that
is ontic. On the other hand, if quantum mechanics is ontic, there may or may not be
a deeper theory.
– 127 –
ACKNOWLEDGMENTS
Support for this work has been provided by the Department of Physics of the University
of Evansville.
A. GRAM-SCHMIDT PROCESS
The Gram-Schmidt process (e.g., Wikipedia: Gram-Schmidt process) is used for creating
a set of orthonormal vectors out of a set of non-orthonormal vectors: i.e., orthonormalizing
a set of basis vectors in shorthand. The process is actually tedious and, for space of infinite
dimension. Fortunately, one seldom has to do it. Useful orthonormal sets often being obtained
by other means: e.g., by solution to eigenvalue problems for Hermitian operators (4). Here
describe the Gram-Schmidt process for completeness and for the insight the description gives.
Say we have a non-orthonormal set of independent vectors {|φi〉} (where the curly
brackets mean set of vectors and the i is an index for the set). The indices run 1, 2, 3, etc.
We assume that dimension of the space is finite or a countable infinity. A space with an
dimension of uncountable infinity is, in general beyond the knowledge of yours truly. How
the uncountably infinite position and wavenumber spaces are treated in quantum mechanics
is described in § 5.
We will construct the orthonormal set in order of index i. We assert the ith orthonor-
malized vector is given by
|φi〉 =|φi〉 −
∑i−1j=1 |φj〉〈φj|φi〉
normalization factor, (A1)
where we use |φi〉 for the normalized replacement for |φi〉, recall that ||α〉| =√
〈α|α〉 is the
norm of |α〉, and for i = 1 the summation is assigned the value zero. This orthonormalized
vector is orthogonal to all processed vectors of index less than i. The normalization is proven
by inspection. The proof of orthogonality is by induction.
– 128 –
1. We first prove that the subset {|φj〉}2 (the subset of processed vectors up to index 2)
is orthonormal. First note that
|φ1〉 =|φ1〉|φ1|
|φ2〉 =|φ2〉 − |φ1〉〈φ1|φ2〉
normalization factor. (A2)
Taking the inner product of these two vectors gives
〈φ1|φ2〉 =〈φ1|φ2〉 − 〈φ1|φ1〉〈φ1|φ2〉
normalization factor=
〈φ1|φ2〉 − 〈φ1|φ2〉normalization factor
= 0 (A3)
Thus, the subset is orthonormal.
2. We asssume the subset {|φj〉}i−1 (the subset of processed vectors up to index i− 1) is
an orthonormal subset.
3. Now we prove that |φi〉 is orthogonal to all of subset {|φj〉}i−1 For k < i, we find
〈φk|φi〉 =〈φk|φi〉 −
∑i−1j=1〈φk|φj〉〈φj|φi〉
normalization factor=
〈φk|φi〉 −∑i−1
j=1 δkj〈φj|φi〉normalization factor
=〈φk|φi〉 − 〈φk|φi〉
normalization factor= 0 . (A4)
Thus, the subset {|φj〉}i is an orthonormal subset.
4. The proof by induction is complete, since if {|φj〉}2 is orthonormal—and it is—so is
{|φj〉}3 and then so is {|φj〉}4, and so on.
There are continuum infinity of sets of orthonormal vectors in general. This is proven
just by noting that the 3-dimensional Cartesian unit vectors of 3-dimensional Euclidean
space have a continuum infinity of possible orientations. So it is not surprising that the
Gram-Schmidt process does not give a unique orthonormal set. In fact, the orthonormal set
obtained depends on the order of the states taken in the process in general. For a partial proof,
imagine that the initial set {|φi〉} contains no orthogonal pairs of vectors at all. Whichever
vector you start with becomes the first member of the orthonormalized set (aside from being
– 129 –
normalized if it is not so already). All the other members of the orthonormalized set are not
members of the original set since they are orthogonal to the first member. Thus, whatever
vector you start with gives a different orthonormal set containing that original set member
and no other original vectors. Each orthonormal set is unique.
What about a general proof of order dependence for the orthonormal set obtained? It
is a bit tedious, but for reference and pedagogical reasons, we give it in § A.1 just below.
A.1. General Proof
Here we give a general proof of the order dependence for the orthonormal set obtained
from the Gram-Schmidt process. Say that you start selecting vectors for two Gram-Schmidt
processes identically up to the ith vector. You now make different choice for the ith vector
and after make general choices for both processes. The two ith vectors are
|φi〉 =|φi〉 −
∑i−1j=1 |φj〉〈φj|φi〉
normalization factor(A5)
|φ′i〉 =
|φ′i〉 −
∑i−1j=1 |φj〉〈φj|φ′
i〉normalization factor
(A6)
where we use primes to label the second process vectors. Note that the subsets {|φj〉}i−1 and
{|φ′j〉}i−1 are identical, and so we suppress the primes on the later. Since |φi〉 and |φ′
i〉 are
independent by hypothesis (i.e., not aligned), |φi〉 and |φ′i〉 must be independent (i.e., not
aligned). Let us take the inner product of |φi〉 and |φ′i〉, but suppressing the denominators
for clarity:
〈φi|φ′i〉 ∝ 〈φi|φ′
i〉 − 2i−1∑
j=1
〈φi|φj〉〈φj|φ′i〉 +
i−1∑
j=1
i−1∑
k=1
〈φi|φj〉〈φj|φk〉〈φk|φ′i〉
∝ 〈φi|φ′i〉 − 2
i−1∑
j=1
〈φi|φj〉〈φj|φ′i〉 +
i−1∑
j=1
i−1∑
k=1
〈φi|φj〉δjkφk〉〈φk|φ′i〉
∝ 〈φi|φ′i〉 − 2
i−1∑
j=1
〈φi|φj〉〈φj|φ′i〉 +
i−1∑
j=1
〈φi|φj〉〈φj|φ′i〉
– 130 –
∝ 〈φi|φ′i〉 −
i−1∑
j=1
〈φi|φj〉〈φj|φ′i〉
∝N
∑
j=i
〈φi|φj〉〈φj|φ′i〉 (A7)
where N is the total number of independent vectors and for the last step we have used
the fact that {|φj〉} is a complete orthonormal set. By hypothesis that |φi〉 and |φ′i〉 are
independent, we must have i < N .
In special cases, either or both of |φi〉 and |φ′i〉 can be independent of all all vectors
in the orthonormal subset {|φj〉}i−1 and be non-orthogonal relative to each other. In these
cases, the members of the set {|φj〉} from index i on form a complete set as far as either or
both of |φi〉 and |φ′i〉 are concerned. Thus, in our special cases, we can have either or both
of the expansions
|φi〉 =
N∑
j=i
|φj〉〈φj|φi〉 (A8)
|φ′i〉 =
N∑
j=i
|φj〉〈φj|φ′i〉 (A9)
And thus, in our special cases,
N∑
j=i
〈φi|φj〉〈φj|φ′i〉 = 〈φi|φ′
i〉 6= 0 . (A10)
And finally thus, in our special cases,
〈φi|φ′i〉 6= 0 , (A11)
or in other words |φi〉 and |φ′i〉 are not orthogonal.
Now if the two Gram-Schmidt processes yield identical orthonormal sets and |φi〉 and
|φ′i〉 are independent, they must be orthogonal. But we have just shown that there are
special cases where they are not orthogonal. So after mucho labor, we have proven that the
– 131 –
orthonormal set obtained is independent in general (but not in special cases) of how far one
goes in two Gram-Schmidt processes selecting the same states for orthonormalization before
branching into different selection paths.
B. ENVARIANCE AND BORN’S RULE
Born’s rule can be derived in decoherence theory using envariance (e.g., Zurek 2005).
As stated in the § 9, decoherence theory is widely favored, but not universally accepted.
The derivation Born’s rule is not, however, dependent on the full decoherence theory and its
validity must be judged separately. Your truly is not able to give an authoritative judgment
and will just rely on the authority of Zurek (2005). However, yours truly thinks it important
to give a special case derivation of Born’s rule (for an expansion in two eigenstates with
equal amplitudes) to give the reader some insight into an important element of decoherence
theory. The derivation follows those of Zurek (2005, 2009). It is somewhat belabored for
pedagogical reasons.
Envariance stands for environment induced invariance. Envariance is not a new axiom. It
is special symmetry that leads to Born’s rule given the first 4 axioms of quantum mechanics
(§ 1).
Consider the folllowing normalized state:
|Ψ〉 =1√2
(
|a〉S|b〉E + |c〉S|d〉E)
, (B1)
where S stands for system, E stands for environment the system is embedded in, |a〉S and
|c〉S are distinct orthonormal eigenstates of the system for a system observable QS , |b〉E and
|d〉E are distinct orthonormal eigenstates of the environment for some system observable QE ,
and any phase factors have been absorbed into the system or environment eigenstates. For
– 132 –
pedagogical reasons, we will verify that the overall state is normalized. Behold:
〈Ψ|Ψ〉 =1
2
(
〈a|a〉S〈b|b〉E + 〈a|c〉S〈b|d〉E + 〈c|a〉S〈d|b〉E + 〈c|c〉S〈d|d〉E)
=1
2
(
1 × 1 + 0 + 0 + 1 × 1)
= 1 , (B2)
where we have used the orthonormality of the eigenstates.
Now |a〉S |b〉E and |c〉S|d〉E are both eigenstates of the the joint system-environment
system. The system is a superposition of these joint eigenstates with equal amplitudes (i.e.,
equal coefficients). By the wave function collapse axiom or alternatively by decoherence
theory, a decoherence event (which could be either an actual observer measurement or a
natural decoherence event) for of the system observable collapses the system into either
|a〉S (with probability pa) or |c〉S (with probability pc) with no superposition. Similarly, a
decoherence event for the environment observable collapses the system into either |b〉E (with
probability pb) or |d〉E (with probability pd) with no superposition.
Logically, pa + pc = 1 and pb + pd = 1. It seems reasonable to believe that pa = pc = 1/2
and pb = pd = 1/2. But that is precisely what we are going to prove. First, we assert that
pa = pb and pc = pd. This product-state probability rule seems reasonable since equalities
certainly hold (and are equal to 1) when the overall system is in either |a〉S|b〉E or |c〉S|d〉E .
We leave the full justification to Zurek (2005, 2009).
Second, we now posit a unitary swap transformation on the system:
US = |c〉S〈a|S + |a〉S〈c|S . (B3)
We can prove this transformation is unitary. Note
USUS = |c〉S〈a|c〉S〈a|S + |c〉S〈a|a〉S〈c|S + |a〉S〈c|c〉S〈a|S + |a〉S〈c|a〉S〈c|S= |c〉Sδac〈a|S + |c〉S〈c|S + |a〉S〈a|S + |a〉Sδac〈c|S= |c〉S〈c|S + |a〉S〈a|S = 1op , (B4)
– 133 –
where have used orthonormality for distinct states and where the penultimate expression is
clearly the unit operator 1op. Clearly US = U−1S : i.e., US is its own inverse. Now consider
general states |α〉, |β〉, |i〉 and |j〉. We form the operator
Q = |i〉〈j| . (B5)
Now
〈α|Q†|β〉 = 〈β|Q|α〉∗ = (〈β|i〉〈j|α〉)∗ = 〈β|i〉∗〈j|α〉∗ = 〈i|β〉〈α|j〉 = 〈α|j〉〈i|β〉 , (B6)
and thus
Q† = |j〉〈i| . (B7)
Now since Hermitian conjugation distributes over addition (4), U †S = US = U−1
S , and thus
U †S = U−1
S which means US is unitary. Since swap US is unitary, one can always imagine that
there is some physical process that could actually bring the swap about.
Third, applying US to |Ψ〉 gives
US |Ψ〉 =1√2
(
|c〉S|b〉E + |a〉S |d〉E)
. (B8)
This state is clearly different from |Ψ〉. A joint system-environment collapse of |Ψ〉 gives
|a〉S|b〉E or |c〉S |d〉E , whereas a joint system-environment collapse of U |Ψ〉 |c〉S|b〉E or |a〉S |d〉E .
Fourth, we will use primes to indicate the probabilies of collapses to states after swaps.
Now making use of the product-state probability rule, we find that
p′c = pb = pa and p′a = pd = pc . (B9)
Fifth, we posit a unitary swap transformation on the environment:
UE = |d〉E〈b|E + |b〉E〈d|E . (B10)
The proof that swap UE is unitary is the same as for US , mutatis mutandis.
– 134 –
Sixth, applying UE to US |Ψ〉 gives
UEUS |Ψ〉 =1√2
(
|a〉S|b〉E + |c〉S|d〉E)
= |Ψ〉 (B11)
which is the original state. Now making use of the product-state probability rule again, we
find that
p′b = p′a = pd = pc and p′d = p′c = pb = pa . (B12)
But since the state is the original state, p′b = pb = pa and p′d = pd = pc. Now we have
pa = pc. Making use of the product-state equal probability rule and the fact that the sum of
probabilities must be 1, we finally obtain
pa = pc =1
2and pb = pd =
1
2. (B13)
This last result is just what Born’s rule predicts: probability of collapse is equal to
the squared magnitude of the amplitude coefficients both of which are 1/√
2. Zurek (2005)
generalizes the proof for a general number of terms in the expansion and general amplitudes.
It is certainly remarkable that Born’s rule becomes a result of quantum mechanics in
decoherence theory instead of axiom as in conventional quantum mechanics. It is hard to
imagine that a theory of true wave function collapse that included Born’s rule as a result
or an axiom could exist. Nature would have have two mechanisms for producing Born’s rule
which seems to be an unlikely coincidence—it is extremely unparsimonious. Unless, of course,
Born’s rule is some common emergent property itself.
C. DETERMINISM, PROBABILISM, AND FREE WILL
In this appendix, I will take up the subjects of determinism, probabilism, and free will—
all without much awareness of the enormous literature on these subjects. My opinions and
conjectures may not be worth much, but they are what I have at the present.
– 135 –
In pure decoherence theory, there is no wave function collapse in a system (e.g., Paz
& Zurek 2001; Zurek 2003). Einselection extremely rapidly damps out all, but the pointer
states. The pointer states then evolve effectively independently of each other: they are the
initial conditions for the rest of the evolution of the system and the universe insofar as it
is affected by the system. The upshot is that the universe is constantly bifurcating into
independent or uncorrelated paths. One gets a quasi-infinity of parallel worlds: the many
worlds interpretation of quantum mechanics (e.g., Zurek 2003, p. 5) is actually correct.
Furthermore the evolution of the universe of parallel worlds is completely deterministic
just as quantum mechanics without the wave function collapse axiom predicts. This is as
it should be, since decoherence theory is an emergent theory based on quantum mechanics
without the wave function collapse axiom. In this case, it seems if one knew exactly all the
physical conditions of the universe at any one instant, the past and future would be totally
predictable just Laplace thought should be the case for classical mechanics (e.g., Wikipedia:
Laplace’s demon). The universe obeys determinism.
However, there many who cannot believe in quasi-infinity of the parallel worlds or which
only one can be detected—the one we are in. It seems to them—maybe to us all—extremely
unparsimonious. So some maintain that true wave function collapse must still happen and
that only one eigenstate or, in the decoherence theory view, one pointer state in any system
eventually leads to the future. So there is only one macroscopic world. Given that decoherence
theory is valid—which is the assumption of SQM—true wave function collapse would follow,
compete, or precede and preclude decoherence. The simplest wave function collapse theory
would be one where the Born rule is fundamental: quantum mechanics is then intrinsically
probabilistic and the universe obeys probabilism. But it seems extremely unparsimonious to
have a true wave function collapse theory that aside from the true wave function collapse
mimicks and/or complements decoherence theory—assuming its validity as we do. We are
– 136 –
faced with battling unparsimonies—Occam’s razor against itself.
Can the fundamental probabilism of a true wave function collapse theory affect the
macroscopic world? Well obviously. Many experimental results are predicted by Born’s rule,
and so from these at least a microscopic probabilism is amplified to the macroscopic world.
In fact, there must be many amplifications in nature not just in human experiments. It
may be that some of those occur in our brains and lead to intrinsically random thoughts—
Heisenberg, Martin Heisenberg, seems to think so (Heisenberg 2009). The butterfly effect
(e.g., Wikipedia: Butterfly effect) ensures that microscopic intrinsic randomness leaks out
everywhere in the macroscopic universe particularly in biospheres and probably affects even
very large scale evolutions like those of galaxies.
But the trouble with the wave function collapse hypothesis is that no adequate consensus
theory of it exists. As discussed in § 9, the point where one breaks off a quantum mechanical
calculation and uses Born’s rule to calculate the probabilities of experimental outcomes
seems to have been chosen empirically (e.g., Greene 2004, p. 119) or in more recent times
from decoherence theory where no wave function collapse happens and one is just choosing
a good time when the pointer states have come to dominate a system. However, one can say
that if instantaneous wave function collapse is required, it does not seem to be precluded
by quantum mechanics. The non-locality of Bell state interactions of entangled states allows
that. Of course, it seems to be unspeakable as to what frame instantaneous collapse happens
in. Greenstein & Zajonc (1997) (p. 184) blandly informs us in a footnote that the question
of frame has been shown not to be a problem. I would guess, the collapse is instantaneous
according to cosmic time (e.g., Wikipedia: Cosmic time): time scale of the expansion of
the universe. What other time would be appropriate for Bell states that are spread over
cosmological distances. We are mixing quantum mechanics and general relativity, but that
has been done before—e.g., for Hawking radiation. If two formally incompatible theories are
– 137 –
both on the right path, then a mixed approach may be valid.
In any case, the supporters of parallel worlds (the determinists) can retort to the sup-
porters of wave function collapse (the probabilists) that parallel worlds are not a theory,
they are a prediction of quantum mechanics—a highly verified theory—and the probabilists
have to invoke an ad hoc hypothesis—but a venerable one—to explain away parallel worlds.
I am neutral in this debate: determinism versus probabilism. I would like to see it
resolved, but I have no favorite. But it is interesting that there may be no way in principle
to tell between the two positions. If a testable theory of wave function collapse arises that
might tell. But otherwise we might be left in the lurch. If there is in principle no way to tell,
then the world as we know it empirically is indifferent to the two choices. And even if there
is a way, we could imagine a universe in which there is not. The upshot is that the decision
between determinism versus probabilism may have no implications for some human concerns
like ethics.
Either way what about free will? With determinism, every human decision is predes-
tined. With probabilism, every human decision is a mixture of eternal causes and random
events since forever. I will offer my opinion. A conscious being based on many factors makes
decisions. The result may be purely determined or the mixture, but as we suggested above
that may make no difference to the decision at all. Either way, the conscious being doesn’t
know the outcome of the decision until it is made—or maybe a bit later since there is evi-
dence that humans are not aware of having made a decision until a bit later. Making the
decision with an outcome is often unknown to everyone but omniscient beings—and not even
them in a probabilistic world—is free will as I call it while acknowledging that others may
have different views.
The process of conscious beings making decisions is very complex and creative. The
complexity is probably essential to the creativity. In particular random factors going into
– 138 –
the process from inside or outside probably vastly aid creativity. That creativity allows
scientific and artistic leaps. The universe would be very different without conscious-being
decision making (i.e., free will)—oh the galaxies would look much the same—but those parts
particularly interesting to us would be very different and very impoverished. The human
world would be very different
But with “my” free will, what of ethics or as Vincent Price (1911-1993) (Wikipedia:
Vincent Price) once said “Is there no morality left?” (Wikipedia: The Comedy of Terrors
(1964)) Well consciousness likes consciousness or else it would likely die out quickly or ne-
ver develop. Consciousness liking consciousness is an emergent principle—an iron law—of
consciousness which itself an emergent entity. I posit it as an axiom that the consciousness-
liking-consciousness principle is the seed from which all ethics flow. Specific ethical systems
and specific ethical decisions are extremely contigent on all kinds of other things, but ultima-
tely must include at their base, explicitly or implicitly, the consciousness-liking-consciousness
principle. But at this moment I am writing oracularly and not about to discuss whether “my”
axiom is falsifiable.
Now what is consciousness and does God exist?—well some more thought is needed.
REFERENCES
Arfken, G. 1970, Mathematical Methods for Physicists (New York: Academic Press)
Bevington, P. R. 1969, Data Reduction and Error Analysis for the Physical Sciences (New
York: McGraw-Hill Book Company)
Cohen-Tanoudji, C., Diu, B., & Laloe, F. 1977, Quantum Mechanics (New York: John Wiley
& Sons)
– 139 –
Erhart, J., Sponar, S., Sulyok, G., Badurek, G., Ozawa, M., & Hasegawa, Y. 2012, Nature
Physics, doi:10.1038/nphys2194
Ghirardi, G. C., Rimini, A, & Weber, T. 1986, Physical Review D, 34, 470
Greene, B. 2004, The Fabric of the Cosmos (New York: Vintage Books), (Gre)
Greenstein, G., & Zajonc, A. G. 1997 The Quantum Challenge: Modern Research on the
Foundations of Quantum Mechanics (Sudbury, Massachusetts: Jones and Bartlett
Publishers)
Griffiths, D. J. 2005, Introduction to Quantum Mechanics (Upper Saddle River, New Jersey:
Pearson/Prentice Hall)
Heisenberg, M. 2009, Nature, 459, 164
Laughlin, R. B. 2005 , A Different Universe: Reinventing Physics from the Bottom Down
(New York: Basic Books)
Norsen, T., & Nelson, S. 2013, arXiv:1306.4646
Ozawa, M. 2012, arXiv:1201.5334v1
Paz, J. P., & Zurek, W. H. 2001, in Coherent Atomic Matter Waves: Session LXXII of
the Les Houches Ecole d’Ete de Physique Theorique, ed. R. Kaiser, C. Westbrook,
& F. David (Berlin: Springer), 533, arXiv:quant-ph/0010011v1
Pusey, M. F., Barrett, J., & Rudolph, T. 2011, arXiv:1111.3328v1
Schlosshauer, M., Kofler, J., & Zeilinger, A. 2013, arXiv:1301.1069
Wikipedia, http://en.wikipedia.org/wiki/Main Page
Zurek, W. H. 2003, arXiv:quant-ph/0306072v1