Introduction to Quantum Information and Computation...2020/05/02 · Introduction to Quantum...

Introduction to Quantum Information andComputation

Steven M. Girvinc⃝ 2019, 2020

[Compiled: May 2, 2020]

Contents

1 Introduction 11.1 Two-Slit Experiment, Interference and Measurements . . . . 21.2 Bits and Qubits . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Stern-Gerlach experiment: the first qubit . . . . . . . . . . . 8

2 Introduction to Hilbert Space 162.1 Linear Operators on Hilbert Space . . . . . . . . . . . . . . . 182.2 Dirac Notation for Operators . . . . . . . . . . . . . . . . . . 242.3 Orthonormal bases for electron spin states . . . . . . . . . . 252.4 Rotations in Hilbert Space . . . . . . . . . . . . . . . . . . . 292.5 Hilbert Space and Operators for Multiple Spins . . . . . . . 38

3 Two-Qubit Gates and Entanglement 433.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.2 The CNOT Gate . . . . . . . . . . . . . . . . . . . . . . . . 443.3 Bell Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . 513.4 Quantum Dense Coding . . . . . . . . . . . . . . . . . . . . 553.5 No-Cloning Theorem Revisited . . . . . . . . . . . . . . . . . 593.6 Quantum Teleportation . . . . . . . . . . . . . . . . . . . . . 603.7 YET TO DO: . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4 Quantum Error Correction 634.1 An advanced topic for the experts . . . . . . . . . . . . . . . 68

5 Yet To Do 71

i

Chapter 1

Introduction

By 1895 there seemed to be nothing left to do in physics except fill out afew details. Maxwell had unified electriciy, magnetism and optics with histheory of electromagnetic waves. Thermodynamics was hugely successfulwell before there was any deep understanding about the properties of atoms(or even certainty about their existence!) could make accurate predictionsabout the efficiency of the steam engines powering the industrial revolution.By 1895, statistical mechanics was well on its way to providing a microscopicbasis in terms of the random motions of atoms to explain the macroscopicpredictions of thermodynamics.

However over the next decade, a few careful observers (e.g. Planck andespecially Einstein) began to realize that the very foundations were crum-bling. Maxwell correctly taught us that accelerating charges radiate awayenergy in the form of electromagnetic waves. Why then do electrons orbit-ing the positively charged nuclei in atoms not radiate away all their energyand fall into the nucleus? This doesn’t happen, and in fact, physicists soondiscovered that the lone electron in the hydrogen atom seems to have onlya discrete set of allowed orbits corresponding to certain well-defined ener-gies. Electrons could in fact emit electromagnetic waves, but only in discretelumps of energy (‘quanta’ known as ‘photons’) that matched the change inenergy as the electron spontaneously jumped from one allowed energy levelto a lower allowed energy level. Once the electron reached the lowest allowedenergy level (the ‘ground state’), it could fall no further despite the fact thatit was still accelerating while orbiting the nucleus).

Maxwell also taught us that the modes of the electromagnetic field areharmonic oscillators. Why then does electromagnetic radiation not obey the

1

CHAPTER 1. INTRODUCTION 2

laws of statistical mechanics developed for harmonic oscillators? Accord-ing to classical statistical mechanics, the average energy stored in the kineticand potential energy of a harmonic oscillator is kBT , where kB is Boltzmann’sconstant and T is the absolute temperature. Since electromagnetic radiationcomes in a wide range of frequencies running all the way up to infinity, clas-sical statistical mechanics predicts an ‘ultraviolet catastrophe,’ in which theenergy density of blackbody radiation (i.e. radiation in thermal equilibrium)should be kBT ×∞.

Even venerable thermodynamics, which makes predictions of a generalnature independent of microscopic details, gave some hints of problems. Thethermodynamic entropy (a measure of our ignorance of the microscopic de-tails of the state of the system; or equivalent a measure (the logarithm) of thenumber of different microscopic states that are consisten with the observedmacroscopic properties) of a gas obeying the ideal gas law turns negative atlow temperatures. In statistical physics, Josiah Willard Gibbs pointed outthe now famous ‘Gibbs paradox.’ He noted that if a partition is removedfrom a box allowing the gas atoms on each side to mix, the entropy shouldirreversibly increase, and yet nothing macroscopically observable actuallyhappens. The classical way of counting up the number of different configu-rations of a bunch of atoms was somehow wrong. The Gibbs paradox couldbe resolved only if exchanging the positions of two particles (say particle 17and particle 1,230) did not result in a new state–as if the atoms were notmerely the same species but truly indistinguishable.

1.1 Two-Slit Experiment, Interference and

Measurements

TO BE WRITTEN: Wave-like interference and Superposition Principle.Particle-like detection. Which path information and state collapse upon mea-surement.


1.2 Bits and Qubits

A bit is the smallest unit of information. The word ‘bit’ is short for binarydigit and can be represented by 0 or 1 in the binary numbering system (base2 numbers). It is the amount of information contained in the answer to ayes/no or true/false question (assuming you have no prior knowledge of theanswer and the two answers are equally likely as far as you know). If Alicegives Bob either a 0 or a 1 drawn randomly with equal probability, then Bobreceives one bit of information.1 A register of N classical bits can be in oneof 2N distinct states. For example 3 classical bits can be in 8 states: (000),(001), (010), (011), (100), (101), (110), (111), corresponding to the 8 binarynumbers whose (base 10) values run from 0 to 7. This is a remarkably efficientencoding–with an ‘alphabet’ of only two symbols (0 and 1) Alice can sendBob one of 232 = 4, 294, 967, 296 possible messages at a cost of only having totransmit 32 bits (or 4 ‘bytes’). In fact, if all the messages are equally likelyto be chosen, this is provably the most efficient possible encoding.

Information is physical and is stored in the state of a physical system. Inan ordinary computer, bits are stored in the states of switches (typically thesedays transistors) which can be open/closed or on/off. Information can also bestored in small domains of magnetism in disk drives. Each magnetic domainacts like a small bar magnet whose magnetic moment (a vector quantitypointing in the direction running from the south pole to the north pole ofthe bar magnet) can pnly point up or down. Ordinarily a bar magnet canpoint in any direction in space, but the material properties of the disk driveare intentionally chosen to be anisotropic so that only two directions arepossible.

A quantum bit (‘qubit’) is information stored in a quantum system thatcan be in one of two possible states. Using a quantum state notation devel-oped by Paul A.M. Dirac, state |0⟩ might denote an atom in its lowest energystate (the ground state) and state |1⟩ might denote the first excited state of

1Information is ‘news’. If Bob knew ahead of time that Alice would chose 0 withprobability p0 = 0.999, and 1 with probability p1 = 1− p0, then he is not surprised whenhe receives a 0 because he already was pretty sure he ‘knew the answer to the question’ahead of time. The mathematical theory of information quantifies the information thatBob receives and shows that it is much less than one bit of information. Claude Shannondemonstrated that in this situation Bob receives information H = − [p0 ln2 p0 + p1 ln2 p1],where ln2 means log base 2. H has a maximum value of 1 at p0 = p1 = 1

2 and approacheszero when either of the probabilities approaches unity.


the atom (or some other selected excited state). However, quantum systemsbehave strangely–particles can act like waves and waves sometimes act likeparticles. Two-state systems have wave-like properties in the sense that canbe in superposition states represented mathematically by

|ψ⟩ = α|0⟩+ β|1⟩ (1.1)

where α and β are (complex) probability amplitudes. If one measureswhether the qubit has value 0 or 1, the answer is random. The measure-ment yields 0 with probability2 |α|2 and 1 with probability |β|2. Thus wehave the important constraint that the probabilities must add up to unity

|α|2 + |β|2 = 1. (1.2)

We say that the state must be ‘normalized.’ Furthermore, we see that anoverall ‘global phase’ of the form

eiχ|ψ⟩ = eiχα|0⟩+ eiχβ|1⟩ (1.3)

(where χ is an arbitrary real number) can have no measurable effect becausethe probabilities for different measurement results depend only on the magni-tude square of the state coefficients. Hence, without loss of generality we canchoose the phase angle χ to make α real. The complex number β = a + ibis parametrized by two real numbers a and b. However the normalizationconstraint removes one of these degrees of freedom. In the end we are leftwith two real parameters to specify the state of a single qubit. A standardparametrization in terms of two angles θ and φ is the following

|ψ⟩ = cosθ

2|1⟩+ eiφ sin

θ

2|0⟩. (1.4)

The two angles θ and ϕ can be visualized as the polar and azimuthal anglesof a vector pointing from the origin to the surface of the unit sphere as illus-tragted in Fig. 1.1. We will later describe the meaning of the unit vector onthe Bloch sphere in relation to the properties of the corresponding quantumstate and will also explain why half angles appear in the parametrization ofthe state in Eq. (1.4). We simply note here that the two states of a classicalbit are represented by the Bloch sphere vectors corresponding to the northpole (0, 0, 1) and the south pole (0, 0,−1). Quantum bits can be in statescorresponding to an arbitrary point on the sphere.

2We are using the following standard notation. If we have a complex number z = x+iy,we define complex conjugation as z∗ = x− iy and the magnitude (squared) of the numberas |z|2 = z∗z = x2 + y2.


x

z

y

|1

| 0

1| 0 |1

2

1| 0 |1

2

xy

1| 0 |1

2i

1| 0 |1

2i

z

Figure 1.1: Unit vector corresponding to a point on the Bloch sphere with cartesiancoordinates (sin θ cosφ, sin θ sinφ, cos θ). The orientation of this 3D unit vector can beobtained by starting with the unit vector pointing to the ‘north pole’ of the sphere andthen rotating it in the xz plane by an angle θ around the y axis and then rotating byan angle φ around the z axis. This unit vector corresponds to the parametrization of thequbit state given in Eq. (1.4).

Box 1.1. A word about notation. Since |0⟩ and |1⟩ might representthe ground and first excited states of a quantum system, we will sometimesdenote them |g⟩ and |e⟩ respectively. Similarly, we may use the orientationon the Bloch sphere to label the same states | ↑⟩ and | ↓⟩. The statescorresponding to (1, 0, 0) on the Bloch sphere might be written

| →⟩ = |+X⟩ = 1√2[|0⟩+ |1⟩] = 1√

2[| ↓⟩+ | ↑⟩].

It is a weird feature of quantum spins that a coherent superpostion of up anddown points sideways!

It is very important to understand that the randomness in quantum me-chanics does not arise from our lack of knowledge of all the variables in theexperimental set up. It is not true that the qubit has a certain value deter-mined by some ‘hidden’ variable that you do not know about. You can pre-pare two qubits in exactly the same state and still get different measurementresults! In fact, the qubit does not have a ‘value’ before it is measured. Wewill see later that assuming that the qubit has a value before it is measuredleads to contradiction with experiment. One can actually prove experimen-tally that there are no hidden variables. This leads to the following clever


statement (due to Alexander Korotkov): ‘In quantum mechanics you do notsee what you get, you get what you see.’

Curiously, if (using an idealized apparatus) you measure the same qubitagain, you will get the same value for the second (and all subsequent) mea-surement results. This means that the first measurement has ‘collapsed’ thestate. If you start with the state α|0⟩ + β|1⟩ and the first measurement re-sult happens to be 1, then the state collapses to |1⟩ and stays there for allsubsequent measurements. Conversely, if the first measurement happens toyield the value 0, then the state collapses to |0⟩ and stays there.

The above results illustrate a curious asymmetry in quantum information.The state of a qubit requires specification of two real numbers (the Blochsphere angles, θ and φ). In principle this means it takes an infinite numberof bits to specify the state. And yet, when we make a measurement, thestate collapses to |0⟩ or |1⟩ and we learn only one bit of information. Thisasymmetry has far-reaching implications as we will see later. In particularit will play an important role in the no-cloning theorem (which tells us thatunknown quantum states cannot be copied). This in turn is the basis ofquantum cryptography.

One might wonder why one would want to build a computer out of qubitswhose measured values seem to be random. The answer (somehow) partly liesin the ability of a collection of qubits to explore a huge number of configura-tions simulataneously. Thus a collection of 3 qubits can be in a simultaneoussuperposition of 23 states of the form

|ψ⟩ = α0|000⟩+ α1|001⟩+ . . .+ α7|111⟩. (1.5)

Thus if we build a quantum computer with an input register of N qubits,we can give the computer a single giant superposition of all the possibleinputs we could ever give it. Because the Schrodinger equation describing thetime evolution of the quantum state of the ocmputer is linear, the computerwill carry out is calculation on every possible input simultaneously withoutdifficulty. Thus we have what appears to be exponentially powerful ‘quantumparallelism’ in which the computer carries out 2N computations at once.Unfortunately, the output register contains a huge superposition of all 2N

answers (results of the calculation). When we measure the output registerthe state randomly collapses to only one of the 2N answers and it seems welose all the quantum advantage of the parallelism.

It turns out however that all is not necessarily lost. There exist certainclasses of problems which have simple answers and one can use the wave-


like properties of the superpositions to yield constructive interference which‘focuses’ the probability amplitudes in the output register onto the desiredanswer. The classic example of this is the Shor algorithm for finding theprime factors of a composite number. The input to the quantum computeris very simple–the single number to be factored. The output is also verysimple–one of the factors. The algorithm almost perfectly focuses the outputamplitude onto one of the factors so that measurement of the output registergives the correct answer with relatively high probability. Since factoring is a‘one-way problem’ that is hard to solve but the solution is easy to check, onesimply runs the algorithm a small number of times and checks the answereach time. Within a relatively small number of tries, you will be able to verifythat you have found a factor. In this case, where the inputs and outputs arerelatively simple, the power of the quantum computer (presumably) comesfrom its ability to explore an exponentially large number of intermediatestates during the course of the calculation.

The Shor algorithm is exponentially faster than the best known classicalalgorithm for factoring. (Factoring is believed to be exponentially hard clas-sically, but this has not actually been rigorously proven.) Other algorithmsoffer only polynomial speed up. For example the Grover search algorithm canfind a given entry in an unordered data base of size N in about

√N queries to

the (quantum) database. Consider for example a telephone directory whichhas the entries in random rather than alphabetically order. To find yourfriend’s name in this database classcially would require examining on aver-age N/2 out of the N total entries. For this problem, the quantum algorithmis not able to focus all the output amplitude onto the desired answer, butrather can partially focus the output probability amplitudes so that they arespread over only

√N output states, rather than all N . Thus measurement of

the output register has probability p ∼ 1/√N of collapsing onto the desired

state. One thus expects to find the correct answer in about√N attempts

rather than N/2 attempts as in the classical case. This is not exponentialspeed up but it is still interesting (and surprising) quantum advantage.

The source of the power of quantum computers lies partly in the existenceof superpositions but also in another subtle concept called entanglement thatgives qubits ‘spooky’ correlations that can be stronger than is allowed forclassical bits. We will look into this mystery in a later section.

Before delving into the mathematical description of these two-state sys-tems and the information they contain, it is useful to start with experiment.


1.3 Stern-Gerlach experiment: the first

qubit

The first qubit was a silver atom. In Frankfurt, Germany in February 1922,physicists Otto Stern and Walther Gerlach carried out an experiment on abeam of silver atoms that had momentous import for the development ofquantum mechanics and in fact constituted the first measurement of thestate of a qubit. [See Physics Today article in the files section on Canvas.]They evaporated atoms from a bar of silver placed in an oven and sent themthrough a magnetic field that had a strong gradient as illustrated in Fig. 1.2.

Stern and Gerlach did not realize it at the time, but it turns out that theirexperiment showed that electrons carry intrinsic angular momentum called‘spin.’ Thinking classically we can imagine the electron as a small sphericalball of charge spinning around some axis. The resulting current flow createsa magnetic dipole–the electron acts like a little bar magnet. The magneticmoment is a vector whose direction runs from the south pole to the northpole of the bar magnet and whose magnitude is proportional to the strengthof the magnetic field produced by the bar magnet. Because electrons havenegative charge (thank you, Mr. Franklin), the magnetic moment vector isantiparallel to the angular momentum vector of the spinning electron. Thisclassical argument cannot really be applied to the quantum domain where theelectron is a point particle that acts like a wave. However Dirac showed thathis relativistic wave equation for the electron very naturally predicted theexistence of the electron spin and predicted the electron magnetic momentnearly exactly

µ = − e

me

g

2S, (1.6)

where me is the electron mass, −e is the electron charge and S is the spinangular momentum of the electron which Dirac showed obeyed |S| = ~

2. The

Dirac equation predicts g = 2 but the experimental value is very slightlylarger (see discussion box below). Defining a dimensionless spin angular

momentum vector σ via S = ~2σ, we have

µ = −g2µBσ, (1.7)

where the Bohr magneton is defined by µB = e~2me

. The energy of a bar

magnet in a magnetic field B depends on the orientation of the bar magnet


Figure 1.2: Plaque memorializing the 1922 Stern-Gerlach experiment. From BretislawFriedrich and Dudley Herschbach, Physics Today, December, 2003.

relative to the fieldU = −µ · B =

g

2µBB · σ.

The magnetic field supplies a torque that tries to align the magnetic momentwith the B field. Because of the negative charge of the electron, its angularmomentum wants to anti-align with the magnetic field.

Box 1.2. Quantum fluctuation corrections to the electron magneticmoment Dirac’s equation yields g = 2 exactly, but experimentally g differsslightly from that value due to subtle quantum effects called ‘vacuum fluc-tuations,’ in which the electron emits and quickly reabsorbs photons. Thecurrent experimental estimate [Hanneke et al., Physical Review A 83, 052122(2011)] for the value is g

2≈ 1.00115965218073 with an uncertainty of about

0.28 parts per trillion. The fact that g−2 can be measured so accurately andit agrees with theoretical predictions of quantum electrodynamics to manysignificant digits is one of the great triumphs of the quantum theory.

The spin angular momentum of the electron makes it behave like a gyro-scope. Recall that gyroscopes respond to torques by precssing. In the case ofan electron in a magnetic field, the angular momentum vector of the electronprecesses around the direction defined by the magnetic field. Thus if themagnetic field points in the z direction, Sz remains constant, and Sx andSy oscillate sinusoidally (and 90 degrees out of phase with each other) at


frequency ωB = g µB

~ B. For a field of 1 Tesla, this corresponds to an angularfrequency ωB ≈ (2π) 28.025× 109 Hz.

Box 1.3. Precession of a gyroscope Consider the gyroscope shown inFig. 1.3 consisting of a disk of mass M with angular momentum L spinningon an axle. The axle is supported at the origin and the force of gravity F =−Mgz acts on the center of mass of the disk located at point R, producinga torque τ = R × F = RF sin θφ in the azimuthal direction φ. Recognizingthat the gravitational potential energy has the form U(θ) = Mgr cos θ, wecan write the torque in the form τ = −dU

dθφ. We see that this torque is

directed perpendicular to the angular momentum and so causes the angularmomentum vector to precess in time (i.e. to rotate around the z axis).For an electron spin in a magnetic field, the magnetostatic energy is

U(θ) =g

2

e

me

B · S =g

2µBB cos θ

so the torque is τ = g2µBB sin θφ. The spin angular momentum precesses

around the z axis according to dSdt

= ωBz×S. Note that, unlike a macroscopicgyroscope, the electron spin does not undergo nutation. Can you think whythis might be? It is related to the fact that the moment of inertia is suchthat the angular momentum vector of the gyroscope is not parallel to itsaxle (i.e., not solely due to the spin) if θ = 0. For a point particle like theelectron, this cannot occur.

This rapid precession is an interesting phenomenon in its own right, butthe details are not important for the Stern-Gerlach experiment. What countsis that because of the precession, the component of the spin parallel to themagnetic field is a constant of the motion, and so the energy is µBB cos(θ).Recall from basic chemistry that the allowed orbitals for electrons in an atomare generally doubled occupied by pairs of electrons with opposite spin. Thisis true for silver which has 46 paired electrons and one unpaired electronin the 5s orbital. (s means the orbital angular momentum is zero so wedon’t have to worry about orbital angular momentum affecting the magneticmoment and are free to concentrate on the spin.)

There is no reason to believe that the spin angular momentum of thesilver atoms exiting the oven in the Stern-Gerlach experiment should haveany preferred orientation foor their magnetic moment, it is reasonable toassume that cos θ is randomly and uniformaly distributed between -1 and


L

ˆF Mgz

R

Figure 1.3: Gyroscope precessing under a torque produced by gravity. The torque is intothe page and the precession of the angular momentum vector occurs as a slow rotation

at rate Ωz that causes dLdt = Ωz × L = τ = −MgR × z. Using R = L(R/L) yields

Ω =MgR/L. Notice that the precession frequency rate is independnent of the polar angle

θ between R and the vertical direction z. The same is true for electron spins in a magneticfield.

+1. Thus the energy of the atoms in the magnetic field should be uniformalydistributed between ±g

2µBB. Very surprisingly however, as shown in Fig. 1.4

one finds that the energy (known as the Zeeman energy) only takes on the twoextremal values and never any value in between. Stern and Gerlach obtainedthis surprising result by sending the silver atoms through a non-uniformmagnetic field whose magnitude had a strong gradient in the z direction3

∇|B| = λz. (1.8)

This caused the Zeeman energy to vary with height and produced a spin-dependent force

F = −∇U = −g2µB∇(|B| cos θ) = −

g

2µBλ cos θz, (1.9)

that deflected the beam maximally upward or downward as if the only allowedvalues of cos θ were ±1.

This surprising result indicated that if you ask the question, “Is the spinpointing up or down on the Bloch sphere?,” the answer is always ‘Yes, the zcomponent of the (dimensionless) spin angular momentum is either σz = +1

3Recall from Maxwell’s equations that ∇ · B = 0. This constraint forces the magneticfield to have a form like B = B0(0,−λy, 1 + λz). This is a minor complication that doesnot significantly affect the results of the experiment for atoms lying close to the beam axis(x, 0, 0).


B

EB

2B

g

B

2B

g

Figure 1.4: Zeeman energy of an electron in a magnetic field. Classically the energy cantake on any value in the interval g

2µBB[−1, 1], but quantum experiments show that onlythe extremal values ± g

2µBB are ever observed.

or σz = −1 and never anything else.’ For simplicity, let us refer to this versionof the Stern-Gerlach experiment as ‘asking the Z question.’ By rotating theStern-Gerlach magnet one could put the field gradient in another direction,say X. Now one finds that the component of the angular momentum in thedirection of the field gradient is σx = ±1 as if the spin can only be parallelor antiparallel to the x direction! It seems that what state you find, dependson what you decide to measure. Once again, ‘we get what we see, not seewhat we get.’

To illustrate just how peculiar this is, consider the setup shown in Fig. 1.5.Because the oven presumably evaporates silver atoms with no preferred spinorientation, the first Z measurement randomly produces σz = ±1 with equalprobabilities. However what happens next depends on what you measure!It is impossible for all three components of the spin vector to have definitevalues because they are incompatible observables. If you measure σx (say)you always get ±1 and you have no idea what the state was before youmeasured it. Afterwards you know the state has collapsed onto a uniquestate compatible with your measurement result, because if you measure thesame component again you always get the same result.

All of these peculiar results can be explained by the ansatz in Eq. (1.4)that the state with spin angular momentum pointing in the direction n =(sin θ cosφ, sin θ sinφ, cos θ) on the Bloch sphere is a superposition of the up


a)

Z

Z50%

50%

100%

0%

b)

Z

X50%

50%

50%

Figure 1.5: A silver atom evaporated in the oven flies into a Stern-Gerlach measurementapparatus. A Z measurement randomly produces σz = ±1 with equal probability. (a) Ifwe select those atoms with σz = +1, a second Z measurement yields σz = +1 100% of thetime. (b) If instead we follow the Z measurement with an X measurement, we randomlyobtain σx = ±1 with equal 50% probabilities. If we were to add second measurement ofσx it would agree with the first. However if followed the X measurement with another Zmeasurement, the result would be completely random even though we had selected σz =+1 atoms in the first measurement. This illustrates the fact that Z and X measurementsare fundamentally incompatible.

and down states

|ψ⟩ = |n⟩ = cosθ

2| ↑⟩+ eiφ sin

θ

2| ↓⟩. (1.10)

To see why this is so, consider the state with spin pointing in the +x direction(corresponding to θ = π/2, φ = 0)

|+X⟩ = | →⟩ = 1√2[| ↑⟩+ | ↓⟩]. (1.11)

This is a coherent superposition of up and down with equal probability am-plitudes 1/

√2. Hence if we were to measure Z we would randomly collapse

the state to | ↑⟩ with probability p↑ =(

1√2

)2

= 12and to state | ↓⟩ with prob-

ability p↓ =(

1√2

)2

= 12. If the measurement result happens to be σz = +1,

then the state collapses to | ↑⟩, and a second Z measurement would also yield|σz in agreement with the first.


Similarly, we could consider the state with spin pointing in the −x direc-tion (corresponding to θ = π/2, φ = π)

| −X⟩ = | ←⟩ = 1√2[| ↑⟩ − | ↓⟩]. (1.12)

A Z measurement on this state also randomly yields σz = ±1 with equalprobabilities.

Suppose that the first Z measurement yields the value σz = +1 as shownin Fig. 1.5b. Then the state has collapsed to | ↑⟩. How do we predict whatwill happen if we were to make an X measurement? Using Eqs. (1.11-1.12)we see that we can write the state | ↑⟩ as the following superposition

| ↑⟩ = 1√2[| →⟩+ | ←⟩]. (1.13)

Thus we see that if we start with the state | ↑⟩ and make an X measurementwe will randomly obtain σx = ±1 with equal probability just as illustratedin Fig. 1.5b. Similarly we can write

| ↓⟩ = 1√2[| →⟩ − | ←⟩] (1.14)

and again find that an X measurement will give a random result σx = ±1with equal probabilities.

For completeness we also note that states with spin aligned along the ±ydirection are given by Eq. (1.10) with θ = π/2, φ = ±π/2

|+ Y ⟩ =1√2[| ↑⟩+ i| ↓⟩] (1.15)

| − Y ⟩ =1√2[| ↑⟩ − i| ↓⟩]. (1.16)

We can express the Z states in terms of these as

| ↑⟩ =1√2[|+ Y ⟩+ | − Y ⟩] (1.17)

| ↓⟩ =−i√2[|+ Y ⟩ − | − Y ⟩]. (1.18)

This tells us that, just as Z and X measurements are incompatible, so areZ and Y measurements.


There is a general method for expressing quantum states in terms ofdifferent orthogonal basis states which we will learn about in the next section.This method makes it very easy to derive expressions like those in Eqs. (1.17-1.18).

Box 1.4. Predicting Measurement Outcomes In order to correctly pre-dict the probability of different measurement outcomes, you must expressthe given state |ψ⟩ in the basis appropriate to the measurement you wish tomake. Thus to find the probability that a measurement of σy yields +1 or−1, you must express |ψ⟩ in the basis of |±Y ⟩ and then apply the Born ruleto find the probabilities from the probability amplitudes in the Y basis. Wewill shortly learn more formally how to do this.

Exercise 1.1. Express the states |±Y ⟩ in terms of |±X⟩. If you mea-sure σy = +1, what is the probability that a subsequent X measurementwill yield σx = −1? Are X and Y measurements compatible?

Chapter 2

Introduction to Hilbert Space

We have seen that the state of an electron whose spin points in a specifieddirection can be written as a linear combination of a pair of ‘basis states.’These basis states could for example be | ↑⟩ and | ↓⟩

|ψ⟩ = α| ↑⟩+ β| ↓⟩, (2.1)

where α and β are complex coefficients. The mathematical space of allowedstates for an electron spin is a vector space over the field of complex num-bers and is referred to as the system’s Hilbert space. We are used to writingordinary vectors as ‘row vectors’ (x, y) or (x, y, z). It is traditional howeverin quantum mechanics to represent vectors as ‘column vectors.’ Thus for ex-ample the basis states corresponding to Z measurements can be representedas

| ↑⟩ =

(10

)(2.2)

| ↓⟩ =

(01

)(2.3)

which gives

|ψ⟩ =(αβ

). (2.4)

It will be useful to define an ‘inner product’ between pairs of vectorsin Hilbert space that is analogous to the ordinary dot product. The inner

16

CHAPTER 2. INTRODUCTION TO HILBERT SPACE 17

product between

|ψ′⟩ = α′| ↑⟩+ β′| ↓⟩ =(α′

β′

)(2.5)

is defined to be (using the Dirac notation to be explained further shortly)

⟨ψ′|ψ⟩ = α′∗α + β′∗β. (2.6)

For ordinary vectors A ·B = B ·A, but notice that for a complex vector spacewe have to be careful about the order since

⟨ψ′|ψ⟩ = (⟨ψ|ψ′⟩)∗. (2.7)

The complex conjugation means that the inner product of any vector withitself is real

⟨ψ|ψ⟩ = |α|2 + |β|2 = 1, (2.8)

where the second equality follows from the Born interpretation of the magni-tude squared of probability amplitudes as measurment outcome probabilities.

Dirac referred to his notation for state vectors as the ‘bracket’ notation.A quantum states is represented by ‘ket’ |ψ⟩. Associated with this vector isa dual ‘bra’ vector defined to be the following row vector

⟨ψ| =(αβ

)†

=(α∗ β∗ )

, (2.9)

where † indicates the complex conjugate of the transpose. Thus

⟨ψ|ψ′⟩ =(α∗ β∗ ) (

α′

β′

)= α∗α′ + β∗β′, (2.10)

where the last equality follows from the ordinary rules of matrix multiplica-tion (here applied to non-square matrices).

Even though the inner product can be complex, we can still think of it astelling us something about the angle between two vectors in Hilbert space.Thus for example

⟨↑ | ↓⟩ = ⟨+X| −X⟩ = 0. (2.11)


These pairs of vectors are orthogonal even though their corresponding spinvectors are co-linear (anti-parallel). Thus opposite points on the Bloch spherecorrespond to orthogonal vectors in Hilbert space. [This is closely tied to thehalf angles that appear in Eq. (1.10).] If two state vectors are orthogonalthen the states are completely distinguishable. A system prepared in one ofthe states can be measured to uniquely determine which of the two states thesystem is in. Of course if the system is in a superposition of two orthogonalstates, the measurement result can be random. Conversely, if a system isin one of two states that are not orthogonal, it is not possible to reliablydetermine by measurement which state the system is in.

2.1 Linear Operators on Hilbert Space

An operator defines a mapping between the Hilbert space back into the sameHilbert space. In quantum mechanics we will deal with linear operators,which for the case of a single spin are simply 2x2 matrices that when mul-tiplied into a column vector representing a state, produce another columnvector representing another state. It turns out that physical observables inquantum mechanics are represented by Hermitian matrices. We cannot usenumbers to represent physical observables because observables do not havevalues before we measure them. Instead observables have the potential tohave different values upon measurement. It turns out that matrices can beused to represent this peculiar situation. This takes some getting used to, sowe will proceed with some examples.

It turns out that the z component of the spin angular momentum isrepresented in the following way

σz =

(1 00 −1

). (2.12)


Now notice the following interesting properties of this matrix

σz| ↑⟩ =

(1 00 −1

)(10

)=

(10

)= (+1)| ↑⟩ (2.13)

σz| ↓⟩ =

(1 00 −1

)(01

)=

(0−1

)= (−1)| ↓⟩. (2.14)

These equations tell us that | ↑⟩ is an eigenstate of σz with eigenvalue +1and that | ↓⟩ is an eigenstate of σz with eigenvalue -1. The eigenvectorsof an operator are those vectors that are mapped into themselves under theoperation, up to a constant factor called the eigenvalue.

It turns out that the possible measurement results for some physical ob-servable are given by the eigenvalues of the operator corresponding to thatobservable. Hence, we see that for the case of σz the possible measurementresults must be ±1 in agreement with experiment. Furthermore, it turns outthat if a given quantum state is an eigenstate of a particular operator witheigenvalue λ then the measured value of the observable will not be random,but in fact will always be exactly λ. Let us consider what happens when astate is not an eigenstate of an observable. Consider for example

|ψ⟩ = α| ↓⟩+ β| ↑⟩ (2.15)

σz|ψ⟩ = −α| ↓⟩+ β| ↑⟩. (2.16)

If α and β are both non-zero, this is clearly not an eigenvector. But this isconsistent with the fact that the measurement result will be random withprobability p↑ = |β|2 of being +1 and probability p↓ = |α|2 of being -1.


Box 2.1. Eigenvalues, Eigenvectors and Measurements It is importantto note that the act of measurement of an observable is not simply repre-sented by multiplying the quantum state |ψ⟩ by the operator correspondingto the observable being measured (e.g. σz). In particular, σz|ψ⟩ is not thestate of the system after the measurement of σz. The state always randomlycollapses to one of the eigenstates of the operator being measured. Further-more, the measurement result is always one of the eigenvalues of the operatorbeing measured and the particular state to which the system collapses is theeigenvector corresponding to the eigenvalue that is the measurement result.Thus if the result of the measurement of σz is +1, the state collapses to | ↑⟩.If the result of the measurement is −1, the state collapses to | ↓⟩. Fromthe Born rule, the probabilities that the state collapses to | ↑⟩ and | ↓⟩ arerespectively

p↑ = ⟨ψ| ↑⟩⟨↑ |ψ⟩ (2.17)

p↓ = ⟨ψ| ↓⟩⟨↓ |ψ⟩. (2.18)

How do we know that measurement collapses the state onto one of the eigen-vectors of the operator being measured? This follows from the experimentalfact that a Stern-Gerlach Z measurement (say) that gives a certain result,will give exactly the same result if we make a second Z measurement. Themost general possible state is a superposition of the two eigenstates of σz ofthe form |ψ⟩ = α| ↑⟩ + β| ↓⟩. In order for the second Z measurement tonot be random, we must have (from the Born rule) that |α|2 and |β|2 cannotboth be non-zero. We have to be in one eigenstate or the other so that theprobability of getting the same measurement result is 100%. Thus the firstZ measurement has to collapse the state onto one of the eigenstates of Z.

Recall that measurement collapses the state, so if we want to know theaverage measurement result for the state we have to prepare many copiesand measure each copy only once. The average measurement result will ofcourse be (+1)p↑ + (−1)p↓ = |β|2 − |α|2. Let us compare this quantity tothe so-called ‘expectation value’ of the operator σz in the state |ψ⟩ whichis defined by the expression ⟨ψ|σz|ψ⟩. What we mean by this expression iscompute the state σz|ψ⟩ and then takes its inner product (‘overlap’) with


|ψ⟩:⟨ψ|σz|ψ⟩ = (α∗⟨↑ |+ β∗⟨↓ |)(+α| ↑⟩ − β| ↓⟩

= +|α|2⟨↑ | ↑⟩ − α∗β⟨↑ | ↓⟩+ β∗α⟨↓ | ↑⟩ − |β|2⟨↓ | ↓⟩= |α|2 − |β|2. (2.19)

Or equivalently in matrix notation

⟨ψ|σz|ψ⟩ =(ψ∗↑ ψ∗

↓)( 1 0

0 1

)(ψ↑ψ↓

)(2.20)

= |ψ↑|2 − |ψ↓|2 = |α|2 − |β|2. (2.21)

Thus we have the nice result that the average measurement result for someobservable is simply the expectation value of the observable (operator) in thequantum state being measured.

Can we now say something about how random the measurement resultwill be? Let us begin by reminding ourselves about some basic facts aboutclassical random variables. Suppose some random variable ξ takes on valuesfrom the set v1, v2, . . . , vM and value vj occurs with probability pj. Thenthe mean value is given by

ξ =M∑j=1

pjvj. (2.22)

where the overbar indicates statistical average. One measure of how randomξ is the variance defined by

Var(ξ) ≡ (ξ − ξ)2 = ξ2 − ξ2 (2.23)

=M∑j=1

pjv2j −

[M∑j=1

pjvj

]2

. (2.24)

We see that the variance is the mean square deviation from the average andso is a measure of the size of the random fluctuations in the ξ.

Exercise 2.1. Statistical Variance

a) Derive Eq. (2.24).

b) Assuming that all the eigenvalues vj are distinct (i.e. non-degenerate), prove that the variance vanishes if and only if oneof the pj ’s is unity and the rest vanish (so that ξ is not random).If one of the eigenvalues ism-fold degenerate, then the correspond-ing m probabilities must add up to unity and the rest must vanish.


Let us now turn to the quantum case. Let observable Q have eigenvaluesv1, v2, . . . , vM. (So far we have only considered a single spin for whichM =2. In more general cases (e.g. multiple spins), the Hilbert space dimensionMcan be larger.) The analog of Eq. (2.24) for the variance of the measurementresults for Q in state |ψ⟩ is

Var(Q) = ⟨ψ|Q2|ψ⟩ − ⟨ψ|Q|ψ⟩2. (2.25)

Now notice that if |ψ⟩ is an eigenvector of Q (say the mth eigenvector)

Q|ψ⟩ = vm|ψ⟩, (2.26)

then

Var(Q) = v2m⟨ψ|ψ⟩ − [vm⟨ψ|ψ⟩]2 = 0. (2.27)

Thus only superpositions of states (in the basis appropriate to the measure-ment!) with different eigenvalues will have randomness in the measurementresults. This is entirely consistent with the experimental facts describedabove for a single spin.

Exercise 2.2. Since the eigenvalues of σz are ±1, we expect that (σz)2has both of its eigenvalues equal to unity. From the matrix form of σz

prove that its square is the identity matrix

(σz)2 = I =

(1 00 1

).


Box 2.2. Observables as Hermitian operators Hermitian matrices (thatis, matrices obeying M † =M , where again † indicates transpose followed bycomplex conjugation) are guaranteed to have real eigenvalues and that theireigenvectors for a complete basis set that spans the Hilbert space (more onwhat this means later). Since the measured values of physical observablesare always real, physical observables are always represented by Hermitianmatrices.Let us now demonstrate that Hermitian matrices have real eigenvalues andthat their eigenvectors are orthogonal. Consider the jth and kth (normalized)eigenvectors

H|ψj⟩ = λj|ψj⟩, (2.28)

H|ψk⟩ = λk|ψj⟩. (2.29)

Taking the adjoint of the second equation we obtain

⟨ψk|H† = λ∗k⟨ψk|. (2.30)

NOTE THE ERROR IN SUSSKIND. HE WRITES λk INSTEAD OF λ∗k.Assuming H† = H, can now use Eq. (2.28) and Eq. (2.30) to obtain

⟨ψk|H|ψj⟩ = λj⟨ψk|ψj⟩, (2.31)

⟨ψk|H|ψj⟩ = λ∗k⟨ψk|ψj⟩. (2.32)

Subtracting these yields

0 = (λj − λ∗k)⟨ψk|ψj⟩. (2.33)

For the case j = k, we know that (by construction) ⟨ψj|ψj⟩ = 1, and hence theimaginary part of λj must vanish. Thus all the eigenvalues of an Hermitianoperator are real. If for j = k, the eigenvalues are non-degenerate (i.e.,λj = λk), then Eq. (2.33) requires ⟨ψk|ψj⟩ = 0 and so the two eigenvectorsmust be orthogonal. Thus if the full spectrum is non-degenerate, the set ofeigenvectors is orthonormal: ⟨ψk|ψj⟩ = δkj.If M eigenvalues are degenerate, then any superposition of the M eigenvec-tors is also an eigenvector. If they are not orthogonal, they can be madeorthogonal by taking appropriate linear combinations of the set of M eigen-vectors (via the so-called Gram-Schmidt procedure).


2.2 Dirac Notation for Operators

We have seen that in Dirac’s bracket notation, the inner product ⟨Φ|Ψ⟩ is a(possibly complex) number. The outer product G = |Ψ⟩⟨Φ| turns out to bean operator on the space. To see why, simply operate with G on an arbitrarystate |χ⟩

G|χ⟩ = |Ψ⟩⟨Φ|χ⟩ = g|Ψ⟩, (2.34)

where g is the complex number representing the inner product of |Φ⟩ and|Ψ⟩,

g = ⟨Φ|χ⟩. (2.35)

Hence when applied to any vector in the Hilbert space, G returns anothervector in the Hilbert space. Thus it is an operator and indeed it is a linearoperator since

G(α|χ⟩+ β|ϕ⟩) = αG|χ⟩+ βG|ϕ⟩. (2.36)

As a specific example, consider the operator

G = | ↑⟩⟨↑ | − | ↓⟩⟨↓ |. (2.37)

Clearly

G| ↑⟩ = (+1)| ↑⟩ (2.38)

G| ↓⟩ = (−1)| ↓⟩, (2.39)

so it must be that G = σz. To confirm this let us write G out in matrix form

G =

(10

) (1 0

)−

(01

) (0 1

)(2.40)

=

(1 00 −1

), (2.41)

where the last result follows from the rules of matrix multiplication andagrees with Eq. (2.12).

The above result is a particular case of the general fact that in Dirac’snotation, Hermitian operators take on a very simple form

V =M∑j=1

vj|ψj⟩⟨ψj|, (2.42)


where |ψj⟩ is the jth eigenvector of V with eigenvalue vj.

Exercise 2.3. Show that the representation of the operator V inEq. (2.42) correctly reproduces the required property of V that

V |ψm⟩ = vm|ψm⟩,

for every eigenvector |ψm⟩ of V .

Exercise 2.4. The three components of the (dimensionless) spin an-gular momentum vector σ = (σx, σy, σz) are each physical observables.The 2× 2 matrices representing these operators are known as the Paulimatrices. We have already seen the matrix representation for σz in theup/down (Z) basis

σz =

(+1 00 −1

). (2.43)

Using the known form of the states |±X⟩ from Eq. (1.11) and Eq. (1.12)and | ± Y ⟩ in Eqs. (1.15-1.16), show that in the Z basis

σx =

(0 +1+1 0

)(2.44)

σy =

(0 −i+i 0

). (2.45)

2.3 Orthonormal bases for electron spin

states

It is possible to show that the quantum states |±n⟩ corresponding to any pairof oppositely directed unit vectors ±n on the Bloch sphere are orthogonal

⟨−n|+ n⟩ = 0. (2.46)

Since were are dealing with a Hilbert space of only two dimensions and wehave an orthonormal pair of states in the Hilbert space, they constitute acomplete basis for expressing any vector in the Hilbert space.

It is straightforward to derive the completeness relation

|+ n⟩⟨+n|+ | − n⟩⟨−n| = I , (2.47)


where I is the identity operator whose matrix representation is

I =

(1 00 1

). (2.48)

This is a powerful result because it allows us to express any state |ψ⟩ in anybasis by simply writing

|ψ⟩ = I|ψ⟩ = |+ n⟩⟨+n|ψ⟩+ | − n⟩⟨−n|ψ⟩. (2.49)

We see immediately the coefficients in the expansion of the state |ψ⟩ in thebasis are simply computed as the inner products of the basis vectors withthe state |ψ⟩.

The results above are very reminiscent of how we find the representation ofan ordinary vector, say a position vector in 2D. We may have some orthogonalbasis vectors for our coordinate system, for example x, y, or a rotated set i, j.We can represent any vector as

r = (rx, ry) = x(x · r) + y(y · r) (2.50)

= i(i · r) + j(j · r). (2.51)

This suggests we can think of the identity transformation as

I = xx+ yy = ii+ jj, (2.52)

where we interpret xx applied to a vector V to mean x(x · V ). Note thesimilarity between Eq. (2.52) and Eq. (2.47).

It is useful at this point to discuss the concepts of projections and projec-tors. Recall that when we project an ordinary vector r onto the x axis, wesimply remove all components of r other than the x component. Thus theprojection of r onto the x axis is the vector xVx = x(x · V ). We can think ofPx = xx as a projection operator (or projector) because

PxV = (xx)V = x(x · V ). (2.53)

Notice that Px satisfies the defining characteristic of projection operators

(Px)2 = Px. (2.54)

This has a simple interpretation–once the vector is projected onto the x axis,further projection doesn’t do anything.


We can think of the shadow of an object cast onto the ground as the pro-jection of the objection onto the horizontal (xy) plane. This is accomplishedby the projection operator

Pxy = xx+ yy = I − zz, (2.55)

where I = xx + yy + xz is the identity for the 3D case. This shows us thatprojection onto the xy plane simply removes the component of the vectornormal to the plane. Despite this more complicated form, it is straightfor-ward to show that (Pxy)

2 = Pxy as required. The key to this is that the basisvectors x, y, z are orthonormal.

The analogous projector onto the vector |n⟩ in the Hilbert space describ-ing the spin of an electron is simply

Pn = |n⟩⟨n|. (2.56)

It is straightforward to show that this is idempotent PnPn = Pn as required.

Exercise 2.5. Derive Eq. (2.46).

Exercise 2.6. Derive Eq. (2.47).

Eq. (2.47) along with the Born rule tells us that means that, given anarbitrary state ψ⟩, a measurement asking the question ‘Is this the state |n⟩will be answered ‘yes’ with probabilitygiven ⟨ψ|Pn|ψ⟩ = |⟨n|ψ⟩|2.

Exercise 2.7. Show that | ± n⟩ is an eigenvector of

H = n · σ = (n · x)σx + (n · y)σy + (n · z)σz

with eigenvalue ±1

H|+ n⟩ = (+1)|+ n⟩ (2.57)

H| − n⟩ = (−1)| − n⟩ (2.58)

This is simply a reflection of the rotation invariance of space. We canalign our Stern-Gerlach magnet in the arbitrary direction n and themeasured state of the spin will be either |+ n⟩ or | − n⟩.


2.3.1 Gauge Invariance

We have seen that any two states at opposite points on the Bloch sphereform an orthonormal basis that can be used to represent any state in the2D Hilbert space of a single spin. One confusing aspect of all this that wehave not yet discussed, is the following. The states |± n⟩ in the Hilbert spacecorresponding to the Bloch sphere unit vectors±n are not unique. Each basisvector can be multiplied by a different arbitrary phase factor |v+⟩ = eiξ+ |+n⟩and |v−⟩ = eiξ−| − n⟩ and we would still have a perfectly good orthonormalbasis obeying

⟨v+|v+⟩ = ⟨v−|v−⟩ = 1, (2.59)

⟨v−|v+⟩ = 0, (2.60)

and the completeness relation

I = |v+⟩⟨v+| + |v−⟩⟨v−|. (2.61)

The particular choice of phase factors cancels out in the above expressions.The specific representation of one other operator (besides the identity) is alsoindependent of the ‘gauge’ choice that we make by picking particluar phasefactors, namely the (diagonal) operator that is measured by a Stern-Gerlachmagnet oriented in the n direction

n · σ = |v+⟩⟨v+| − |v−⟩⟨v−| (2.62)

= |+ n⟩⟨+n| − | − n⟩⟨−n|, (2.63)

is invariant and the probability of the two measurement results is unaffectedby the gauge choice. However non-diagonal operators such as the spin flipoperator

Q = |v+⟩⟨v−| + |v−⟩⟨v+| (2.64)

= e+i(ξ+−ξ−)|+ n⟩⟨−n| + e−i(ξ+−ξ−)| − n⟩⟨+n| (2.65)

are gauge dependent when expressed in the | ± n⟩ basis. Of course the statevectors also become gauge dependent

|ψ⟩ = α|v+⟩+ β|v−⟩ (2.66)

= αeiξ+|+ n⟩+ βeiξ− | − n⟩, (2.67)

and so nothing changes in the physics.


Interestingly, even the apparently fixed definition for |+ n⟩ in Eq. (1.10)is ambiguous when it comes to writing down the other basis vector | − n⟩.If n = (sin θ cosφ, sin θ sinφ, cos θ), then we can arrive at the oppositelydirected unit vector −n via two different routes: i) θ → θ + π, φ→ φ, or ii)θ → π − θ, φ→ φ+ π. Consider for example the case θ = π/2, φ = 0 whichgives

|n⟩ = |+X⟩ = 1√2[| ↑⟩+ | ↓⟩] . (2.68)

Both transformation (i) and (ii) take n to −n on the Bloch sphere but the tworesulting states in the Hilbert space differ by a gauge choice (i.e., a phase)

| − n⟩ = | −X⟩ = − 1√2[| ↑⟩ − | ↓⟩] Method(i) (2.69)

= +1√2[| ↑⟩ − | ↓⟩] Method(ii) (2.70)

Method (ii) self-consistently keeps θ in the range 0 ≤ θ ≤ π and is thestandard gauge choice. Going back to Eq. (1.12) and Eq. (1.16), we see thatMethod (ii) was used in to define | −X⟩ and | − Y ⟩ and yields the standardform of the Pauli matrices displayed in Ex. 2.4.

2.4 Rotations in Hilbert Space

In order to build a quantum computer we need to have complete control overthe quantum states of a system of many qubits. For the moment howeverlet us consider how we might create an arbitrary state of a single qubitstarting from some fiducial state, typically taken to be either | ↑⟩ or | ↓⟩ inthe standard basis. Since the most general state is a superposition state |n⟩corresponding to an arbitrary point n on the Bloch sphere, we need to beable to perform rotations in Hilbert space that correspond to the rotation ofthe Bloch vector some initial position to any desired position on the Blochsphere. Being able to prepare arbitrary states is a requirement for carryingout a quantum computation.

Because every state in the single- or multi-qubit Hilbert space has thesame length ⟨ψ|ψ⟩ = 1, rotations are the only operations we need to movethroughout the entire Hilbert space. There is no concept of ‘translations’ inHilbert space. If we add a vector |ϕ⟩ to the vector |ψ⟩ we of course obtain


1v

1v

2v

1 2v v

2v3

v

Figure 2.1: Two unit vectors v1, v2 whose end points lie on the unit circle in 2D. Thesum of the vectors v1 + v2 (generically) has length squared W = 2 + 2(v1 · v2) = 1. Thenormalized vector v3 = 1√

W(v1 + v2) does lie on the unit circle and can be obtained by a

rotation applied to either v1 or v2.

another element of the complex vector space, but generically |ψ⟩+ |ϕ⟩ is notproperly normalized to unity. One can show that the normalized version ofthe state

|Λ⟩ = 1√W

(|ψ⟩+ |ϕ⟩) , (2.71)

with

W = 2 + ⟨ψ|ϕ⟩+ ⟨ϕ|ψ⟩, (2.72)

can be obtained from the initial state |ψ⟩ via a rotation. The idea is illus-trated schematically for ordinary 2D vectors in Fig. (2.1).

It turns out that rotations in Hilbert space are executed via unitary op-erations. A unitary matrix U has the defining property

U †U = I, (2.73)

where I is the identity matrix. Thus the inverse of U is simply the adjointU−1 = U †. It turns out that unitary transformations preserve the innerproducts between vectors in Hilbert space, just as the more familiar orthog-onal rotation matrices preserve the angles (dot products) between ordinaryvectors.

To better understand rotations consider the representation of an arbitrarystate vector in terms of some orthonormal basis |j⟩; j = 1, 2, 3, . . . , N span-ning the N -dimensional Hilbert space

|ψ⟩ =N∑j=0

ψj|j⟩. (2.74)


From the Born rule, the probability of measuring the system to be in the basisstate |j⟩ is given by pj = |ψj|2. The requirement that the total probabilitybe unity gives us the normalization requirement on the state vector

N∑j=1

|ψj|2 = 1. (2.75)

Now consider a linear operation U that preserves the length of every vectorin the Hilbert space.

|ψ′⟩ = U |ψ⟩ =N∑j=1

ψjU |j⟩, (2.76)

⟨ψ′|ψ′⟩ = ⟨ψ|U †U |ψ⟩ (2.77)

=N∑

j,k=1

ψ∗kψj⟨k|U †U |j⟩ (2.78)

=N∑j=1

|ψj|2⟨j|U †U |j⟩+N∑j =k

ψ∗kψj⟨k|U †U |j⟩. (2.79)

Since U preserves the length of every vector we have

⟨ψ|U †U |ψ⟩ = 1 (2.80)

and

⟨j|U †U |j⟩ = 1 (2.81)

for every basis vector |j⟩. Thus we must have

N∑j =k

ψ∗kψj⟨k|U †U |j⟩ = 0 (2.82)

for all possible choices of the set of amplitudes ψj; j = 1, 2, 3, . . . , N . Thus itmust be that

⟨k|U †U |j⟩ = δkj, (2.83)

where the Kronecker delta symbol δjk = 1 for j = k and 0 for j = k.Equivalently, this means

U †U = I. (2.84)


Thus the only linear operations that conserve probability for all states areunitary operations. It follows that unitary transformations preserve not onlythe inner product of states with themselves but also preserve the inner prod-ucts between any pair of states

|ϕ′⟩ = U |ϕ⟩ (2.85)

|ψ′⟩ = U |ψ⟩ (2.86)

⟨ϕ′|ψ′⟩ = ⟨ϕ|U †Uψ⟩ = ⟨ϕ|ψ⟩. (2.87)

KEEP WORKING ON THIS.It turns out that in an ideal, dissipationless closed quantum system, the

evolution of the system from its initial state at time 0 to its final state attime t is described by a unitary transformation. We can control the timeevolution, and thus create different unitary operations, by applying controlsignals to our quantum system. The specifics of how this is done for differentsystems using laser beams, microwave pulses, magnetic fields, etc. will notconcern us at the moment. We will for now simply postulate that the onlyoperations available to us to control the quantum system are multiplicationof the starting state by a unitary matrix to effect a rotation in Hilbert space.

Being able to rotate states in Hilbert space can also be very useful forthe purposes of measurement. If we literally had a Stern-Gerlach magnetthat could be oriented in every possible direction, we could measure anycomponent of the spin vector σ. In general however our qubit will not be(say) a silver atom whose magnetic moment we can measure with a Stern-Gerlach magnet. The two states of our qubit might be the ground and firstexcited electronic orbital states of an atom, or might be the ground and firstexcited state of a superconducting circuit. In these cases it often happensthat the energy eigenstates of the system (i.e., of the Hamiltonian operator)constitute the only basis in which measurements can be made. Since westill have a two-level system, we have the same Hilbert space and the sameoperators, so the mathematics is the same, but the ‘spin’ should be thoughtof as a ‘pseudospin.’ The direction of the pseudospin vector on the Blochsphere tells us about the superposition state of the qubit but has nothing todo with angular momentum or magnetic moment.

Typically we represent the energy eigenstates in terms of the standardbasis states | ↑⟩ and | ↓⟩. Thus the Hamiltonian (energy operator) is

H =E0 + E1

2I +

E1 − E0

2σz, (2.88)


which has eigenvalues E0 and E1. If we choose the zero of energy to be halfway between the ground and excited state energies then E0+E1 = 0 and wecan drop the first term. If we are able to measure the energy (say) then thepreferred measurement operator to which we have access is σz. If the qubitis in the state

|ψ⟩ = α| ↑⟩+ β| ↑⟩, (2.89)

and we are able to prepare many copies of this state, a histogram of themeasurement results Z = ±1 plus the Born rule allows us to deduce thevalues of |α|2 and |β|2. We cannot however deduce the relative complexphase of α and β. (Recall that WLOG we can take α to be real.) To fullydetermine the state we need (in general) to be able to measure all threecomponents of σ. If we only have access to measurements of σz, then fullstate ‘tomography’ seems to be impossible. However if we prepend certainselected rotations of the state before making making the Z measurement wecan achieve our goal. For example a rotation by π/2 around the y axis takes| + X⟩ to | − Z⟩ and | − X⟩ to | + Z⟩. Similarly a rotation by π/2 aroundthe x axis takes |+ Y ⟩ to |+Z⟩ and | − Y ⟩ to | −Z⟩. Thus we can measureall three components of the spin. In fact, we can rotate any state |n⟩ into|+ Z⟩ and thus measure the operator n · σ.

Before we learn how to rotate states in Hilbert space, let us review thefamiliar concept of rotations in ordinary space. For example if we start withan ordinary 3D unit vector on the Bloch sphere

n = xx+ yy + zz = [sin θ cosφ x+ sin θ sinφ y + cos θ z] , (2.90)

we can rotate it through an angle χ around the z axis to yield a new vector

n ′ = [sin θ cos(φ+ χ) x+ sin θ sin(φ+ χ) y + cos θ z] , (2.91)

which simply corresponds to a transformation of the polar coordinates θ →θ, φ→ φ+ χ. If we choose to represent n as a column vector

n =

xyz

=

sin θ cosφsin θ sinφ

cos θ

, (2.92)

then, using the trigonometric identities cos(θ + χ) = cos(θ) cos(χ) −sin(θ) sin(χ) and sin(θ + χ) = sin(θ) cos(χ) + cos(θ) sin(χ), it is straight-


forward to show that the new vector n ′ is represented by

n ′ =

x′

y′

z′

= Rz(χ)

xyz

, (2.93)

where Rz(χ) is a 3× 3 ‘rotation matrix’

Rz(χ) =

cosχ − sinχ 0sinχ cosχ 00 0 1

. (2.94)

Of course, if we follow this by a rotation through angle −χ we must returnthe vector to its original orientation. Hence Rz(−χ)Rz(χ) = I or equivalentlyRz(−χ) = R−1

z (χ). From the fact that sin and cos are respectively odd andeven in their arguments, it is straightforward to show that

R−1z (χ) = RT

z (χ). (2.95)

Matrices whose transpose is equal to their inverse are called orthogonal andit turns out that rotations (for ordinary vectors) are always represented byorthogonal matrices.

Two natural properties of rotations are that they preserve the length ofvectors and they preserve the angle between vectors. These facts can besummarized in the single statement that if r1

′ = Rz(χ)r1 and r2′ = Rz(χ)r2,

then r1′ · r2 ′ = r1 · r2. That is, the dot product between any two vectors

(including the case of the the dot product of a vector with itself) is invariantunder rotations. The mathematical source of the preservation of lengths canbe traced back to the fact that the determininant of an orthogonal matrix isunity.

Let us now turn to rotations in Hilbert space. There, must be someconnection to ordinary rotations, because rotation of the spin vector on theBloch sphere is an ordinary (3D vector) rotation. However the Hilbert spaceis only two-dimensional, and unlike the example above where the standardbasis vectors were x, y and y, the standard basis vectors in the Hilbert spacecorrespond to the orthogonal states | + Z⟩ = | ↑⟩ and | − Z⟩ = | ↓⟩. Fur-thermore, the inner product in Hilbert space involves complex conjugationunlike the case of the dot product for ordinary vectors. Hence we expectthat rotation operations in Hilbert space will not look like 3D rotations ofthe unit vectors on the Bloch sphere.


Let us begin with rotations around the z axis. We know from the exampleabove, that rotation of the 3D vector n on the Bloch sphere by an angle χaround the z axis, simply corresponds to the transformation φ → φ + χ.From the standard quantum state representation in Eq. (1.10), we see thatthis simply changes the relative phase of the coefficients of | ↑⟩ and | ↓⟩. Letus therefore consider the following operator on the Hilbert space which doessomething very similar

Uz(χ) = e−iχ2σz

. (2.96)

What do we mean by the exponential of a matrix? One way to interpret thisexpression is via a power series expansion

Uz(χ) =∞∑n=0

1

n!

[−iχ

2

]n[σz]n. (2.97)

The series should converge provided that |χ|2|σz| lies within the radius of

convergence of the series expansion. Here |σz| means the absolute value ofthe largest eigenvalue of the matrix. Since |σz| = 1 and since the exponentialis an entire function (it is analytic everywhere in the complex plane andhence the series has infinite radius of convergence), the series does converge1.

Using the result of Ex. 2.2, we can rewrite Eq. (2.97) as

Uz(χ) =∑

n=even

1

n!

[−iχ

2

]nI +

∑n=odd

1

n!

[−iχ

2

]nσz

= cosχ

2I − i sin χ

2σz. (2.98)

Using the fact that σz| ↑⟩ = (+1)| ↑⟩ and σz| ↓⟩ = (−1)| ↓⟩, yields

Uz(χ)| ↑⟩ = e−iχ2 | ↑⟩ (2.99)

Uz(χ)| ↓⟩ = e+iχ2 | ↑⟩. (2.100)

Applying this to the standard state |n⟩, and using the fact that σz| ↑⟩ =1For infinite-dimensional matrices with unbounded eigenvalues, one has to be more

careful with this analysis.


(+1)| ↑⟩ and σz| ↓⟩ = (−1)| ↓⟩, yields

Uz(χ)|n⟩ = Uz(χ)

[cos

θ

2| ↑⟩+ sin

θ

2eiφ| ↓⟩

](2.101)

=

[cos

θ

2e−iχ

2 | ↑⟩+ sinθ

2ei(φ+

χ2)| ↓⟩

](2.102)

= e−iχ2

[cos

θ

2| ↑⟩+ sin

θ

2ei(φ+χ)| ↓⟩

](2.103)

= e−iχ2 |Rz(χ)n⟩ (2.104)

= e−iχ2 |n′⟩. (2.105)

Thus the 2× 2 complex matrix Uz(χ) correctly rotates the quantum state byan angle χ around the z axis of the Bloch sphere. Notice however that theresulting state differs from the standard state by an (irrelevant) global phasefactor corresponding to a gauge change.

Notice that Uz(−χ) = U−1z (χ) = U †

z (χ). Hence Uz(χ) is unitary, meaningthat

U †zUz = I . (2.106)

All rotations in Hilbert space are unitary and it is straightforward to showthat any unitary transformation U preserves the inner product in the Hilbertspace. Consider two arbitrary states |Ψ⟩ and |Φ⟩ to which U is applied

|Ψ′⟩ = U |Ψ⟩ (2.107)

|Φ′⟩ = U |Φ⟩ (2.108)

⟨Φ′| = ⟨Φ|U † (2.109)

⟨Φ′|Ψ′⟩ = ⟨Φ|U †U |Ψ⟩ = ⟨Φ|Ψ⟩. (2.110)

Thus unitary matrices for complex vector spaces preserve inner productshence are the analog of orthogonal matrices for real vector spaces.

Exercise 2.8. Prove that the Paul matrices σx, σy, σz are both Hermi-tian and unitary.

Exercise 2.9. a) Prove that every operator that is both Hermitianand unitary squares to the identity. Using this, prove that if suchoperators are traceless, their spectrum contains only +1 and −1and has equal numbers of each.

b) Conversely, prove the following lemma: Every operator that bothsquares to the identity and is Hermitian, is necessarily unitary.


Exercise 2.10. Every N × N unitary matrix can be written in theform

U = eiθM , (2.111)

where θ is a real parameter and M is an N ×N Hermitian matrix.

a) Using the fact that the eigenvectors of an Hermitian matrix forma complete basis, find the spectrum (set of eigenvalues) and eigen-vectors of U in terms of the eigenvalues and eigenvectors in termsof M .

b) Using this result, prove that U is unitary.

So far we have only considered rotations around the z axis of the Blochsphere. By analogy with Eq. (2.96), we can rotate the state around the xaxis through an angle χ via the unitary

Ux(χ) = e−iχ2σx

, (2.112)

around the y axis via

Uy(χ) = e−iχ2σy

, (2.113)

and around an arbitrary ω axis via

Uω(χ) = e−iχ2ω·σ. (2.114)

Mathematically, the three components of the dimensionless spin angular mo-mentum S = 1

2(σx, σy, σz) are the generators of (the Lie group of) rotations

in spin space. As an example, let us consider a rotation by angle π/2 aroundthe y axis to see how the cardinal points on the Bloch sphere transform intoeach other as expected:

Uy(π

2)|+ Z⟩ =

[cos

π

4I − i sin π

4σy

]| ↑⟩ = 1√

2[| ↑⟩+ | ↓⟩]

Uy(π

2)|+ Z⟩ = +|+X⟩, (2.115)

Uy(π

2)| − Z⟩ = −| −X⟩ (2.116)

Uy(π

2)|+X⟩ = +| − Z⟩ (2.117)

Uy(π

2)| −X⟩ = +|+ Z⟩ (2.118)


where we have used the fact that

−iσy| ↑⟩ = −i(

0 −i+i 0

)(10

)=

(01

)= | ↓⟩ (2.119)

−iσy| ↓⟩ = −i(

0 −i+i 0

)(01

)= −

(10

)= −| ↑⟩ (2.120)

Exercise 2.11. Prove that

Ux(π

2)|+ Z⟩ = | − Y ⟩ (2.121)

Ux(π

2)| − Z⟩ = −i|+ Y ⟩. (2.122)

2.4.1 The Hadamard Gate

We see from Eqs. (2.115-2.118) that a rotation by π/2 around the y axisinterchanges the states ±Z and ±X, but not in a simple way. In order toswap the x and z components of the spin vector without the inconvenience ofthe various minus signs in Eqs. (2.116-2.118), quantum computer scientistslike to invoke the Hadamard gate

H = |+X⟩⟨+Z| + | −X⟩⟨−Z| (2.123)

=1√2(σx + σz) =

1√2

(1 11 −1

)(2.124)

Exercise 2.12. Using Eq. (2.123), prove that the Hadamard gate obeys

a) H = |+ Z⟩⟨+X| + | − Z⟩⟨−X|,

b) H2 = I,

c) H is unitary.

2.5 Hilbert Space and Operators for Multiple

Spins

So far we have been focused on the two-dimensional complex vector spacedescribing the quantum states of a single spin. We turn now to the study


of how to extend this to two spins and ultimately to N spins. If we have asingle spin, the standard basis has two states, | ↑⟩ and | ↓⟩ and the spin canbe in any linear superposition of those two basis states. If we have two spins,we need four basis states: | ↓↓⟩, | ↓↑⟩, | ↑, ↓⟩, | ↑↑⟩. Replacing ↓ by 0 and ↑ by1, we see that the states are labelled by the binary numbers corresponding to0, 1, 2, 3 just as in a classical computer memory that contains only two bits.Our quantum bits can however be in a superposition of all four states of theform

|ψ⟩ = ψ↓↓| ↓↓⟩+ ψ↓↑| ↓↑⟩+ ψ↑↓| ↑↓⟩+ ψ↑↑| ↑↑⟩, (2.125)

or equivalently a column vector of length four

|ψ⟩ =

ψ↑↑ψ↑↓ψ↓↑ψ↓↓

. (2.126)

Note that the choice of how to order the entries in the column is completelyarbitrary, but once you make a choice you must stick to it for all your cal-culations. For the case of N spins, all of the above generalizes to a vectorspace of dimension 2N .

We have so far seen two kinds of products for vectors, the inner product⟨ψ|ϕ⟩ which is a scalar (complex number), and the outer product |ϕ⟩⟨ψ|which is an operator, that has a matrix representation. When it comes tothinking about the quantum states of a composite physical system consistingof multiple qubits, we have to yet another kind of product, the tensor product,sometimes called the direct product. The tensor product of the Hilbert spaceH1 of the first qubit with that of a second qubit H2 yields a new Hilbertspace H1 ⊗H2 whose dimension is the product of the dimensions of the twoindividual Hilbert spaces. The two-qubit basis states in Eq. (2.125) can bethought of as direct products of individual vectors,

| ↑↓⟩ = | ↑⟩ ⊗ | ↓⟩. (2.127)

The direct product of two column vectors of length n1 and n2 is a columnvector of length n1 + n2.

How do we represent operators in this multi-qubit space? First we haveto put labels on the operators to denote which spin they are acting on. For


example, the zth component of the spin vector of the jth spin is denotedσzj , with j = 1, 2. Strictly speaking, we should write the tensor productσz1 ⊗ σ0

2, where σ02 is the identity operator acting on the second spin, but

many authors take a shortcut in the notation by not explicitly writing outthe identity operators. The tensor product of a matrix with dimension n1 ×n1 with a matrix with dimension n2 × n2 is a larger matrix of dimension(n1 + n2) × (n1 + n2). This is of course consistent with the fact that thedimension of the direct product Hilbert space is n1 + n2. As an exampleconsider the matrix representation of the operator

σz1 ⊗ σ0

2 =

(+1 00 −1

)⊗

(+1 00 +1

). (2.128)

The rule for how you write out the entries to this 4 × 4 matrix depend onexactly how you order the terms in the column vector in Eq. (2.126). It isclear however that we want the following to be true (assuming we numberthe spins in | ↑↓⟩ from left to right inside the Dirac ket)

σz1 ⊗ σ0

2| ↑↑⟩ = (+1)| ↑↑⟩ (2.129)

σz1 ⊗ σ0

2| ↑↓⟩ = (+1)| ↑↓⟩ (2.130)

σz1 ⊗ σ0

2| ↓↑⟩ = (−1)| ↓↑⟩ (2.131)

σz1 ⊗ σ0

2| ↓↓⟩ = (−1)| ↓↓⟩, (2.132)

and

σ01 ⊗ σz

2| ↑↑⟩ = (+1)| ↑↑⟩ (2.133)

σ01 ⊗ σz

2| ↑↓⟩ = (−1)| ↑↓⟩ (2.134)

σ01 ⊗ σz

2| ↓↑⟩ = (+1)| ↓↑⟩ (2.135)

σ01 ⊗ σz

2| ↓↓⟩ = (−1)| ↓↓⟩. (2.136)

Using the basis implicit in Eq. (2.126), the outer product in Eq. (2.128)is represented by the matrix

σz1 ⊗ σ0

2 =

+1 0 0 00 +1 0 00 0 −1 00 0 0 −1

. (2.137)

We can think of the direct product as producing a 2×2 array of 2×2 identitymatrices, each multiplied by one of the four entries in σz

1. Reversing the order


of the operators gives

σ01 ⊗ σz

2 =

+1 0 0 00 −1 0 00 0 +1 00 0 0 −1

. (2.138)

We can think of this as four copies of the σz matrix each multiplied by theappropriate entry in the identity matrix.

We can also consider the direct sum of two operators. For example, thez component of the total spin angular momentum is given by

Sz ≡ σz1 ⊕ σz

2, (2.139)

is represented in matrix form by

Sz = σz1 ⊗ σ0

2 + σ01 ⊗ σz

2 =

+2 0 0 00 0 0 00 0 0 00 0 0 −2

. (2.140)

In Wolfram Mathematica R⃝, the following commands will produce thesedirect products (called outer products in Mathematica R⃝)

II = IdentityMatrix[2];

MatrixForm[II]

Z = 1, 0, 0, -1;

MatrixForm[Z]

ZI = ArrayFlatten[Outer[Times, Z, II]];

MatrixForm[ZI]

IZ = ArrayFlatten[Outer[Times, II, Z]];

MatrixForm[IZ]

MatrixForm[IZ + ZI]

Exercise 2.13. Write out the matrix representation of the followingtwo-qubit Pauli matrices

a) σx1 ⊗ σx2 .

b) σx1 ⊕ σx2 .

c)(σ01 + σz1

)⊕ 1

2

(σ02 + σz2

).


Box 2.3. The No-cloning Theorem The no-cloning theorem states thatit is impossible to make a copy of an unknown quantum state.The essential idea of the no-cloning theorem is that in order to make a copyof an unknown quantum state, you would have to measure it to see what thestate is and then use that knowledge to make the copy. However measurementof the state produces random back action (state collapse) and it is not possibleto fully determine the state. This is a reflection of the fact that measurementof a qubit yields one classical bit of information which is not enough in generalto fully specify the state via its latitude and longitude on the Bloch sphere.Of course if you have prior knowledge, such as the fact that the state isan eigenstate of σx, then a measurement of σx tells you the eigenvalue andhence the state. The measurement gives you one additional classical bit ofinformation which is all you need to have complete knowledge of the state.A more formal statement of the no-cloning theorem is the following. Givean unknown state |ψ⟩ = α| ↑⟩+ β| ↓⟩ and an ancilla qubit initially preparedin a definite state (e.g. | ↓⟩), there does not exist a unitary operation U thatwill take the initial state

|Φ⟩ = [α| ↑⟩+ β| ↓⟩]⊗ | ↓⟩ (2.141)

to the final state

U |Φ⟩ = [α| ↑⟩+ β| ↓⟩]⊗ [α| ↑⟩+ β| ↓⟩] ,= α2| ↑↑⟩+ αβ[| ↑↓⟩+ | ↓↑⟩] + β2| ↓↓⟩. (2.142)

unless U depends on α and β.The proof is straightforward. The RHS of Eq. (2.141) is linear in α and β,whereas the RHS of Eq. (2.142) is quadratic. This is impossible unless Udepends on α and β. For an unknown state, we do not know α and β andtherefore cannot construct U .

Exercise 2.14. Assuming you have knowledge of α and β, constructan explicit unitary U(α, β) that carries out the cloning operation inEq. (2.142).

Chapter 3

Two-Qubit Gates andEntanglement

3.1 Introduction

We learned in Chap. 2 that the physical operations we are allowed to carryout (besides measurement) are represented by unitary matrices acting onstate vectors. These operations are rotations in Hilbert space and transformone state into another. Indeed, this is all a quantum computer can do. Youload a quantum state into the N qubits of the input register, carry out aunitary rotation in the 2N -dimensional Hilbert space, and place the resultingstate into the N qubits of the output register. Measure the state of theoutput register and you acquire N classical bits of information.1 Unitaryoperations can be combined with measurements in interesting ways. Aftercarrying out a unitary U0, one could measure a subset of say m qubits. Themeasurement will yield m classical bits of information corresponding to oneinstance out of 2m possible results. Conditioned on this, one can choose whichof 2m subsequenty unitaries Uj from the set U1, U2, . . . , U2m to apply. Thenet effect of these three steps is the most general possible quantum operation

1As noted earlier, N classical bits is far less information than it takes to specify ageneric N -qubit state. Thus generically the measurements give random results as themeasured state collapses. However certain algorithms, e.g. Shor’s factoring algorithm usequantum interference to ‘focus’ the amplitudes in the output state that it consists of asuperposition of only a small number of states in the standard basis and thus the resultis not very random. Repeating the algorithm a small number of times yields the desiredprime factors with high probability.

43

CHAPTER 3. TWO-QUBIT GATES AND ENTANGLEMENT 44

and is not equivalent to a unitary transformation. Such quantum operationscan be more powerful for certain purposes (e.g. quantum error correction).

For now however, let us concentrate on two-qubit gates (unitary opera-tions). We saw in Chap. 2 that Hilbert space consisting of the direct productof the Hilbert space for two single qubits is four-dimensional and that oper-ators on that space are represented by 4× 4 matrices. It does not seem thatstepping up from one qubit to two qubits is going to lead to anything dra-matically new. However, it does–we will explore the fascinating concept ofentanglement. Entanglement was deeply disturbing to the founders of quan-tum mechanics, especially to Einstein who understood and pointed out itsspooky implications. The power of entanglement as a quantum resource hascome to be appreciated in recent decades and underlies certain key protocolsfor quantum communication and for quantum computation.

3.2 The CNOT Gate

The NOT gate in a classical computer simply flips the bit taking 0 to 1 and1 to 0. The so-called ‘controlled-NOT’ or CNOT gate allows one to do ‘if-then-else’ logic with a computer. It is a two-bit gate that flips the state ofthe target bit conditioned on the state of the control qubit. It has the ‘truthtable’ shown in Table 3.1 from which we see that the target bit is flipped, ifand only if, the control bit is in the 1 state.

CNOT Truth TableControl In Target In Control Out Target Out

0 0 0 00 1 0 11 0 1 11 1 1 0

Table 3.1: Action of the two-bit CNOT gate in the standard basis.

In the classical context one can imagine measuring the state of the controlbit and then using that information to control the flipping of the target bit.However in the quantum context, it is crucially important to emphasize thatmeasuring the control qubit would collapse its state. We must thereforeavoid any measurements and seek a unitary gate which works correctly when


the control qubit is in |0⟩ and |1⟩ and even when it is in a superposition ofboth possibilities α|0⟩ + β|1⟩. As we will soon see, it is this latter situationwhich will allow us to generate entanglement. When the control qubit is ina superposition state, the CNOT gate causes the target qubit to be bothflipped and not flipped in a manner that is correlated with the state of thecontrol qubit. As we will see, these are not ordinary classical statisticalcorrelations (e.g. clouds are correlated with rain), but rather special (andpowerful) quantum correlations resulting from entanglement.

Box 3.1. CNOT without Measurement How can we possibly flip thestate of the target qubit conditioned on the state of the control qubit withoutmeasuring the control and hence collapsing its state? We know that in manysystems we can cause a transition between two quantum levels separated inenergy by an amount ~ω by applying an electromagnetic wave of frequency ωfor a certain precise period of time. The way to make this bit flip of the targetbe conditioned on the state of the control is to have an interaction betweenthe two qubits that causes the transition energy of the target depend onthe state of the control. For example, consider the Hamiltonian (the energyoperator)

H =~ω1

2Z1 +

~ω2

2Z2 + ~gZ1Z2. (3.1)

The energy change to flip qubit 2 from | ↓⟩ to | ↑⟩ is ∆E = ~(ω2 + gZ1).Thus if we shine use light of frequency ~(ω2+ g), qubit 2 will flip when qubit1 is in | ↑⟩ because that matches the transition frequency. However if qubit 1is in the state | ↓⟩, the transition frequency for qubit 2 is shifted to ~(ω2− g)and the light has the wrong frequency to cause the transition. See Fig. 3.1for an illustration of the level scheme.

From the classical truth table we can attempt to construct the appropriatequantum operator by putting the dual of initial state ket in the bra and thedesired final state in the ket

CNOT = |00⟩⟨00|+ |01⟩⟨01|+ |11⟩⟨10|+ |10⟩⟨11|. (3.2)

We see that this produces all the desired transformations, for example

CNOT|11⟩ = |10⟩. (3.3)


|

|

|

|

2 2g

2 2g

1 2g

1 2g

Figure 3.1: Energy levels for two interacting qubits whose Hamiltonian is givenby Eq. (3.1). Solid lines correspond to transitions in which qubit 1 is flipped.Dashed lines correspond to transitions in which qubit 2 is flipped. By choosingω1, ω2 and g appropriately, all four single-spin transition frequencies can be madeunique, thereby allowing flip of one qubit conditioned on the state of the other.If qubit 1 is the control and qubit 2 is the target, then illumination at frequencyω2 + 2g executes the CNOT. Illumination at frequency ω1 + 2g executes a CNOTwith qubit 2 being the control and qubit 1 being the target. If g = 0, then thetransition energy for one qubit is not dependent on the state of the other andconditional operations are not possible.

This however is not enough. We have to prove that the desired transforma-tions are legal. That is, we must show that CNOT is unitary. It is clearby inspection that CNOT is Hermitian. A straightforward calculation showsthat (CNOT)2 = I. Hence by the lemma in Ex. 2.9b, CNOT is unitary.

It is also instructive to write the gate in the following manner (dispensingwith the ⊗ symbol and number the qubits 1 and 2 from left to right)

CNOT =

(I1 + Z1

2

)X2 +

(I1 − Z1

2

)I2. (3.4)

The first parentheses enclose the projector onto the up state for the controlqubit (qubit 1). The second parentheses enclose the projector onto the downstate for the control qubit. Thus if qubit 1 is in | ↑⟩ = |1⟩, then X2 flips thesecond qubit while the remaining term vanishes. Conversely when qubit 1is in | ↓⟩ = |0⟩, the coefficient of X2 vanishes and only the identity in thesecond term acts.

In the standard two-qubit basis defined in Eq. (2.126), the operator in


Eq. (3.2) and Eq. (3.4) has the matrix representation

Box 3.2. Some desired operations are not unitary. For example,one important task in a quantum computer is to reset all the bits to somestandard state before starting a new computation. Let us take the standardstate to be |0⟩. We want to map every initial state to |0⟩ which can be donewith the following operator

R = |0⟩⟨0|+ |0⟩⟨1| =(

0 01 1

). (3.5)

We see immediately that R cannot be unitary because a unitary operationpreserves inner products between states. The two initial states |0⟩ and |1⟩are orthogonal and yet they both map to the state |0⟩ which means thatthe states under the mapping are no longer orthogonal. Equivalently, we seethat R is not invertible because we don’t know if R−1 is supposed to take |0⟩to |0⟩ or to |1⟩. One can also mechanically check that R is not unitary bywriting

R† = |0⟩⟨0|+ |1⟩⟨0|, (3.6)

R†R = (|0⟩⟨0|+ |1⟩⟨0|) (|0⟩⟨0|+ |0⟩⟨1|) (3.7)

=

(1 11 1

)= I + σx = I. (3.8)

Thus if we only have access to unitary rotations, we cannot reset an unknownstate. (Of course if we knew the state, we could find a rotation that wouldtake it back to 0⟩.) There is however a simple fix for this problem. If wemeasure σz (say), the unknown state will randomly collapse to either |0⟩ or|1⟩. If it collapses to |0⟩ apply the identity operation. If it collapses to |1⟩,then apply the unitary operation σx to flip the state from |1⟩ to |0⟩. Wesee that applying one of two different unitaries (I or σx) conditioned on theoutcome of a measurement gives us a new capability beyond that of unitaryoperations.


CNOT =

(1 00 0

)⊗

(0 11 0

)+

(0 00 1

)⊗(

1 00 1

)(3.9)

=

0 1 0 01 0 0 00 0 1 00 0 0 1

. (3.10)

The CNOT unitary is represented in ‘quantum circuit’ notation by theconstruction illustrated in Fig. 3.2

1| q

2| q

Figure 3.2: Quantum circuit representation of the CNOT operation. The filled cir-cle denotes the control qubit (in this case, q1) and the symbol ⊕ denotes the targetqubit (in this case, q2) for the gate. In quantum circuit notation the order in whichgates are applied (‘time’) runs from left to right. If the circuit has GATE1 followedby GATE2, reading from left to right, this corresponds to the (right-to-left) sequenceGATE2 GATE1|INPUT STATE⟩.

Exercise 3.1. Consider the reset operator R defined in Box 3.2. Finda state whose norm is not preserved under R. This is further proofthat R is not a legal unitary operator and requires the assistance of ameasurement operation.

Now that we have established the CNOT unitary and how it acts on thestandard basis, let us consider what happens when the control qubit is ina superposition state. We begin with a so-called product state or separablestate which can be written as the product of a state for the first qubit and astate for the second qubit

|Ψ⟩ = [α1|0⟩+ β1|1⟩]⊗ [α2|0⟩+ β2|1⟩]= α1α2|00⟩+ α1β2|01⟩+ β1α2|10⟩+ β1β2|11⟩. (3.11)


For example consider the following separable initial state

|ψ⟩ = | −X⟩|+ Z⟩ = | ←⟩| ↑⟩ = 1√2(| ↑⟩ − | ↓⟩) | ↓⟩ (3.12)

=1√2(| ↑↑⟩ − | ↓↑⟩) . (3.13)

When we apply the CNOT gate to this state, the target qubit both flipsand doesn’t flip! The two qubits become ‘entangled’

|B0⟩ = CNOT|ψ⟩ = 1√2(| ↑↓⟩ − | ↓↑⟩) . (3.14)

An entangled state of two qubits2 is any state which cannot be written inseparable form. A convenient basis in which to represent entangled states isthe so-called Bell basis3

|B0⟩ =1√2[| ↑↓⟩ − | ↓↑⟩] (3.15)

|B1⟩ =1√2[| ↑↓⟩+ | ↓↑⟩] (3.16)

|B2⟩ =1√2[| ↑↑⟩ − | ↓↓⟩] (3.17)

|B3⟩ =1√2[| ↑↑⟩+ | ↓↓⟩] . (3.18)

Each of these four states is a ‘maximally entangled’ state, but they are mu-tually orthogonal and therefore must span the full four-dimensional Hilbertspace. Thus linear superpositions of them can represent any state, includingeven product states. For example,

| ↑⟩| ↑⟩ = | ↑↑⟩ = 1√2[|B2⟩+ |B3⟩] . (3.19)

2The concept of entanglement is more difficult to uniquely define and quantify forN > 2 qubits.

3Named after John Bell, the physicist who in the 1960’s developed deep insights into theissues surrounding the concept of entanglement that so bothered Einstein. Bell proposeda rigorous experimental test of the idea that the randomness in quantum experiments isdue to our inability to access hidden classical variables. At the time this was a theorist’s‘gedanken’ experiment, but today the ‘Bell test’ has rigorously ruled out the possibility ofhidden variables. Indeed, the Bell test is now a routine engineering test to make sure thatyour quantum computer really is a quantum computer, not a classical computer.


Entanglement is very mysterious and entangled states have many pecu-liar and counter-intuitive properties. In an entangled state the individualspin components have zero expectation value and yet the spins are stronglycorrelated. For example,

⟨B0|σ1|B0⟩ = 0 (3.20)

⟨B0|σ2|B0⟩ = 0 (3.21)

⟨B0|σx1σ

x2 |B0⟩ = −1 (3.22)

⟨B0|σy1σ

y2 |B0⟩ = −1 (3.23)

⟨B0|σz1σ

z2|B0⟩ = −1. (3.24)

This means that the spins have quantum correlations which are stronger thanis possible classically. In particular,

⟨B0|σ1 · σ2|B0⟩ = −3 (3.25)

despite the fact that in any single-spin (or product state) |⟨σ⟩| = 1.We can get obtain a useful picture of the Bell states by examining all

of the possible two-spin correlators in the so-called ‘Pauli bar plot’ for thestate |B0⟩ shown in Fig. (3.3). We see that all the single-spin operator (e.g.IX and ZI) expectation values vanish. Because of the entanglement, eachspin is, on average, totally unpolarized. Yet three of the two-spin correlators,XX, Y Y, ZZ, are all -1 indicating that the two spins are pointing in exactlyopposite directions. This is becasue |B0⟩ is the rotationally invariant ‘spin-singlet’ state.

II IX IY IZ XI YI ZI XX XY XZ YX YY YZ ZX ZY ZZ

-1.0

-0.5

0.5

1.0

Figure 3.3: ‘Pauli bar plot’ of one and two spin correlators in the Bell state |B0⟩.


In maximally entangled states Bell states, the individual spins do nothave their own ‘state’ in the sense that ⟨B|σ1|B⟩ = 0, rather than n, wheren is some unit vector. We can only say for |B0⟩ for example that the zcomponents of the two spins will always be measured to be exactly oppositeeach other and have product −1. Remarkably, the same is true for thex and y components as well. Classical vectors of length 1 drawn from arandom probability distribution simply cannot have this property. Thusthe correlations of the spins in entangled states are intrinsically quantumin nature and stronger than is possible in any classical (or ‘hidden variable’quantum) model.

Exercise 3.2. Derive Eqs. (3.20-3.25)

3.3 Bell Inequalities

We turn now to further consideration of the ‘spooky’ correlations in entan-gled states. We have already seen for the Bell state |B0⟩ that the componentsof the two spins are perfectly anti-correlated. Suppose now that Alice pre-pares two qubits in this Bell state and then sends one of the two qubits toBob who is far away (say one light-year). Alice now chooses to make make ameasurement of her qubit projection along some arbitrary axis n. For sim-plicity let us say that she chooses the z axis. Let us further say that hermeasurement result is −1. Then she immediately knows that if Bob choosesto measure his qubit along the same axis, his measurement result will be theopposite, +1. It seems that Alice’s measurement has collapsed the state ofthe spins from |B0⟩ to | ↓↑⟩. This ‘spooky action at a distance’ in which Al-ice’s measurement seems to instantaneously change Bob’s distant qubit wasdeeply troubling to Einstein [2].

Upon reflection one can see that this effect cannot be used for superlu-minal communication (in violation of special relativity). Even if Alice andBob had agreed in advance on what axis to use for measurements, Alicehas no control over her measurement result and so cannot use it to signalBob. It is true that Bob can immediately find out what Alice’s measurementresult was, but this does not give Alice the ability to send a message. Infact, suppose that Bob’s clock were off and he accidentally made his mea-surement slightly before Alice. Would either he or Alice be able to tell? Theanswer is no, because each would see a random result just as expected. This


must be so because in special relativity, the very concept of simultaneity isframe-dependent and not universal.

Things get more interesting when Alice and Bob choose different mea-surement axes. Einstein felt that quantum mechanics must somehow not bea complete description of reality and that there might be ‘hidden variables’which if they could be measured would remove the probabilistic nature ofthe quantum description. However in 1964 John S. Bell proved a remarkableinequality [3] showing that when Alice and Bob use certain different measure-ment axes, the correlations between their measurement results are strongerthan any possible classical correlation that could be induced by (local) hiddenvariables. Experimental violation of the Bell inequality proves that it is nottrue that quantum observables have values (determined by hidden classicalvariables) before we measure them. Precision experimental measurementswhich violate the Bell inequalities are now the strongest proof that quantummechanics is correct and that local hidden variable theories are excluded.

Perhaps the simplest way to understand this result is to consider theCHSH inequality developed by Clauser, Horn, Shimoni and Holt [4] follow-ing Bell’s ideas. Consider the measurement axes shown in Fig. (3.4). Theexperiment consists of many trials of the following protocol. Alice and Bobshare a pair of qubits in an entangled state. Alice randomly chooses to mea-sure the first qubit using X or Z while Bob randomly chooses to measure thesecond qubit using X ′ or Z ′ which are rotated 45 degrees relative to Alice’saxes. After many trials (each starting with a fresh copy of the entangledstate), Alice and Bob compare notes (via classical subluminal communica-tion) on their measurement results and compute the following correlationfunction

S = ⟨XX ′⟩+ ⟨ZZ ′⟩ − ⟨XZ ′⟩+ ⟨ZX ′⟩. (3.26)

By correlation function of two measurement results we mean (for example)

⟨XZ ′⟩ = ⟨B0|XZ ′|B0⟩. (3.27)

Experimentally this is measured by the following. For the XZ ′ correlator,Alice and Bob select those instances in which Alice happens to have chosento measure X and Bob happens to have chosen to measure Z ′. (This occurssay N times, corresponding to 25% of the total number of trials on average.)Let xj = ±1 and z′j = ±1 be respectively Alice’s random measurement resultfor X in the jth trial and and Bob’s random measurement result for Z ′ in the


jth trial. Then Alice and Bob’s comparision of measurement results can beused to create the following unbiased (but noisy) estimator of the correlation

⟨B0|XZ ′|B0⟩ ≈1

N

N∑j=1

xjzj. (3.28)

In the limit of large N the statistical uncertainty in the estimator goes tozero and we obtain an arbitrarily accurate estimate of the correlation. Thefour possible combinations of the two measurement results are illustrated inFig. 3.5. The correlator is given by the probabilities of the four outcomes

⟨B0|XZ ′|B0⟩ = P++ − P−+ + P−− − P+−. (3.29)

If the measurements are perfectly correlated (xj = z′j every time) then thecorrelator will be +1. If perfectly anticorrelated (xj = −z′j every time), thenthe correlator will be −1. If the measurements are uncorrelated, then all fourmeasurement outcomes will be equally likely and xjz

′j will be fully random

(±1 with equal probability) and the correlator will vanish (on average).

X

X ¢Z ¢

Z

45o

Figure 3.4: Measurement axes used by Alice (solid lines) and Bob (dashed lines) in estab-lishing the Clauser-Horn-Shimoni-Holt (CHSH) inequality.

Eq. (3.26) can be rewritten

S = ⟨(X + Z)X ′⟩ − ⟨(X − Z)Z ′⟩. (3.30)

Alice and Bob note that their measurement results are random variableswhich are always equal to either +1 or −1. In a particular trial Alice choosesrandomly to measure either X or Z. If you believe in the hidden variabletheory, then surely, the quantities not measured still have a value of either


jx

jz

( 1, 1)( 1, 1)

( 1, 1)( 1, 1)

Figure 3.5: Illustration of the four possible measurement outcomes in the jth run ofan experiment in which Alice measuresX and Bob measures Z ′. The net correlatorof the two measurements is given by Eq. (3.29).

+1 or −1 (because when we do measure them, they always are either +1 or−1). If this is true, then either X = Z or X = −Z. Thus either X + Zvanishes or X −Z vanishes in any given realization of the random variables.The combination that does not vanish is either +2 or −2. Hence it followsimmediately that S is bounded by the CHSH inequality

−2 ≤ S ≤ +2. (3.31)

It turns out however that the quantum correlations between the two spinsin the entangled pair violate this classical bound. They are stronger thancan ever be possible in any classical local hidden variable theory. To see this,note that because σ is a vector, we can resolve its form in one basis in termsof the other via

σx′=

1√2[σz + σx] (3.32)

σz′ =1√2[σz − σx]. (3.33)

Thus we can express S in terms of the ‘Pauli bar’ correlations

S =1√2[⟨XX +XZ + ZX + ZZ⟩ − ⟨XZ −XX − ZZ + ZX⟩] . (3.34)

For Bell state |B0⟩, these correlations are shown in Fig. (3.3) and yield

S = −2√2, (3.35)


in clear violation of the CHSH inequality. Strong violations of the CHSHinequality are routine in modern experiments.4 This teaches us that in ourquantum world, observables do not have values if you do not measure them.There are no hidden variables which determine the random values. Theunobserved spin components simply do not have values. Recall that σx andσz are incompatible observables and when we choose to measure one, theother remains not merely unknown, but unknowable.

It is ironic that Einstein’s arguments that quantum mechanics must beincomplete because of the spooky properties of entanglement have led to thestrongest experimental tests verifying the quantum theory and falsifying alllocal hidden variable theories. We are forced to give up the idea that physicalobservables have values before they are observed.

Exercise 3.3. Other Bell inequalities.

1. Work out the ‘Pauli bar plots’ (analogous to Fig. 3.3) for each ofthe Bell states B1, B2, B3.

2. Using the same quantization axes as in Fig. 3.4, find the analog ofEq. (3.26) for the correlators that should be measured to achieveviolation of the Bell inequality for these other Bell states.

3.4 Quantum Dense Coding

Now that we have learned about Bell states describing the entanglementof a pair of qubits, we will show that entanglement is a powerful resourcefor quantum information tasks. In this section we will see how quantumentanglement can be used as a resource to help in the task of quantumcommunication.

We saw above that quantum correlations are strong enough to violatecertain classical bounds, but the spooky action at a distance seemed unableto help us send signals. Actually it turns out that by using a special ‘quan-tum dense coding’ protocol, we can use entanglement to help us transmit

4Although strictly speaking, in many experiments there are loopholes associated withimperfections in the detectors and the fact that Alice and Bob typically do not have aspace-like separation. However recent experiments have closed all these loopholes. Notethat all measured correlators have statistical uncertainty in them when the number ofmeasurements N is finite. However modern experiments can achieve results that exceedthe Bell bound by many standard deviations.


information in a way that is impossible classically. [1] As background let usrecall that for a single qubit there are only two orthogonal states which areconnected by a rotation of the spin through an angle of π. For rotation by πaround the y axis the unitary rotation operator is

Ryπ = e−iπ

2σy

= −iσy, (3.36)

and we map between the two orthogonal states

Ryπ| ↑⟩ = +| ↓⟩ (3.37)

Ryπ| ↓⟩ = −| ↑⟩ (3.38)

while any other rotation angle simply produces general linear combinationsof the two basis states. For example, for rotation by π/2 we have

Ryπ2= e−iπ

4σy

=1√2[1− iσy], (3.39)

and we have

Ryπ2| ↑⟩ =

1√2[| ↑⟩+ | ↓⟩] = | →⟩ (3.40)

Ryπ2| ↓⟩ = − 1√

2[| ↑⟩ − | ↓⟩] = −| ←⟩. (3.41)

The rotated states are linearly independent of the starting state but neverorthogonal to it except for the special case of rotation by π.

The situation is very different for two-qubit entangled states. We take asour basis the four orthogonal Bell states. Suppose that Alice prepares theBell state |B0⟩ and sends one of the qubits to Bob who is in a distant location.Using a remarkable protocol called quantum dense coding [1], Alice can nowsend Bob two classical bits of information by sending him the remainingqubit. The protocol relies on the amazing fact that Alice can transformthe initial Bell state into any of the others by purely local operations on herremaining qubit without communicating with Bob. The four possible unitary


operations Alice should perform are I1,−X1,−iY1, Z1 which yield5

I1|B0⟩ = |B0⟩ (3.42)

Z1|B0⟩ = |B1⟩ (3.43)

−X1|B0⟩ = |B2⟩ (3.44)

−iY1|B0⟩ = |B3⟩. (3.45)

It seems somehow miraculous that without touching Bob’s qubit, Alicecan reach all four orthogonal states by merely rotating her own qubit. How-ever the fact that this is possible follows immediately from the Pauli bar plotin Fig. 3.3 and the corresponding plots for the other Bell states. In everyBell state, the expectation value of every component of σ1 (and σ2) vanishes.Thus for example

⟨B0|σx1 |B0⟩ = ⟨B0|X1|B0⟩ = 0. (3.46)

But this can only occur if the state σx1 |B0⟩ is orthogonal to |B0⟩! This in

turn means that there are four possible two-bit messages Alice can send byassociating each with one of the four operations I1, Z1,−X1,−iY1 accordingto the following6

message operation00 −X1

01 I110 −iY111 Z1

After encoding her message by carrying out the appropriate operation onher qubit, Alice physically transmits her qubit to Bob. Bob then makes ajoint measurement (details provided further below) on the two qubits whichtells him which of the four Bell states he has and thus recovers two classicalbits of information even though Alice sent him only one quantum bit afterdeciding what the message was. The pre-positioning of the entangled pair

5As usual we are simplifying the notation. For example Z1 stands for the more math-ematically formal Z ⊗ I since it applies Z to the first (Alice’s) qubit and the identity tothe second (Bob’s) qubit. Note that the phase factor -1 in front of X1 and −i in front ofY1 are irrelevant global phases.

6The particular operators associated with each of the four binary numbers is somewhatarbitrary and was chosen in this case to correspond to a particular choice of Bob’s decodingcircuit which will be described later.


has given them a resource which doubles the number of classical bits thatcan be transmitted with one (subsequent) quantum bit. Of course in totalAlice transmitted two qubits to Bob. The key point is that the first one wassent in advance of deciding what the message was. How weird is that!?

This remarkable protocol sheds considerable light on the concerns thatEinstein raised in the EPR paradox [2]. It shows that the special correlationsin Bell states can be used to communicate information in a novel and efficientway by ‘prepositioning’ entangled pairs shared by Alice and Bob. Howevercausality and the laws of special relativity are not violated because Alicestill has to physically transmit her qubit(s) to Bob in order to send theinformation.

The above protocol requires Bob to determine which of the four Bell stateshe has. The quantum circuit shown in Fig. (3.6) uniquely maps each of theBell states onto one of the four computational basis states (eigenstates of Z1

and Z2). The first symbol indicates the CNOT gate which flips the targetqubit (in this case qubit 2) if and only if the control qubit (qubit 1) is in theexcited state. The second gate in the circuit (denoted H) is the Hadamardgate that was introduced in Sec. 2.4.1 acting on the first qubit.

H

Figure 3.6: Bell measurement circuit with a CNOT and Hadamard gate. This circuitpermits measurement of which Bell state a pair of qubits is in by mapping the states tothe basis of eigenstates of σz

1 and σz2 .

Exercise 3.4. Prove the following identities for the circuit shown inFig. (3.6) where qubit 1 is the control and qubit 2 is the target

H1CNOT12|B0⟩ = +|01⟩ (3.47)

H1CNOT12|B1⟩ = +|11⟩ (3.48)

H1CNOT12|B2⟩ = +|00⟩ (3.49)

H1CNOT12|B3⟩ = +|10⟩ (3.50)

Once Bob has mapped the Bell states onto unique computational basis


states, he measures Z1 and Z2 separately, thereby gaining two bits of classicalinformation and effectively reading the message Alice has sent. Note that theoverall sign in front of the basis states produced by the circuit is irrelevantand not observable in the measurement process. Also note that to createBell states in the first place, Alice can simply run the circuit in Fig. (3.6)backwards.

Exercise 3.5. Construct explicit quantum circuits that take the start-ing state |00⟩ to each of the four Bell states.)

3.5 No-Cloning Theorem Revisited

Now that we have understood the EPR paradox and communication viaquantum dense coding, we can gain some new insight into the no-cloningtheorem. It turns out that if cloning of an unknown quantum state werepossible, then we could use an entangled pair to communicate information atsuperluminal speeds in violation of special relativity. Consider the followingprotocol. Alice and Bob share an entangled Bell pair in state |B0⟩. Alicechooses to measure her qubit in either the Z basis or the X basis. Thechoice she makes defines one classical bit of information. The result of themeasurement collapses the entangled pair into a simple product state. IfAlice chooses the Z basis for her measurement, then Bob’s qubit will beeither | ↑⟩ or | ↓⟩. If Alice chooses to measure in the X basis, then Bob’squbit will be either | →⟩ or | ←⟩. Bob can distinguish these cases by cloninghis qubit to make many copies. If he measures a large number of copies in theZ basis and always gets the same answer, he knows that his qubit is almostcertainly in a Z eigenstate. If even a single measurement result is differentfrom the first, he knows his qubit cannot be in a Z eigenstate and so must bein an X eigenstate. (Of course he could also measure a bunch of copies in theX basis and gain the same information.) This superluminal communicationwould violate special relativity and hence cloning must be impossible.

In fact, cloning would make it possible for Alice to transmit an unlimitednumber of classical bits using only a single Bell pair. Alice could choose anarbitrary measurement axis n. The specification of n requires two real num-bers (the polar and azimuthal angles). It would take a very large number ofbits to represent these real numbers to some high accuracy. Now if Bob canmake an enormous number of copies of his qubit, he can divide the copiesin three groups and measure the vector spin polarization ⟨σ⟩ to arbitrary


accuracy. From this he knows the polarizaton axis n = ±⟨σ⟩ Alice chose (upto an unknown sign since he does not know the sign of Alice’s measurementresult for n · σ). Hence Bob has learned a large number of classical bits ofinformation. The accuracy (and hence the number of bits) is limited only bythe statistical uncertainties resulting from the fact that his individual mea-surement results can be random, but these can be reduced to an arbitrarilylow level with a sufficiently large number of copies of the state.

3.6 Quantum Teleportation

[Not covered in class in 2019]Even though it is impossible to clone an unknown quantum state, it

is possible for Alice to ‘teleport’ an unknown state to Bob as long as hercopy of the original is destroyed in the process [5]. Just as for quantumdense coding, teleportation protocols take advantage of the power of ‘pre-positioned’ entangled pairs. However unlike quantum dense coding whereAlice ultimately sends her qubit to Bob, teleportation only requires Alice tosend two classical bits to Bob. She does not have to send any additionalquantum bits to Bob after pre-positioning the initial Bell pair. A simpleprotocol is as follows: Alice creates a |B0⟩ Bell state and sends one of thequbits to Bob. Alice has in her possession an additional qubit in an unknownstate

|ψ⟩ = α| ↑⟩+ β| ↓⟩ (3.51)

which she wishes Bob to be able to obtain without her sending it to him.Again, this must done at the expense of destroying the state of her copy ofthe qubit because of the no-cloning theorem.

Alice applies the Bell state determination protocol illustrated in Fig. (3.6)to determine the joint state of the unknown qubit and her half of the Bellpair she shares with Bob. She then transmits two classical bits giving hermeasurement results to Bob. To see how Bob is able to reconstruct the initialstate, note that we can rewrite the initial state of the three qubits in the basis


of Bell states for the two qubits that Alice will be measuring as follows

|Ψ⟩ = [α| ↑⟩+ β| ↓⟩]|B0⟩ (3.52)

=α√2[| ↑↑↓⟩ − | ↑↓↑⟩] + β√

2[| ↓↑↓⟩ − | ↓↓↑⟩] (3.53)

=1

2|B0⟩[−α| ↑⟩ − β| ↓⟩]

+1

2|B1⟩[−α| ↑⟩+ β| ↓⟩]

+1

2|B2⟩[−β| ↑⟩ − α| ↓⟩]

+1

2|B3⟩[+β| ↑⟩ − α| ↓⟩] (3.54)

From this representation we see that when Alice tells Bob which Bell stateshe found, Bob can find a local unitary operation to perform on his qubitto recover the original unknown state (up to an irrelevant overall sign). Theappropriate operations are

Alice’s Bell state Bob’s operation|B0⟩ I|B1⟩ Z|B2⟩ X|B3⟩ Y

Algebra leading to this table needs to be checked.

3.7 YET TO DO:

1. Explain how Hadamard gates interchange control and target in theCNOT.

2. Show that there is NO state for which Alice can change the probabilityof the outcomes for Bob’s Z measurement by applying any unitary toher qubit. (Because operators on different qubits commute.) HoweverAlice can change Bob’s probability by making a measurement of herqubit. But, she can’t control the outcome of the measurement, so thereis no superluminal communication.


3. The value measured for the observable is always one of the eigenvaluesof that observable and the state always collapses to the correspondingeigenvector. (If two or more of the eigenvalues are degenerate, thenthe situation is slightly more subtle. The state is projected onto thedegenerate subspace and then normalized.) Thus if the measurementresult is the non-degenerate eigenvalue λj then the state |ψ⟩ collapsesto

|ψ⟩ → Pj|ψ⟩√⟨ψ|Pj|ψ⟩

,

where the square root in the denominator simply normalizes the col-lapsed state. Introduce this projector idea for measurements to avoidthe confusion that some students think measurement of X is repre-sented by multiplying by X. Discuss further the confusion that Paulioperators are both unitary operations and Hermitian observables.

If the Stern-Gerlach magnet measures σ · n (i.e. asks ‘Are you in state| + n⟩ or state | − n⟩?), then from the Born rule, the probability thatthe measurement result is ±1 is |⟨±n|ψ⟩|2. If you make a measurementof a product state, say | ↑↑⟩ that asks, ‘Which entangled Bell state areyou in?’ the state will collapse to one of the Bell states and the collapse(aka ‘measurement back action’) leaves the state entangled.

Chapter 4

Quantum Error Correction

Now that we understand entanglement, we are in a position to tackle quan-tum error correction.

To overcome the deleterious effects of electrical noise, cosmic rays andother hazards, modern digital computers rely heavily on error correctingcodes to store and correctly retrieve vast quantities of data. Classical errorcorrection works by introducing extra bits which provide redundant encodingof the information. Error correction proceeds by measuring the bits andcomparing them to the redundant information in the auxiliary bits. Anotherbenefit of the representation of information as discrete bits (with 0 and 1corresponding to a voltage for example) is that one can ignore small noisevoltages. That is, V = 0.99 volts can be safely assumed to represent 1 andnot 0.

All classical (and quantum) error correction codes are based on the as-sumption that the hardware is good enough that errors are rare. The goal isto make them even rarer. For classical bits there is only one kind of error,namely the bit flip which maps 0 to 1 or vice versa. We will assume forsimplicity that there is probability p ≪ 1 that a bit flip error occurs, andthat error occurrences are uncorrelated among different bits. One of the sim-plest classical error correction codes to understand involves repetition andmajority rule. Suppose we have a classical bit carrying the information wewish to protect from error and we have available two ancilla bits (also sub-ject to errors). The procedure consists copying the state of the first bit intothe two ancilla bits. Thus a ‘logical’ 1 is represented by three ‘physical’ bitsin state 111, and a ‘logical’ 0 is represented by three ‘physical’ bits in state000. Any other physical state is an error state outside of the logical state

63

CHAPTER 4. QUANTUM ERROR CORRECTION 64

space. Suppose now that one of the three physical bits suffers an error. Byexamining the state of each bit it is a simple matter to identify the bit whichhas flipped and is not in agreement with the ‘majority.’ We then simply flipthe minority bit so that it again agrees with the majority. This proceduresucceeds if the number of errors is zero or one, but it fails if there is morethan one error. Of course since we have replaced one imperfect bit withthree imperfect bits, this means that the probability of an error occurringhas increased considerably. For three bits the probability Pn of n errors isgiven by

P0 = (1− p)3 (4.1)

P1 = 3p(1− p)2 (4.2)

P2 = 3p2(1− p) (4.3)

P3 = p3. (4.4)

Because our error correction code only fails for two or more physical bit errorsthe error probability for our logical qubit is

plogical = P2 + P3 = 3p2 − 2p3, (4.5)

If p < 1/2, then the error correction scheme reduces the error rate (instead ofmaking it worse). If for example p = 10−6, then plogical ∼ 3×10−12. Thus thelower the raw error rate, the greater the improvement. Note however thateven at this low error rate, a petabyte (8 × 1015 bit) storage system wouldhave on average 24,000 errors. Futhermore, one would have to buy threepetabytes of storage since 2/3 of the disk would be taken up with ancillabits!

We are now ready to enter the remarkable and magic world of quantumerror correction. Without quantum error correction, quantum computationwould be impossible and there is a sense in which the fact that error cor-rection is possible is even more amazing and counterintuitive than the factof quantum computation itself. Naively, it would seem that quantum errorcorrection is completely impossible. The no-cloning theorem (see Box 2.3)does not allow us to copy an unknown state of a qubit onto ancilla qubits.Furthermore, in order to determine if an error has occurred, we would haveto make a measurement, and the back action (state collapse) from that mea-surement would itself produce random unrecoverable errors.

Part of the power of a quantum computer derives from its analogcharacter–quantum states are described by continuous real (or complex) vari-ables. This raises the specter that noise and small errors will destroy the


extra power of the computer just as it does for classical analog computers.Remarkably, this is not the case! This is because the quantum computeralso has characteristics that are digital. Recall that any measurement of thestate of qubit always yields a binary result. Because of this, quantum errorsare continuous, but measured quantum errors are discrete. Amazingly thismakes it possible to perform quantum error correction and keep the calcula-tion running even on an imperfect and noisy computer. In many ways, thisdiscovery by Peter Shor in 1995 and Andrew Steane in 1996 is even moreprofound and unexpected than the discovery of efficient quantum algorithmsthat work on ideal computers.

It would seem obvious that quantum error correction is impossible be-cause the act of measurement to check if there is an error would collapse thestate, destroying any possible quantum superposition information. Remark-ably however, one can encode the information in such a way that the presenceof an error can be detected by measurement, and if the code is sufficientlysophisticated, the error can be corrected, just as in classical computation.Classically, the only error that exists is the bit flip. Quantum mechani-cally there are other types of errors (e.g. phase flip, energy decay, erasurechannels, etc.). However codes have been developed (using a minimum of5 qubits) which will correct all possible quantum errors. By concatenatingthese codes to higher levels of redundancy, even small imperfections in theerror correction process itself can be corrected. Thus quantum superposi-tions can in principle be made to last essentially forever even in an imperfectnoisy system. It is this remarkable insight that makes quantum computationpossible.

As an entre to this rich field, we will consider a simplified example of onequbit in some state α|0⟩+β|1⟩ plus two ancillary qubits in state |0⟩ which wewould like to use to protect the quantum information in the first qubit. Asalready noted, the simplest classical error correction code simply replicatesthe first bit twice and then uses majority voting to correct for (single) bitflip errors. This procedure fails in the quantum case because the no-cloningtheorem (see Box 2.3) prevents replication of an unknown qubit state. Thusthere does not exist a unitary transformation which takes

[α|0⟩+ β|1⟩]⊗ |00⟩ −→ [α|0⟩+ β|1⟩]⊗3. (4.6)

As was mentioned earlier, this is clear from the fact that the above transfor-mation is not linear in the amplitudes α and β and quantum mechanics is


linear. One can however perform the repetition code transformation:

[α|0⟩+ β|1⟩]⊗ |00⟩ −→ [α|000⟩+ β|111⟩], (4.7)

since this is in fact a unitary transformation. Just as in the classical case,these three physical qubits form a single logical qubit. The two logical basisstates are

|0⟩log = |000⟩|1⟩log = |111⟩. (4.8)

The analog of the single-qubit Pauli operators for this logical qubit are readilyseen to be

Xlog = X1X2X3

Ylog = iXlogZlog

Zlog = Z1Z2Z3. (4.9)

We see that this encoding complicates things considerably because now todo even a simple single logical qubit rotation we have to perform some rathernon-trivial three-qubit joint operations. It is not always easy to achieve aneffective Hamiltonian that can produce such joint operations, but this is anessential price we must pay in order to carry out quantum error correction.

It turns out that this simple code cannot correct all possible quantumerrors, but only a single type. For specificity, let us take the error operatingon our system to be a single bit flip, either X1, X2, or X3. These threetogether with the identity operator, I, constitute the set of operators thatproduce the four possible error states of the system we will be able to correctlydeal with. Following the formalism developed by Daniel Gottesman, let usdefine two stabilizer operators

S1 = Z1Z2 (4.10)

S2 = Z2Z3. (4.11)

These have the nice property that they commute both with each other (i.e.,[S1, S2] = S1S2 − S2S1 = 0) and with all three of the logical qubit operatorslisted in Eq. (4.9). This means that they can both be measured simulta-neously and that the act of measurement does not destroy the quantuminformation stored in any superposition of the two logical qubit states. Fur-thermore they each commute or anticommute with the four error operators


in such a way that we can uniquely identify what error (if any) has occurred.Each of the four possible error states (including no error) is an eigenstate ofboth stabilizers with the eigenvalues listed in the table below

error S1 S2

I +1 +1X1 −1 +1X2 −1 −1X3 +1 −1

Thus measurement of the two stabilizers yields two bits of classical informa-tion (called the ‘error syndrome’) which uniquely identify which of the fourpossible error states the system is in and allows the experimenter to correctthe situation by applying the appropriate error operator, I,X1, X2, X3 to thesystem to cancel the original error.

We now have our first taste of fantastic power of quantum error correction.We have however glossed over some important details by assuming that eitheran error has occurred or it hasn’t (that is, we have been assuming we arein a definite error state). At the next level of sophistication we have torecognize that we need to be able to handle the possibility of a quantumsuperposition of an error and no error. After all, in a system described bysmoothly evolving superposition amplitudes, errors can develop continuously.Suppose for example that the correct state of the three physical qubits is

|Ψ0⟩ = α|000⟩+ β|111⟩, (4.12)

and that there is some perturbation to the Hamiltonian such that after sometime there is a small amplitude ϵ that error X2 has occurred. Then the stateof the system is

|Ψ⟩ = [√

1− |ϵ|2I + ϵX2]|Ψ0⟩. (4.13)

(The reader may find it instructive to verify that the normalization is correct.)What happens if we apply our error correction scheme to this state? The

measurement of each stabilizer will always yield a binary result, thus illustrat-ing the dual digital/analog nature of quantum information processing. Withprobability P0 = 1 − |ϵ|2, the measurement result will be S1 = S2 = +1.In this case the state collapses back to the original ideal one and the erroris removed! Indeed, the experimenter has no idea whether ϵ had ever evendeveloped a non-zero value. All she knows is that if there was an error, it is


now gone. This is the essence of the Zeno effect in quantum mechanics thatrepeated observation can stop dynamical evolution. (It is also, once again, aclear illustration of the maxim that in quantum mechanics ‘You get what yousee.’) Rarely however (with probability P1 = |ϵ|2) the measurement resultwill be S1 = S2 = −1 heralding the presence of an X2 error. The correctionprotocol then proceeds as originally described above. Thus error correctionstill works for superpositions of no error and one error. A simple extension ofthis argument shows that it works for an arbitrary superposition of all fourerror states.

4.1 An advanced topic for the experts

Not covered in 2019. There remains however one more level of subtlety wehave been ignoring. The above discussion assumed a classical noise sourcemodulating the Hamiltonian parameters. However in reality, a typical sourceof error is that one of the physical qubits becomes entangled with its envi-ronment. We generally have no access to the bath degrees of freedom andso for all intents and purposes, we can trace out the bath and work with thereduced density matrix of the logical qubit. Clearly this is generically nota pure state. How can we possibly go from an impure state (containing theentropy of entanglement with the bath) to the desired pure (zero entropy)state? Ordinary unitary operations on the logical qubit preserve the entropyso clearly will not work. Fortunately our error correction protocol involvesapplying one of four possible unitary operations conditioned on the outcomeof the measurement of the stabilizers. The wave function collapse associatedwith the measurement gives us just the non-unitarity we need and the errorcorrection protocol works even in this case. Effectively we have a Maxwelldemon which uses Shannon information entropy (from the measurement re-sults) to remove an equivalent amount of von Neumann entropy from thelogical qubit!

To see that the protocol still works, we generalize Eq. (4.13) to includethe bath

|Ψ⟩ = [√

1− |ϵ|2|Ψ0,Bath0⟩+ ϵX2]|Ψ0,Bath2⟩. (4.14)

For example, the error could be caused by the second qubit having a couplingto a bath operator O2 of the form

V2 = g X2O2, (4.15)


acting for a short time ϵ~/g so that

|Bath2⟩ ≈ O2|Bath0⟩. (4.16)

Notice that once the stabilizers have been measured, then either the exper-imenter obtained the result S1 = S2 = +1 and the state of the system plusbath collapses to

|Ψ⟩ = |Ψ0,Bath0⟩, (4.17)

or the experimenter obtained the result S1 = S2 = −1 and the state collapsesto

|Ψ⟩ = X2|Ψ0,Bath2⟩. (4.18)

Both results yield a product state in which the logical qubit is unentangledwith the bath. Hence the algorithm can simply proceed as before and willwork.

Finally, there is one more twist in this plot. We have so far described ameasurement-based protocol for removing the entropy associated with errors.There exists another route to the same goal in which purely unitary multi-qubit operations are used to move the entropy from the logical qubit tosome ancillae, and then the ancillae are reset to the ground state to removethe entropy. The reset operation could consist, for example, of putting theancillae in contact with a cold bath and allowing the qubits to spontaneouslyand irreversibly decay into the bath. Because the ancillae are in a mixedstate with some probability to be in the excited state and some to be in theground state, the bath ends up in a mixed state containing (or not containing)photons resulting from the decay. Thus the entropy ends up in the bath. Itis important for this process to work that the bath be cold so that the qubitsalways relax to the ground state and are never driven to the excited state.We could if we wished, measure the state of the bath and determine whicherror (if any) occurred, but in this protocol, no actions conditioned on theoutcome of such a measurement are required.

Quantum error correction is extremely challenging to carry out in prac-tice. In fact the first error correction protocol to actually reach the breakeven point (where carrying out the protocol extends rather than shortensthe lifetime of the quantum information) was achieved by the Yale group in2016. This was done not using two-level systems as qubits but rather by


storing the quantum information in the states of a harmonic oscillator (asuperconducting microwave resonator containing superpositions of 0, 1, 2, . . .photons).

Chapter 5

Yet To Do

1. Insert 2019 homework as exercises in text

2. Did I insert proof that length preservation implies general inner productconservation which implies unitary? Proof could be improved.

3. Appendix on linear algebra, outer products, direct sums, eigenval-ues, degeneracies, determinants, traces, hermitian matrices producingbases, etc.

4. 3 CNOTS make a SWAP but apply to a general state or to the X basis

5. effect of Hadamards on CNOT

6. quantum teleportation

7. Shor code as next step beyond 3 qubit repetition code

8. matrix mechanics for harmonic oscillator?

71

Bibliography

[1] Bennett, Charles H. and Wiesner, Stephen J., ‘Communication via one-and two-particle operators on Einstein-Podolsky-Rosen states,’ Phys.Rev. Lett. 69, 2881-2884 (1992).

[2] A. Einstein and B. Podolsky and N. Rosen, ‘Can quantum-mechanicaldescription of physical reality be considered complete?’, Phys. Rev. 47,777-780 (1935).

[3] J. S. Bell, ‘On the Einstein Podolsky Rosen paradox,’ Physics 1, 195–200(1964).

[4] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, ‘Proposedexperiment to test local hidden-variable theories,’ Phys. Rev. Lett. 23,880–884, (1969).

[5] C. H. Bennett, G. Brassard, C. Crepeau, R. Jozsa, A. Peres, and W.K. Wootters, ‘Teleporting an unknown quantum state via dual classicaland Einstein-Podolsky-Rosen channels,’ Phys. Rev. Lett. 70, 1895–1899(1993).

72

Introduction to Quantum Information and Computation...2020/05/02 · Introduction to Quantum...

Documents

Transcript of Introduction to Quantum Information and Computation...2020/05/02 · Introduction to Quantum...