Theoretical Materials Science - TU Berlin · 2017. 8. 31. · The complexity of the quantum...

Theoretical Materials Science

Prof. Matthias Scheffler and Dr. Christian CarbognoTheory Department

Fritz Haber Institute of the Max Planck SocietyFaradayweg 4-6

14195 Berlin, Germany

Version: August 31, 2017

Table of Contents

0 Foreword 20.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Literature for This Lecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 80.3 Symbols and Terms Used in This Lecture . . . . . . . . . . . . . . . . . . . 110.4 Atomic Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1 Introduction 131.1 Many-Body Hamilton Operator . . . . . . . . . . . . . . . . . . . . . . . . 131.2 Separation of Dynamics of Electrons and Nuclei . . . . . . . . . . . . . . . 16

1.2.1 Adiabatic Approximation or Born-Oppenheimer Approximation . . 161.2.2 Static Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 201.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.2.3.1 Structure, Lattice Constant, and Elastic Properties of Perfect Crystals 221.2.3.2 Lattice Waves (Phonons) . . . . . . . . . . . . . . . . . . 23

1.3 The Ewald Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2 Some Definitions and Reminders, incl. Fermi Statistics of Electrons 302.1 Statistical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.2 Fermi Statistics of the Electrons . . . . . . . . . . . . . . . . . . . . . . . . 312.3 Some Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Electron-Electron Interaction 383.1 Electron-Electron Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 383.2 Hartree Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.3 Hartree-Fock Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 463.4 Exchange Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.5 Koopmans’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.6 The Xα Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.7 Thomas-Fermi Theory and the Concept of Screening . . . . . . . . . . . . 613.8 Density-Functional Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.8.1 Meaning of the Kohn-Sham Single-Particle Energies ǫi . . . . . . . 783.8.2 Spin Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.8.3 Two Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.9 Summary (Electron-Electron Interaction) . . . . . . . . . . . . . . . . . . 88

2

4 Lattice Periodicity 924.1 Lattice Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924.2 The Bloch Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.3 The Reciprocal Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5 The Band Structure of the Electrons 1185.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.1.1 What Can We Learn from a Band Structure? . . . . . . . . . . . . 1215.2 General Properties of ǫn(k) . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.2.1 Continuity of ǫn(k) and Meaning of the First and Second Derivatives of ǫn(k)1255.2.2 Time Reversal Symmetry . . . . . . . . . . . . . . . . . . . . . . . . 1295.2.3 The Fermi Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.3 The LCAO (linear combination of atomic orbitals) Method . . . . . . . . . . 1335.3.1 Band Structure and Analysis of the Contributions to Chemical Bonding137

5.4 The Density of States, N(ǫ) . . . . . . . . . . . . . . . . . . . . . . . . . . 1395.5 Other Methods for Solving the Kohn-Sham Equations of Periodic Crystals 141

5.5.1 The Pseudopotential Method . . . . . . . . . . . . . . . . . . . . . 1425.5.2 APW and LAPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.5.3 KKR, LMTO, and ASW . . . . . . . . . . . . . . . . . . . . . . . . 142

5.6 Many-Body Perturbation Theory (beyond DFT) . . . . . . . . . . . . . . . 142

6 Cohesion (Bonding) in Solids 1436.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.2 Van der Waals Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1466.3 Ionic bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1536.4 Covalent bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

6.4.1 Hybridization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616.5 Metallic bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1656.6 Hydrogen bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

6.6.1 Some Properties of Hydrogen bonds . . . . . . . . . . . . . . . . . . 1756.6.2 Some Physics of Hydrogen bonds . . . . . . . . . . . . . . . . . . . 176

6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

7 Lattice Vibrations 1817.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1817.2 Vibrations of a Classical Lattice . . . . . . . . . . . . . . . . . . . . . . . . 182

7.2.1 Adiabatic (Born-Oppenheimer) Approximation . . . . . . . . . . . . 1827.2.2 Harmonic Approximation . . . . . . . . . . . . . . . . . . . . . . . 1837.2.3 Classical Equations of Motion . . . . . . . . . . . . . . . . . . . . . 1857.2.4 Comparison between ǫn(k) and ωi(q) . . . . . . . . . . . . . . . . . 1887.2.5 Simple One-dimensional Examples . . . . . . . . . . . . . . . . . . 188

7.2.5.1 Linear Chain with One Atomic Species . . . . . . . . . . . 1897.2.5.2 Linear Chain with Two Atomic Species . . . . . . . . . . . 192

7.2.6 Phonon Band Structures . . . . . . . . . . . . . . . . . . . . . . . . 1947.3 Quantum Theory of the Harmonic Crystal . . . . . . . . . . . . . . . . . . 197

7.3.1 One-dimensional Quantum Harmonic Oscillator . . . . . . . . . . . 1987.3.2 Three-dimensional Quantum Harmonic Crystal . . . . . . . . . . . 199

3

7.3.3 Lattice Energy at Finite Temperatures . . . . . . . . . . . . . . . . 2017.3.4 Phonon Specific Heat . . . . . . . . . . . . . . . . . . . . . . . . . . 203

7.3.4.1 High-Temperature Limit (Dulong-Petit Law) . . . . . . . 2057.3.4.2 Intermediate Temperature Range (Einstein Approximation, 1907)2057.3.4.3 Low Temperature Limit (Debye Approximation, 1912) . . 207

7.4 Anharmonic Effects in Crystals . . . . . . . . . . . . . . . . . . . . . . . . 2097.4.1 Thermal Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 2107.4.2 Heat Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

8 Magnetism 2148.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2148.2 Macroscopic Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . 2158.3 Magnetism of Atoms and Free Electrons . . . . . . . . . . . . . . . . . . . 218

8.3.1 Total Angular Momentum of Atoms . . . . . . . . . . . . . . . . . . 2188.3.2 General derivation of atomic susceptibilities . . . . . . . . . . . . . 219

8.3.2.1 ∆ELL: Larmor/Langevin Diamagnetism . . . . . . . . . . 2228.3.2.2 ∆EVV: Van Vleck Paramagnetism . . . . . . . . . . . . . . 2238.3.2.3 ∆EPara : Paramagnetism . . . . . . . . . . . . . . . . . . . 2248.3.2.4 Paramagnetic Susceptibility: Curie’s Law . . . . . . . . . . 226

8.3.3 Susceptibility of the Free Electron Gas . . . . . . . . . . . . . . . . 2278.3.4 Atomic magnetism in solids . . . . . . . . . . . . . . . . . . . . . . 231

8.4 Magnetic Order: Permanent Magnets . . . . . . . . . . . . . . . . . . . . . 2348.4.1 Ferro-, Antiferro- and Ferrimagnetism . . . . . . . . . . . . . . . . . 2348.4.2 Interaction versus Thermal Disorder: Curie-Weiss Law . . . . . . . 2358.4.3 Phenomenological Theories of Ferromagnetism . . . . . . . . . . . . 239

8.4.3.1 Molecular (mean) field theory . . . . . . . . . . . . . . . . 2398.4.3.2 Heisenberg and Ising Hamiltonian . . . . . . . . . . . . . . 242

8.4.4 Microscopic origin of magnetic interaction . . . . . . . . . . . . . . 2488.4.4.1 Exchange interaction between two localized moments . . . 2488.4.4.2 From the Schrödinger equation to the Heisenberg Hamiltonian2518.4.4.3 Exchange interaction in the homogeneous electron gas . . 252

8.4.5 Band consideration of ferromagnetism . . . . . . . . . . . . . . . . 2558.4.5.1 Stoner model of itinerant ferromagnetism . . . . . . . . . 257

8.5 Magnetic domain structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

9 Transport Properties of Solids 2669.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2669.2 The Definition of Transport Coefficients . . . . . . . . . . . . . . . . . . . 2679.3 Semiclassical Theory of Transport . . . . . . . . . . . . . . . . . . . . . . . 2689.4 Boltzmann Transport Theory . . . . . . . . . . . . . . . . . . . . . . . . . 2719.5 Superconductivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2749.6 Meissner effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2769.7 London theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2799.8 Flux quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2809.9 Ogg’s pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2829.10 Microscopic theory - BCS theory . . . . . . . . . . . . . . . . . . . . . . . 282

9.10.1 Cooper pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

4

9.11 Bardeen-Cooper-Schrieffer (BCS) Theory . . . . . . . . . . . . . . . . . . . 2859.12 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

1

0 Foreword

0.1 Introductory Remarks

A solid or, more generally, condensed matter is a complex quantum-mechanical many-body-system consisting of ∼1023 electrons and nuclei per cm3. The most important foun-dations of its theoretical description are electronic-structure theory and statistical me-chanics. Due to the complexity of the quantum-mechanical many-body problem there isa large variety of phenomena and properties. Their description and understanding is thepurpose of this lecture course on “condensed-matter theory”. Keywords are, e.g., crystalstructure, hardness, magnetism, electrical conductivity, thermal conductivity, supercon-ductivity, etc.. The name of this lecture (Theoretical Materials Science) indicates that weintend to go a step further, i.e., “condensed matter” is replaced by “materials”. This isa small, but nevertheless important generalization. When we talk about materials, then,in addition to phenomena and properties, we also think of potential applications, i.e.,the possible function of materials, like in electronic, magnetic, and optical devices (solarcells or light emitters), sensor technology, catalysts, lubrication, and surface coatings (e.g.with respect to protection against corrosion, heat, and mechanical scratch-resistance). Itis obvious that these functions, which are determined to a large extent by properties ona nanometer length scale, play an important role in many technologies, on which ourlifestyle and the wealth of our society are based. In fact, every new commercial productis based on improved or even novel materials. Thus, the identification of new materialsand their property profiles can open new opportunities in fields such as energy, transport,safety, information, and health.

About 200,000 different inorganic materials are known to exist, but the properties, e.g.,hardness, elastic constants, thermal or electrical conductivity, are know for very few ofthem. Considering that there are about 100 different atoms in the periodic table andthat unit cells of crystals can have sizes anywhere between one and hundred atoms (oreven more), considering surfaces, nanostructures, heterostructures, organic materials andorganic/inorganic hybrids, the amount of different, possible materials is practical infinite.Thus, it is likely that there are hitherto unknown or unexplored materials with unknownbut immensely useful property profiles. Finding structure in this huge data space of differ-ent materials and reaching better understanding of the materials properties is a fascinatingscientific goal of fundamental research. And, there is no doubt that identifying new ma-terials can have truly significant socio-economical impact.

This role of materials science has been realized by at least two influential “science and en-gineering initiatives” of US presidents: the “national nanotechnology initiative” by B. Clin-

2

ton (2000) and the “materials genome initiative” by B. Obama (2011). Such initiativescome with no money. They just identify and describe a promising new route. In the men-tioned two examples, these initiatives were worked out convincingly, and various fundingagencies (from basic research to engineering to military) got together and created signif-icant programs. This has changed the research landscape not just in the US but in theworld. – In both cases our research group had been on these routes already and fullyprofited from the developments without changing.

The field of electronic-structure theory, applied to materials-science problems, is in animportant, active phase with rapid developments in

• the underlying theory,

• new methods,

• new algorithms, and

• new computer codes.

For several years now a theory is evolving that, taking advantage of high and highestperformance computers, allows an atomistic modeling of complex systems with predictivepower, starting from the fundamental equations of many-body quantum mechanics. Thetwo central ingredients of such ab initio theories are the reliable description of the under-lying elementary processes (e.g. breaking and forming of chemical bonds) and a correcttreatment of their dynamics and statistical mechanics.

Because of the importance of quantum-mechanical many-body effects in the descriptionof the interactions in polyatomic systems up to now a systematic treatment was rarelypossible. The complexity of the quantum mechanical many-body problem required theintroduction of approximations, which often were not obvious. The development of theunderlying theory (density-functional theory, DFT) started in 19641. But “only” since1978 (or 1982)2 reliable calculations for solids have been carried out. Only these allow tocheck the validity of possibly reasonable (and often necessary) approximations and to givethe reasons for their success; or demonstrate which approximations have to be abandoned.Today this development has reached a feasible level for many types of problems, but it isnot completed yet.

With these DFT-based (and related) computational methods, the approximations, whichin many existing text books are introduced ad hoc, can be inspected. Further it is pos-sible to make quantitative predictions, e.g., for the properties of new materials. But stilltheoretical condensed-matter physics or theoretical materials science is in an active stateof development. Phenomena that are still only badly understood (or not at all) includephase transitions, disorder, catalysis, defects (meta-, bi-stabilities), properties of quantum

1To some extend earlier concepts by Thomas, Fermi, Hartree, and Slater may be considered as prepa-ration for this route.

2V.L. Moruzzi, J.F. Janak, and A.R. Williams, Calculated Electronic Properties of Metals, PergamonPress (1978) ISBN 0-08-022705-8; and M.T. Yin and M.L. Cohen, Theory of static structural properties,

crystal stability, and phase transformations: Application to Si and Ge, Phys. Rev. B 26, 5668 (1982).

3

dots, crystal growth, crystal structure prediction, systems containing f -electrons, high-temperature-superconductivity, electronic excitation, and electrical and thermal trans-port.

Modern theoretical materials science is facing two main challenges:

1. Explain experimentally found properties and phenomena and place them in a biggercontext. This is done by developing models, i.e., by a reduction to the key phys-ical processes. This enables a qualitative or semi-quantitative understanding. Asmentioned before, there are examples for which these tasks are not accomplished,yet.

2. Predict properties of systems that have not been investigated experimentally sofar – or situations that cannot be investigated by experiments directly. The lat-ter include conditions of very high pressures or conditions that are chemically orradioactively harsh. A recent development is concerned with building a “library ofhitherto unknown materials”. We will get back to this project at the end of thislecture.

The latter point shall be illustrated by the following examples:

1) Until recently it was not possible to study the viscosity and the melting temperatureof iron at pressures that exist at the earth core, but this has been calculated since about19993).

2) The element carbon exists in three solid phases: i) as amorphous solid and in crystallineform ii) as graphite and iii) as diamond4,5. Graphite is the most stable phase, i.e., the onewith the lowest internal energy. At normal conditions, diamond is only a metastable state,but with a rather long lifetime. Usually when carbon atoms are brought together graphiteor an amorphous phase is formed. Only under certain conditions (pressure, temperature)present in the earth mantle diamonds can be formed. Therefore, in a certain sense, dia-monds are known only accidentally. It cannot be excluded that also other elements (Si,Ge, Ag, Au, etc.) can exist in other yet unknown modifications (polymorphs). Examplesfor new, artificially created materials are semiconductor quantum dots, quantum wire sys-tems, or epitaxial magnetic layers6. Once the interactions between atoms are understood,

3 D. Alfe, M.J. Gillan, and G.D. Price, Nature 401, 462 (1999).4Diamond is the hardest known material, i.e., it has the highest bulk modulus. Diamonds without

defects are most transparent to light. At room temperature their thermal conductivity is better than thatof any other material.

5This statement is slightly oversimplified: Since 1985 fullerenes (e.g. C60) and since 1991 carbonnanotubes are known. From the latter, “soft matter” can be formed and they can also be used directlyas nanomaterials (for example as nanotube transistors). These systems will be discussed in more detaillater.

6In 1988, Albert Fert and Peter Grünberg independently discovered that an increased magnetoresistiveeffect (hence dubbed “giant magnetoresistance” or GMR) can be obtained in magnetic multilayers. Thesesystems essentially consist of an alternate stack of ferromagnetic (e.g., Fe, Co, Ni, and their alloys) andnon-ferromagnetic (e.g., Cr, Cu, Ru, etc.) metallic layers. It is unusual that a basic effect like GMR leadsto commercial applications in less than a decade after its discovery: Magnetic field sensors based on GMRwere already introduced into the market as early as 1996 and by now, e.g., all read heads for hard discsare built that way. In 2007 Albert Fert and Peter Grünberg were awarded the Nobel Prize in physics.

4

materials with completely new physical properties could be predicted theoretically. As anexample, the theoretical investigation of semiconductor and/or metal heterostructures iscurrently used to attempt the prediction of new materials for light-emitting diodes (LEDs)and of new magnetic memory devices. Likewise, researchers hope to find new catalysts bythe theoretical investigation of alloys and surface alloys with new compositions (for whichno bulk-analogue exists). Albeit this sounds like being very close to practical application,it should be noted that applications of such theoretical predictions cannot be expected inthe too near future, because in industry many practical details (concerning the technicalprocesses, cost optimization, etc.) are crucial to decide whether a physical effect will beused in real devices or not.

It is also interesting to note that 38 Nobel prizes have been awarded for work in thefield of or related to materials science since 1980. They are listed in the following. Moredetails, also review papers describing the work behind the prize can be found here:http://www.nobelprize.org/:

1981 Physics: Nicolaas Bloembergen and Arthur L. Schawlow “for their contribution tothe development of laser spectroscopy”Kai M. Siegbahn “for his contribution to the development of high-resolution electronspectroscopy”

1981 Chemistry: Kenichi Fukui and Roald Hoffmann “for their theories, developed inde-pendently, concerning the course of chemical reactions”

1982 Physics: Kenneth G. Wilson “for his theory for critical phenomena in connectionwith phase transitions”

1982 Chemistry: Aaron Klug “for his development of crystallographic electron microscopyand his structural elucidation of biologically important nuclei acid-protein com-plexes”

1983 Chemistry: Henry Taube “for his work on the mechanisms of electron transfer reac-tions, especially in metal complexes”

1984 Chemistry: Robert Bruce Merrifield “for his development of methodology for chem-ical synthesis on a solid matrix”

1985 Physics: Klaus von Klitzing “for the discovery of the quantized Hall effect”

1985 Chemistry: Herbert A. Hauptman and Jerome Karle “for their outstanding achieve-ments in the development of direct methods for the determination of crystal struc-tures”

1986 Physics: Ernst Ruska “for his fundamental work in electron optics and for the designof the first electron microscope”Gerd Binnig and Heinrich Rohrer “for their design of the scanning tunneling micro-scope”

1986 Chemistry: Dudley R. Herschbach, Yuan T. Lee, and John C. Polanyi “for theircontributions concerning the dynamics of chemical elementary processes”

5

1987 Physics: J. Georg Bednorz and K. Alexander Müller “for their important break-through in the discovery of superconductivity in ceramic materials”

1987 Chemistry: Donald J. Cram, Jean-Marie Lejn, and Charles J. Pedersen “for theirdevelopment and use of molecules with structure-specific interactions of high selec-tivity”

1988 Chemistry: Johann Deisenhofer, Robert Huber, and Hartmut Michel, “for the de-termination of the three-dimensional structure of a photosynthetic reaction centre”

1991 Physics: Pierre-Gilles de Gennes “for discovering that methods developed for study-ing order phenomena in simple systems can be generalized to more complex formsof matter, in particular to liquid crystals and polymers”

1991 Chemistry: Richard R. Ernst “for his contributions to the development of the method-ology of high resolution nuclear magnetic resonance (NMR) spectroscopy”

1992 Chemistry: Rudolph A. Marcus “for his contributions to the theory of electron trans-fer reactions in chemical systems”

1994 Physics: Bertram N. Brockhouse “for the development of neutron spectroscopy”, andClifford G. Shull “for the development of the neutron diffraction technique”.

1996 Physics: David M. Lee, Douglas D. Osheroff, and Robert C. Richardson “for theirdiscovery of superfluidity in Helium-3”

1996 Chemistry: Robert F. Curl Jr., Sir Harold W. Kroto, Richard E. Smalley “for theirdiscovery of fullerenes”

1997 Physics: Steven Chu, Claude Cohen-Tannoudji, and William D. Phillips “for devel-opment of methods to cool and trap atoms with laser light”

1998 Physics: Robert B. Laughlin, Horst L. Störmer, and Daniel C. Tsui “for their dis-covery of a new form of quantum fluid with fractionally charged excitations”

1998 Chemistry: Walter Kohn “for his development of the density-functional theory”John A. Pople “for his development of computational methods in quantum chem-istry”

1999 Chemistry: Ahmed H. Zewail “for his studies of the transition states of chemicalreactions using femtosecond spectroscopy”

2000 Physics: Zhores I. Alferov and Herbert Kroemer “for developing semiconductor het-erostructures used in high-speed- and opto-electronics”, and Jack S. Kilby “for hispart in the invention of the integrated circuit”

2000 Chemistry: Alan J. Heeger, Alan G. MacDiarmid, and Hideki Shirakawa, “for thediscovery and development of conductive polymers”

2001 Physics: Eric A. Cornell, Wolfgang Ketterle, and Carl E. Wieman “for the achieve-ment of Bose-Einstein condensation in dilute gases of alkali atoms, and for earlyfundamental studies of the properties of the condensates”

6

2003 Physics: Alexei A. Abrikosov, Vitaly L. Ginzburg, and Anthony J. Leggett “forpioneering contributions to the theory of superconductors and superfluids”

2005 Physics: Roy J. Glauber “for his contribution to the quantum theory of opticalcoherence” and John L Hall and Theodor Hänsch for their contributions to thedevelopment of laser-based precision spectroscopy, including the optical comb tech-nique”

2007 Physics: Albert Fert and Peter Grünberg “for their discovery of Giant Magnetore-sistance”

2007 Chemistry: Gerhard Ertl “for his studies of chemical processes on solid surfaces”

2009 Physics: Charles K. Kao “for groundbreaking achievements concerning the transmis-sion of light in fibers for optical communication”Willard S. Boyle and George E. Smith “for the invention of an imaging semiconduc-tor circuit - the CCD sensor”

2010 Physics: Andre Geim and Konstantin Novoselov “for groundbreaking experimentsregarding the two-dimensional material graphene”

2011 Chemistry: Dan Shechtman “for the discovery of quasicrystals”.

2013 Chemistry: Martin Karplus, Michael Levitt, and Arieh Warshel “for the developmentof multiscale models for complex chemical systems”.

2014 Physics: Isamu Akasaki, Hiroshi Amano and Shuji Nakamura “for the invention ofefficient blue light-emitting diodes which has enabled bright and energy-saving whitelight sources”.

2014 Chemistry: Eric Betzig, Stefan W. Hell and William E. Moerner “for the developmentof super-resolved fluorescence microscopy”.

2016 Physics: David J. Thouless, F. Duncan M. Haldane and J. Michael Kosterlitz “fortheoretical discoveries of topological phase transitions and topological phases ofmatter”.

2016 Chemistry: Jean-Pierre Sauvage, Sir J. Fraser Stoddart and Bernard L. Feringa “forthe design and synthesis of molecular machines”.

In the above list I ignored Nobel prizes in the field of biophysics, though some develop-ments in this area are now becoming part of condensed-matter physics. The length of thislist reflects the fact that materials science is an enormously active field of research andan important one for our society as well.

Some remarks on the above list: The quantum-Hall-effect (Nobel prize 1985) is under-stood these days, but for its “variant” the “fractional quantum-Hall-effect” (Nobel prize1998) this is true only in a limited way. The latter is based on the strong correlation ofthe electrons and even these days unexpected results are found.

7

The theory of high-Tc superconductivity (Nobel prize 1987) is still unclear. High-Tc su-perconductors seem to feature Cooper pairs with a momentum signature of different sym-metry than conventional BCS-superconductors. Typically, high-Tc superconductors havea complex atomic structure and consist of at least four elements (e.g., La, Ba, Cu, O).However, in recent years also simpler systems have been found with rather high Tc, e.g.,MgB2 (Tc= 39 K) and in 2015 hydrogen sulfide (H2S) was found to undergo a supercon-ducting transition at high pressure (around 150 GP) near 203 K, the highest temperaturesuperconductor known to date. The latter two examples are materials that had beenknow since many decades, but their properties had only been investigated recently andthe noted results were not expected.

In this lecture:

1. Equations will not fall down from heaven, but we will derive them from first prin-ciples;

2. we will not only give the mathematical derivation, but also, and in particular, wewill develop a physical feeling, i.e., we will spend a noticeable amount of time ininterpreting equations;

3. we will give the reasons for approximations and clarify their physical meaning andthe range of validity (as much as this is possible).

In contrast to most text books we will start with the “adiabatic principle” and subse-quently discuss the quantum mechanical nature of the electron-electron interaction. Inmost text books both are introduced only in the middle or at the end.

In the first part of the lecture we will restrict ourselves – unless stated otherwise – toT ≈ 0 K. Sometimes an extrapolation to T 6= 0 K is unproblematic. Still, one shouldkeep in mind that for T 6= 0 K important changes and new effects can occur (e.g. due toentropy).

0.2 Literature for This Lecture

Author: Ashcroft, Neil W. and Mermin, N. DavidTitle: Solid state physicsPlace: Philadelphia, PAYear: 1981Publisher: Saunders College PublishingISBN: 0-03-083993-9 = 0-03-049346-3

Author: Kittel, CharlesTitle: Quantum theory of solidsPlace: Hoboken, NJYear: 1963Publisher: John Wiley & Sons, Inc.

8

Author: Ziman, John M.Title: Principles of the theory of solidsPlace: CambridgeYear: 1964Publisher: Cambridge University Press

Author: Madelung, OtfriedTitle: Festkörpertheorie, 3 BändePlace: BerlinYear: 1972Publisher: Springer

Author: Dreizler, Reiner M. and Gross, Eberhard K. U.Title: Density functional theory: an approach to the quantum many-body problemPlace: BerlinYear: 1990Publisher: SpringerISBN: 3-540-51993-9 = 0-387-51993-9

Author: Parr, Robert G. and Yang, WeitaoTitle: Density-functional theory of atoms and moleculesPlace: OxfordYear: 1994Publisher: Oxford University PressISBN: 0-19-509276-7

Author: Marder, Michael P.Title: Condensed matter physicsPlace: New YorkYear: 2000Publisher: John Wiley & Sons, Inc.ISBN: 0-471-17779-2

Author: Martin, Richard M.Title: Electronic StructurePlace: CambridgeYear: 2004Publisher: Cambridge University Press

Author: Marvin L. Cohen and Steven G. LouieTitle: Fundamentals of Condensed Matter PhysicsPlace: CambridgeYear: 2016

9

Publisher: Cambridge University Press

10

0.3 Symbols and Terms Used in This Lecture

−e charge of the electron

+e charge of the proton

m mass of the electron

rk position of electron k

σk spin of electron k

ZK nuclear number of atom K

ZvK valence of atom K

MK mass of nucleus K

RK position of nucleus K

φ electric field

Ψ many-body wave function of the electrons and nuclei

Λ nuclear wave function

Φ many-body wave function of the electrons

ϕ single-particle wave function of the electrons

χ spin wave function

RI ≡ R1,...,RM atomic positions

riσi ≡ r1σ1,...,rNσN electron coordinates (position and spin)

ε0 dielectric constant of the vacuum

ǫi single-particle energy of electron i

Vg volume of the base region

Ω volume of a primitive cell

vI(rk) potential created by nucleus I at position of the k-th electron, rk

V BO potential-energy surface (PES) or Born-Oppenheimer energysurface

11

0.4 Atomic Units

All through this lecture I will use SI-units (Système International d’Unités).

Often, however, so-called atomic units (a.u.) are introduced in order to simplify the nota-tion in quantum mechanics. For historic reasons two slightly different conventions exist.For both we have:

length :4πε0~

2

me2= 1 bohr = 0.529177 Å = 0.0529177 nm , (0.1)

but then two different options have been established: Rydberg atomic units and Hartreeatomic units.

e2

4πε0~ m energy =

~2

2ma2B

~2

2m

Hamilton operator forthe hydrogen atom

Rydberg a.u. 2 1 0.5 1 Ry = 13.606 (eV) 1 −∇2 +2

r

Hartree a.u. 1 1 1 1 Ha = 27.212 (eV) 0.5 −1

2∇2 +

1

r

12

1 Introduction

1.1 Many-Body Hamilton Operator

The starting point of a quantitative theoretical investigation of the properties of solids isthe many-body Schrödinger equation

HΨ = EΨ , with Ψ = Ψ(RI, rk, σk) . (1.1)

Here, the many-body wave function depends on the coordinates of all the atoms, RI,and on the space and spin coordinates of all electrons. In general, this wave function willnot separate into RI- and (rk, σk)-dependent components. This should be kept in mindwhen we will introduce such a separation below and use it in most parts of this lecture.Of course, this separation is an approximation, but an extremely useful one and we willdiscuss its range of validity.

The properties of condensed matter are determined by the electrons and nuclei and, inparticular, by their interaction (∼ 1023 particles per cm3). For many quantum-mechanicalinvestigations it is useful to start with an approximation, which is called the “frozen-coreapproximation”. This approximation is often helpful or convenient, but it is not necessary,i.e., the many-body problem can also be solved without introducing this approximation.While most theoretical work still introduces this approximation, at the FHI we developeda methodology that treats the full problem with the same efficiency (in terms of CPUtime and memory) as the “frozen-core approximation”. In terms of qualitatively discussingand understanding numerical results, as the inter-atomic interactions, it is neverthelessuseful to focus on the valence electrons.

The frozen-core approximation assumes that, when condensed matter is formed from freeatoms, only the valence electrons contribute to the interaction between atoms. Indeed,the electrons close to the nuclei (core electrons), which are in closed shells, will typicallyonly have a small influence on the properties of solids. Exceptions are situations at veryhigh pressure (small inter-atomic distances) and experiments, which more or less directlymeasure the core electrons or the region close to the nuclei [e.g. X-ray photo emission(XPS), electron spin resonance (ESR)]. Therefore it is reasonable to introduce the follow-ing separation already in the atom, before turning to solids: Nucleus and core electronsshall be regarded as a unit, i.e., the neutral atom consists of a positive, spherically sym-metric ion of charge Zve and of Zv valence electrons.

This ion acts on each valence electron with a potential that looks like that shown inFig. 1.1. The symbols have the following meaning:

13

Ze: nuclear charge of the atom

Zve: charge of the ion

Rc: radial extension of the core electrons

η RcvIon

r

−Zv e

4πε0r

−Z e4πε0r

Figure 1.1: Potential of a positive ion (full line), where all valence electrons have beenremoved, i.e., only the electrons in closed shells are kept. The dashed curves show theasymptotic behavior for small and large distances.

Thus, the number of core electrons is Z −Zv, and the solid is considered being composedof these ions and the valence electrons.

The frozen-core approximation is conceptionally appropriate, because it corresponds tothe nature of the interaction. In Table 1.1, I give the electronic configuration and the ionicpotentials for four examples. Here (and in Fig. 1.1), η is a small number roughly of theorder of Rc/(100Z). The question marks in the range η ≤ r ≤ Rc indicate that in thisrange no analytic form of the potential can be given.

Though the field frequently uses a language (and often also a theory) in terms of ions andvalence electrons, we will continue in this lecture by talking about nuclei and electrons.For the construction of the Hamilton operator of the many-body Schrödinger equation ofa solid, we start with the classical Hamilton function and subsequently utilize the corre-spondence principle by replacing the classical momentum p with the operator (~/i)∇.

The many-body Hamilton operator of the solid has the following contributions:

1) Kinetic energy of the electrons

T e =N∑

k=1

p2k

2m. (1.2)

14

Table 1.1: Electronic configuration and ionic (frozen-core) potentials for different atoms.

atom electronicconfiguration

Z Zv Rc (bohr) vIon(r) (Ry)

H 1s1 1 1 0 −2/rHe 1s2 2 2 0 −4/r

C [1s2]2s22p2 6 4 0.7r ≥ Rc : −8/rη ≤ r ≤ Rc : ?r < η : −12/r

Si [1s22s22p6]3s23p2 14 4 1.7r ≥ Rc : −8/rη ≤ r ≤ Rc : ?r < η : −28/r

2) Kinetic energy of the nuclei

TNuc =M∑

I=1

P2I

2MI

. (1.3)

If the solid is neutral and contains only one type of atom, then N = ZM , where Zis the nuclear number of the atoms.

3) Electron-electron interaction

V e−e(rkσk) ≈1

2

1

4πε0

N,N∑

k,k′

k 6=k′

e2

|rk − rk′ |. (1.4)

We use rkσk as a short-hand notation for all position and spin coordinates ofthe electrons: r1, σ1, r2, σ2, r3, σ3, . . . , rN , σN . Here we have considered only the elec-trostatic interaction. In general, also the spin of the electrons and the magneticinteraction should and could be taken into account. Spin and magnetism in generalrequire to solve the Dirac equation. Often, however, a scalar-relativistic treatmentis sufficient. We will get back to this in Chapter 3 (Electron-Electron Interaction).

4) Interaction between the nuclei

V Nuc−Nuc(RI) ≈1

2

1

4πε0

M,M∑

I,JI 6=J

e2

|RI −RJ |ZIZJ . (1.5)

Also here (even better justified than for the electrons) we did not consider the spinof the particles.

5) Electron-nucleus interaction (without nuclear spin)

V e−Nuc(rkσk; RI) ≈ −M∑

I=1

N∑

k=1

e2

|RI − rk|ZI , (1.6)

15

which is often summed up as:

−M∑

I=1

e2

|RI − rk|ZI = v(rk) . (1.7)

Here, v(rk) is the potential due to all the nuclei.

Consequently, the many-body Hamilton operator of the solid reads

H = T e + TNuc + V e−e + V e−Nuc + V Nuc−Nuc . (1.8)

1.2 Separation of Dynamics of Electrons and Nuclei

1.2.1 Adiabatic Approximation or Born-Oppenheimer Approxi-mation

The dynamics of the electrons and nuclei is described by the time-dependent Schrödingerequation

i~∂Ψ(t)

∂t= HΨ(t) , (1.9)

where H is defined in Eq. (1.8). Thus

Ψ(t) = e−iH·(t−t0)/~ Ψ(t0) . (1.10)

In order to solve this equation, we have to bring it into a more tractable form. How canwe split things up? How can we divide the problem into smaller and tractable pieces inorder to conquer the whole?

Before we start the mathematical discussion, let me give an initial remark to motivatethe goal: Considering the (inert) masses of nuclei and electrons, it seems to be plausiblethat electrons react to an external perturbation much faster than nuclei. This is reflectedin the ratio of the masses, e.g.:

MH/m = 1,840 ,MSi/m = 25,760 ,MAg/m = 86,480 .

Thus, it seems to be reasonable to assume that electrons adjust without noticeable delayto the current positions of the atoms RI. Formulated more precisely, it can be said thatelectrons in general1 react to a perturbation on a time scale of femtoseconds (10−15s),while nuclei require times of the order of picoseconds (10−12 s). Thus, we may assumethat from the electrons point of view the nuclei do not move (or move sufficiently slowly).

1The term “in general” implies that exceptions are possible. We will get back to such situations laterin this lecture.

16

To ceck (and justify) this decoupling of the motion of the electrons and the nuclei we nowgive an (initially) exact discussion.

We define an operator He, in order to use its eigenfunctions as basis set:

He(RI)Φν(RI, rkσk) = EeνΦν , (1.11)

with terms defined in Eqs. (1.2), (1.4), (1.6)

He = T e + V e−Nuc + V e−e . (1.12)

If the kinetic energy of the nuclei would be zero (or MI/m→∞), the electrons could bedescribed by this equation. But strictly speaking, the meaning of the functions Φν definedby Eq. (1.11) is only that of basis functions. The arguments RI in the electronic wavefunctions should not be interpreted as a variable of the wave function, but as parameterswhich classify the Hamilton operator He (similar to the nuclear numbers ZI).

The following statement is exact: The solutions of Eq. (1.1) with the Hamiltonian definedby Eq. (1.8) can be expanded in terms of the functions Φν [the eigenfunctions of Eq. (1.11)]

Ψ =∑

ν

Λν(RI)Φν(RI, rkσk) . (1.13)

The meaning of Eq. (1.11) and (1.13) can also be expressed as: The eigenfunctions of He

for each atomic configuration RI form a complete set of functions. Strictly speaking,the eigenfunctions of one atomic configuration RI are complete (with respect to theelectronic coordinates), i.e., the Hilbert spaces of different atomic configurations RIare the same. Still it is reasonable (here) to consider the functions Φν(RI, rkσk) asbeing dependent of RI . Mathematically it would also be correct to take the He(RI) andthe Φν(RI, rkσk) of a certain configuration R0

I and to consider the dependence onRI only by the coefficients Λν(RI). This will be discussed in Section 1.2.2. Now, weinvestigate the equation HΨ = EΨ, representing Ψ by Eq. (1.13) and

H = He + TNuc + V Nuc + V Nuc−Nuc . (1.14)

Obviously, for the operator He we have

HeΛνΦν = ΛνHeΦν = ΛνE

eνΦν . (1.15)

Considering the full Schrödinger equation (1.1) together with Eqs. (1.8) and (1.13),V Nuc−Nuc can be interchanged with Λν , but not TNuc. Applying the chain rule we ob-tain

∇2RI

(ΛνΦν) = Λν(∇2RI

Φν) + 2(∇RIΛν)(∇RI

Φν) + (∇2RI

Λν)Φν . (1.16)

17

Now Eq. (1.1) for the ground state E0, Ψ0 is multiplied from the left by Φ∗µ and integrated

over the electronic coordinates. Using Eq. (1.11), the equation used to determine the “wavefunction of the electrons” Φν , we obtain

〈Φµ|H|Ψ0〉 = E0Λµ = (Eeµ + TNuc + V Nuc−Nuc)Λµ+

∑

ν

M∑

I=1

− ~2

2MI

[〈Φµ|∇2

RI|Φν〉Λν + 2〈Φµ|∇RI

|Φν〉(∇RIΛν)]

. (1.17)

For each electronic state Φµ, Eeµ there is one such equation. The difficult part in solving

Eq.(1.17) are the terms coupling Φµ and Φν . They describe that the dynamics of the lat-tice atoms the (∇RI

and ∇2RI

operators) couples different electronic states. These termsare called electron-phonon (or electron-vibrational) coupling.2

The electron-phonon coupling can be calculated. Then one often finds that for many prop-erties of solids it is not very important and can be neglected. However, for the “standardsuperconductivity” (BCS-theory) it is essential. Initially, for the new superconductors itwas believed that the electron-phonon interaction is not the main origin of superconduc-tivity. Nowadays, this is not generally accepted anymore, and it is still debated whatthe actuating mechanism is. For some solids electron-phonon coupling has measurableinfluence on the spectrum of the lattice vibrations, and for some situations the electron-phonon coupling is responsible for structural instabilities. Keywords are, for example,Kohn-anomaly, Jahn-Teller-effect, and Peierls instability. These mechanisms become im-portant because of certain aspects of the electronic structure. They will be discussedlater, when we discuss defects and surfaces. Electron-phonon coupling is also relevant forand noticeable in electron spectroscopy of solids, e.g., when measuring the band gaps ofsemiconductors. The corresponding theory and actual calculations represent the currentstate-of-the-art in this field.

Up to this point our derivation is exact. More general statements concerning the impor-tance of the electron-phonon coupling are usually not possible. Now we will introduce twoapproximations:

1) We assume that at each time, i.e., for each lattice geometry RI, the electronsare in an eigenstate of He. Thus, we assume that the motion of the lattice doesnot induce transitions from Φµ to Φν . The reasoning behind this assumption is thatthe electrons (typically) react fast and therefore follow the nuclear motion instanta-neously. One may also say that the electrons do not feel the nuclear motion and arealways in the electronic ground state. In this case, the matrix elements 〈Φµ|∇2

RI|Φν〉

and 〈Φµ|∇RI|Φν〉 in Eq. (1.17) are zero for µ 6= ν.

This is called the adiabatic principle or Born-Oppenheimer approximation. In gen-eral, its quantitative validity, i.e., the importance of the off-diagonal elements, ishard to evaluate.

2Later in this lecture there will be a full Chapter on lattice vibrations (phonons).

18

2) The magnitude of the diagonal elements of the electron-phonon interaction can beestimated:

a) The term 〈Φµ|∇RI|Φµ〉 = 1

2∇RI〈Φµ|Φµ〉 vanishes exactly, because 〈Φµ|Φµ〉 = 1,

i.e., it is constant. The derivative of a constant is zero.

b) For the term

− ~2

2MI

〈Φµ|∇2RI|Φµ〉 (1.18)

the strongest imaginable dependence would exist, if the electrons would followthe atoms without any delay and distortion. This means

|〈Φµ|∇2RI|Φµ〉| . |〈Φµ|∇2

rk|Φµ〉| (1.19)

and further

| − ~2

2MI

〈Φµ|∇2RI|Φµ〉| .

m

MI

| − ~2

2m〈Φµ|∇2

rk|Φµ〉|

≈ 10−4 × kinetic energy of an electron. (1.20)

Thus, for the diagonal elements µ = ν, but unfortunately only for these, a rough estima-tion is possible.

With Eq. (1.20) and the adiabatic approximation we obtain the Schrödinger equation for“the wave functions of the nuclei”:

(TNuc + V Nuc−Nuc + Ee

µ

)Λµ = E0Λµ . (1.21)

For the energetically lowest electronic state we will often write

V Nuc−Nuc + Eeµ=0 = V BO , (1.22)

and V BO is called “potential-energy surface” (PES) or “Born-Oppenheimer energy sur-face”. The PES is the energy surface the nuclei are moving on according to Eq.(1.21).

When neglecting the coupling terms 〈Φµ|. . .|Φν〉 in Eq. (1.17), the eigenfunction of theground state of H has the form:

Ψ→ ΨBO = Λ0(RI)Φ0(RI, rk, σk) , (1.23)

where Φ0 is determined by Eq.(1.11) and Λ0 by Eq. (1.21). Equation (1.11) and Eq. (1.21)can be calculated reliably using modern methods.

Strictly, the motion of the nuclei on the Born-Oppenheimer surface would have to bedescribed quantum mechanically. When the respective equations are solved, one findsthat they can be almost always replaced by the classical (Newton) equations of motion.Quantum-mechanical effects like zero-point vibrations and tunneling only rarely play an

19

important role. Hydrogen, as the lightest element, is an exception, but already for thedeuteron a classical treatment is sufficient in most cases.

In general, the functions Λ0 are narrowly peaked and centered at the atomic sites RI.Consequently, for the ground state of Eq. (1.21) we have:

E0 = Ee0(R0

I) +1

4πε0

1

2

M,M∑

I,JI 6=J

e2

|R0I −R0

J |ZIZJ + 〈Λ0|TNuc + V BO(R0

I −RI)|Λ0〉

(1.24)

The last term describes the zero-point vibrations. Equation (1.24) forms the basis of theab initio calculation of the electronic, structural, elastic, and vibrational properties ofsolids. E0 is often called total energy (or structural energy). It is based on just one ap-proximation, the Born-Oppenheimer (or adiabatic) approximation.

The Born-Oppenheimer potential, V BO(RI) refers to the actual position of the nuclei(which typically changes with time), and it assumes that the functions Φµ(RI, rk, σk)refer to exactly these positions. A hard proof of the validity of the Born-Oppenheimerapproximation is not possible and, in fact, the quality of this approximation depends onthe actual problem, because there might be situations, in which the electrons react slowerthan assumed above, and then they will not be able to follow the motion of the nucleiinstantaneously, but only with some delay and distortion. The latter means that there areelectronic transitions between the electronic ground state and excited states. Systematicstudies of electron-vibrational coupling are a timely and increasingly important researchtopic.

The derivation in this section was reasonable in order to show the form of the matrixelements of the electron-phonon interaction. Further, we wanted to estimate the orderof magnitude of the matrix elements. In principle, for each calculation the validity ofthe Born-Oppenheimer-approximation should be checked by an explicit calculation of thematrix elements in Eq. (1.17) or at least by an estimation.

1.2.2 Static Approximation

We now briefly give an alternative derivation, which is often called the “static approxima-tion”. The nuclei are always in motion, but in many cases they will just vibrate arounda position that represents a minimum of the Born-Oppenheimer potential energy, R0

I.We now investigate the Hamilton operator He(R0

I), which yields the wave functionsof the electrons Φν(R0

I, rk, σk) at these equilibrium positions R0I (cf. Eqs. (1.11),

(1.12)). Again, these Hamiltonian He(R0I) defines (by its eigenvectors) a complete set

of functions, which we can use as a basis set for the general problem. Though this basisnow refers to a fixed (static) geometry the treatment is as general as that of Sec. 1.2.1above. Although the final equations look different, they describe the same physics.

20

In the present treatment the wave function of the solid is

Ψ(RI, rkσk) =∑

ν

Λν(RI)Φν(R0I, rkσk) . (1.25)

This equation is (so far) exact, too. But the expansion coefficients are different; thereforewe added a “hat” to the Λ. Using ∆RI = RI − R0

I , the components of the Hamiltonoperator containing the nucleus-nucleus and the electron-nucleus interaction [cf. Eqs. (1.5)and (1.6)] can be formally rewritten as:

V Nuc−Nuc(RI) = V Nuc−Nuc(R0I) + V ph(∆RI) , (1.26)

V e−Nuc(RI, rkσk) = V e−Nuc(R0I, rkσk) + V e−ph(∆RI, rkσk) . (1.27)

This defines two terms: The phonon potential energy V ph(∆RI) and the electron-phonon coupling V e−ph(∆RI, rkσk). From this we obtain the equation that deter-mines the coefficients of the expansion and the energy eigenvalues of the solid

EeµΛµ + TNucΛµ + V Nuc−Nuc(R0

I)Λµ+∑

ν

⟨Φµ(R0

I, rkσk)∣∣V e−ph + V ph

∣∣Φν(R0I, rkσk)

⟩Λν = E0Λµ . (1.28)

The difference of this equation to Eq. (1.17) is that in the second line now there is nodifferential operator. Also, Ee

µ is no longer a function of RI, but it is evaluated at thefixed geometry R0

I.

If we neglect the coupling terms 〈Φµ| . . . |Φν〉 in Eq. (1.28), we obtain for the general wavefunction of the ground state

Ψ→ Ψstatic = Λ0Φ0 .

Here Φ0 is the solution of He(R0I) and Λ0 is the solution of [TNuc + Ee

0(R0I) +

V Nuc−Nuc(R0I)]Λ0 = E0Λ0. The error which is introduced by this ansatz and by neglect-

ing the coupling of Φµ and Φν in Eq. (1.28) is typically small and actually approacheszero in a first order perturbational approach.

1.2.3 Examples

What can we learn from the Born-Oppenheimer PES or, slightly more general, from

E0 = V BO + quantum mechanical corrections for lattice vibrations ? (1.29)

The difficulty, i.e., getting to know Ee0 (solving the Schrödinger equation of the electrons),

will be ignored for the moment. It will be addressed later (in Chapter 3). Here we simplyassume that Ee

0(RI) and V BO(RI) are known.

21

V BO

M

zero pointenergy

a0 a

aa0 +∆a

cohesive energy≈ 5 eV

Figure 1.2: The total energy per atom (without zero-point vibrations) as a function ofthe inter-atomic distance. The minimum of the curve determines the stable equilibriumgeometry. The lattice constant, as measured, does not correspond to the minimum of thecurve, but to the average over the zero point vibrations: a0 + ∆a.

1.2.3.1 Structure, Lattice Constant, and Elastic Properties of Perfect Crys-tals

For a cubic crystal, due to the periodicity, the dependence of the total energy E0 inEq. (1.29) on the RI is reduced to a single variable a, which determines the inter-atomicdistances in a crystal. In Fig. 1.2 the total energy per atom is shown schematically: Thecohesive energy is the energy, which is gained by the formation of the crystal from theindividual atoms.

The minimum of the total energy determines the equilibrium position and therefore thelattice constant a0 of the crystal. The “bulk modulus” B0, which describes the dependenceof the equilibrium geometry on the external pressure, can be determined from the energycurve E(a). It is defined as the product of the second derivative (curvature) of the energytimes the volume V (at the equilibrium distance a0):

B0 =1

K= V

∂2V BO(V )

∂V 2

∣∣∣∣a=a0

, (1.30)

where V is the volume per atom (for a cubic crystal V = a3) and K is the compressibility.

Figure 1.2 shows the typical behavior of the binding energy of polyatomic systems as afunction of the inter-atomic distance, and this form is often called “equation of state”.

22

Typically V BO is calculated for about 10 geometries and then the curve is represented byan analytical fit.

The minimum of the “equation of state” is close to, but not exactly at, the lattice constantof the solid, because the “equation of state” shows a clear asymmetry.

The order of magnitude of the zero point vibrations can be estimated from the uncertaintyrelation:

∆P∆X ≥ ~/2 .

If ∆X = 0.1 bohr (a typical inter-atomic distance is 5 bohr≈ 2.5 Å), it follows

P 2

2M≈ 0.02eV/atom .

In comparison to the binding energy of the solid (cohesive energy) this is a small number(≈ 1

200Ecoh), but still, the zero point vibrations do have a measurable effect: ≈ 0.1% in-

crease of the lattice constant compared to a neglect of 〈Λ0|TNuc + V BO(∆RI)|Λ0〉.

In Chapter 6 (cohesion) we will return to the “equation of state” and there we will comparecrystal structures.

Obviously, when higher-energy vibrations are excited (by higher temperatures) the lat-tice constant increases3. This is due to the non-harmonic behavior of V BO around itsminimum (cf. Fig. 1.2): For a value smaller than a0 the potential energy increases steeplydue to Pauli repulsion. All solids with one atom per unit cell show such thermal expansion.

1.2.3.2 Lattice Waves (Phonons)

When intending to calculate the energy of lattice waves (phonons), E0 has to be investi-gated as a function of the wavelength λ and the direction of the lattice wave. Figure 1.3shows the example of a “frozen phonon”.4 The magnitude of η “tells”, how many phononsof wave length λ are excited. The energy of this lattice wave follows from the energydifference E0(RI)−E0(R0

I), where R0I is the geometry of the minimum of the PES

(cf. Fig. 1.2), and RI is the periodically distorted geometry sketched in Fig. 1.3.

From the energy of the lattice wave we can for example obtain the specific heat of thelattice (cf. Ashcroft and Mermin, p. 452-454) and the thermal expansion.

3Some materials exhibit a “negative thermal expansion”, i.e., they shrink with increasing temperature.This is actuated by entropy effects or by a change in magnetism. We will get back to such cases laterduring the lecture.

4The detailed introduction and discussion of phonons will be described later in a full chapter.

23

η

η

a0

λ = 2 a0

r

Figure 1.3: Schematic picture of a snapshot of a lattice wave (frozen phonon). The arrowsgive the direction of the distortion of the atoms. The wave length is λ = 2a0, and a0 isthe equilibrium distance of the atoms. The amplitude of the distortion is η.

1.3 The Ewald Method

(References: J.C. Slater, Insulators, Semiconductors and Metals, Quantum Theory ofMolecules and Solids, Vol. 3, McGraw Hill, 1967, S. 215-220)

The sum appearing in Eq. (1.24), now evaluated per atom,

ENuc−Nuc

M=

1

M

1

4πε0

1

2

M,M∑

I,JI 6=J

e2

|RI −RJ |ZIZJ , (1.31)

exhibits an interesting (and often important) aspect. It cannot be calculated directly,because it converges very slowly or not at all. Even for a very large number of atoms Mthe result depends on the order of summation, or on the shape of the surface that includesthe already summed up terms.In general, the sum in Eq. (1.31) is an important contribution to the total energy. Andfor ionic crystals (e.g. if one considers NaCl as being composed of Na+ and Cl− nuclei)it is even the dominating part of the total energy. The poor convergence of the sum inEq. (1.31) is therefore somewhat unexpected. The reason is the long range of the Coulombinteraction. The methodical treatment for evaluating Eq. (1.31) is important for actualcalculations, but it is also interesting, because it clarifies a methodical approach that inanalogous ways is also helpful for other problems. In general the procedure can be de-scribed as follows: If there is an apparently unsolvable problem, first a similar (and possiblyuninteresting) problem is solved and then the difference of the two systems is investigated.

The problem in the calculation of the sum in Eq. (1.31), i.e., the poor convergence,originates from the fact that the number of atoms that are at the same distance from achosen center is growing with this distance. We discuss the example of a periodic solid

24

with only one atom per unit cell5 (and therefore only one atom type) located at R1. Itfollows

ENuc−Nuc

M=

1

4πε0

1

2

M∑

I=2

Z2e2

|RI −R1|=

1

2Zeφ(R1) . (1.32)

Here, φ(R1) is the electrostatic potential generated at the position R1 of the I = 1 ion byall other ions I = 2, 3, · · ·M , i.e., the periodic images of I = 1. Because of this periodicitywe have φ(R1) = φ(R2) = φ(R3), etc. . The electrostatic potential can be calculated fromthe Poisson equation. If e n+(r) is the charge density of the nuclei, we have

∇2φ(r) = − e

ε0n+(r) . (1.33)

Initially, we take into account all atoms in this equation, including I = 1. Later we willremove the contribution of I = 1, as it should not appear in Eq. (1.32). We have:

φ(R1) = φ(R1)− contribution of the charge density of the nucleus #1 . (1.34)

First, the reason for the difficulties in evaluating Eq. (1.33) will be pointed out. Becausein a periodic crystal n+(r) is periodic, we have

n+(r) =∑

Gn

n+(Gn)eiGnr (1.35)

with

n+(Gn) =1

Ω

∫

Ω

n+(r)e−iGnrd3r (1.36)

and RnGn = 2πΓ, where Γ is an integer number. Ω is the volume of the unit cell5

(cf. Chapter 4). We have

n+(Gn = 0) =1

ΩZ ; (1.37)

and because of the Poisson-equation (1.33) it follows for the electrostatic potential φ

∇2φ(r) = − e

ε0

∑

Gn

n+(Gn)eiGnr (1.38)

and therefore

φ(r) =e

ε0

∑

Gn

n+(Gn)eiGnr

|Gn|2+ C , (1.39)

which is easily verified by evaluating∇2φ(r). We point out that the singularity of Eq. (1.39)for Gn = 0 does not play a role. It is cancelled by a corresponding singularity in the othercontributions to the total energy, which appear in V e−e and V e−Nuc . This is reasonable,

5The unit cell is the smallest unit which can be used to construct a periodic solid.

25

0 r

1e−α r2

1/e

2/√α

1/√α

Figure 1.4: Gaussian function e−αr2 . The width (at value e−αr2 = 1/e = 0.368) is 2/√α.

because for a neutral system the term Gn = 0 has to disappear after all. The singularityfor Gn = 0 in Eq. (1.39) will therefore be ignored.

If e n+(r) would only contain the nuclear charges, i.e., δ-functions, then for all terms wehave n+(Gn) =

Zv

Ω. We note for Eq. (1.32) and (1.39) that the potential of δ-shaped charge

densities converges poorly in real space and in Fourier space.

On the other hand in Eq. (1.39) one recognizes that the convergence of the series wouldbe better if we had a charge density n+(r), for which n+(Gn) decreases with increasing|Gn|2. An exponential decay would be best. It follows that a sum of Gaussian functionshas to converge nicely. Thus we first investigate such Gaussian-shaped charge densities,although this does not directly correspond to what we are interested in:

n+Gauss(r) = Z

M∑

I=1

(απ

)3/2e−α|r−RI |2 . (1.40)

We have normalized the individual Gaussian functions, i.e., we have∫ (α

π

)3/2e−α|r−RI |2dr = 1 , (1.41)

and 2/√α is the width of the individual Gaussians (cf. Fig. 1.4). The Fourier representa-

tion of n+Gauss(r) has the form:

n+Gauss(Gn) =

Z

Ω

M∑

I=1

∫

Ω

(απ

)3/2e−α|r−RI |2e−iGnrd3r =

Z

Ωe−(

G2n

4α

)

, (1.42)

26

0 r1/√α

Zve

4πε0 r

φGauss =Zve

4πε0

erf(√α r)

r

Zve

4πε0(value at r = 0)

Figure 1.5: The electrostatic potential of a Gaussian shaped charge density of chargeeZv. For large (r > 1/

√α) the function become identical to Zve

4πε0r.

and for the corresponding electrostatic potential we obtain from the Poisson equation

φGauss(r) =Z

Ω

e

ε0

∑

Gn

e−(

G2n

4α

)

eiGnr

|Gn|2+ C (1.43)

=Ze

4πε0

M∑

I=1

erf(√α|r−RI |)|r−RI |

+ C . (1.44)

Here erf(x) is the error function

erf(x) =2√π

∫ x

0

e(−x′2)dx′ . (1.45)

This means,

Ze

4πε0× erf(

√α|r−RI |)|r−RI |

(1.46)

is the electrostatic potential, which is created by a Gaussian charge density cloud beingcentered at position RI (cf. Fig. 1.5).

In contrast to φ(r) in Eq. (1.39), this sum [cf. Eq. (1.42) and (1.44)] converges excellently.This is because of the factor exp(−G2

n/4α). The smaller α, the wider are the Gaussiansand the smoother is φGauss(r) and the better is the convergence with respect to Gn.

Because we are not interested in Gaussian clouds, we write for the density of interest

n+(r) =n+(r)− n+

Gauss(r)+ n+

Gauss(r) . (1.47)

27

n(r)

R1 R2 R3 R4 R5

Ort

Figure 1.6: Charge distribution of point charges and surrounding Gaussian charge densi-ties. Delta-functions are represented by arrows.

Together, the first two components describe a neutral charge, i.e., δ-shaped point charges,which are surrounded by oppositely charged Gaussian clouds and therefore are screened.It is therefore obvious that the sum over such neutral objects converges rapidly in realspace. The charge distribution is shown in Fig. 1.6. The last term in Eq. (1.47) converges,as discussed above, very nicely in Fourier space.

We recognize that the contributions being centered at different positions now do notinteract (or interact only weakly). The electrostatic field of these two components is:

φ1,2(r) =Ze

4πε0

M∑

I=1

1− erf(

√α|r−RI |)

|r−RI |

+ C . (1.48)

The term in the curly brackets of Eq. (1.48) vanishes with increasing distance to thenucleus at position RI . Consequently, the sum converges rapidly. Only a few atomicpositions in the neighborhood of r have to be taken into account. The third contributionin Eq. (1.48) of n+(r), the Gaussian clouds, we describe in the representation of Eq. (1.44).For the potential that we want to calculate, we thus obtain

φ(r) = +Ze

4πε0

M∑

I=1

1− erf(√α|r−RI |)

|r−RI |

+Z

Ω

e

ε0

∑

Gn

e−G2

n4α eiGnr

|Gn|2− Ze

4πε0

1

|r−R1|+ C . (1.49)

Here we have now removed also the RI = R1 contribution (cf. Eq. (1.32)).

Finally, I will briefly discuss the integration constant C. The electrostatic potential φ(r),as written down in Eq. 1.49), depends on α. Of course this is unphysical and unwanted.

28

The reason for this dependence is that we have not yet determined the integration con-stant C, which appeared in the solution of the Poisson-equation for φGauss(r).

From the condition that φ should not depend on α we obtain the integration constant C.It has to be fulfilled:

dφ(r)

dα= 0 . (1.50)

The calculation yields:

C = −πZve

Ωα× 1

4πε0. (1.51)

29

2 Some Definitions and Reminders,incl. Fermi Statistics of Electrons

2.1 Statistical Mechanics

We had introduced the Hamiltonian of the electrons He in Eqs. (1.11) and (1.12). Wenow like to elaborate on the physical (thermodynamic) meaning of its eigenvalues. Atfinite temperatures a solid is not just in its electronic ground state (Ee

0,Φ0(rkσ)), buthigher-energy states (Ee

ν ,Φν(rkσ))ν 6=0 are thermally excited with a certain probability.In principle, there are also vibrations of the nuclei above the zero-point level, but thosewe will consider later [see the discussion of Eq. (2.6) below]. For the moment, we solelyfocus on electronic electron-hole excitations. Assuming that the volume, the number ofparticles in this volume, and the temperature are fixed, we have a canonical ensemble,and the probability P (Ee

ν , T ) for the occurrence of state (Eeν ,Φν(rkσ)) is proportional

to exp(−Eeν/kBT ). Here, kB is the Boltzmann constant and T the absolute temperature.

The ensemble is described by the density operator

ρ =∑

ν

P (Eeν , T )|Φν >< Φν | . (2.1)

Of course, the ensemble of all states has to be normalized to 1. Therefore we have

∑

ν

P (Eeν , T ) = 1 ≡ 1

Ze

∑

ν

exp(−Eeν/kBT ) . (2.2)

This yields the partition function of the electrons Ze:

Ze =∑

ν

exp(−(Eeν/kBT )) = Tr(exp(−He/kBT )) . (2.3)

Ze is directly related to the Helmholtz free energy F e:

− kBT lnZe = F e = U e − TSe , (2.4)

where U e and Se are the internal energy and the entropy of the electronic system, respec-tively. Consequently, the probability of a thermal excitation of a certain state (Eν ,Φν)is

P (Eeν , T ) =

1

Zeexp(−Ee

ν/kBT ) = exp[(F e − Eeν)/kBT ] . (2.5)

At finite temperatures, we thus need to know the full energy spectrum of the many-bodyHamilton operator, i.e., Ee

ν for all values of ν. Then we can calculate the partition function

30

and the free energy defined in Eqs. (2.3) and (2.4), respectively.

Let us now briefly discuss, how the internal energy and the entropy can be determined1:The internal energy is what is often also called the “total energy at finite temperature”:

U e(RI, T ) =∑

ν

Eeν(RI, T )P (Ee

ν , T ) +1

4πε0

1

2

M,M∑

I,JI 6=J

e2

|RI −RJ |ZIZJ . (2.6)

It is important to keep in mind that in the general case, i.e., when also atomic vibrationsare excited, we have U = U e + Uvib, not just U e.

From Eq.(2.4) we obtain the specific heat

cv =1

V

(∂U

∂T

)

V

=T

V

(∂S

∂T

)

V

. (2.7)

We removed the superscript e and in fact mean U = U e + Uvib and S = Se + Svib. Thecalculation of cv for metals (cf. Ashcroft-Mermin p. 43, 47, 54) is an instructive examplefor the importance of the Fermi statistics for electrons that will be introduced below.

2.2 Fermi Statistics of the Electrons

To inspect the statistical mechanics of electrons, we have to introduce some concepts thatwill be justified and discussed in a more formal way in Chap. 3. For the start, we notethat the full many-problem for N electrons is given by Eqs. (1.11) and (1.12):

He(RI)Φν(RI, rkσk) = EeνΦν with He = T e + V e−Nuc + V e−e . (2.8)

If the electron-electron interaction V e−e is neglected by assuming H ind,e = T e + V e−Nuc,the problem separates and one gets N independent Hamiltonians hj:

H ind,e =N∑

j=1

hj with hj = −~2

2m∇j + V e−Nuc(rj) . (2.9)

Since the Hamiltonians h = hj = hi are identical for all electrons, it is sufficient to solve

hϕi(r) = ǫi ϕi(r) (2.10)

once. Each of these single-particle levels can be occupied with two electrons at most (oneelectron with spin up and one electron with spin down) due to the Pauli principle. In theabsence of thermal excitations, i. e., at T = 0K, only the lowest N energy levels ǫi are

1cf. e.g. N.D. Mermin, Phys. Rev. 137, A 1441 (1969); M. Weinert and J.W. Davenport, Phys. Rev. B45, 13709 (1992); M.G. Gillan, J. Phys. Condens. Matter 1 689 (1989); J. Neugebauer and M. Scheffler,Phys. Rev. B 46, 16067 (1992); F. Wagner, T. Laloyaux, and M. Scheffler, Phys. Rev. B 57, 2102 (1998).

31

occupied, so that the ground-state total energy (internal energy at 0K) of this electronicsystem is given by:

U e,ind(T = 0K) = Ee,ind0 =

N∑

i=1

ǫi . (2.11)

This independent electron approximation is very strong and thus not particularly useful by itself. However, it is very instructive: As will be discussed in Chap. 3, also thefull many-body problem problem can be rewritten in terms of an effective single-particleHamiltonian

h =−~22m2 +V eff(r) (2.12)

with effective “single-particle” levels2 given by

hϕi(r) = ǫi ϕi(r) . (2.13)

Please note that in this formalism the electrons are only described by an effective single-particle Hamiltonian, but they are in fact not independent (see Chapter 3). As a conse-quence, the ground-state total energy (internal energy at 0K) is given by

U e(T = 0K) = Ee0 =

N∑

i=1

ǫi +∆ , (2.14)

where ∆ is a correction accounting for the electron-electron interaction. For indepen-dent particles ∆ is zero, but for the true, fully interacting many-body problem it is veryimportant (see Chapter 3).We now apply the density matrix formalism introduced in the previous section to theelectronic system described by Eq. (2.12), see for instance Marder, Chapter 6.4 or Landau-Lifshitz, Vol. IV. In this case, the lowest internal energy that is compatible with the Pauliprinciple at finite temperatures is given by

U e(T ) =∞∑

i=1

ǫif(ǫi, T ) + ∆(T ) + Uvib(T ) . (2.15)

The index i is running over all single particle states and the occupation probability of theith single particle level ǫi is given by the Fermi function (cf. e.g. Ashcroft-Mermin, Eq.(2.41) - (2.49) or Marder, Chapter 6.4, and the plot in Fig. 2.1):

f(ǫi, T ) =1

exp[(ǫi − µ)/kBT ] + 1. (2.16)

Here, kB is again the Boltzmann constant and µ is the chemical potential of the electrons,i.e., the lowest energy required to remove an electron from the system. At T = 0K this is

− µ = Ee0(N − 1)− Ee

0(N) . (2.17)

At finite temperature this should be replaced by the difference of the corresponding freeenergies.

2The justification and meaning of the term “single-particle level” will be discussed in Chapter 3.

32

f(ǫ)

µ

T = 0

T 6= 0

ǫ

1.0

0.5

0.0

Figure 2.1: The Fermi-distribution function [Eq. (2.16)].

Obviously, the number of electrons N is independent of temperature. Therefore, we have

N =∞∑

i=1

f(ǫi, T ;µ) . (2.18)

For a given temperature this equation contains only one unknown quantity, the chemicalpotential µ. Thus, if all ǫi are known, µ(T ) can be calculated.

By introducing the energy and entropy per unit volume (u = U/V, s = S/V ) and usingthe laws of thermodynamics [(∂u/∂T )V = T (∂s/∂T )V , and s→ 0 if T → 0)], we obtain

se =Se

V= −kB

∑

i

[f(ǫi, T ) ln f(ǫi, T ) + (1− f(ǫi, T )) ln (1− f(ǫi, T ))

]. (2.19)

The derivation of Eq. (2.19) is particularly simple if one is dealing with independent par-ticles as described by Eq. (2.15) with ∆(T ) = 0.

2.3 Some Definitions

We will now introduce some definitions and constrains that are particularly useful in thedescription of solids. First, we like to consider a system without spin-orbit interaction.Thus spin and position coordinates separate:

Φ(rkσk) = Φ(rk) .χ(σk) . (2.20)

Second, we simplify the Hamiltonian He: One possible consists in approximating theelectron-nuclear interaction with a constant C jellium

He = T e + V e−Nuc + V e−e ≈ T e + C jellium + V e−e . (2.21)

33

In this so called “jellium” model, the potential of the nuclei is smeared out to a constantvalue. We note in passing that for different electron numbers N this constant is differentand that at not too low electron densities, this is an exact description (see Sec. 3.8 fordetails).3

Since the “jellium” model defined in Eq. (2.21) is not tractable analytically, we switch toan ever cruder approximation, in which also the electron-electron interaction is neglected:

He ≈ T e + C jellium . (2.22)

For C jellium = 0, this is the free electron problem that will be analysed in detail below. Atthis stage, you might ask yourself why it is helpful to discuss such a harsh approxima-tion at all. As already mentioned in the discussion of Eq. (2.12) and discussed in detailin Chap. 3, the full many-body Schrödinger equation can be addressed using Density-Functional Theory (see Sec. 3.8) via an effective single-particle equation

(− ~

2

2m∇2 + V eff(r)

)ϕj(r) = ǫjϕj(r) . (2.23)

In such a formalism, the “jellium” and the “free-electron” problem would just differ in theeffective potential V eff(r): For the “free-electron” problem, V eff(r) is zero. For the “jel-lium” model in the limit of not too low densities, V eff(r) becomes a constant that dependson C jellium in a non-trivial way. We will discuss the details in Chap. 3.8. In the following,we will thus discuss the free electron problem and then use the derived formalism to findinterpretations to the more complex theories discussed in the later chapters.

The eigenfunctions of Eq. (2.23) are plane waves

ϕk(r) = eikr , (2.24)

and choosing the energy zero such that C jellium = 0 the energy eigenvalues are

ǫ(k) =~2k2

2m. (2.25)

The vectors k, or its components kx, ky, kz, are interpreted as quantum numbers. Thus,they replace what so far was noted as index j. Including the spin, the quantum numbersare k and s, where the latter can assume one of the two values: +1/2 and −1/2 (up anddown).

The wave functions in Eq. (2.24) are not normalized (or they are normalized with respectto δ functions). In order to obtain a simpler mathematical description, it is often usefuland helpful to constrain the electrons to a finite volume. This volume is called the baseregion Vg and it shall be large enough, so that the obtained results are independent ofits size and shape.4 The base region Vg shall contain N valence electrons and M ions,and for simplicity we chose a box of the dimensions Lx, Ly, Lz (cf. Ashcroft, Mermin:

3For systems with very low densities, electrons will localize themselves at T = 0K due to the Coulombrepulsion. This is called Wigner crystallization and was predicted in 1930.

4 For external magnetic fields the introduction of a base region can give rise to difficulties, becausethen physical effects often depend significantly on the border.

34

Exercise for more complex shapes). For the wave function we could chose an almostarbitrary constraint (because Vg shall be large enough). It is advantageous to use periodicboundary conditions

ϕ(r) = ϕ(r+ Lxex) = ϕ(r+ Lyey) = ϕ(r+ Lzez) . (2.26)

Here ex, ey, ez are the unit vectors in the three Cartesian directions. This is also called theBorn-von Kármán boundary condition. As long as Lx, Ly, Lz, and thus also Vg are largeenough, the physical results do not depend on this treatment. Sometimes also anti-cyclicboundary conditions are chosen in order to check that the results are independent of thechosen base region.

By introducing such a base region, the wave functions can be normalized to the Kroneckerdelta. Using Eq. (2.26) and the normalization condition

∫

Vg

ϕ∗k(r)ϕk′(r)d3r = δk,k′ (2.27)

we obtain

ϕk(r) =1√V g

eikr . (2.28)

Because of Eq. (2.26), i.e., because of the periodicity, only discrete values are allowed forthe quantum numbers k, i.e., k · Liei = 2πni and therefore

k =

(2πnx

Lx

,2πny

Ly

,2πnz

Lz

), (2.29)

with ni being arbitrary integer numbers. Thus, the number of vectors k is countable, andeach k point is associated with the volume (2π)3/Vg.

Each state ϕk(r) can be occupied by two electrons. In the ground state at T = 0K theN/2 k points of lowest energy are occupied by two electrons each. Because ǫ dependsonly on the absolute value of k, these points fill (for non-interacting electrons) a spherein k-space (the “Fermi sphere”) of radius kF (the “Fermi radius”). We have

N = 24

3πk3F

Vg(2π)3

=1

3π2k3FVg . (2.30)

Here the spin of the electron (factor 2) has been taken into account, and Vg/(2π)3 is the

density of the k-points [cf. Eq. (2.25)]. The particle density of the electrons in jellium isconstant:

n(r) = n =N

Vg=

1

3π2k3F , (2.31)

the charge density of the electrons is −en, and the highest occupied state has the “quan-tum number”, or k-vector with radius kF =

3√3π2n.

35

The energy of this single-particle state |kF〉 is

ǫF =~2

2mk2F =

~2

2m(3π2n)2/3 . (2.32)

For jellium-like systems (e.g, metals), the electron density is often noted in terms of thedensity parameter rs. This is defined by a sphere 4π

3r3s , which contains exactly one electron.

One obtains4π

3r3s = Vg/N = 1/n . (2.33)

The density parameter rs is typically given in bohr units. For the valence electrons ofmetals rs is typically around 2 bohr , and therefore kF is approximately 1 bohr−1, or2 Å−1, respectively.

Now we introduce the (electronic) density of states:

N(ǫ)dǫ = number of states in the energy interval [ǫ, ǫ+ dǫ] .

For the total number of electrons in the base region we have:

N =

∫ +∞

−∞N(ǫ)f(ǫ, T )dǫ . (2.34)

For free electrons (jellium), we get the density of states:

N(ǫ) =2Vg(2π)3

∫d3k δ(ǫ− ǫk)

=2Vg4π

(2π)3

∫k2dk δ(ǫ− ǫk)

=Vgπ2

∫dǫk

(dǫk/dk)

2mǫk~2

δ(ǫ− ǫk)

=Vgπ2

∫dǫk

2m

~2k

2mǫk~2

δ(ǫ− ǫk)

=mVgπ2~3

√2mǫ . (2.35)

For ǫ < 0 we have N(ǫ) = 0. The density of states for two- and one dimensional systemsis discussed in the exercises (cf. also Marder).

Figure 2.2 shows the density of states and the occupation at T = 0K and at finitetemperature. The density of states at the Fermi level is

N(ǫF)

Vg=

3

2

N

Vg

1

ǫF=

m

~2π2kF . (2.36)

The figure shows that at finite temperatures holes below µ and electrons above µ aregenerated.

Later, we will often apply Eqs. (2.32)-(2.35) and the underlying concepts, because someformulas can be presented and interpreted more easily, if ǫF, kF, and n(r) are expressedin this way.

36

♦

s

T = 10,000 KT = 0 K

f(ǫ) ·√

ǫ

Figure 2.2: Density of states of free electrons√(ǫ)f(ǫ, T ) and the separation in occupied

and unoccupied states for two temperatures.

37

3 Electron-Electron Interaction

3.1 Electron-Electron Interaction

In the adiabatic approximation, the dynamics of the electrons and nuclei is decoupled.Still, equations (1.11) and (1.17) describe systems containing 1023 particles. In the follow-ing, we will discuss methods that enable to obtain solutions to the many-body Schrödingerequation defined in Eq. (1.11).The wave function Φν(riσi) and its energy Ee

ν are determined by the equations (1.11)and (1.12). We write down Eq. (1.12) once more

He =N∑

k=1

− ~2

2m∇2

rk+

N∑

k=1

v(rk) +1

2

1

4πε0

N,N∑

k,k′

k 6=k′

e2

|rk − rk′ |. (3.1)

Here, v(rk) is the potential of the lattice components (ions or nuclei) of the solid (cf. Eq. (1.7)).Often it is called the “external potential”.

At this point we briefly recall the meaning of the many-body wave function. It dependson 3N spatial coordinates and N spin coordinates. Because the spatial coordinates areall coupled by the operator V e−e, they cannot be dealt with separately in general. In acertain sense, this is already the case in the single-particle problem: Neglecting spin forthe moment, the single-particle wave function ϕ(x, y, z) depends on three spatial coor-dinates, and the motion in x-direction is usually not independent of that in y-direction.The same is true for x and z, and for y and z. In other words, ϕ(x, y, z) does not describe3 independent one-dimensional particles, but 1 particle with 3 spatial coordinates. In thesame way the N -particle Schrödinger equation has to be treated as a many-body equationwith 3N spatial coordinates. One can say that the total of all the electrons is like a glue,or a mush and not like 3N independent particles.

If the electron-electron interaction would be negligible or if it could be described as

V e−e =N∑

k=1

ve−e(rk) , (3.2)

i.e., if the potential at position rk would not explicitly depend on the positions of theother electrons, then it would not be a major problem to solve the many-body Schrödingerequation, since it would decouple into independent particle equations. Unfortunately, sucha neglect cannot be justified: The electron-electron interaction is Coulombic; it has aninfinite range and for small separations it becomes very strong. Indeed, learning about

38

this electron-electron interaction is the most interesting part of solid-state theory, since itis responsible for a wide range of fascinating phenomena displayed in solids. Thus, now wewill describe methods that enable us to take into account the electron-electron interactionin an appropriate manner. There are four methods (or concepts) that are being used:

1. Many-body perturbation theory (also called Green-function theory): This methodis very general, but not practical in most cases. One constructs a series expansionin interactions and necessarily many terms have to be neglected, which are believedto be unimportant. Keywords are Green functions, self energies, vortex corrections,Feynman diagrams. We will come back to this method in a later part of this lecture,because it allows for the calculation of excited states.

2. Methods that are based on Hartree-Fock theory and then treat correlation analo-gous to state-of-the-art quantum chemistry approaches. Keywords are Møller-Plessetperturbation theory, coupled-cluster theory, and configuration interaction. These arevery systematic concepts, but they don’t play a role in materials science, so far, asthey are numerically too inefficient. This may well change in the next years if newalgorithms are being developed. It is also discussed that these equations are suitablefor a quantum computer, and in 5-10 years such machines may well exist.

3. Effective single-particle theories: Here, we will emphasize in particular density-functional theory (DFT)1. Primarily, DFT refers to the ground state, Ee

0. In thischapter we will discuss DFT (and its precursors) with respect to the ground state,and we will also briefly describe the calculation of excited states and time-dependentDFT (TD-DFT).

4. The quantum Monte Carlo method: Here the expectation value of the many-bodyHamilton operator He is calculated using a very general ansatz of many-body wavefunctions in the high-dimensional configurational space. Then the wave functions arevaried and the minimum of the expectation value 〈Φ|He|Φ〉/〈Φ|Φ〉 is determined.Due to the availability of fast computers and several methodological developmentsin recent years, this method has gained in importance. It will be discussed more tothe end of this lecture.

Now we will discuss density-functional theory in detail. This will be done step by step inorder to clarify the physical content of the theory. Thus, we begin with the Hartree andHartree-Fock theory and then proceed, via Thomas-Fermi theory, to density-functionaltheory.

3.2 Hartree Approximation

Besides its scientific value, the ansatz introduced by Hartree also serves here as moregeneral example for the route how a theory can evolve. Often, initially an intuitive feelingis present. Only in a second step one attempts to derive things in a mathematical way.Hartree (Proc. Camb. Phil. Soc. 24, 89, 111, 426 (1928)) started from the following idea:

1Walter Kohn was awarded the Nobel prize in chemistry for the development of density-functionaltheory in 1998.

39

The effect of the electron-electron interaction on a certain electron at position r shouldapproximately be given by the electrostatic potential, which is generated by all otherelectrons on average at position r, i.e., it should approximately be possible to replace thepotential V e−e(ri) of the many-body Schrödinger equation by

V e−e(ri)!≈

N∑

k=1

vHartree(rk) (3.3)

with

vHartree(r) =e2

4πε0

∫n(r′)

|r− r′|d3r′ . (3.4)

The ansatz of Eq. (3.3) is called mean field approximation. As a consequence, the many-body Hamilton operator decomposes into N single-particle operators

He =N∑

k=1

h(rk) . (3.5)

Each electron can then be described by a single-particle Schrödinger equation with aHamilton operator

h = − ~2

2m∇2 + v(r) + vHartree(r) . (3.6)

The validity of Eqs. (3.3) – (3.6) may seem to be reasonable. However, for nearly allsituations this approach represents a too drastic approximation. Only a more systematictreatment can show how problematic this approximation is, and this shall be done now.

Mathematical Derivation of the Hartree Equations

Starting from the general many-body equation, Eqs. (3.3) – (3.6) shall be derived. At firstwe note that He does not contain the spin of the electrons explicitly, and therefore, spinand position are not coupled. Thus, for the eigenfunctions of He we have the factorization

Φν(riσi) = Φν(ri)χν(σi) . (3.7)

To take into account orbital as well as spin quantum numbers, we will label the set ofquantum numbers as follows

ν ≡ oνsν ,

with oν representing the orbital and sν the spin quantum numbers of set ν. In the free-electron system, oν represents all possible values of the three numbers kx, ky, kz, and foreach state, sν is either ↑ or ↓ (cf. Chapter 2).

Because He does not contain spin-orbit and spin-spin coupling, the spin component fac-torizes and we find

χν(σi) = χs1(σ1)χs2(σ2)...χsN (σN) . (3.8)

Here, the “functions” are χ↑(σ) ≡(

10

)and χ↓(σ) ≡

(01

), i. e., σ refers to the elements

of a two-component vector. For the spatial part such a product ansatz is not possible,

40

Figure 3.1: Schematic representation of the expectation value of the Hamilton operatoras a “function” of the vectors of the Hilbert space of He. The tick marks at the Φ-axislabel the wave functions of type ΦHartree. They definitely cannot reach the function Φ0.

because He couples the positions of “different” electrons due to the electron-electron in-teraction. We used quotes because speaking of different electrons is no longer appropriate.One should better talk about a many-electron cloud or an electron gas. For the lowesteigenvalue of the Schrödinger equation the variational principle holds, i.e., for the groundstate of He we have

Ee0 ≤〈Φ|He|Φ〉〈Φ|Φ〉 , (3.9)

where the Φ are arbitrary functions of the N -particle Hilbert space, which can be differen-tiated twice and can be normalized. Let us now constrain the set of functions Φ further toachieve mathematical convenience. For example, we could allow only many-body functionsthat can be written as a product of single particle functions

Φ(ri) ≈ ΦHartree(ri) = ϕo1(r1)ϕo2(r2) . . . ϕoN (rN) , (3.10)

where we clearly introduce an approximation. Functions that can be written as in Eq. (3.10)do not span the full Hilbert space of the functions Φ(ri), and most probably we willnot obtain Ee

0 exactly. However, a certain restriction on the set of the allowed functionsis justifiable when we are not interested in the exact value of Ee

0, but accept an error of,say, 0.1 eV. As a word of caution, we note that the quality of such a variational estimationis often overrated and in general difficult to assess. Schematically, the variational princi-ple can be described by Fig.(3.1). Because the Hartree ansatz (Eq.(3.10)) for sure is anapproximation, we have

Ee0 <〈ΦHartree|He|ΦHartree〉〈ΦHartree|ΦHartree〉 , (3.11)

and (Φ0, Ee0) will not be reached exactly.

Due to the normalization condition we have

〈ΦHartree|ΦHartree〉 =∫. . .

∫|ΦHartree(ri)|2d3r1 . . . d3rN = 1 , (3.12)

41

〈ϕoi |ϕoi〉 =∫|ϕoi(r)|2d3r = 1 . (3.13)

Orthogonality of the different ϕoi is not required explicitly, but we will find that it isfulfilled automatically. With the Hartree ansatz given in Eq. (3.10), the expectation valueof the energy is found to be

〈ΦHartree|He|ΦHartree〉 =

∫ϕ∗o1(r1)ϕ

∗o2(r2) . . . ϕ

∗oN(rN)

[N∑

k=1

−~22m∇2

rk+ v(rk)

](3.14)

ϕo1(r1)ϕo2(r2) . . . ϕoN (rN)d3r1 . . . d

3rN

+1

2

e2

4πε0

∫ϕ∗o1(r1)ϕ

∗o2(r2) . . . ϕ

∗oN(rN)

N,N∑

k,k′=1

k 6=k′

1

|rk − rk′ |

ϕo1(r1)ϕo2(r2) . . . ϕoN (rN)d3r1 . . . d

3rN .

=N∑

k=1

∫ϕ∗ok(rk)

− ~

2

2m∇2 + v(rk)

ϕok(rk)d

3rk

+1

2

e2

4πε0

N,N∑

k,k′

k 6=k′

∫∫ϕ∗ok(rk)ϕ

∗ok′(rk′)ϕok′

(rk′)ϕok(rk)

|rk − rk′ |d3rkd

3rk′ .

Equation (3.14) can be understood as an energy functional:

〈ΦHartree|He|ΦHartree〉 != EHartree

[ϕo1 , ϕo2 . . . ϕoN , ϕ

∗o1, ϕ∗

o2. . . ϕ∗

oN

]. (3.15)

Here ϕ and ϕ∗ are considered as two independent functions. Alternatively, the real andthe imaginary part of ϕ could be considered as separate variables.

The ansatz used in Eq.(3.10) significantly restricts the possible eigenfunctions of He.Still, we will continue and determine “the best” single-particle functions from this set offunctions, i.e., those single-particle functions that minimize EHartree[ϕo1 . . . ϕ

∗oN]. The hope

is that the minimum of 〈ΦHartree|He|ΦHartree〉 will be rather close to the true ground stateenergy Ee

0 (compare Fig. 3.1). Thus, we vary the expression (Eq. (3.14)) with respect tothe functions ϕ∗

oi(r) and ϕoi(r). The variation is not fully free, because only those functions

should be considered that can be normalized to one. The constraint given in Eq. (3.13)can be taken into account in the variational problem by using the method of Lagrangemultipliers. Then we obtain an equation to determine the best ϕoi(r):

Q[ϕo1 , . . . , ϕoN , ϕ∗o1, . . . , ϕ∗

oN] = EHartree[ϕo1 . . . ϕoN , ϕ

∗o1. . . ϕ∗

oN]

−N∑

k=1

ǫok(1− 〈ϕok |ϕok〉) ≡ minimum , (3.16)

where the ǫok are the Lagrange-multipliers.

42

Equation (3.16) can be rewritten as follows: We search for the minimum of the functionalQ, which means that the functions ϕoi,

ϕ∗oi

have to fulfill the stationarity condition

δQ = Q[ϕo1 , ϕo2 , . . . , ϕoN , ϕ

∗o1, ϕ∗

o2, . . . , ϕ∗

oi+ δϕ∗

oi, . . . , ϕ∗

oN

]

− Q[ϕo1 , . . . , ϕoN , ϕ

∗o1, . . . , ϕ∗

oi, . . . , ϕ∗

oN

]= 0 , (3.17)

for an arbitrary variation δϕ∗oi, or δϕoi , where i = 1 . . . N .

Therefore, if we vary ϕ∗oi(r), with Eq. (3.16) we obtain

⟨δϕoi

∣∣∣∣−~2

2m∇2 + v(r)

∣∣∣∣ϕoi

⟩+

N∑

k=1k 6=i

e2

4πε0

⟨δϕoiϕok

∣∣∣∣1

|rk − ri|

∣∣∣∣ϕokϕoi

⟩= ǫoi〈δϕoi |ϕoi〉 .

(3.18)

Because the constraint (normalization of the ϕoi) is taken into account by the methodof the Lagrange-multipliers, this equation is valid for arbitrary variations δϕoi . Thus, theequation used to determine the functions ϕoi(r) is

[− ~

2

2m∇2 + v(r)

]ϕoi(r) +

N∑

k=1k 6=i

e2

4πε0

⟨ϕok

∣∣∣∣1

|rk − r|

∣∣∣∣ϕok

⟩ϕoi(r) = ǫoiϕoi(r) . (3.19)

We rewrite this equation and obtain[− ~

2

2m∇2 + v(r) + vHartree(r) + vSICoi

(r)

]ϕoi(r) = ǫoiϕoi(r) , (3.20)

where

vHartree(r) =e2

4πε0

∫n(r′)

|r− r′|d3r′ (3.21)

with

n(r) = 〈Φ|N∑

k=1

δ(r− rk)|Φ〉 =N∑

k=1

|ϕok(r)|2 , (3.22)

and

vSICoi(r) = − e2

4πε0

∫ |ϕoi(r′)|2

|r− r′| d3r′ . (3.23)

n(r) is the particle density of all electrons and −en(r) is the charge density of all electrons.The first equals sign in Eq.(3.22) holds in general, i.e., this is the quantum mechanicaldefinition of the electron density. The second equals sign is valid only for independentparticles and for the Hartree approximation.

The term vHartree(r), generally called the Hartree potential, can also be expressed in thedifferential form of electrodynamics (Poisson equation):

∇2vHartree(r) = −e2

ε0n(r) . (3.24)

43

The potential vSICσi(r) is the self-interaction correction to the Hartree potential. It takes

into account that an electron in orbital ϕoi(r) shall not interact with itself, but only withthe (N − 1) other electrons of the system. Equation (3.20) is now “sufficiently simple” tobe solved using modern numerical methods. Often the potential vSICσi

(r) is then neglected.However, in recent years it became clear that this additional approximation is typicallynot justified. If the functions ϕoi(r) were known, we could calculate the total energyfrom Eq. (3.14) or the electron density using Eq.(3.22). The latter can be compared tomeasurements (e.g. X-ray diffraction). Additionally, inspecting the charge distribution ofthe electrons, the nature of the forces, which stabilize the solid, can be understood. Onefinds that several quantities obtained from the Hartree approximation agree well withexperimental data. Further, we find that the negative values of the Lagrange parametersǫoi , introduced due to the normalization, agree approximately with measured ionizationenergies. An exact discussion of the physical meaning of the Lagrange parameters will begiven later (in the context of the Hartree-Fock theory).

Obviously, the Hartree Eq. (3.20) is no ordinary single-particle equation. Formally, it canbe written as a single-particle equation,

− ~

2

2m∇2 + veff(r)

ϕoi(r) = ǫoiϕoi(r) . (3.25)

However, the “effective potential” itself depends on the solutions ϕoi(r). Therefore, Eq. (3.25)is an effective (but not a true) single-particle equation. This, for example, implies that thetotal energy is not equal to the sum of the ǫoi , which would be the case for non-interactingparticles in the single-particle states ϕoi(r). Nevertheless, the total energy can be obtainedfrom the ϕoi via Eq. (3.14).

How can an equation like Eq. (3.25) be solved, if the potential

veff(r) = v(r) + vHartree(r) + vSICoi(r) (3.26)

is initially unknown, because the single-particle wavefunctions (and also n(r)) are notknown? For this purpose the so-called self-consistent field method (SCF) is applied. First,one starts with a reasonable guess, i.e., one estimates n(r). Then the density is improvedstep by step until the correct result is obtained.

A first crude approximation for n(r) in a solid is obtained by assuming that it can bewritten as a simple superposition of the electron densities of the individual atoms

nstart(r) =M∑

I=1

nAtomI (|r−RI |) . (3.27)

This is correct for large interatomic distances, but it is a severe approximation when theelectron densities of different atoms overlap. Nevertheless, Eq. (3.27) is a possible and notbad “zeroth approximation”. Once a “zeroth approximation” has been made, one proceedsas shown in Fig. 3.2. At the end of such a calculation the wave functions ϕok(r) and the

44

zeroth approximation(“guess”):

n(r) or veff, 0(r)

Schrödinger equation− ~

2

2m∇2 + veff, i(r)

ϕok(r) = ǫokϕok(r)

improved particle density of theelectrons

n(r) =N∑

k=1

|ϕok(r)|2

electrostatics

vHartree(r) =1

4πε0

∫e2n(r)

|r− r′|d3r′

or

∇2vHartree(r) = −e2

ǫ0n(r)

veff, new(r) = v(r) + vHartree(r)

comparison, if veff,new(r) andveff, old(r) = veff, i(r) differ

no end

yes

i→ i+ 1

mixing:

veff, i+1(r) = αveff, i(r)+ (1− α)veff, new(r)

with 1 > α > 0.9

Figure 3.2: Scheme of the self-consistent field method for the solution of the Hartreeequation. Here, vSIC has been neglected. Analogous diagrams exist for Hartree-Fock anddensity-functional theory.

45

potential veff(r) are self-consistent, i.e., their differences in two subsequent iterations arearbitrarily small. Typically, between 5 and 20 iterations are required.

3.3 Hartree-Fock Approximation

The Hartree wave function (Eq. (3.10)) has a crucial disadvantage: It does not satisfy thePauli principle. Due to the variational principle, this does not necessarily have a drasticimpact on the calculated energy, but for sure it would be better to remove or reducethis disadvantage. In a many-electron system there is an important interaction betweenthe particles, which in the single-particle picture is formulated in the following way: Asingle-particle wave function can be occupied by only one electron. More generally, inthe many-body picture, the Pauli principle states that the N -particle wave function offermions has to be antisymmetric with respect to the interchange of all coordinates (spa-tial and spin) of two particles.

In order to fulfill the Pauli principle, Fock suggested to replace the wave function used inHartree theory by a suitable linear combination, a so-called Slater determinant:

ΦHF(riσi) =1√N !

∣∣∣∣∣∣∣∣

ϕo1s1(r1σ1) . . . ϕoNsN (r1σ1)ϕo1s1(r2σ2) . . . ϕoNsN (r2σ2)

. . .ϕo1s1(rNσN) . . . ϕoNsN (rNσN)

∣∣∣∣∣∣∣∣. (3.28)

The factor 1/√N ! ensures the normalization of the many-body wave function. The many-

body wave function as given in Eq. (3.28) changes sign on interchange of the coordinates(spatial and spin) of two particles. For a determinant this is obvious, because the inter-change of two particles corresponds to the interchange of two rows of the determinant.The Pauli exclusion principle is thus fulfilled.

For two-electron systems (e.g. H− or He) the wave function reads

ΦHF =1√1× 2

(ϕ1(r1σ1)ϕ2(r2σ2)− ϕ2(r1σ1)ϕ1(r2σ2)) . (3.29)

Also the normalization is fulfilled:

〈ΦHF|ΦHF〉 =1

2!

∑

σ1,σ2

∫∫ ϕ∗1(r1σ1)ϕ

∗2(r2σ2)ϕ1(r1σ1)ϕ2(r2σ2)

−ϕ∗1(r1σ1)ϕ

∗2(r2σ2)ϕ1(r2σ2)ϕ2(r1σ1)

−ϕ∗1(r2σ2)ϕ

∗2(r1σ1)ϕ1(r1σ1)ϕ2(r2σ2)

+ϕ∗1(r2σ2)ϕ

∗2(r1σ1)ϕ1(r2σ2)ϕ2(r1σ1)

d3r1 d

3r2

=1

2(1 + 0 + 0 + 1)

= 1 , (3.30)

46

when the single-particle functions are normalized and orthogonal,

〈ϕoisi |ϕojsj〉 = δoi,ojδsi,sj with ϕoisi(r, σ) = ϕoisi(r)χsi(σ) . (3.31)

Please note the index si in the spatial part of the wavefunction, ϕσisi(r). In Hartree the-ory, the spatial functions were assumed to be independent of the spin states so that thespin quantum numbers si could be omitted. In Hartree-Fock theory, this is not the caseanymore.

Now we proceed in the same way as in the Hartree theory, i.e., we will make use of thevariational principle 〈ΦHF|He|ΦHF〉 = E[ΦHF] ≥ Ee

0. It should be noted that the ansatzfor ΦHF (Eq. (3.28)) is still not general. It allows for an infinite number of possibilities,because the ϕoisi(r, σ) are arbitrary functions (which can be normalized), but the set ofall vectors that can be written like Eq. (3.28) still only forms a subset of the Hilbert spaceof the full N -particle problem. Still, this is a “more complete” subset than the one usedin the Hartree theory.2 Consequently, again we will not necessarily obtain the ground-state energy exactly, but an approximation. However, this approximation will be betterthan the approximation obtained in Hartree theory, because now the offered many-bodywave functions fulfill one additional physical constraint that was violated by the Hartreeapproximation: the Pauli principle of the electrons. Similarly to the Hartree theory wenow obtain the expectation value of the energy by using Eq. (3.28):

EHF[ΦHF] = 〈ΦHF|He|ΦHF〉

=N∑

i=1

∑

σ

∫ϕ∗oisi

(r, σ)

− ~

2

2m∇2 + v(r)

ϕoisi(r, σ)d

3r (3.32)

+1

2

e2

4πε0

N,N∑

i,ji 6=j

∑

σ,σ′

∫∫ϕ∗oisi

(r′, σ′)ϕ∗ojsj

(r, σ)ϕoisi(r′, σ′)ϕojsj(r, σ)

|r− r′| d3r d3r′

−1

2

e2

4πε0

N,N∑

i,ji 6=j

∑

σ,σ′

∫∫ϕ∗ojsj

(r′, σ′)ϕ∗oisi

(r, σ)ϕoisi(r′, σ′)ϕojsj(r, σ)

|r− r′| d3r d3r′ .

The sums over the spins can be performed analytically, because of ϕoisi(r, σ) = ϕoisi(r)χsi(σ)and of

∑

σ

χ∗si(σ)χsj(σ) = δsi,sj . (3.33)

Accordingly, we obtain∑

σ

χ∗si(σ)χsi(σ) = 1 . (3.34)

in the first line of Eq. (3.32). In the second line, we obtain∑

σ,σ′

χ∗si(σ′)χsi(σ

′)χ∗sj(σ)χsj(σ) = 1 (3.35)

2In order to further complete the Hilbert space, also linear combinations of Slater determinants areincluded in more advanced quantum-chemistry approaches.

47

and in the third line we obtain∑

σ,σ′

χ∗sj(σ′)χ∗

si(σ)χsi(σ

′)χsj(σ) =∑

σ

δsi,sjχ∗si(σ)χsj(σ) = δsi,sj . (3.36)

This means that in Eq. (3.32) all sums,∑

σ,σ′ , can be removed. Only in the last line thissum has to be replaced by a Kronecker symbol. A comparison of Eq. (3.28) with thecorresponding equation of Hartree theory (Eq. (3.14)) shows that we now have obtainedan additional term

Ex[ϕ∗oisi, ϕoisi] = −

1

2

e2

4πε0

N,N∑

i,ji 6=j

δsi,sj

∫∫ϕ∗oisi

(r)ϕ∗ojsj

(r′)ϕoisi(r′)ϕojsj(r)

|r− r′| d3rd3r′ .(3.37)

This term is called exchange interaction and enters because of the consideration of thePauli principle. Thus, the letter “x” (an abbreviation for the word “exchange”) is used totag the respective energy.

This term has a negative sign. Compared to Hartree theory it therefore lowers the energy,which reflects that Hartree-Fock theory is a better approximation than Hartree theory.In Eq. (3.28) the condition i 6= j in both sums can be omitted, because for i = j the lasttwo terms cancel each other. By summing over the spins we obtain

EHF[ϕ∗oisi, ϕoisi] = Ts[ϕ∗

oisi, ϕoisi] + Ee−Nuc[ϕ∗

oisi, ϕoisi]

+ EHartree[ϕ∗oisi, ϕoisi] + Ex[ϕ∗

oisi, ϕoisi] . (3.38)

Here, the functional for the kinetic energy of non-interacting electrons in the single-particlestates ϕoisi(r) is

Ts[ϕ∗oisi, ϕoisi] =

N∑

i=1

∫ϕ∗oisi

(r)

− ~

2

2m∇2

ϕoisi(r)d

3r . (3.39)

The other quantities are:

Ee−Nuc[ϕ∗oisi, ϕoisi] =

∫v(r)n(r) d3r , (3.40)

EHartree[ϕ∗oisi, ϕoisi] =

1

2

e2

4πε0

∫∫n(r)n(r′)

|r− r′| d3r d3r′ , (3.41)

and

Ex[ϕ∗oisi, ϕoisi] = −

1

2

e2

4πε0

N,N∑

i,j

δsi,sj

∫∫ϕ∗oisi

(r)ϕ∗ojsj

(r′)ϕoisi(r′)ϕojsj(r)

|r− r′| d3rd3r′ .(3.42)

The electron density is

n(r) =N∑

i=1

|ϕoisi(r)|2 . (3.43)

48

Please note that in comparison to the Hartree theory, no self-interaction correction ap-pears in Eq. (3.38), since the term with i = j, which would give rise to self-interaction,cancels out exactly in Eq. (3.38). For this reason, the last term in Eq. (3.42) is also writtenas Ex without tilde (and not: Ex = Ex−ESIC). Thus, EHartree contains the self-interactionof the electrons, while Ex takes into account the correction of this self-interaction, andadditionally Ex makes sure the Pauli principle is fulfilled.

The “best” functions ϕoisi(r), i.e., the functions yielding the lowest energy, are obtainedby varying Eq. (3.42) with respect to ϕ∗

oisi(r) or ϕoisi(r), respectively. This again has to

be done taking into account the normalization and the orthogonality of the ϕoisi(r). Bothhave been used already in the construction of Eq. (3.32) and (3.38). We have

δ

EHF[ϕ∗

oisi, ϕoisi] +

N,N∑

i,j

λoisi,ojsj [δoisi,ojsj − 〈ϕoi |ϕoj〉]

= 0 . (3.44)

In the derivation of the Hartree theory only the normalization has been considered asa constraint. This is sufficient here as well; but mathematically it is easier to considerboth, orthogonality and normalization as constraints (cf. Slater “The Self-consistent Field...”, Vol. 2, Chapter 17). We denote the functional in the curly brackets in Eq. (3.44) asQ[ϕ∗

o1s1, . . . , ϕ∗

oNsN, ϕo1s1 , . . . , ϕoNsN ]. If we take the functional derivative3 of the functional

Q and take into account that ϕ∗oisi

and ϕoisi have to be treated as independent variables(which are given by their real and imaginary parts, i.e., by two functions), for the variationof ϕ∗

oisi,

δ

δϕ∗oksk

(r)Q[ϕ∗

oisi, ϕoisi] = 0 for k = 1 . . . N , (3.45)

we obtain a set of equations,

− ~

2

2m∇2 + v(r) + vHartree(r)

ϕoksk(r)−

e2

4πε0

N∑

j=1

δsk,sj

∫ϕ∗ojsj

(r′)ϕoksk(r′)ϕojsj(r)

|r− r′| d3r′

=N∑

j=1

λoksk,ojsj ϕojsj(r) ,

(3.46)with

vHartree(r) =e2

4πε0

∫n(r′)

|r− r′| d3r′ . (3.47)

At a first glance, this equation looks unfamiliar. Due to the exchange term and due tothe right hand side it does not look like a Schrödinger equation. However, it can be easilyshown that it can be written in a familiar form. To show this we investigate the matrixλojsj ,oisi . By multiplying Eq. (3.46) from the left by ϕ∗

olsl(r) and integrating over r, we

obtain

Alk + Blk = λolsl,oksk , (3.48)

3Note: δδf(x)

∫f(x′)g(x′) dx′ = g(x) .

49

where

Alk =

∫ϕ∗olsl

(r)

− ~

2


ϕoksk(r)d

3r (3.49)

and

Blk = −e2

4πε0

N∑

j=1

δsk,sj

∫∫ϕ∗olsl

(r)ϕ∗ojsj

(r′)ϕoksk(r′)ϕojsj(r)

|r− r′| d3r d3r′ . (3.50)

The matrix A defined this way is obviously Hermitian, i.e., we have Alk = A∗kl. If we take

the complex conjugate of B, we obtain

B∗kl = −

e2

4πε0

N∑

j=1

δsj ,sk

∫∫ϕoksk(r)ϕojsj(r

′)ϕ∗olsl

(r′)ϕ∗ojsj

(r)

|r− r′| d3r d3r′ = Blk . (3.51)

Therefore, also B is Hermitian and thus the hermiticity of λolsl,oksk follows. In turn, theremust be a unitary transformation

∑

ol

Uok,olϕok(r) = ϕok(r) (3.52)

and

∑

om,on

U †ok,om

λomsl,onskUon,ol = ǫokskδolsl,oksk , (3.53)

so that matrix λ is transformed into a real, diagonal matrix. We use the unitary transfor-mation in Eq. (3.46) to determine ϕoksk(r):

− ~

2

2m∇2 + v(r) + vHartree(r) + vxk(r)

ϕoksk(r) = ǫokskϕoksk(r) (3.54)

with

vxk(r)ϕoksk(r) = −e2

4πε0

N∑

i=1

δsisk

∫ϕ∗oisi

(r′)ϕoksk(r′)ϕoisi(r)

|r− r′| d3r′ . (3.55)

Equation (3.54) is called the Hartree-Fock (effective single-particle) equation. Once thesingle-particle wave functions ϕoisi have been obtained using Eq. (3.54), in the nextstep the total energy can be calculated using Eq. (3.38).

The exchange term vxk(r) does not look like a normal potential, because it is an integraloperator. In 1951, Slater pointed out that it can be written in a familiar form. For thispurpose, Eq. (3.55) is multiplied by

ϕoksk(r)

ϕoksk(r)= 1 . (3.56)

50

By using the definition

nHFk (r, r′) =

N∑

i=1

δsi,skϕ∗oisi


ϕoksk(r)(3.57)

for the exchange particle density, the exchange potential can be written in the form form

vxk(r) = −e2

4πε0

∫nHFk (r, r′)

|r− r′| d3r′ . (3.58)

Now, the potential vxk is almost a usual multiplicative operator, but it is different foreach particle. The sum vHartree + vxk describes the interaction of electron k with the otherelectrons. The interaction vxk is present only for those electrons, which have the same spin,i.e., si = sk . In principle, the expression (Eq. (3.57)) is not defined at the points whereϕoksk(r) is zero, but this is not a problem, because nHF

k (r, r′) can be extended continously.From now on, when talking about the Hartree-Fock equation, we mean Eq. (3.54), andwe will omit the tilde on the single-particle orbitals ϕoksk(r).

3.4 Exchange Interaction

In this section we will investigate the physical meaning of the Hartree and the Hartree-Fockequations. In particular the meaning of the exchange energy will be explained, and theproblems of Hartree-Fock theory will be discussed, i.e., which physical many-body effectsare missing in this theory. We start by recalling the effective single-particle Hamiltonoperator of the Hartree and of the Hartree-Fock equations:

h = − ~2


+

− e2

4πε0

∫nHk (r

′)

|r− r′| d3r′ = vSICk (r) Hartree,

− e2

4πε0

∫nHFk (r, r′)

|r− r′| d3r′ = vxk(r) Hartree-Fock,

(3.59)

with

nHk (r

′) = |ϕok(r′)|2 (3.60)

and nHFk (r, r′) from Eq. (3.57). Nowadays, vSICk is often neglected, and the ansatz veff =

v + vHartree is termed Hartree approximation. We will not introduce this additional ap-proximation here.

We have ∫nHk (r

′) d3r′ = 1 (3.61)

∫nHFk (r, r′) d3r′ = 1 . (3.62)

51

0 1 2

0

0.5

1

Hartree: n(r′)− nHk (r

′)Vg

N 1− 1N

|r− r′|/rs

Figure 3.3: Distribution of the (N − 1) electrons seen by particle ϕk(r), which is locatedat r = 0 (in the Hartree approximation and for jellium).

Both densities nHk (r

′) and nHFk (r, r′) represent one electron, i.e., in both cases the self-

interaction included in vHartree is removed. Equation (3.62) is an important sum rule,which also holds for the exact theory. However, nHF

k (r, r′) contains more than just thecorrection of the self-interaction. We will exemplify this for a simple system now. Becausewe like to address the electron-electron interaction (and not the electron-ion interaction),we investigate a system, in which the potential of the lattice components (i.e., of the ions)varies only weakly. Thus we set v(r) = v ≡ constant. We want to use this to demonstratethe meaning of the exchange interaction. For such a jellium system the Hartree-Fockequations are solved by plane waves. This will be the result of our discussion (but we notein passing that there are also more complex solutions: “spin density waves”). Therefore,the single-particle wave functions are

ϕoisi(r) =1√Vgeikir . (3.63)

With these wave functions we obtain:

nHk (r

′) =1

Vg= constant , (3.64)

i.e., the density nHk and the electron k, respectively, is smeared out uniformly over the

whole volume. The particle density interacting with the Hartree particle k is

n(r′)− nHk (r

′) =N

Vg− 1

Vg. (3.65)

If particle k is located at r = 0, the distribution of the other electrons is as shownin Fig. 3.3. Strictly speaking the line in Fig. 3.3 is at (1 − 1

N). As N is truely large,

this cannot be distinguished from 1. Therefore, vSICk (r) is negligible for extended wavefunctions.The corresponding particle density interacting with an electron in single-particle stateϕoksk in the Hartree-Fock theory is

n(r′)− nHFk (r, r′) . (3.66)

52

If particle k is at position r, then the (N − 1) other particles have a distribution, which isgiven by Eq. (3.66). Like in the Hartree approximation, Eq. (3.66) also contains exactlyN − 1 particles. What does this density look like?

nHFk (r, r′) =

N∑

i=1

δsi,skϕ∗oisi


ϕoksk(r)(3.67)

=1

Vg

N2∑

ki

e−ikir′

eikkr′

eikire−ikkr (3.68)

=1

Vg

N2∑

ki

ei(ki−kk)(r−r′) . (3.69)

In order to simplify the representation, here we have assumed that the system is nonmag-netic and therefore each spatial state ϕoi(r) is occupied by two electrons. Because nHF

k isdifferent for each state, one obtains a better impression of the meaning of Eq. (3.69), ifone averages over all electrons by writing

nHF(r, r′)!=

N∑

k=1

ϕ∗ok(r)ϕok

(r)nHFk (r, r′)

n(r)(3.70)

= 2VgN

1

Vg

1

Vg

N2∑

kk

e−ikk(r−r′)

N2∑

ki

eiki(r−r′) . (3.71)

The sum over the vectors ki and kk can be evaluated easily, if one changes from a discreteto a continuous representation (cf. the discussion following Eq. (2.18)):

N2∑

ki

→∫ kF

0

Vg(2π)3

d3k . (3.72)

We obtain

Vg(2π)3

∫ kF

0

eik(r−r′)d3k =3

2N(kFr) cos(kFr)− sin(kFr)

(kFr)3, (3.73)

with r = |r− r′|.

It follows:

nHF(r) =9

2

N

Vg

((kFr) cos(kFr)− sin(kFr)

(kFr)3

)2

, (3.74)

nHF therefore is spherically symmetric. The particle density of the other electrons feltby an averaged Hartree-Fock particle is that shown in Fig. 3.4. The figure demonstratesthat the concentration of electrons of like spin is lowered in the neighborhood of theinvestigated electron. This is formulated in the following way: An electron is surroundedby its exchange hole. The quantity nHF

k (r, r′) is called particle density (with respect to r′)of the exchange hole of an electron at position r.

53

0 1 2

0

0.5

1

Hartree: n(r′)− nHk (r

′)Vg

N

Hartree-Fock: n(r′)− nHF(r, r′)Vg

N

|r− r′|/rs

Figure 3.4: Distribution (averaged) of the (N − 1) other electrons with respect to anelectron that is located at position r = 0 (in the Hartree-Fock approximation and forjellium).

The difference between the Hartree and the Hartree-Fock approximation (cf. Fig. 3.4, orthe equations (3.64) and (3.74)) is that nH

k (r′) only depends on r′, i.e., it is the same for

each position r of the observed particle. But nHFk (r, r′) and nHF(r, r′) depend on the posi-

tion of the particle, i.e., on the position of the particle for which we are actually solvingthe Hartree-Fock equation using nHF

k . Furthermore, nHFk fulfills the Pauli principle: If the

investigated electron k is at position r, then all other electrons of like spin are displacedfrom position r. Due to the Pauli principle, the electrons of like spin do not move in-dependently of each other, but their motion is correlated, because in its neighborhoodan electron displaces the other electrons. Although we have solved the time-independentSchrödinger equation, this dynamic Pauli correlation is taken into account. This is be-cause the Pauli interaction is not explicitly included in the Hamilton operator, but istaken into account via the constraint of antisymmetry of the many-body wave function.Another correlation should appear (in an exact theory) due to the Coulomb repulsion forall electrons, i.e., also for electrons of unlike spins there must be a displacement of elec-trons. But this Coulomb repulsion is included only in an averaged way in Hartree as wellas in Hartree-Fock theory, so that the correlation resulting from the Coulomb repulsionis missing in both theories. Hartree-Fock therefore contains a part of the correlation, theso-called Pauli correlation. Nevertheless, it is commonly agreed that the term correlationis used for all that is missing in Hartree-Fock theory. This usage is not very fortunate,but it is at least well defined and has become generally accepted4.

For the exchange potential (cf. Eq. (3.58)) of a jellium system we obtain (with plane wavesas eigenfunctions) for the dependence of vx on state km the result shown in Fig. 3.5.The calculation goes as follows:

vxk(r) = − e2

4πε0

∫nHFk (r− r′)

|r− r′| d3r′ ,

= − e2

4πε0

1

(2π)3

∫ kF

0

∫ei(k

′−k)r

|r| d3r d3k′ , (3.75)

4In the field of density-functional theory the definition is modified.

54

0

0.5

1

0 kF

k = |k|

−vxk π2kF

4πε0e2

= F(kkF

)

Figure 3.5: The exchange potential as a function of state k for a jellium system.

where r = r− r′. We have

1

|r− r′| = 4π1

(2π)3

∫eiq(r−r′)

q2d3q . (3.76)

Thus, we obtain

vxk = − e2

4πε0

4π

(2π)3

∫ kF

0

1

|k− k′|2 d3k′ , (3.77)

where we have used∫ei(q−k+k′)rd3r = (2π)3δ(q+ k′ − k) . (3.78)

After integration, one obtains

vxk = − e2

4πε0

2kFπF

(k

kF

), (3.79)

where

F (x) =1

2+

1− x24x

ln

∣∣∣∣1 + x

1− x

∣∣∣∣ . (3.80)

The function F(

kkF

)is shown in Fig. 3.5.

For the further discussion of Hartree-Fock theory we have a look at the eigenvalues of theHartree-Fock equation. In the jellium approximation it follows:

〈ϕk|h|ϕk〉 = ǫ(k) =~2k2

2m+ 〈ϕk|vxk|ϕk〉 . (3.81)

Here, the zero point of the energy was set to the average electrostatic potential:

v(r) +e2

4πε0

∫n(r′)

|r− r′|d3r = 0 . (3.82)

55

-2

-1

0

1

2

ǫ(k)~2

2mk2F

0.5 1 1.5 k/kF

band width of occupied statesof non-interacting electrons

band width of occupied statesof Hartree-Fock particles

free electrons

interactingHartree-Fock particles

Figure 3.6: Dispersion of the single-particle energies in jellium for free electrons (or Hartreeparticles) and for Hartree-Fock particles (Eq. (3.83)), shown here for the density parameter(rs = 4 bohr).

The expectation value of the exchange potential is given by Eq. (3.77):

〈ϕk|vxk|ϕk〉 = vxk〈ϕk|ϕk〉 = −e2

4πε0

2kFπ

F

(k

kF

), (3.83)

and for Eq. (3.81) we obtain:

ǫ(k) =~2k2

2m− e2

4πε0

2kFπF

(k

kF

)(3.84)

with F (x) from Eq. (3.80).

This derivation shows that plane waves are in fact eigenfunctions of the Hartree-FockHamilton operator, i.e., they diagonalize the Hartree-Fock operator. However, we do nothave the dispersion relation of free electrons anymore, but there is an additional term,which depends on k and which gives rise to a lowering of the single-particle energies. Ifwe now compare the relation (Eq. (3.83)) of Hartree-Fock theory with the one of freeelectrons (veff(r) = constant) or with Hartree theory, we obtain Fig. 3.6.In Fig. 3.6 it can be seen:

56

1. For very small k the dispersion of Hartree-Fock particles is parabolic. The curvatureof this parabola is different from the one of free electrons. If k is very small we obtain

ǫ(k) =~2k2

2m∗ + C for |k| → 0 . (3.85)

For the effective mass of the Hartree-Fock particles we have:

m∗

m=

1

1 + 0.22 (rs/aB)for |k| → 0 , (3.86)

for k → 0, m∗ is smaller than m (rs is typically between 2 and 3 bohr).

2. The band width of the occupied states is significantly larger for

Hartree-Fock particles than for Hartree particles, a factor 2.33 in Fig. 3.6.

3. For k = kF there is an obviously unphysical result:

The derivative

∂ǫ(k)

∂k→∞ for k → kF (3.87)

diverges logarithmically. This has consequences for metallic properties and for theheat capacity. Both are described mainly by the electrons close to the Fermi energy.The reason for the singularity lies in the long-ranged 1

|r−r′| behavior of the electron-electron interaction. If the interaction were screened, e.g.

e−λ(|r−r′|)

|r− r′| , (3.88)

the singularity would not exist (cf. Ashcroft-Mermin, p. 337).

4. The potential vxk is typically in the range of 5-15 eV. Because many phenomena insolid-state physics are determined by energy differences in the order of 0.1-0.5 eV,an exact treatment of the exchange interaction is crucial.

3.5 Koopmans’ Theorem

In the discussion of the Hartree equation we pointed out that the Lagrange parametersǫoi seem to correspond (at least approximately) to ionization energies. This statement willbe investigated in more detail now for the Hartree-Fock equation.

The energy required to remove an electron from state k in an N electron system is

Ik = EN−1k − EN =

⟨ΦN−1

k

∣∣He,N−1∣∣ΦN−1

k

⟩−⟨ΦN∣∣He,N

∣∣ΦN⟩

. (3.89)

Here we assume that the removed electron is excited to the the vacuum level, and there itis a free particle with zero kinetic energy. ΦN is the ground state wave function of the Nelectron system and ΦN−1

k the wave function of the N − 1 electron system, with the statek being unoccupied. If state k is the highest occupied state of the N electron system (i.e.,state N), then Ik is the ionization energy. In order to investigate Eq. (3.89) we make thefollowing assumptions:

57

1) Removing an electron has no influence on lattice geometry, or more precisely, weassume that the electron removal happens very fast (e.g. by optical excitation),so that the lattice components do move only after the ionization (Franck-Condonprinciple).

2) The many-body wave functions shall be single Slater determinants (i.e., the discus-sion refers to the Hartree-Fock approximation).

3) Removing the k-th electron does not affect the single-particle wave functions of theother electrons. This assumption is reasonable, as long as the number of electronsis large and the density |ϕok(r)|2 of the electron being removed is rather extended.Then vHartree(r) and nHF

j (r, r′) remain basically unchanged, and, consequently alsoall ϕoi si remain essentially unchanged.

In particular assumptions 2) and 3) are rather drastic. These two assumptions mean thatthe wave function ΦN−1 is derived from the wave function ΦN by deleting line k andcolumn k in the Slater determinant ΦN . Then we obtain, using Eq. (3.89), (3.32), and(3.54)

Ik = 〈ΦN−1k |

N∑

i=1i 6=k

− ~2

2m∇2

ri+ v(ri)|ΦN−1

k 〉+ 1

2

e2

4πε0〈ΦN−1

k |N,N∑

i,j=1i 6=j

i,j 6=k

1

|ri − rj||ΦN−1

k 〉

− 〈ΦN |N∑

i=1

− ~2

2m∇2

ri+ v(ri)|ΦN〉 − 1

2

e2

4πε0〈ΦN |

N,N∑

i,j=1i 6=j

1

|ri − rj||ΦN〉

≈ −〈ϕoksk |−~2

2m∇2 + v(r)|ϕoksk〉

− e2

4πε0

N∑

i=1i 6=k

∫∫ϕ∗oksk

(r′)ϕ∗oisi

(r)ϕoisi(r)ϕoksk(r′)

|r− r′| d3r d3r′

+e2

4πε0

N∑

i=1i 6=k

δsk,si

∫∫ϕ∗oksk

(r)ϕ∗oisi

(r′)ϕoisi(r)ϕoksk(r′)

|r− r′| d3r d3r′

= −〈ϕoksk |hHF|ϕoksk〉

= −ǫoksk . (3.90)

Thus, we have shown what we have assumed before: The meaning of the quantities ǫok orǫoksk , initially introduced as Lagrange parameters, is (approximately) the negative valueof the energy that is required to remove an electron from orbital k. It follows that alsoexcitation energies, i.e., a transition from an occupied state i to an unoccupied state j is(approximately) determined by the ǫok :

∆Ei→j ≈ ǫj − ǫi . (3.91)

These statements (Eq. (3.90)) and (Eq. (3.91)) are called Koopmans’ theorem. They arevalid approximately for the valence electrons of atoms and for extended states of solids.

58

However, when an electron is excited from a localized state in a solid one does not onlyobtain a single discrete line, but (due to many-body effects like electronic relaxation andexcitations) several peaks”. Then Koopmans’ theorem (approximately) gives the center ofmass of the function Ik(ǫ).

3.6 The Xα Method

(Hartree-Fock-Slater Method)

Initially, the Hartree-Fock method was applied to atoms without major difficulties. Forsolids, however, it was realized that it is very elaborate5 due to the exchange term. Be-cause of these difficulties in 1951 Slater suggested (Phys. Rev. 81, 385 (1951); 82, 5381(1951)), to simplify this term. This simplification, although introduced ad hoc, was verysuccessful. In fact, it gave better results (compared to experimental data) than the cleanHartree-Fock approach. Later, i.e., by density-functional theory, it was realized that thetreatment introduced by Slater does in fact correspond to an important physical theoremand is not an approximation but even an improvement of Hartree-Fock theory.

The difficulty for practical use of Hartree-Fock theory is that the exchange potential isstate-dependent. For jellium the expectation value of the exchange potential vxk is givenby Eq. (3.79) and it reads

vxk = − e2

4πε0

2kFπkF F

(k

kF

)with kF =

3√3π2n . (3.92)

If one averages over all occupied states the function F (k/kF) becomes F = 0.75. But onecould also claim that only the states at kF are important: Electrons can react to externalperturbations only by transitions to excited states, and the enegetically lowest excitationshappen at ǫF (or kF ). At kF , F has the value 0.5. Exactly this value (F = 0.5) is alsoobtained, if one averages over all states in the expression for the total energy (Eq. (3.42)),before the functional derivative δEHF/δϕ is calculated. This is because a variation afteraveraging mainly refers to the Fermi edge. One obtains:

vxF= 3

4(r) = − 3

2π

e2

4πε0

3√3π2n(r)

averaging over all occupiedstates of the Hartree-Fockequation (F = 0.75).

(3.93)

vxF= 1

2(r) = − 1

π

e2

4πε0

3√3π2n(r)

Averaging in the expression ofthe total energy or by takinginto account only the states at kFof the Hartree-Fock equation(F = 0.5).

(3.94)

5In recent years there have been several better Hartree-Fock calculations for solids: Stollhoff (1987),Gigy-Baldereschi (1987), Louie (1988). In particular Dovesi et al. have developed a Hartree-Fock program(“CRYSTAL”) for solids. The next step is to take into account “exact exchange” in the context of density-functional theory.

59

0

5

10

15

20

0 2 4 6 8 10

−vxα [n] (eV)

n(r)×102 (bohr−3)many metals

α= 23(orF= 1

2)

α= 34(orF= 9

16)

α=1 (orF= 34)

Figure 3.7: The Xα-potential as a function of the density n (atomic units). The density,as usual, depends on the position, the value n is different for each r. And for this valueof n, the figure gives vxα(r).

Now, in n(r), we have again written the r dependence, to indicate that the derivative isalso valid for slowly varying densities. Because Hartree-Fock theory is not exact anyway(Coulomb correlation is missing), Slater suggested to introduce the following quantity forthe impractical exchange potential

vxα(r) = −α 3

2π

e2

4πε0

3√3π2n(r) , (3.95)

where α is a parameter of value2

3< α < 1 . (3.96)

The higher value (i.e., α = l) corresponds to the exchange potential originally derivedby Slater (F = 0.75). Today, from the point of view of density-functional theory, thepotential for α = 2/3 (for F = 0.5) would be called exchange potential. In the seventiesα was determined for atomic calculations and then transferred to solids and molecules. Ifα is chosen to obtain a good agreement with experimental total energies one finds that αshould obtain a value of ≈ 2

3. Slater believed that the obtained α ≈ 2

3-exchange potential

represents an improvement, because the Coulomb correlation is taken into account semi-empirically. Today, this view is largely confirmed by density-functional theory [cf. Kohn,Sham, Phys. Rev. 140, A 1193 (1965), Gaspar, Acta Phys. Acad. Sci. Hung. 3, 263 (1954)].From a puristic theoretician’s point of view, Slater’s empirical treatment was not satis-fying, because the derivation was not really justified. Still this treatment had impressivesuccess. The Xα-potential is illustrated in Fig. 3.7. It gives an impression of the depen-dence on the density and of the strength of the exchange interaction. Most metals havea density of rs = 2 bohr, or n = 0.03 bohr−3. Here, the exchange potential has a value of

60

about 7 eV, and consequently one recognizes that for an accurate theoretical treatmentin general it will be very important to take into account the exchange potential precisely.

3.7 Thomas-Fermi Theory and the Concept of Screen-

ing

A main interest in the solution of the many-body Schrödinger equation of solids is to learnsomething about the distribution of the electrons in the many-body ground state Φ, i.e.,about the electron density:

n(r) =

⟨Φ

∣∣∣∣∣N∑

i=1

δ(r− ri)

∣∣∣∣∣Φ⟩

. (3.97)

When thinking of the approaches of Hartree, Hartree-Fock, and Hartree-Fock-Slater, onecould ask, if one really has to calculate the many-body wave function Φ, i. e., ∼ 1023

single-particle wave functions ϕoi(r), in order to calculate

n(r) =N∑

i=1

|ϕoisi(r)|2 , (3.98)

or if n(r) can be calculated directly. We start from the effective single-particle equation− ~

2

2m∇2 + veff(r)

ϕi(r) = ǫiϕi(r) , (3.99)

where the effective potential is taken, e.g., from the Hartree or Hartree-Fock-Slater theory.We ask ourselves if it is possible to calculate the density directly from the potential v(r),without calculating the wave function and without solving the Schrödinger equation. Inthis context, we note that n(r) is a functional of the external potential v(r): Since v(r)defines the many-body Hamilton operator and the latter determines everything, also n(r)is completely determined by v(r). It is unclear, however, if the particle density n(r) canbe expressed explicitly as a functional of the potential of the ions (lattice components).To continue, we start with the jellium model: We have seen that by neglecting explicitelectron-electron interactions and by introducing a constant effective potential represent-ing the positive ionic background, the single-particle wave functions are plane waves, andthe expectation values of the single-particle Hamiltonian are simply given by

ǫi =~2

2mk2i + veff . (3.100)

For the highest occupied state, i.e., for the weakest bound electron, we have:

~2

2mk2F + veff = ǫN = µ . (3.101)

Here we anticipate that ǫN equals the electron chemical potential µ, which will be shownbelow. For a non-jellium system, as long as the potential veff(r) varies slowly in r,Eq. (3.101) is still valid, but only approximately. Actually, one should replace veff(r)

61

by 〈ϕoN |veff |ϕoN 〉. Because this is not done here, the following derivation should be con-sidered as a semi-classical approximation.

For the jellium system (and for slowly varying densities), we can define a (position de-pendent) Fermi-k-vector (cf. Eq. (2.31)),

kF(r) =3√

3π2n(r) , (3.102)

and obtain the following equation for the electron of the highest energy:

~2

2m

(3π2n(r)

)2/3+ veff(r) = µ . (3.103)

Since µ is the energy of the weakest bound electron, it has to be spatially constant. Thefirst term of the left side of Eq. (3.103) is the kinetic, the second the potential energy.This equation enables us to calculate the density n(r) for a given µ from v(r) withoutsolving a Schrödinger equation. The ∼ 1023 particles do not appear explicitly as individualparticles. Equation (3.103) tells that there is a discrete relation between veff(r) and n(r),at least for jellium, or close to jellium systems

Equation (3.103) is called the Thomas-Fermi equation. For veff = v + vHartree it is equiv-alent to the Hartree equation, and for veff = v + vHartree + vxα it is equivalent to theHartree-Fock-Slater Equation.

The Thomas-Fermi equation (3.103) can also be derived from a variational principle. Thisshall be done here for the Hartree theory, where we have:

Ee = 〈Φ |He|Φ〉

=N∑

i=1

⟨ϕoi

∣∣∣∣−~2

2m∇2

∣∣∣∣ϕoi

⟩+

∫v(r)n(r)d3r+

e2

4πε0

1

2

∫∫n(r)n(r′)

|r− r′| d3rd3r′ .

(3.104)

For the jellium we find:

N∑

i=1

⟨ϕoi

∣∣∣∣−~2

2m∇2

∣∣∣∣ϕoi

⟩=

VgVg(2π)3

2

∫ kF

0

~2

2mk2 d3k

= 4π2

(2π)3~2

2m

∫ kF

0

k4dk =1

5π2

~2

2mk5F

=1

5π2

~2

2m(3π2n)5/3 = T Jellium

s [n] . (3.105)

This is the functional of the kinetic energy of non-interacting particles for a jellium system,i.e., if n(r) is constant, or nearly constant. Then for jellium we obtain

δT Jelliums [n]

δn(r)=

~2

2m

(3π2n(r)

)2/3. (3.106)

62

For jellium and probably also for slowly and weakly varying densities n(r), the expectationvalue 〈Φ|He|Φ〉 can be written as a functional of the density (at least in the Hartreeapproximation). Because the total energy of the ground state is Ee

0 = MinΦN 〈ΦN |He|ΦN〉,we write

Ee0 = Minn(r)E

e[n] . (3.107)

With the assumption that the total number of particles remains constant, the minimiza-tion is written as

δEe[n]− µ

(∫n(r)d3r−N

)

δn(r)= 0 . (3.108)

Here, µ is the Lagrange parameter taking care of the constraint that the total number ofelectrons is N . Apparently, Eq. (3.108) is a restatement of the Thomas-Fermi equation

µ =~2

2m

(3π2n(r)

)2/3+ veff(r) , (3.109)

with

veff(r) = v(r) +e2

4πε0

∫n(r′)

|r− r′|d3r′ . (3.110)

We did not prove that the variational principle is really valid (Eq. (3.108)), but we have“simply” assumed its validity. Further, it should be mentioned that in addition to thecondition of the conservation of the particle number actually other constraints would beimportant as well, namely that only densities n(r) must be considered that are physicallymeaningful, e.g., n(r) must be real and always positive.

Now we want to investigate the meaning of Eq. (3.103) and (3.109). The Thomas-Fermiequation is valid for arbitrary changes in n(r), because the conservation of the particlenumber is taken into account by the method of Legrange parameters. Therefore, theequation also holds for

n(r) −→ n(r) +∆N

Vg. (3.111)

By using this in Eq. (3.108), we find

E[n+ ∆N

Vg

]− E[n]

∆N= µ ·

∫ (∆NVg

)d3r

∆N= µ (3.112)

dEe

dN= µ . (3.113)

Consequently, µ is the chemical potential, i.e., the energy required to change the particlenumber.

The Concept of Screening

63

Equation (3.103) shows that n(r) is a functional of veff(r):

n(r) = F1[veff(r);µ] (3.114)

=1

3π2

[2m

~2

(µ− veff(r)

)]3/2. (3.115)

Now we introduce a small perturbation

veff(r) −→ veff(r) + ∆veff(r) . (3.116)

Here, we assume that µ remains unchanged, i.e., the energy that is required to remove theweakest bound electron shall stay the same as it was without perturbation, because theperturbation is small. This will hold, e.g., when ∆veff is a spatially localized perturbationof the potential in a macroscopic system, e.g. a defect atom in a semiconductor. Now weask, how this affects the particle density

n(r) −→ n(r) = n(r) + ∆n(r) . (3.117)

For the perturbed system, we have the Thomas-Fermi equation for n,

µ =~2

2m

[3π2n(r)

]2/3+ veff(r) + ∆veff(r) , (3.118)

and the comparison with Eq. (3.115) yields

n(r) = F1[veff +∆veff(r), µ] = F1[v

eff(r);µ−∆veff(r)] . (3.119)

Now, we denote α = µ− veff and expand Eq. (3.119) in a Taylor series in the vicinity ofα = µ. This yields

n(r) = F1[veff(r);α]

∣∣∣∣∣α=µ

− ∂F1[veff(r);α]

∂α

∣∣∣∣∣α=µ

·∆veff(r) +O([∆veff ]2

). (3.120)

For the change in the particle density ∆n, it follows

∆n(r) = −∂F1[veff(r);α]

∂α

∣∣∣∣α=µ

·∆veff(r) +O([∆veff ]2

). (3.121)

Actually we are interested in the relation between v(r) and n(r), or between ∆v(r) and∆n(r). To obtain this relation, we here consider the Hartree approximation. Then wehave:

∆veff(r) = ∆v(r) +e2

4πε0

∫∆n(r′)

|r− r′| d3r′ , (3.122)

or with the Poisson equation,

∇2(∆veff(r)) = ∇2(∆v(r))− e2

ε0∆n(r) . (3.123)

In the Fourier representation, we have:

∆n(k) =ε0e2k2[∆veff(k)−∆v(k)

]= −∂F1[v

eff(r);α]

∂α

∣∣∣∣α=µ

·∆veff(k) . (3.124)

64

If we set ∂F1

∂α

∣∣α=µ

= ε0e2k20, we obtain

k2 + k20k2

∆veff(k) = ∆v(k) . (3.125)

The quantity k0 is called the Thomas-Fermi wave vector. We have now derived an equa-tion, which (in the Hartree approximation and for a jellium-type system) describes therelation between the origin of the perturbation (change of the potential of the ions) and thepotential (effective potential) acting on the single-particle wave functions. This equationcorresponds closely to a description known from electrodynamics: The relation betweenthe strength of an electric field E and the dielectric displacement D is given by

D = εE , (3.126)

where ε is the dielectric constant (generally a tensor). Therefore, we can and want tocontinue our investigation by starting from the relation ∆v(r) ∆veff(r), and considerthis in terms of a microscopic materials equation of electrodynamics. In the context ofThomas-Fermi theory in k-space, we write (cf. Eq. (3.125)):

∆v(k) = ε(k)∆veff(k) , (3.127)

with the Thomas-Fermi dielectric constant

ε(k) =k2 + k20

k2. (3.128)

For small values of k2, i.e., large wave lengths, the jellium approximation is well justifiedand then equations (3.127) and (3.128) are equivalent to the Hartree theory. Still, k0 isunknown, but it will be determined below (Eq. (3.139)). In real space we obtain

∆v(r) =

∫ε(r, r′)∆veff(r′)d3r′ . (3.129)

All the many-body quantum mechanics is now hidden in the dielectric constant. In casewe are dealing with a spatially isotropic, uniform and homogenous system, we would have:

ε(r, r′) = ε(|r− r′|) . (3.130)

To demonstrate the scientific content of the Thomas-Fermi theory we will now discuss thescreening in the neighborhood of a point charge in a solid (e.g. a defect in a crystal). Westart by describing the bare electrostatic field of the point defect by

∆v(r) =−e24πε0

Z

r, (3.131)

which, in k-space, equals

∆v(k) =−e24πε0

Z

k2. (3.132)

For the effective potential we obtain:

∆veff(k) =k2

k2 + k20∆v(k) =

−e24πε0

Z

k2 + k20. (3.133)

65

0

−2−4

−10

−200 0.5 1 1.5 2 2.5 3

(Ryd)

r (bohr)

∆v(r)

∆veff(r)

Figure 3.8: Change of the effective potential induced by a point charge at r = 0 withZ = 1. In the Thomas-Fermi approximation with k0 = 1.1 bohr−1, i.e. for rs ≃ 2 bohr.

By transforming back to real space, we obtain

∆veff(r) =−e24πε0

Z

re−k0r . (3.134)

Now it is clear that not an “external” perturbation charge Z · e with a potential −e2

4πε0Zr

isacting on the electron, but a screened Coulomb potential as described in Eq. (3.134). Thispotential is also called Yukawa potential (the name originates from the theory of mesons,where the potential is modified in a similar fashion). The Thomas-Fermi wave vector k0determines the strength of the screening. When 1/k0 becomes infinite, then there is noscreening. The effective potential of Eq. (3.134) is shown in Fig. 3.8.

Physically this result means that a positive perturbation charge at r = 0 induces anattraction of the valence electrons of the solid to the perturbation charge. This increasesthe negative charge density in the neighborhood of r = 0, and the perturbation is screened.One says that the “valence charge density is polarized”. This polarization charge density∆n(r), i.e., the change of the electron density induced by the perturbation, according toEq. (3.124) with (3.133) is:

∆n(k) = −ε0e2k20∆v

eff(k) = Zk20

k2 + k20· 1

4π. (3.135)

In real space we obtain the result,

∆n(r) = Zk204π

e−k0r

r, (3.136)

which is again of Yukawa form and displayed in Fig. 3.9.

66

0

2

4

6

0 0.2 0.4 0.6 0.8 1

∆n(r)(bohr)−3

r (bohr)

Figure 3.9: Change of the electron density induced by a positive point charge at r = 0with Z = 1. In the Thomas-Fermi approximation with k0 = 1.1 bohr−1 and rs ≃ 2 bohr.

The area under the curve is equal to Z =∫∆n(r) d3r, i.e., the charge belonging to ∆n(r)

is exactly equal to the perturbation charge, but with an opposite sign. How important,or how efficient is this screening; i.e., how large is k0 for realistic systems? By writing

ε0e2k20 =

∂F1[veff(r);α]

∂α

∣∣∣∣α=µ

=∂

∂α

1

3π2

2m

~2

(α− veff

)3/2∣∣∣∣∣α=µ

=3

2

2m

~23π2

2m

~2

(α− veff

)1/2∣∣∣∣∣α=µ

=m

~2π2kF , (3.137)

we find:

k20 =e2

ε0

m

~2π2kF =

e2

ε0

m

~2π2

[3π2n

]1/3. (3.138)

If we express the density by the density parameter rs, we have

k0 =2.95√rs/aB

Å−1

. (3.139)

Because rs is generally in the range 1 . . . 6 bohr, the screening happens very fast, i.e.,on a length scale of 1/k0 ≈ 0.5Å. This length is comparable to the distance betweenthe lattice components in a crystal (typically 2–3 Å). A more accurate calculation yieldsqualitatively the same result. But there are also significant differences. A more accuratecalculation (which is significantly more complicated) is shown in Fig. 3.10. The reasonsfor the differences to Fig. 3.9 are that now not T Jellium[n], but the correct kinetic energyhas been used. Further, no semiclassical approximation for veff has been assumed, and noTaylor-series expansion and approximation for F1[v

eff ;α] has been used, and the exchangeinteraction has been taken into account. The correct kinetic energy, T , yields oscillations,

67

Figure 3.10: Change in the charge density induced by a defect atom (AS) in a siliconcrystal. The top figure shows a contour plat in the (110) plane, and the bottom shows thedensity change along the [111] direction.

which are called Friedel oscillations in literature.

The basic idea of Thomas-Fermi theory to calculate n(r) directly from v(r) is for sureinteresting. However, the mentioned approximations are generally too drastic in most ap-plications. Improvements of the kinetic energy term (cf. Eq. (3.106)) have been suggestedby C.F. Weizsäcker. The correction term is proportional to |∇n(r)|2 /n(r).

3.8 Density-Functional Theory

References:P. Hohenberg, W. Kohn, Phys. Rev. 136, B 864 (1964); W. Kohn, L. Sham, Phys. Rev.A 140, 1133 (1965); M. Levy, Proc. Natl. Acad. Sci. USA 76, 6062 (1979); R.M. Dreizler,E.K.U. Gross, Density Functional Theory (Spinger, 1990); R.G. Parr, W. Yang, Density-

68

Functional Theory of Atoms and Molecules (Oxford University Press 1994); R.O. Jonesand O. Gunnarsson, Rev. Mod. Phys. 61, 689 (1989)

We are still interested in the properties of a solid, which is described by the many-bodyHamilton operator

He =N∑

i=1

− ~2

2m∇2

ri+ v(ri) +

1

2

e2

4πε0

N,N∑

i,ji 6=j

1

|ri − rj|. (3.140)

We will assume that the ground state of

HeΦ = EeΦ (3.141)

is non-degenerate. Further we will assume that the system is non-magnetic, i.e., the par-ticle density of the spin-up and spin-down electrons shall be the same:

n↑(r) = n↓(r) , (3.142)

where for the total particle density as usual we have:

n(r) = n↑(r) + n↓(r) (3.143)

=

⟨Φ

∣∣∣∣∣N∑

i=1

δ(r− ri)

∣∣∣∣∣Φ⟩

. (3.144)

From (3.142) and (3.143) we then obtain

n↑(r) = n↓(r) =n(r)

2. (3.145)

The assumptions of a “non-degenerate ground state” and Eqs. (3.143) and (3.145) canalso be omitted, but the following discussion is simpler when they are made. First, wegive the theorem of Hohenberg und Kohn, and subsequently we will prove its validity:

The expectation value of He is a functional of the particle density n(r):

〈Φ|He|Φ〉 = Ev[n] =

∫v(r)n(r)d3r+ F [n] , (3.146)

where the functional F [n] does not depend explicitly on v(r).

Proof of this statement: It is immediately clear that F = 〈Φ|T e + V e−e|Φ〉 is a functionalof Φ, but initially it is surprising that it is supposed to be a functional of n(r). Now we willshow that Φ is a functional of n(r), as long as we constrain ourselves to functions whichare defined according to Eq. (3.146), and Φ is the ground state wave function of an arbi-trary N -particle problem. Because of the word “arbitrary”, i.e., v(r) is arbitrary, this is nota constraint of physical relevance. Mathematically, however, this is a noticeable restriction.

The opposite of our goal is known: n(r) is a functional of Φ:

n(r) =

⟨Φ

∣∣∣∣∣N∑

i=1

δ(r− ri)

∣∣∣∣∣Φ⟩

. (3.147)

69

The question to be answered is: Is the mapping of Eq. (3.147) reversibly unique (cf.Fig. 3.11)?

The proof of the theorem of Hohenberg and Kohn and of the statement Φ = Φ[n] is donefollowing the principle “reductio ad absurdum”:

n(1)

n(2)

n(3)n(4)

n(5)n(6)n(7)

n(8)

n(9)

n(10)

n(11)n(12)

n(13)

Φ(1)

Φ(2)

Φ(3)

Φ(4)

Φ(5)Φ(6)

Φ(7)

Φ(8)Φ(9)

Φ(10)

Φ(11) Φ(12)

Φ(13)

The set of non-degenerate groundstate wave functions Φ of ar-bitrary N -particle Hamilton op-erators of type of Eq. (3.1) orEq. (3.142) .

The set of the particle densi-ties n(r), which belong to non-degenerate ground states of theN -particle problem.

Figure 3.11: Relation between wave functions and particle densities. The Hohenberg-Kohntheorem states that the dashed case does not exist, i.e., two different many-body wavefunctions have to yield different densities.

Starting point: Let v(r) and v(r) be two physically different potentials, i.e., we have

v(r)− v(r) 6= constant . (3.148)

These two potentials define two Hamilton operators He and He. As noted above, we con-strain ourselves to operators that have a non-degenerate ground state for simplicity. Forthe more general discussion we refer to the work of Levy.

Assumption 1: Both Hamilton operators have the same ground state wave function Φ0. Itfollows

(He −He)Φ0 =N∑

i=1

v(ri)− v(ri)Φ0 = (Ee0 − Ee

0)Φ0 , (3.149)

and from this one obtains (with the exception of a discrete number of points, for whichΦ0 is zero)

N∑

i=1

v(ri)− v(ri) = Ee0 − Ee

0 . (3.150)

70

This means that v(r)− v(r) is a mere constant, which is in contradiction to the startingpoint. It follows: Our assumption 1, i. e., that He and He have the same eigenfunction, iswrong. Therefore, we have that Φ0 and Φ0 are different.

Assumption 2: We assume that both Φ0 and Φ0 can give rise to the same particle density

n(r), even though Φ0 6= Φ0. This corresponds to the dashed arrows in Fig. 3.11. We thenget

Ee0 = 〈Φ0|He|Φ0〉 <

⟨Φ0|He|Φ0

⟩=

⟨Φ0

∣∣∣∣∣He −

N∑

i=1

v(ri) +N∑

i=1

v(ri)

∣∣∣∣∣ Φ⟩

, (3.151)

therefore,

Ee0 < Ee

0 +

⟨Φ0

∣∣∣∣∣N∑

i=1

v(ri)− v(ri)∣∣∣∣∣ Φ0

⟩(3.152)

and

Ee0 < Ee

0 +

∫v(r)− v(r)n(r)d3r . (3.153)

Similarly, we obtain for Ee0 = 〈Φ0|He|Φ0〉:

Ee0 < Ee

0 −∫v(r)− v(r)n(r) d3r (3.154)

If we add equations (3.153) and (3.154), we obtain

Ee0 + Ee

0 < Ee0 + Ee

0 , (3.155)

which is a contradiction for Ee0, E

e0 > 0. This means that assumption 2 is wrong. Thus we

have proven:

Two different ground states Φ0 and Φ0 must yield two different particle densities n(r) andn(r).

This has the following consequences:

a) Ev[n]!= 〈Φ|He|Φ〉 is a functional of n(r). In fact, what we had shown was: Φ = Φ[n].

Here, the functionals are only defined for the set of particle densities that can beconstructed from a ground state wave function of an arbitrary N -particle Hamiltonoperator He, where v(r) is an arbitrary function (cf. Fig. 3.11).

b) In the expression

Ev[n] =

∫n(r)v(r) d3r+ F [n] , (3.156)

F [n] is a universal functional of n(r). This means that the functional F is indepen-dent of the “external” potential v(r).

71

E0

Φ0

〈Φ|He|Φ〉

Φ(ri)↑

∼ 1023Variables

Schr..odinger

E0

n0

Ev[n]

n(r)↑

3 Variables

DFT

Figure 3.12: Schematic figure for the variational principle of 〈Φ|He|Φ〉 and Ev[n].

c) Under the constraint ∫n(r)d3r = N, (3.157)

the correct particle density minimizes Ev[n]. This minimum defines the electrondensity in the ground state and the ground state energy

Ee0 = min

n(r)Ev[n] . (3.158)

Thus, the variational principle for 〈Φ|He|Φ〉 can exactly be reformulated in terms of avariational principle for Ev[n]. The new variational principle is:

δ

Ev[n]− µ

(∫n(r)d3r−N

)= 0 , (3.159)

orδE

[n]v

δn= µ . (3.160)

Here the constraint of a constant total number of particles being equal to N is taken intoaccount by the method of Lagrange multipliers, i.e., we have included the condition

∫n(r)d3r = N . (3.161)

Still some physically important conditions are missing, e.g. that n(r) ≥ 0 and that n(r) hasto be continuous. These are necessary conditions which have to be fulfilled by the functionsof the range of Ev[n], and we will have to take them into account when performing thevariation in an actual calculation. Compared to Hartree and Hartree-Fock theory, weachieved a significant simplification: Earlier we had to insert a wave function dependingon 1023 coordinates in the functional to be minimized. This treatment led to obviousdifficulties and approximations, which were introduced before the actual variation wasperformed. Now we have to insert functions depending on only three coordinates into thefunctional, and, up to now, i.e., up to Eq. (3.160), we have introduced no approximation.So far, we have shown that the functional Ev[n] does exist. However, we have not shownhow it looks.

72

For the actual variation, Kohn and Sham suggested the following procedure. We write:

Ev[n] = Ts[n] +

∫v(r)n(r)d3r+ EHartree[n] + Exc[n] (3.162)

with

EHartree[n] =1

2

e2

4πε0

∫∫n(r)n(r′)

|r− r′| d3rd3r′ . (3.163)

Ts[n] is the kinetic energy functional of non-interacting electrons. Although generally itis not explicitly known as a function of n(r), it will be introduced here. For exchangecorrelation functional it follows

Exc[n] = 〈Φ|He|Φ〉 −∫v(r)n(r)d3r− Ts[n]− EHartree[n]

= F [n]− Ts[n]− EHartree[n] . (3.164)

The variational principle yields Eq. (3.160), and

δTs[n]

δn(r)+ veff(r) = µ (3.165)

with

veff(r) =δ∫

v(r)n(r)d3r+ EHartree[n] + Exc[n]

δn(r)

= v(r) +e2

4πε0

∫n(r′)

|r− r′| d3r′ +

δExc[n]

δn(r). (3.166)

Equation (3.165) formally is an equation for non-interacting particles moving in the po-tential veff(r), because per definition Ts[n] is the kinetic energy of non-interacting particlesof density n(r), i.e., it is defined via:

n(r) =N∑

i=1

|ϕoi(r)|2 . (3.167)

It is not clear if the functional Ts[n] does exist for all particle densities, i.e., also for thoseparticle densities that cannot be expressed via Eq. (3.165). Here this is just assumed.It seems plausible that the set of densities defined this way covers all physically reasonabledensities or at least come arbitrarily close. As long as Ts[n] is a well behaved functional,the assumption of ("being arbitrarily close") should be sufficient. But this point still hasnot been discussed conclusively in the literature.

From Eq. (3.165) we obtain single-particle Schrödinger equation− ~

2

2m∇2 + veff(r)

ϕoi(r) = ǫoiϕoi(r) . (3.168)

Obviously, this is an effective single-particle equation, because veff depends on the solu-tions that we are seeking.

73

Equation (3.168) together with Eq. (3.167) and (3.166) is called the Kohn-Sham equation.It is solved using an SCF procedure. Although the functional Ts[n] is not explicitly knownas a functional in n, we can, by replacing Eq. (3.165) by the equivalent Eq. (3.168), stilltreat it exactly. This has the technical disadvantage that we end up with the N single-particle functions, which we wanted to avoid. For the evaluation of the total energy Ev[n]we need Ts[n]. The calculation of Ts[n] is done using one of the two following equations.Generally, for non-interacting particles we have:

Ts[n] =N∑

i=1

〈ϕoi

∣∣∣∣−~2

2m∇2

∣∣∣∣ϕoi〉 , (3.169)

=N∑

i=1

ǫoi −∫veff [nin](r)n(r)d3r . (3.170)

Here veff(r) is determined from Eq. (3.166), and ǫoi is obtained from Eq. (3.168). Thus,we have proven that Ts is a functional of n. Here veff has to be calculated from a densityni(r), because if we interpret Ts[n] as a functional, veff has to be exactly the potential,which generates the ǫok and the ϕok(r) and n(r). Generally, ni(r), which is used for thecalculation of veff(r), and n(r) are different. Only at the end of the SCF cycle both den-sities are the same.

As a side remark we note that the self-consistent solution of the variational principlewould not be changed if Ts[n] was not used, but e.g.

T [n] =N∑

i=1

ǫok −∫veff [n](r)n(r)d3r , (3.171)

or different equations, which differ only by O(ni − n) from Ts[n]. But here we continueusing Ts[n].

Up to now no approximation has been introduced (apart from the reasonable assumptiondescribed by Eq. (3.167)). Therefore we have – in contrast to Hartree and Hartree-Fock– first made use of the variational principle of the ground state, and now we will start tothink about approximations. In Hartree and Hartree-Fock theory first an approximation(ansatz of the wave function) was introduced and then the expectation value of He wasinvestigated. Experience shows that it is particularly important to treat Ts[n] as accuratelyas possible, in order to obtain e.g. the shell structure of the electrons in atoms (s-,p-,d-electrons), which cannot be described with Thomas-Fermi theory. The Kohn-Sham ansatzpermits one to treat Ts[n] exactly.Using Eq. (3.169) or (3.170) we can evaluate Ts[n] without knowing the functional explic-itly. Just one thing remains unknown, the exchange-correlation functional Exc[n] and itsassociated potential vxc(r) = δExc[n]

δn(r). We know that Exc[n] is a universal functional 6, i.e.,

the functional does not depend on the system: The hydrogen atom, the diamond crystal,etc. are described by the same functional. Unfortunately, we do not know the exact form

6 Strictly, F [n] is a universal functional in n. Because F [n] = Ts[n] + EHartree[n] + Exc[n], cf. Eq.(3.165), this is also valid for Exc[n].

74

of Exc[n]. It is also not clear, if the functional can be given in a simple, closed form at all,or if Exc is similar to Ts.To shed more light on the exchange-correlation functional Exc[n], we consider a seriesexpansion in the density n starting from jellium, the homogeneous, interacting electrongas, where we have v(r) = constant and n(r) = constant, so that

Exc[n] = Exc−LDA[n] +O[∇n] . (3.172)

We rewrite this as

Exc[n] =

∫ǫxc[n]n(r)d3r , (3.173)

and

Exc−LDA[n] =

∫ǫxc−LDA[n]n(r)d3r . (3.174)

Here, ǫxc−LDA[n] is the exchange-correlation energy per particle in a jellium system of con-stant density n. Because n(r) is constant, i.e., n is just a number, ǫxc−LDA then is a functionof the density: ǫxc−LDA(n). We generalize this expression to the following statement: Forsystems with a slowly varying density, Exc[n] can be replaced by

Exc−LDA[n] =

∫n(r) ǫxc−LDA(n(r)) d3r . (3.175)

Here n is the local density, i.e., the density at position r.

“Slowly varying” means that the system can be regarded as a collection of jellium sys-tems, where neighboring systems have only slightly different densities. Therefore, in astrict sense, n(r) at a scale of 2π

kFmust change only marginally. 2π

kFis the shortest wave

length, appearing in the occupied states of a jellium system. Generally, for real systemsthis “mathematical requirement” for n(r) is not fulfilled, i.e., 2π

kF≈ 5Å is of the same order

as the interatomic distances. Still, experience shows that the ansatz Eq. (3.175) workssurprisingly well. This will be explained later. As a side remark we note that for Ts[n] alocal-density approximation is very poor, but for Exc[n] such an approximation is appar-ently acceptable.

The approximation (3.175) is called the Local-Density-Approximation (LDA). In this ap-proximation, each point in space is treated like a jellium system, but only for the evalu-ation of Exc[n]. To evaluate Eq. (3.175), the function ǫxc−LDA(n) is required. What doesǫxc−LDA(n) look like? Already in 1938 Wigner determined ǫxc for jellium in the limit ofsmall densities (Trans. Faraday Soc. 34, 678 (1938)). In 1957 Gell-Mann and Brück-ner discussed the limit of high densities using many-body theories (Phys. Rev. 106, 364(1957)). In 1980, it became possible to address also the range in between using computercalculations (Ceperley, Alder, Phys. Rev. Lett. 45, 566 (1980)). Nowadays, the function

ǫxc−LDA(n), or ǫxc−LDA(rs) (rs = 3

√3

4πn), is numerically well known. It is shown in Fig. 3.13.

Table 3.1 gives the average electron density and the corresponding rs parameter for somemetals.

75

n (10−2

A−3)

0

-5

-10

0 5 10 15 20 25 30

−ǫxc−jellium(n)(eV)

← Wigner (1938)

←− Ceperley - Alder (1980) −→

Gell-Mann - Br..uckner (1957) →

Figure 3.13: The exchange-correlation energy per particle for jellium systems of densityn.

If we know ǫxc[n], the exchange-correlation potential of the Kohn-Sham equation can becalculated:

vxc−LDA(r) =δExc−LDA(r)

δn(r)=

∂

∂n

(n ǫxc−LDA(n)

)∣∣∣∣n=n(r)

= ǫxc−LDA(n) + n∂ǫxc−LDA(n)

∂n

∣∣∣∣n=n(r)

. (3.176)

In order to interpret the exchange-correlation potential of DFT-LDA theory, we proceedlike for Hartree and Hartree-Fock (cf. the discussion of Fig. 3.4). We write:

vHartree(r) + vxc(r) =e2

4πε0

∫n(r′)− nxc(r, r′)

|r− r′| d3r . (3.177)

The density n − nxc is shown in Fig. 3.15. The interpretation is to view the exchange-correlation energy as a correction of the Coulomb interaction of the Hartree term, i.e., aparticle at position r does not interact with particles being distributed like n(r′), but itfeels a particle distribution n(r′)−nxc(r, r′) . In the neighborhood of a particle the densityis reduced. The origins of this reduction are the Pauli principle and the Coulomb repul-sion. Strictly, this reduction results from the dynamics of the electrons by a dynamicaltreatment. This is the origin of the term correlation (of the motion). But in a time-independent theory dynamic correlation can also be described as mentioned above. Wesummarize: Hartree theory does not include correlation, i.e., a Hartree particle sees a dis-tribution of the other particles, which is independent of its position. Hartree-Fock theoryincludes the correlation of electrons of like spin originating from the Pauli principle. This“Pauli correlation” is called exchange interaction. In principle, density-functional theoryis exact (and for jellium it can be carried out numerically exact). It contains exchange aswell as the correlation caused by the Coulomb repulsion. But since the functional Exc[n]in its general form is unknown, DFT combined with the LDA is accurate only for in-teracting electronic systems of slowly varying densities. For inhomogeneous systems, the

76

n (10−2

A−3)

0

-5

-10

0 5 10 15 20 25 30

(eV)

vx

vxc−LDA

Figure 3.14: Exchange potential (F = 0.5 or α = 2/3, cf. Eq. (3.85)) and the exchange-correlation potential as a function of the electron-density. The difference of both is thecorrelation potential.

atom

number ofvalenceelectronsin theatom

number ofatoms perprimitive unitcell

lattice con-stant (Å)

averagedensity(1022 cm−3)

rs (bohr)

Li 1 1 3.49 4.70 3.25Na 1 1 4.23 2.65 3.93K 1 1 5.23 1.40 4.86Cs 1 1 6.05 0.91 5.62Cu 1 1 3.61 8.47 2.67Ag 1 1 4.09 5.86 3.02Al 3 1 4.05 18.1 2.07Ga 3 1 4.51 15.4 2.19

Table 3.1: Average electron density of metals. For Cu and Ag the electrons of the 3d- and4d-shells have not been counted as valence electrons. For several problems this approxi-mation is too crude.

77

0

0.5

1

0 2/kF 4/kF 6/kF

|r− r′|

n(r′)− nHF

k (r′, r) VgN

n(r′)− nH

k (r′) VgN

n(r′)− nxc(r′, r) VgN

❪

Figure 3.15: The exchange-correlation hole for jellium. nxc(r′, r) is shown schematically.

LDA is an approximation, but a surprisingly good one! For no interacting many-bodyproblem Hartree is exact and the Hartree-Fock approximation is exactly right only forthose systems, for which the ground state is one Slater determinant.

Surprisingly, it turned out that DFT-LDA yields rather reliable results also for systemsof strongly varying densities. This can be made plausible, but no formal explanation canbe given in terms of Eq. (3.177). The LDA can be understood as an approximation forthe shape of the exchange-correlation hole, and because in vxc and Exc we integrate overnxc the errors in the shape of the xc-hole cancel to some extent7.

3.8.1 Meaning of the Kohn-Sham Single-Particle Energies ǫi

Often one reads that in DFT Koopmans’ theorem is not valid and that therefore thesingle-particle energies of the Kohn-Sham equation have no physical meaning. While thelatter is right, the first statement is wrong. In DFT there is a theorem being practicallyequivalent to Koopmans’ theorem.First, the Kohn-Sham single-particle energies do not seem to have a direct physical mean-ing. The Kohn-Sham equation has been derived as an equation having the same chargedensity as the many-body problem, nothing more. But we found that the ǫoi are neverthe-less required, i.e. in the expression of the kinetic energy when the total energy has to becalculated (cf. Eq. (3.170)). Only the highest occupied level has a direct physical mean-ing: The highest occupied level (of the exact DFT) is the ionization energy (Almbladh,v. Barth, Phys. Rev. B 31, 3231 (1985); a more recent, controversial discussion can befound in Manoj K. Harbola, Phys. Rev. B 60, 4545 (1999)). Here we give proof only formetals, for which we want to show that the following is true:

δEv[n]

δn(r)= µ = ǫF = highest occupied single-particle level (3.178)

7 For a detailed discussion cf. Barth, Williams, in “The inhomogeneous electron gas”, also R.O. Jonesand O. Gunnarsson, Rev. Mod. Phys. 61, 689 (1989).

78

For the proof we assume: ǫF shall be m-fold degenerate and the corresponding wavefunctions shall be extended. Physically this means that we are dealing with metals. Wechoose δn(r) as:

δn(r) ≈ m

Vg≪ N

Vg(3.179)

This means that δn(r) changes the particle number:

N → N +m mit m << N , (3.180)

n(r) → n(r) + δn(r) (3.181)

where

δn(r) =N+m∑

i=N+1

|ϕoi(r)|2 (3.182)

For the change of the total energy we obtain

∆Ev = Ts[n+ δn]− Ts[n]

+

∫δn(r)

v(r) +

e2

4πǫ0

∫n(r′)

|r− r′|d3r′ +

δExc[n]

δn(r)

d3r (3.183)

=N+m∑

i=1

ǫoi −N∑

i=1

ǫoi (3.184)

For the case of self-consistency we therefore obtain

∆Ev = mǫF =

∫δn(r)d3r× ǫF . (3.185)

Since the change in δn(r) describes a change of the particle number N, we have:

∆Ev

m=∂Ev

∂N= ǫF (3.186)

Thus the highest occupied level is equal to the chemical potential ∂Ev

∂N. Now, generally we

haveδEv[n]

δn(r)= µ (3.187)

Of course, this includes the considered case δn(r). We obtain:

µ = ǫF (3.188)

The Lagrange parameter of the Hohenberg-Kohn equation is equal to the highest occupiedlevel of the Kohn-Sham equation. This level, i.e., the chemical potential (the Fermi level)is a measurable quantity.

79

In the functional Ev[n], we could introduce occupation numbers foi by writing

n(r) =N∑

i=1

|ϕoi(r)|2 (3.189)

=∞∑

i=1

foi |ϕoi(r)|2 . (3.190)

At zero temperature we have

foi =

1 for i = 1 . . . N

0 sonst ,(3.191)

and

Ts[n] =∞∑

i=1

foiǫoi −∫veff(r)n(r)d3r . (3.192)

Strictly, Ev[n] is defined only for foi = 0 or 1. Now we will assume that the range ofthe foi can be extended to non-integer occupations. For example, for a finite temperaturedescription the occupation numbers would be given by the Fermi function. This extensionof the foi is obviously not a problem for all parts of the energy functional, except forExc. Since we do not know Exc exactly, it is possible that the exact functional depends onn(r) in a way that permitting non-integer values of the occupation numbers foi introducesproblems. However, for all known approximations of Exc this extension of the range ofpermissible occupation numbers foi is unproblematic and we have:

∂Ev[n]

∂fk= ǫk , (3.193)

i.e., the change of the total energy due to removal or addition of an electron is equal tothe orbital energy of the corresponding state. We proof the statement by noting that

∂Ev[n]

∂fok=

∫δEv[n]

δn(r)

∂n(r)

∂fokd3r . (3.194)

Further we have∂n(r)

∂fok= |ϕok(r)|2 , (3.195)

because the ϕok do not depend explicitly on the occupation numbers for a given veff . Wedo not want (and cannot) assume here that the variational principle δEv [n]

δn(r)= µ is fulfilled!

Therefore δEv [n]δn(r)

has to be calculated. With

δEv[n]

δn(r)=

δTs[n]

δn(r)+ vHartree(r) + vxc(r) + v(r) (3.196)

=δTs[n]

δn(r)+ veff(r) . (3.197)

we obtain

∂Ev[n]

∂fok=

∫δTs[n]

δn(r)|ϕok(r)|2 d3r+

∫veff(r)|ϕok(r)|2 d3r . (3.198)

80

Because ∫δTs[n]

δn(r)|ϕok(r)|2 d3r =

∂Ts[n]

∂fok, (3.199)

and with Eq. (3.192),∂Ts∂fok

= ǫok −∫veff(r)|ϕok(r)|2d3r , (3.200)

we obtain the result∂Ev[n]

∂fok= ǫok . (3.201)

This equation also holds for the highest occupied state, k = N , which, at least in metals,is called the Fermi energy:

∂Ev[n]

∂foN= ǫoN = ǫF . (3.202)

Figure 3.16 shows an example for ǫok as a function of the occupation number. However,here the local spin-density approximation for Exc has been used. This represents an im-provement over the ansatz n↑ = n↓, and it will be discussed in Section 3.7.2 below. Dueto the kinetic energy, the functions ǫok(f) are not differentiable at some points. These arepoints where a new spin channel or a new shell is added.

Now we have a look at an ionization event, i.e., the transition from the ground state ΦN

to the state ΦN−1k plus a free electron of zero energy. The index k of the wave function

represents that the level k is no more occupied. In the exact meaning of the word ionizationwould be present if k = N, i.e., if the highest occupied level is affected.The ionization energy is

Ik = EN−1k − EN (3.203)

= −∫ 1

0

∂Ev[n]

∂fokdfok (3.204)

= −∫ 1

0

ǫok(fok) dfok . (3.205)

Here we assume that the geometry of the lattice is not changed by the ionization. For theionization from extended levels of a solid this assumption is justified. And in general itrepresents the Franck-Condon principle which states that the displacement of the nucleifollows the electronic excitation (or the ionization) with some delay. Using the mean valuetheorem of calculus we obtain:

Ik = −ǫk(0.5) . (3.206)

This expression is called Slater-Janak-“transition-state”. When a calculation is carried outand the level k is occupied by only half an electron, the energy ǫk is roughly the ionizationenergy. This procedure illustrates the meaning of the ǫk, but there is also a practicalapplication. Often it is numerically more precise to calculate the ionization energies fromǫk(0.5), and not as the difference of two large numbers EN−1

k -EN .If the functions ϕk(r) are extended, n(r) and therefore also veff(r) and the values of ǫkwill not change significantly when the occupation of a level, fk, is changed. Then we have

81

-20

-15

-10

-5

0

eige

nval

uesǫ o

k(e

V)

(2s↓)f (2s↑)

0 (2p↓)0 (2s↓)

1 (2s↑)f (2p↓)

0

(2s↓)1 (2s↑)

1

(2p↓)f

Z − 2 (Be2+) Z − 1 (Be+) Z (Be)

ǫ2s↓

ǫ2s↑

ǫ2p↓

electron number N

Figure 3.16: The function ǫok(N)) for the 2s and 2p states of the Be atom (Z = 4) as afunction of the electron number (from N = 2 to N = 4.4). The neutral Be atom has theconfiguration 1s2 2s2. At first the occupation of the 2s↓ level is changed from zero to one(from left to right). Then the occupation of the 2s↑ level is changed and then that of the2p↓ level. The local spin-density approximation is employed.

82

ǫk(fk = 1) ≈ ǫk(fk = 0.5) and the single-particle levels correspond to the ionization ener-gies. This statement is practically identical to Koopmans’ theorem of Hartree-Fock theory.

Experience shows that although Ee0 = minn(r)Ev[n] is affected by the approximation to

the xc-functional, the ionization energy I (as an energy difference) is often rather accurate;the errors of the LDA partly cancel in the calculation. This is reasonable but not proven.We now exemplify this relation for the (extreme) example of the hydrogen atom. Thereis hardly any similarity between the hydrogen atom and the many-body problem of anextended solid. But it points towards the problems of the LDA and error compensation.In density-functional theory we have:

− ~

2

2m∇2 − e2

4πε0

1

r+

e2

4πε0

∫n(r′)

|r− r′| d3r′ + vxc(r)

ϕ(r) = ǫϕ(r) . (3.207)

We know that in the exact calculation for the ground state of hydrogen atom we have

e2

4πε0

∫n(r′)

|r− r′|d3r′ + vxc(r) = 0 , (3.208)

because in a one-electron system there is no electron-electron interaction. The correspond-ing (exact) lowest energy value then is ǫ1 = −13.6 eV, that is minus the Rydberg constant.In the LDA however, the terms in Eq. (3.208) do not cancel exactly. The eigenvalue ob-tained from the LDA therefore is significantly above the exact eigenvalue: ǫLDA

1 = −6.4 eV.Thus, for the hydrogen atom the jellium approximation for vxc is very bad. Still, even herewe see that differences of the total energies are rather good: I ≈ −ǫLDA

1

(12

)= +12.4 eV.

Generally we have: Due to the poorly corrected self-interaction in the LDA, the eigen-values ǫLDA

k are too high, i.e., they correspond to small ionization energies. This is themore true, the more a state is localized. Consequently, for localized states the ǫk areno good approximation for ionization energies, but in general ǫk(0.5) is a rather goodapproximation.8

8 Taking into account the spin in the LSDA improves the value for the ionization energies of thehydrogen atom at the “transition-state” for half-occupation slightly: I ≈ −ǫLSDA

1 ( 12 ) = 13.35 eV, ǫ1(1) =−7.32 eV and ∆Ee = 13.1 eV, for the H-atom.

83

3.8.2 Spin Polarization

References:Kohn and Sham (1965), Barth, Hedin (1972), Rajagopal, Callaway (1973), Levy.

Strictly, the ground state density alone defines the full problem: The ground state densitydetermines the many-body Hamilton operator, which determines everything. However, thedependence of the total energy on the density is very complicated, and possibly cannotbe represented in a closed mathematical form. Therefore, it is reasonable to soften thepuristic approach and to start with densities for spin up and spin down electrons asindependent variables, especially for magnetic systems. This establishes the spin-density-functional theory (SDFT) and is an important and simple improvement of DFT withrespect to the practicability and accuracy of real calculations. In SDFT, also magneticeffects can be described, whereas SDFT and DFT are identical for non-magnetic systems.In the spin-density-functional theory, the density matrix is used as the basic variable:

ns,s′(r) = 〈Φ|Ψ+s (r)Ψs′(r)|Φ〉 . (3.209)

Here, s and s′ represent the spin orientations ↑ or ↓ of individual particles. Ψ+s (r) and

Ψs′(r) are field operators associated with the creation of a particle at position r with spins and annihilation of a particle at position r with spin s′. |Φ〉 denotes the ground state ofthe N electron system. The particle density (the basic variable in DFT) is

n(r) = n↑(r) + n↓(r) , (3.210)

and the magnetization density is

m(r) = µBn↑(r)− n↓(r) . (3.211)

Here, instead of n↑↑ I have used only n↑ and instead of n↓↓ only n↓. Thus, we only needhere the diagonal elements. The Bohr magneton µB is defined as

µB =e~

2mc. (3.212)

Now we want to use

n↑(r) = 〈Φ|N∑

i=1

δsk,↑ δ(r− ri)|Φ〉 (3.213)

and

n↓(r) = 〈Φ|N∑

i=1

δsk,↓ δ(r− ri)|Φ〉 (3.214)

as basic variables. In the same way as in standard DFT, we obtain a single-particleequation:

− ~2

2m∇2 + veffsi (r)

ϕoksk(r) = ǫoisiϕoisi(r) . (3.215)

The wave functions ϕoi↑ and ϕoi↓ are now determined by two different equations, but theseequations are coupled, because the effective potential depends on both n↑ and n↓:

veffsi (r) = v(r) +e2

4πε0

∫n(r)

|r− r′| d3r+ vxcsi (r) , (3.216)

84

with

vxcsi (r) =δExc[n↑, n↓]

δnsi(r), (3.217)

and

nsk(r) =N∑

i=1

δsk,si |ϕoisi(r)|2 . (3.218)

The exchange-correlation potential now depends on the spin orientation. For practicalcalculations the local spin-density approximation is introduced:

Exc−LSDA =

∫ǫxc−jellium(n(r),m(r))n(r) d3r . (3.219)

Here, ǫxc (n,m) is the exchange-correlation energy per particle of a homogeneous elec-tronic system of constant particle density n and constant magnetization density m.9 Theexchange-correlation potential of the LSDA depends on the local electron density in asimilar way as in LDA. However, the exchange and, less emphasized, the correlation con-tribution of vxc−LSDA

si(r) or ǫxc−LSDA additionally depend on the spin-orientation.

Apart from spin-density theory also other generalizations have been investigated: Velocity-dependent forces, spin-orbit-coupling, relativistic formulation (→ Dirac equation). Thesewill not be discussed here and we refer to the literature (cf. Rajagopal, Calloway, Phys.Rev. B 87, 1912 (1973); Mac Donald, Vosko: J. Phys. C 11, L943 (1978); Rajagopal, J.Phys. C 11, L943 (1978)).

3.8.3 Two Examples

Finally, we will demonstrate for two examples which type of information can be obtainedusing DFT-LDA and SDFT-LSDA calculations. Later, during the detailed discussion andexplanation of the nature of cohesion of solids, we will use such calculations again.

Before 1980, systematic high-quality DFT calculation were not possible, partly due to thelack of efficient and reliable algorithms, partly due to the lack of computational power.Therefore, it was often not understood how the electron density is distributed in thecrystal and how the solid is stabilized. It was, for example, not clear, why silicon doesexist in the diamond structure or why silver has an fcc structure. Using parameter-free,self-consistent DFT calculations an initial understanding was obtained. We are still atthe beginning, but with good perspectives: Compared to the 1980s, much more powerfulhardware is available (many orders of magnitude more powerful) and also the efficiencyof state-of-the-art algorithms has increased dramatically.The main advantage of such parameter-free, self-consistent DFT calculations is that theresults can be analyzed in detail, i.e., which ingredients are essential for the stabilizationof the solid and which are not.

9Due to Eq. (3.210) and (3.211), the variables (n↑, n↓) and (n,m) can be used interchangeably.

85

Such theoretical investigations of static and low-frequency dynamical properties usuallyare performed via the self-consistent calculation of the Kohn-Sham equation. The self-consistent field procedure is almost identical to the Hartree approximation discussed be-fore (cf. Fig. 3.2), but now the effective potential additionally contains exchange andcorrelation. The only external parameters given (by the scientist) are the nuclear charge(i.e., the decision of the material, e.g. Si or Ag). In general, the lattice geometry will bevaried in order to find the most stable geometry, i.e., the lowest energy structure, andto analyze n(r) for this structure. For these calculations, there are still several seriouspractical problems that are not visible in Fig. 3.2. These problems are:

1. The solution of the effective Schrödinger equation. For this purpose suitable methodshave to be developed (cf. Chapter 5). Such an equation cannot be solved analytically(except for the hydrogen atom and the linear harmonic oscillator).

2. The calculation of n(r) as an integral or the summation of the |ϕoi(r)|2, respectively.

3. The calculation of the Poisson Eq. (3.104) for arbitrary charge densities −en(r).

4. Approximations for the exchange-correlation functional Exc[n].

Figure 3.17 shows a “historic figure”, namely what I consider to be the first convincingexample demonstrating what type of problems can be tackled by density-functional the-ory calculations10. These are calculations performed in the group of Marvin Cohen inBerkeley. The figure shows the total energy for silicon as a function of the volume peratom, where the volume was normalized such that 1.0 is that of the experimentally knownresult for Si in the diamond structure. The results show clearly that the lowest energy ofall considered structures is indeed found for the diamond structure, and the minimum ofthe theoretical curve is very close to the experimental result. The figure further revealsthat there is a phase transition that eventually brings the system into the beta-tin struc-ture when the volume is reduced. The slope of the common tangent of two curves for thebeta-tin and the diamond structures gives the pressure at which the phase transition setsin. This common tangent is called the Gibbs construction. Such calculations can predictand explain why solids behave as they do, and new materials of hitherto unknown struc-ture or composition can be investigated as well.

In Fig. 3.18, I show a more recent example and this demonstrates how density-functionaltheory calculations can tell us things about geology that we cannot learn otherwise. Saveknowledge about the earth only exists about the crust and the upper mantle. However, itis interesting and indeed important to know more about the central region of our planet, asthis, for example contains information about the origin, the development, and the futureof the earth. One aspect here also is the question what determines the earth’s magneticfield and its fluctuations and changes. The structure of the earth (left side in Figure 3.18)is known from measurements of the propagation, time delays, phase differences of earthquake waves, as these are reflected when the composition or the aggregate state in the

10Another early, impressive example of the power of density-functional theory calculation is the bookby V.L. Moruzzi, J.F. Janak, and A. R. Williams, “Calculated Electronic Properties of Metals”, PergamonPress (1978).

86

Figure 3.17: Density-functional theory calculations (using the local-density approximationfor the exchange-correlation energy) of the total energy for various crystal structures ofsilicon as function of the volume per atom. The volume-axis is normalized such that thevalue 1.0 corresponds to the experimental result known for the diamond structure of Si.(M.T. Yin and M.L. Cohen, ”Theory of static structural properties, crystal stability, andphase transformations: Application to Si and Ge”, Phys. Rev. B 26, 5668-5687 (1982)).

earth change. The inner core of the earth is most likely solid and then we have the outercore which is liquid. We know the depth and we also quite accurately can estimate thepressure that is present at the phase boundary between the solid and the liquid core.The material down there is mostly iron probably with some fraction of O, S, Se and C.Unclear, however, is the temperature at this place.

In fact, we don’t know at what temperature does iron melt when it is put under sucha high pressure of 330 GPa, and we have no idea how such melt may behave. What isthe local structure and what is the viscosity of the melt at such extreme conditions? Theproblem is that such pressure can hardly be reached in the laboratory. With a diamondanvil cell one gets somewhat close, i.e., to 200 GPa, but not to 330 GPa.

Density-functional theory calculations by Alfè et al. (see also http://chianti.geol.ucl.ac.uk/∼dario/ and http://chianti.geol.ucl.ac.uk/∼dario/resint.htm) have shown that the melt-ing temperature of iron at 330 GPa is 6.670 K (cf. Fig. 3.18, right). Thus, this must bethe temperature at the interface between the inner and the outer core. In simple wordsone could say, that DFT was used as a thermometer to determine the temperature at an

87

Figure 3.18: A cut through the earth showing the various shells (left) and the calculatedmelting curve for iron. (D. Alfè, G.D. Price, and M.J. Gillan, “Melting curve of Iron atEarth’s core pressures from ab-initio calculations”, Nature 401, 462-464 (1999).)

inaccessible place.

Furthermore, the authors studied the viscosity. The previously existing experimental es-timates differed by many orders of magnitude. The DFT work showed that liquid iron inthe outer core should have a local coordination similar to that of the hcp structure, andthe viscosity is only by a factor of 10 higher than that of liquid iron at standard pressure.This is actually on the lowest side of the previous experimental estimates. Of course thereare also some uncertainties in the theoretical result. These arise, because a somewhatsmall supercell was used, the exchange-correlation functional was, of course, treated ap-proximately, and the authors studies pure iron, i.e., without the O, S, etc. fractions thatmust be there as well. All together the uncertainty of the calculated viscosity may be afactor of 3. This is still a much lower uncertainty than that of experimental studies.

It is now clear that in the outer core local circulations and turbulent convection will occur.At most of the previous, experimentally estimated values for the viscosity this would notbe possible.

3.9 Summary (Electron-Electron Interaction)

Chapter 3 is only concerned about the properties of the electronic ground state, e.g. thebasic equations that one has to solve to learn about the total energy (internal energy),charge density of the electrons, screening, lattice structure, lattice constant, elastic proper-ties, lattice vibrations, and approximate electronic excitations. In the following summaryof the most important equations, we assume for clarity that spin polarization is absent,

88

i.e.,

n↑(r) = n↓(r) =n(r)

2. (3.220)

The Hamilton operator of the electrons is

He = T e + V e−Nuc + V e−e (3.221)

=N∑

i=1

− ~2

2m∇2

ri+

N∑

i=1

v(ri) +1

2

e2

4πε0

N,N∑

i,j=1i 6=j

1

|ri − rj|. (3.222)

The total energy is given by

E0 = Ee0 + ENuc−Nuc with Ee

0 = MinΦ〈Φ|He|Φ〉 . (3.223)

The density-functional theory of Hohenberg and Kohn means that the many-body Schrödin-ger equation with the Hamilton operator (Eq. ( 3.221)) can be transformed into a self-consistent field theory. Hohenberg and Kohn have shown that

〈Φ|He|Φ〉 =∫v(r)n(r)d3r+ F [n] = Ev[n] , (3.224)

with the particle density of the electrons given by

n(r) =N∑

i=1

〈Φ|δ(r− ri)|Φ〉 , (3.225)

andF [n] = 〈Φ|T e + V e−e|Φ〉 . (3.226)

For a given external potential v(r) and taking into account the conservation of the numberof particles (

∫n(r)d3r = N), Ev[n] assumes a minimum at the correct particle density

n(r), and at the value Ee0 for the energy of the electronic ground state. We have:

δTs[n]

δn(r)+ veff(r) = µ . (3.227)

Here Ts[n] is the kinetic energy functional of independent particles. It must not be confusedwith T e = 〈Φ|T e|Φ〉, which is defined for interacting systems. We have:

δTs[n]

δn(r)=

~2

2m

(3π2n(r)

)2/3+O (∇n(r)) . (3.228)

It should be noted, however, that this series expansion is quite inaccurate.

For veff(r) we have:

veff(r) = v(r) +e2

4πε0

∫n(r′)

|r− r′|d3r′ +

δExc[n]

δn(r), (3.229)

Exc[n] is the exchange-correlation functional. We have:

Exc[n] = 〈Φ|He|Φ〉 −∫v(r)n(r)d3r− Ts[n]−

1

2

1

4πε0

∫∫n(r)n(r′)

|r− r′| d3rd3r′ . (3.230)

89

The quantity µ introduced as a Lagrange parameter in Eq. (3.227) is the chemical poten-tial of the electrons.

The particle density of Eq. (3.225) can also be determined using the Kohn-Sham equation:

− ~

2

2m∇2 + veff(r)

ϕoi(r) = ǫoiϕoi(r) (3.231)

with

n(r) =N∑

i=1

|ϕoi(r)|2 . (3.232)

Equation (3.227) means: In principle it is possible to determine n(r) directly from v(r),i.e., the many-body wave function Φ(r1σ1, . . . , rnσN) is not required explicitly. At thispoint we also mention the Hellmann-Feynman theorem that was already derived in thelate 1930s. It is also exact:

The Hellmann-Feynman theorem states that the forces on the atoms are purely due tothe electrostatic interaction between the potential of the nuclei and the self-consistentelectron density n(r).

The problem with the calculation is that the functional Ts[n] is not explicitly known, orthat the known approximations are inaccurate or very complicated. Although the func-tional Ts[n] cannot be given in a closed mathematical form, its value is calculated exactlywhen the Kohn-Sham equation is used. Experience shows that the exact calculation ofTs[n] is very important.

Exc[n] is also not known exactly. A known approximation for Exc[n] and for vxc(r) = δExc[n]δn(r)

is the “Local-Density Approximation” (LDA).

Equation (3.231) means that the many-body problem of the Hamilton operator of Eq.(3.221) can be brought into the form of a single-particle equation to be solved self-consistently. The potential, in which the N independent quasi particles ϕoi(r) move,veff(r), is local (i.e., it is multiplicative) and is identical for all particles. In practicalcalculations, the only approximation introduced concerns Exc[n].

Approximations:

1. Local-density approximation (LDA):

Exc[n] =

∫ǫxc[n](r)d3r → Exc−LDA[n] =

∫ǫxc−LDA (n(r))n(r) d3r (3.233)

ǫxc−LDA(n) is the exchange-correlation energy per particle of the homogeneous elec-tron gas of density n. Strictly, Eq. (3.233) is valid only for slowly varying densities.Experience with this approximation for calculations of atoms, molecules and solidsshows that Eq. (3.233) in general can also be applied to these systems.

90

2. The Hartree-Fock approximation is obtained from Eq. (3.229) and (3.231), when

δExc[n]

δn(r)= vxc(r) (3.234)

is replaced by

vxk(r) = −e2

4πε0

∫nHFk (r, r′)

|r− r′| d3r′ . (3.235)

nHFk (r, r′) =

N∑

i=1

δsi,skϕ∗oisi


ϕoksk(r)(3.236)

is called exchange particle density. This approximation is obtained, if the many-bodywave function is constructed from one Slater determinant.

Problems:

a) vxk(r) depends on the index (quantum number) of the wave function to becalculated.

b) vxk(r) contains only exchange, i.e., the correlation of the electrons due to thePauli principle. The correlation arising from the Coulomb repulsion betweenelectrons is missing.

3. The Hartree approximation is obtained when Exc[n] and vxc(r) are neglected. Strictly,vxc(r) should be replaced by

vSICk (r) = − e2

4πε0

∫ |ϕok |2|r− r′| d

3r , (3.237)

which is, however, typically ignored. This approximation is obtained, if the many-body wave function is constructed as a simple product of single-particle functions.Problem: The Hartree approximation contains no electron correlation.

4. The Thomas-Fermi approximation is obtained from Eq. (3.227), (3.228) and (3.230),if the following approximation is introduced:

a) O(∇n) is neglected in Ts[n],

b) O(∇n) is neglected in Exc[n].

Problems: The approximation for Ts[n] generally yields an error of 10% in the totalenergy. The shell structure of the atoms is not described.

91

4 Lattice Periodicity

4.1 Lattice Periodicity

In part 3 we saw that the many-body problem can be reduced exactly to the self-consistentsolution of effective single-particle equations:

hϕoi(r) = ǫoiϕoi(r) (4.1)

with

h = − ~2

2m∇2 + veff(r) . (4.2)

In the effective potential, the electrostatic potential of the nuclei depends on the atomicpositions, whereas the Hartree and the exchange-correlation potential are determined bythe charge density of the electrons. Since the densities depend on the positions of theatoms, the symmetry properties of veff(r) are determined by the arrangement of the lat-tice components (nuclei), i.e., by the symmetry of v(r). Note that this does not alwaysmean that the symmetry of veff(r) has to be the same as that of any given system ofnuclei. In principle, the charge density (and thus, veff(r)) could have a lower symmetrythan a given arrangement of nuclei.1 However, this will typically lead to residual forceson the nuclei, “pulling” them into the same (lower) symmetry state as veff(r). In general,the symmetries of v(r), n(r) and veff(r) will thus be consistent with one another when thenuclei are at their equilibrium position. Since the operator ∇2 is invariant with respectto translation, rotation and inversion in real space, the symmetry of h is determined onlyby veff(r). Now we will see what we can learn from such investigations of the symmetry.In order to study the properties depending on the periodic arrangement of the atoms,we first have to introduce several definitions. The fundamental property of a crystal or acrystalline solid is the regular arrangement of its constituents, i.e., the nuclei. “Periodicity”and “order” are not synonyms, and the most recent definition by the “International Unionof Crystallography” therefore reads: “A crystal is a solid having an essentially discretediffraction pattern.” Periodic crystals form a subset. At this point we note that in naturecrystals are more frequent than expected: Not only diamond and quartz are crystals. Alsometals often have a crystalline structure, although their outer shape usually is not sopronounced as, e. g., for salts or for minerals.

1So-called spin or charge density waves in periodic crystals are an example for cases where the nucleimay have a different (higher) translational symmetry (see below) than the resulting veff(r). Examples arethe so-called Peierls instability, or the magnetic ground state of Cr, where the periodicity of the electronicspin density extends over many unit cells of the actual nuclear subsystem.

92

a1

a2

Figure 4.1: A two-dimensional Bravais lattice. The choice of the primitive vectors a1, a2

is not unique.

A periodic crystal is characterized by the fact that by a certain translation it is mappedonto itself. A translation is defined by a vector

Rn = n1a1 + n2a2 + n3a3 , (4.3)

where ni ∈ Z and the vectors ai are linearly independent. For the translation operator wehave

TRnf(r) = f(r+Rn) , (4.4)

where f(r) is an arbitrary function. We have

TRnveff(r) = veff(r+Rn) = veff(r) (4.5)

andTRn∇2f(r) = ∇2f(r+Rn) = ∇2TRn

f(r) . (4.6)

This means that TRnand h commutate: Then for the Hamilton operator h we have

TRnhϕoi(r) = hϕoi(r+Rn) = hTRn

ϕoi(r) = ǫoiϕoi(r+Rn) . (4.7)

The vectors ai introduced above are called primitive vectors. The set of points defined byRn is called a Bravais lattice (cf. Fig. 4.1). For obvious reasons the term Bravais latticeis often also used for the set of vectors Rn. The choice of the primitive vectors is notunique, generally the shortest primitive translations are chosen. The points of the Bravaislattice do not need to correspond to the positions of individual atoms. As a warning wemention that not every apparently symmetric set of points constitutes a Bravais lattice(cf. the example in Fig. 4.2). Apart from translations, which shift all points in space,generally the structure of a crystal is also invariant with respect to symmetry operationsthat keep at least one point fixed, so-called point symmetries (details will be given later).

93

a1

a2

Figure 4.2: The crossing points of the honeycomb structure do not form a Bravais lattice,but the centers of the dumbbells do. Thus, the hexagonal structure is also called a Bravaislattice with basis, i.e., for each point of the Bravais lattice in this case there are two atoms,which in this context are called basis.

The smallest structural unit of a crystal is called the primitive cell or the primitive unitcell. If the primitive unit cell is shifted by all vectors of the Bravais lattice, the full space isfilled without gaps or overlap. Similar to the definition of primitive vectors the definitionof the primitive unit cell is not unique. A primitive unit cell contains exactly one pointof the Bravais lattice. Thus, a possible choice for the primitive unit cell would be thebody spanned by the shortest primitive vectors. This choice has the disadvantage thatthe primitive unit cell defined this way often does not have the same symmetry (pointsymmetry) as the Bravais lattice. But there is always a primitive unit cell which has thesame symmetry with respect to reflection, rotations and inversion as the Bravais lattice.This is the Wigner-Seitz cell : It consists of the region, which is closer to a certain Bravaislattice point than to all other Bravais lattice points. The Wigner-Seitz cell has the samesymmetry as the Bravais lattice. For the construction of the Wigner-Seitz cell, one startswith an arbitrary point of the Bravais lattice. The surface is obtained by connecting thislattice point with its nearest neighbors. In the middle of the connecting line a plane per-pendicular to this line is constructed. A two-dimensional example is shown in Fig. 4.3,and some three-dimensional examples are shown in Fig. 4.4.

Often it is more illustrative to construct a crystal structure not from the primitive cell butfrom larger unit cells instead, so-called “conventional unit cells”. Four important examplesfor Bravais lattices are the sc (simple cubic), fcc (face-centered cubic), bcc (body-centeredcubic) and the hexagonal Bravais lattice (cf. Fig. 4.4).

Apart from translations TRn, there may be further symmetry operations of the crystal:

94

Figure 4.3: The Wigner-Seitz cell of a square net of points.

1) Rφ rotation

1a) Cn normal rotation by φ = 2πn

1b) Sn improper rotation

2) σ reflection

3) i = S2 inversion

4) TRφ screw rotation

5) T σ glide reflection

Generally, the term “rotation” includes “normal rotations” as well as improper rotations.An improper rotation is the following combination of operations: First, rotate about acertain axis by the angle φ and then reflect at the plane perpendicular to this axis.Screw rotation and glide reflection are combinations of rotations and reflections and non-primitive translations. The example of a glide reflection is shown in Fig. 4.5.

If we want to distinguish between proper and improper rotations, instead of the symbolRφ we use the following symbols: Operator of the normal rotation: Cn. Operator of theimproper rotation: Sn. The letter C results from “cyclic”. The Index n gives the rotationangle φ as φ = 2π/n.

The operation of inversion at the origin, i.e., x → −x, y → −y, z → −z is labeledby the letter i, and we have i = S2. Reflections are labeled by the letter σ. They can becomposed of a rotation and an improper rotation: σ = C−1

n ⊗ Sn.

It can easily be seen that the set of symmetry operations of a body has group properties.Therefore, we have the four laws (O,A,N,I):

95

Figure 4.4: Some Bravais lattices and the corresponding Wigner-Seitz cells.

1) There is an operation ⊗ :

a, b ∈ G→ a⊗ b = c ∈ G

2) The associative law is valid:

a⊗ (b⊗ c) = (a⊗ b)⊗ c

3) There is a neutral element E:

a⊗ E = E ⊗ a = a

4) For each element in G there is an inverse element:

a⊗ a−1 = E = a−1 ⊗ a

For the elements of the Bravais lattice, i.e., for the translations, additionally we have thecommutative law a⊗ b = b⊗ a:

TRm+ TRn

= TRn+ TRm

= TRn+Rm(4.8)

Thus, the translational group is Abelian.

A subset U of G, that is closed with respect to the operation ⊗ and itself has groupproperties, is called a subgroup. The set of elements, which is generated by operating allelements of U on a given element a of the group, is called a coset (notation a⊗U). The

96

T (Translation)

σ (Mirror)

Figure 4.5: The system of two atom types is mapped on itself by a combination of trans-lation T and reflection (glide reflection symmetry). The translation T and the reflection σalone would not be symmetry operations. In this example the glide reflection is identicalto a screw rotation. For the screw rotation a translation and a rotation (here by 1800) arecombined.

cosets themselves are not groups. For non-Abelian groups one has to distinguish betweenright (U⊗a) and left cosets (a⊗U). If, in a special case, right and left cosets of a subgroupU are the same, the subgroup U is called a normal divisor of G.

As an example for a point symmetry, we now investigate the point group of a cube. Thegroup is labeled Oh, the letter O referring to “octahedra”. This point group is rather im-portant. Many important crystals have this point symmetry or at least the symmetry ofa subgroup of Oh. The sc, fcc, and bcc Bravais lattices have Oh symmetry.

In Table 4.1 the symmetry operations are described. Reflections do not appear explicitlyin the table, but they are included in the symmetry operations of Oh. We have σv = i⊗C2

and σd = i⊗ C ′2. The index at σ indicates if the plane is crossing the cube vertically (v)

or diagonally (d).

Another term we introduce is the class of conjugate elements (often just called class).Two symmetry operations a and b are part of such a class if there is an element c of thisgroup, so that we have

a = c−1bc . (4.9)

The group elements a and b are then called “similar symmetry operations” or “conjugateoperations”. Symmetry operations of the “same kind” are in one class. In Table 4.1 wealready intuitively summarized the symmetry operations according to classes. Only in thelast row we considered 5 classes together. This can easily be validated. The group Oh

has 10 classes. The group O (this is the subgroup of Oh, which does not contain i) has 5classes. The neutral element is always a class by itself.

97

symbol operation numberE unit operation 1C4 rotation around the x-, −x-, y-, −y-, z- or −z-axis by 2π/4 6C2 rotation around the x-, y- or z-axis by 2π/2 3C2 rotation around the six axes cutting the edges of the cube

in the middle by 2π/26

C3 rotation around the four space diagonals by ±2π/3 8⊗ i all operations given up to here ⊗ i 24

Table 4.1: The 48 symmetry operations of the cube, i.e., the point group Oh.

x

y

Rφ r = r

rφ

Figure 4.6: Operator Rφ. Here: Rotation about the z-axis by the angle φ.

The total of all symmetry operations (translations, point symmetries, and combinationsof both), which map the Bravais lattice (including a possibly existing basis) on itself, formthe space group of the crystal. If we label the operator of a rotation (by the angle φ) withRφ (cf. Fig. 4.6), an arbitrary element of the space group can be labeled by (Rφ;TD). Wehave:

(Rφ;TD)f(r) = f(r+D) , (4.10)

with r = Rφr. Because D appears in a combination of rotation and translation operations,it is not necessarily an element of the Bravais lattice (cf. Fig. 4.5).

It can be shown that the space group contains the required group properties. In particularit has to be closed with respect to all operations, which are defined as:

(Rφ2 ;TD2)(Rφ1 ;TD1) = (Rφ2Rφ1 ;TRφ2D1+D2) . (4.11)

For the inverse element we have:

(Rφ;TD)−1 = (R−1

φ ;T−R−1φ

D) . (4.12)

For Bravais lattices we have: The total number of the different2 symmetry operations isfinite: There are, e.g. only four rotations: C2, C3, C4, and C6. In periodic solids there is norotational axis with a 5-fold symmetry or a symmetry of higher than 6 (due to transla-tional invariance).

2 The operations TRnand NTRn

or Cn and 2Cn, 3Cn, . . . are considered the same.

98

Proof:

The vectors of the Bravais lattice are

Rn = n1a1 + n2a2 + n3a3 . (4.13)

Further, the crystal shall have (at least) one rotational axis Cn, and for the moment weleave it open, what n may be. CnRn then also is an element of the Bravais lattice. Thevectors

R′n = Rn − CnRn (4.14)

are perpendicular to the rotational axis and are of course also part of the Bravais lattice.The shortest of these vectors shall be labeled with a1. The vectors (kCn)a1 with k = 1 . . . nare then also elements of the Bravais lattice; (kCn) means that the rotation is performedk-times. They all have the same length as a1. Now we consider two vectors of differentlengths:

|a1 − (kCn)a1| = L1 (4.15)

and|a1 + (kCn)a1| = L2 . (4.16)

Because a1 is the shortest vector perpendicular to the rotational axis, we have

L1 ≥ |a1| (4.17)

andL2 ≥ |a1| . (4.18)

Alternatively, it would be possible that L1 or L2 are zero. In Fig. 4.7 it can be seen thatfor k = 1 . . . n ∣∣∣∣sin

(kπ

n

)∣∣∣∣ =L1

2|a1|(4.19)

and ∣∣∣∣cos(kπ

n

)∣∣∣∣ =L2

2|a1|. (4.20)

With condition (Eq. (4.17)) follows:∣∣∣∣sin

(kπ

n

)∣∣∣∣ =L1

2|a1|≥ 1

2= sin(30) (4.21)

for all numbers k ≤ n. This means that we must have kπn≥ π/6, i.e., n must not be larger

than 6.

From Eq. (4.18) we obtain∣∣∣∣cos

(kπ

n

)∣∣∣∣ ≥1

2= cos(60) (4.22)

If we set k = 2, we obtain a contradiction; thus n = 5 is also impossible. Therefore, wehave proven that for a Bravais lattice only point symmetries with rotations C2, C3, C4,

99

Figure 4.7: Visualization of Equations (4.19) and (4.20).

G1 A1 A2 . . . AN

A1 A1 ⊗ A1 A1 ⊗ A2 . . . A1 ⊗ AN

A2 A2 ⊗ A1 . . . . . . . . .

... . . . . . . . . . . . .

AN AN ⊗ A1 . . . . . . AN ⊗ AN

Table 4.2: Multiplication table of a group G1 consisting of N elements. Each productAi ⊗ Aj is equal to an element of the group, i.e., Ai ⊗ Aj = Ak .

and C6 can exist.

A cell with C5 or one with Cn and n > 6 cannot fill space completely or without overlap.This was noted already in 1619 by Johannes Kepler. However, in 1984 in rapidly cooledaluminium-manganese-melts diffraction images of 5-fold symmetry were measured (Phys.Rev. Letters, 53, 1951 (1984) ) and also 12-fold (Phys. Rev. Letters, May 1988) symme-tries were found. These are not periodic crystals although these are ordered systems. These“new lattices” are called quasi-crystals (cf. e.g. Physikalische Blätter 1986, S. 373 and S.368, Fig. 3). A certain analogy to the three-dimensional quasi crystals in two dimensionsare the so-called Penrose-patterns (cf e.g. Spektrum der Wissenschaft, Juli 1999). Thiseffect is also known from tiling walls or floors. When tiles of five-fold symmetry are usedone also needs other tiles to fill some areas.

From the very limited number of possible rotation axes for Bravais lattices it follows:For Bravais lattices there are only 7 different point groups (7 crystal systems). We firsthave to explain what is meant by the term “different”, or what is meant by the term “thesame”. Two groups are equivalent, if they contain the same number of elements and if theirmultiplication tables are identical. The multiplication table of a point group is defined inTable 4.2. Instead of the term “point group of the Bravais lattice” we also use the term“crystal system” as a synonym. One of the 7 point groups, i.e., the group Oh, has alreadybeen examined.

In Table 4.3 the 7 crystal systems are listed, where we reduce the symmetry of the samplebody (with the exception of the hexagonal point group) when going from row N to N+1.When considering the space groups one finds that for Bravais lattices (with mono-atomic

100

crystal system crystal axes example Bravais lattice

cubic α = β = γ = 90o

a = b = c

a b = a

c = a

sc,fcc,bcc,

tetragonal α = β = γ = 90o

a = b 6= c

a b = a

c

simple tetragonal,centered tetragonal,

orthorhombic(rhombic)

α = β = γ = 90o

a 6= b 6= c

a b

c

simple,face centered,body centered,face centered,

(Basis: upper/lower facet)

monoclinic α = γ = 90o 6= βa 6= b 6= c

von oben

Seiten

a b

c

simple monoclinic,centered monoclinic,

triclinic α 6= β 6= γa 6= b 6= c

αβ

γ

a

b

c

Parallelepiped (Spat),opposite facets are

parallel

simple triclinic,

trigonal α = β = γ 6= 90o

a = b = c

a b = a

c = a

simple trigonal,

hexagonal α = β = 90o

γ = 120o

a = b 6= c

g

c

a

a

simple hexagonal.

Table 4.3: The 7 Crystal Systems and 14 Bravais Lattices.

101

crystal systemnumberof Bravaislattices

numberof pointgroups

name of the point groups

cubic 3 5 Oh, O, Th, T , Tdtetragonal 2 7 C4, S4, C4h, D4, C4v, D2d, D4h

orthorhombic 4 3 D2, C2v, D2h

monoclinic 2 3 C2, CS, C2h

triclinic 1 2 C1, Ci

trigonal 1 5 C3, C3i, D3, C3v, D3d

hexagonal 1 7 C6, C6h, D6, C6v, D3h, D6h, C3h∑= 14

∑= 32

Table 4.4: Bravais lattices and point groups of the crystal structures.

basis) there are only 14 different space groups. This has been investigated by Frankheimin 1842, but he made a mistake (he found 15), in 1845 Bravais found the correct number.

102

fct:

bct:

1

1

2

2

3

3

top view in [100] direction

in [110] direction

[100]

[110]

layer 1 layer 2layer 3

= layer 1

Figure 4.8: Layer sequence of a fct and of a bct lattice (at the left: projection onto the(001) plane). Both lattices can be represented as a ct lattice.

103

There are further Bravais lattices that may come to mind analogous to sc-, fcc-, and bcc-lattices. They are not included in the table, because they are equivalent to the ones listed.For example, a face-centered and a body-centered tetragonal Bravais lattice are identical(cf. Fig. 4.8).

This is a first crude classification of all possible periodic crystals in the 14 Bravais lattices.For each crystal there is a Bravais lattice and a crystal system. In case each point of theBravais lattice has an inner structure (e.g. the dumbbells of Fig. 4.2), i.e., if it has a basis,then the point symmetry of the crystal is lower than the symmetry of the Bravais lattice.Generally, we have: The point group of a crystal is a subgroup of the crystal system. Forthe cubic crystal system Oh, e.g. there are 5 subgroups, which can be present in real crys-tals. These are the groups Oh, O (like Oh, but without inversion i), Td (the point groupof a tetrahedron: E, 8C3, 3C2, 6σd, 6S4), T (the point group of a tetrahedron, without re-flection symmetry: E, 3C2, 4C+

3 , 4C−3 ), the group Th = T ⊗ i, which in addition to

T contains also the operations i, 8S6, 3σd. We have used σd = C2 ⊗ i and S4 = C4 ⊗ i.Thus, if the “inner atomic structure” of the individual points of the Bravais lattice aretaken into account, one finds: There are 32 crystalline point groups, which are compatiblewith the translational properties of a crystal (cf. Table 4.4) and there are 230 space groups.

C

B

A

A

Figure 4.9: Close-packed structures hcp (left, layer sequence ABCABC. . . ) and fcc (right,layer sequence ABAB. . . ).

If one thinks of the crystal as being composed of hard spheres and these spheres are close-packed, one obtains a structure with the first two layers as shown in Fig. 4.9. The firstlayer has a 6-fold symmetry and each sphere has 6 neighbors. The spheres of the secondlayer are located in the hollow sites of the first layer. For the third layer there are twodifferent possibilities: The spheres could be on top of the gaps (site b) or above the spheresof the second-last layer (site a). In the second case the arrangement of the third layer isequal to the first, and we obtain a layer sequence ABABAB... . This is the hcp-structure.In case b) a layer sequence of ABC... can be obtained. This is the fcc-structure.

fcc- and hcp-structures are mostly adopted by systems without directional bonds betweenstructural elements (in simple Bravais lattices without basis these are the atoms). Thenenergetically it will be favored if each structural element can form bonds to as many

104

neighbors as possible. In the fcc- and hcp-structure each structural element has twelvenearest neighbors.

In Fig. 4.10 some important crystal structures are listed (from Ashcroft-Mermin). Furthersome crystals adopting these structures and their lattice constants are given. Except forthe last example, the hexagonal structure, these are all cubic crystal systems.

Figure 4.10: Some important crystal structures. a) Simple cubic (sc) Bravais lattice, (e.g.α-Polonium).

Fig. 4.10 – b) CsCl structure (sc with a diatomic basis).

105

Fig. 4.10 – c) body-centered cubic Bravais lattice (bcc).

Fig. 4.10 – d) face-centred cubic Bravais lattice (fcc).

Fig. 4.10 – e) NaCl structure (fcc with a diatomic basis),

106

Fig. 4.10 – f) Diamond structure (fcc with a diatomic basis).

Fig. 4.10 – g) Zincblende structure (fcc with a diatomic basis of different species).

Fig. 4.10 – h) Hexagonal closed packed structure (hcp, hexagonal Bravais lattice with adiatomic basis). The table lists the lattice parameters for some elements with hexagonalclosed-packed structure (hcp), cf. Fig. 4.9.

107

Bravais lattice crystal structurepoint symmetry 7 crystal systems 32 crystalline point groupspoint symmetry andtranslational symmetry

14 Bravais lattices 230 crystalline space groups

Table 4.5: The space groups.

4.2 The Bloch Theorem

What can we learn from the symmetry properties of veff(r) for the solution of the Kohn-Sham equation? In this paragraph we take into account (at first) the translational invari-ance of the periodic crystal. For the Kohn-Sham equation the Hamilton operator of thecrystal and the translations of the Bravais lattice commutate (cf. Eg. (4.7)).

Thus, the functions ϕoi(r) and TRnϕoi(r) are both eigenfunctions of h, and they have the

same eigenvalue. In order to analyze this we will distinguish two cases:

a) The eigenvalue ǫoi is non-degenerate.

Then the functions ϕoi(r) and TRnϕoi(r) are physically equivalent. They can differ

by a phase factor only:TRn

ϕoi(r) = eiαϕoi(r) , (4.23)

where α is an arbitrary real number, which can depend on Rn.

In order to investigate the properties of the function α(Rn) in more detail, we applytwo translational operators Rn and Rm and obtain

TRmTRn

ϕoi(r) = eiα(Rm)eiα(Rn)ϕoi(r) . (4.24)

Further we haveTRm+Rn

ϕoi(r) = eiα(Rm+Rn)ϕoi(r) . (4.25)

For the phase function α we obtain

α(Rm +Rn) = α(Rm) + α(Rn) , (4.26)

and we haveα(jRn) = jα(Rn) (4.27)

with j being an arbitrary integer number. The function α(Rn) therefore is linear inRn. Thus, it has the form

α(Rn) = kRn . (4.28)

Therefore, we have

TRnϕoi(r) = ϕoi(r+Rn) = eikRnϕoi(r) (4.29)

This is the eigenvalue equation of the translation operator. The eigenvalues of TRn

are eikRn . The introduced vector k labels the eigenvalues of TRnand thus also the

eigenfunctions of ϕoi(r).

108

b) The eigenvalue ǫoi shall be degenerate.

As a second possibility we have to investigate the case that the eigenvalue is f -folddegenerate. For the f eigenfunctions ϕl we have: The functions ϕl(r) and TRn

ϕl(r)with l = 1 . . . f have the same energy eigenvalue ǫoi . This means:

TRnϕl(r) =

f∑

m=1

Γm,lϕl(r) (4.30)

The matrices Γm,l are called representation of the translational group:

Γm,l = 〈ϕm|TRn|ϕl〉 (4.31)

The translations TRnform an Abelian group. This means: Since the group of the TRn

is Abelian, in the space ϕl with l ∈ 1 . . . f there is a similarity transformationof ϕl(r) to ϕl(r) and thus of Γm,l to Γm,l, such that the matrix Γm,l is diagonal.3 Thisis also formulated as follows: The irreducible representation of an Abelian group isone-dimensional. Then, we can write:

TRnϕl(r) = Γl,lϕl(r) (4.32)

Formally, this equation is the same as (4.29), and it follows that

Γl,l = eikRn (4.33)

We summarize:

The translational operators of the Bravais lattice commutate with h. TRnand h therefore

have the same eigenfunctions. The eigenfunctions and eigenvalues of h can be labeled bythe eigenvalues of TRn

or, better, by the vector k: From now on we will write ϕk(r) andǫ(k). k contains three quantum numbers. This labeling is not necessarily complete. Thestatement that for the eigenfunctions of a crystal we have

TRnϕk(r) = ϕk(r+Rn) = eikRnϕk(r) (4.34)

is called Bloch theorem.4 In order to understand the meaning and the consequences of theBloch theorem, we have a look at the equation

ϕk(r) = eikruk(r) (4.35)

At first this is a very general ansatz for the eigenfunctions of h, because we have made noassumptions for uk(r). We have:

TRnϕk(r) = eik(r+Rn)uk(r+Rn) (4.36)

Because for ϕk(r) the Bloch’s theorem is valid, we obtain

TRnϕk(r) = eikRnϕk(r) (4.37)

= eikRneikruk(r) (4.38)

3For the proof of this we refer to text books on group theory (e.g. Tinkham).4This was found by Bloch during his PhD thesis, which he carried out in the group of Seitz, but

initially he was not aware of the importance of this result.

109

From this we obtain the equation

uk(r+Rn) = uk(r) (4.39)

The function uk(r) has the periodicity of the Bravais lattice. This can be formulated asfollows: The solutions of the single-particle Schrödinger equation of a periodic crystalhave the form of a plane wave that is modulated by a function with lattice periodicity.ϕk(r) generally does not have the periodicity of the lattice, but |ϕk(r)|2. This gives riseto a second formulation of Bloch’s theorem: Due to the translational invariance of theHamilton operator the eigenfunctions have the following form:

ϕk(r) = eikruk(r) with uk(r) having the periodicity of the Bravais lattice (4.40)

This form of the eigenfunctions of h gives a hint to the physical meaning of the vectorsk. When looking at the special case veff(r) = constant we know that the solutions aresimple plane waves:

ϕk(r) =1√Vgeikr (4.41)

I.e., in this case the function uk(r) is constant. Vg is the volume of the base region. Thismeans: When going to a constant potential (infinitesimal translational invariance) k be-comes identical to the wave vector. We note that the vector k, as appearing here, (forcrystals) is not uniquely defined. This is because different vectors k yield the same eigen-value eikRn of TRn

. This will be investigated more closely now.

4.3 The Reciprocal Lattice

Since the vector k appears in a scalar product and in an exponent, it is not uniquelydefined. We have

eik′Rn = eikRn for k′ = k+Gm , (4.42)

ifGmRn = 2πN with N integer . (4.43)

All vectors k′ defined by Eq. (4.42) label the same eigenvalue and the same eigenfunctionof TRn

. How does the set of G-vectors defined by (4.42) look like? We define:

b1 =2π

Ω(a2 × a3)

b2 =2π

Ω(a3 × a1)

b3 =2π

Ω(a1 × a2) (4.44)

Here the vectors ai shall be the primitive vectors and Ω = a1(a2 × a3) is the volume ofthe primitive unit cell. The vectors Gm then are

Gm = m1b1 +m2b2 +m3b3 (4.45)

110

direct lattice reciprocal latticesc sc

hexagonal hexagonalfcc bccbcc fcc

Table 4.6: Four important Bravais lattices in direct space and the corresponding Bravaislattices in reciprocal space.

with mi being an integer number. This lattice of Gm-vectors defined in k space is calledthe reciprocal lattice. (Vectors in real space have the dimension length. Vectors in recip-rocal space have the dimension 1/length.)

The vectors of the reciprocal lattice satisfy the condition (Eq. (4.43)), and the basis vectorsof the reciprocal lattice are defined by Eq. (4.44), or by

aibj = 2πδi,j . (4.46)

The set k+Gm with an arbitrary vector Gm from reciprocal space labels the eigenfunc-tions and eigenvalues of the single-particle Schrödinger equation. In order to label thisset we use the shortest vector of the set k+Gm. From the definition (4.45) we obtainthat the reciprocal lattice is a Bravais lattice. Therefore we consider only those k-vectors,which are closer to point k = 0 (or G = 0) than to any other point of the reciprocal lattice.Such a region of the reciprocal lattice is called “first Brillouin zone” (the correspondingregion of the direct lattice is called the “Wigner-Seitz cell”). A two-dimensional exampleis shown in Fig. 4.11. The construction of Brillouin zones for three-dimensional Bravaislattices is somewhat more complex, but of course also just geometry.

~a

1

~a

2

~

b

1

~

b

2

Figure 4.11: A two-dimensional rhombic point lattice. On the left the direct lattice andthe Wigner-Seitz cell are shown, and on the right the corresponding reciprocal lattice withthe first Brillouin zone.

It can easily be confirmed that the relations between the direct and the reciprocal latticenoted in Table 4.6 are valid. Fig. 4.12 shows the 1st Brillouin zone of four importantdirect lattices: sc, fcc, bcc, and hexagonal. The point k = 0 is always called Γ. Otherdirections and points also have specific labels. Later we will need ǫ(k) for the full rangeof the Brillouin-Zone of the crystal. For this purpose it is often sufficient to investigate

111

a) simple cubic b) face-centered cubic

c) body-centered cubic d) hexagonal

Figure 4.12: Brillouin zones for the simple cubic a), face-centered cubic b), body-centeredcubic c), and hexagonal lattice d). The most important symmetry points and lines andtheir labels are shown.

the function along certain directions or in a small part of the 1st Brillouin zone. The restis determined by the point symmetry of the lattice. This will be discussed later. Now wewill investigate the physical meaning of the reciprocal lattice and of the 1. Brillouin zonefor the wave functions of the effective single-particle Schrödinger equation. Due to thetranslational invariance for the effective potential, veff(r+Rn) = veff(r), only the vectorsof the reciprocal lattice appear in the Fourier expansion:

veff(r) =∑

l

veff(Gl)eiGlr (4.47)

with GlRn = 2πN .

For the eigenfunctions of the Schrödinger equation we obtain in a similar way:

ϕk(r) = eikruk(r) = eikr∑

m

CGm(k)eiGmr (4.48)

112

Then, the Kohn-Sham equation in reciprocal space is:

∑

m

~2

2m(k+Gm)

2CGm(k)ei(k+Gm)r +

∑

l

veff(Gl)∑

m

CGm(k)ei(k+Gm+Gl)r

= ǫ(k)∑

m

CGm(k)ei(k+Gm)r (4.49)

For the CGm(k) this means:

~2

2m(k+Gn)

2CGn(k) +

∑

m

veff(Gn −Gm)CGm(k) = ǫ(k)CGn

(k) (4.50)

For any chosen k/ vector of the 1st Brillouin zone this is a set of equations, which fora certain (given) k allows for the calculation of the expansion coefficients CGn

(k). Onlythose coefficients CGn

(k) (or plane waves ei(k+Gn)r) are coupled by the periodic potentialveff(Gn −Gm), which differ by a reciprocal lattice vector. Equation (4.48) means that aplane wave eikr in the solid does not exist alone, but due to diffraction at the periodicpotential plane waves ei(k+Gm)r are added. Equation (4.50) can also be written in matrixform: ∑

m

hn,mCGm(k) = ǫ(k)CGn

(k) (4.51)

with

hn,m =~2

2m(k+Gn)

2δn,m + veff(Gn −Gm) (4.52)

This means that for each vector k one matrix equation has to be solved, which provides anumber of eigenfunctions and eigenvalues. Therefore, next to k another quantum numberwill be introduced and we write: ϕn,k(r), ǫn(k). We find that equation (4.50) or (4.51) and(4.52) are often quite useful for real systems and can be calculated. This is in particularpronounced if it is combined with the so-called pseudopotential theory (cf. part V). Themain problem is the dimension of the matrix of Eq. (4.53), and in particular the calculationof the non-diagonal elements or the sums in Eq. (4.50). We find that veff(Gl) rapidlydecreases with increasing length of the Gl and often only the first terms in veff(Gl) differfrom zero. If this is true, then the evaluation of Eq. (4.50) or (4.51) and (4.52) is possible,because the non-diagonal part of the matrix hn,m is then of finite size. We have introducedtwo quantum numbers: the vector k, which is limited to the first Brillouin zone and thediscrete index n. In order to illustrate this on a simple level, we examine a one-dimensionalexample and a very weakly varying potential. Then, the energies are

ǫn(k) ≈~2

2m(k+Gn)

2 . (4.53)

Figure 4.13 shows the parabola for G = 0 as a dotted line. We have found that due toperiodicity it is reasonable and sufficient to constrain k to the first Brillouin zone. Thisis possible if we look at k+Gn, i.e., if we fold back parts, which are outside the firstBrillouin zone, of the dotted curve, by a suitable vector Gn. The part of Fig. 4.14 in therange of the 1st Brillouin zone, or the function ǫn(k) is called the “band structure”.

For the further discussion of the importance of the reciprocal lattice we have taken a sideview at a crystal (cf. Fig. 4.14). We can see that the Bravais point lattice can also be

113

−2π/a −π/a 0 π/a 2π/a k

1. Brillouinzone

G1

G2

ǫ(k)

Figure 4.13: First Brillouin zone of a one-dimensional lattice of lattice constant a. G1

and G2 are the shortest non-zero reciprocal lattice vectors (length 2π/a). The straightline gives the function ǫn(k).

regarded as a regular arrangement of planes. There is a close relation between the vectorsof the reciprocal lattice and such parallel planes (the straight, the dashed-dotted and thedashed planes in Fig. 4.14): For each family of lattice planes being separated by a distanced there are reciprocal lattice vectors perpendicular to these planes. The shortest of thesereciprocal lattice vectors has the length 2π/d. The inverse of this statement is also true:For each reciprocal lattice vector G there is a family of lattice planes perpendicular to G.This close relation between planes of the crystal and the reciprocal lattice vectors impliesthat one generally can label the planes in the crystal lattice by the shortest reciprocallattice vector being perpendicular to these planes.

. .

.

d1

d2d3

Figure 4.14: Side view of a crystal (i.e., of the Bravais point lattice)

These labels are called Miller indices. In general, they are defined by the coordinates of theshortest possible reciprocal lattice vector of the Bravais lattice perpendicular to this plane;by definition, they are always integer numbers and have no common factor. Note that,

114

usually, the Miller indices in any cubic lattice are referred to by the conventional (cubic)cell, which for the fcc- and bcc-lattices is a sc lattice with a basis. Figure 4.15 shows threeimportant lattice planes for cubic crystals and their labels. In fcc and bcc lattices, thisconvention leads to a formal disconnection between lattice planes and actual reciprocallattice vectors: Since the primitive cells (one atom per cell) of both lattices are smallerthat the conventional cell (2 atomic for bcc, 4 atoms for fcc), some of their reciprocal lat-tice vectors appear to be missing when written down in a in the “simple cubic” notation.For example, for the bcc lattice we find that only (i, j, k) with i + j + k = even numberis allowed, for the fcc lattice the indices of reciprocal lattice vectors have to be either allodd or all even numbers. Thus, the fcc, bcc, and sc lattices all have (111) lattice planesas denoted by Miller indices and shown in Fig. 4.15, but the shortest corresponding re-ciprocal lattice vector in bcc would have the indices (222).

Now we have a look at the origin of a reflection of electrons at (or in) a crystal, the crystalbeing composed of planes. At first we imagine that a plane wave of electrons or X-rayspropagates with wave vector k. From the discussion of Eq. (4.48) we know that this waveis not a stationary state (eigenstate). This is obtained without solving the Schrödingerequation.

The wave is reflected at the crystal planes (cf. Fig. 4.16). We have constructive interference(Bragg reflection), if the path difference of the waves scattered at different planes is amultiple of the wave vector,

2d sin θ = mλ = m2π

|k| , (4.54)

where m is an arbitrary integer number and λ the wave length of the plane waves.

(100) (110) (111)

Figure 4.15: Important crystal planes in a cubic crystal

We rewrite this condition by using that there are reciprocal lattice vectors which areperpendicular to the planes of interest and which have the following length:

|Gm| = m2π

d(4.55)

From Eq. (4.54) we obtain

2dGmk

|Gm||k|= m

2π

|k| . (4.56)

115

Figure 4.16: Bragg reflection at a crystal

Here, we use thatkGm = |k||Gm| sin θ .

The condition for constructive interference (cf. Eq. (4.56)) can also be written as:

2kGm = |Gm|2 (4.57)

ork2 = (k−Gm)

2 . (4.58)

This means: Waves with wave vectors k fulfilling the requirement (Eq. (4.58)) (i.e., theBragg condition), cannot propagate in the crystal. They are reflected in other directions.The condition (Eq. (4.58)) is obviously fulfilled at the border of the Brillouin zone, i.e.,k = Gm

2.

4.4 Periodic Boundary Conditions

For many questions it is reasonable (without influencing the physical results),5 to useperiodic boundary conditions. These have been discussed already in chapter 2. From nowon we will assume that the crystal is sufficiently large, so that surface effects do not playa role. Then, also the shape of the crystal is not important. The dimensions of the crystalshall be Liai, with Li being a large integer number. The volume of the base region is then

Vg = L1L2L3Ω (4.59)

with Ω, the volume of the primitive unit cell:

Ω = a1(a2 × a3) (4.60)

Periodic boundary conditions for the wave function then mean that we have:

ϕn,k(r+ Ljaj) = ϕn,k(r) . (4.61)

5Problems arise if the boundary plays a role, e.g. for magnetic effects.

116

Due to the Bloch condition we have:

ϕn,k(r+ Ljaj) = eiLjajkϕn,k(r) . (4.62)

From (4.61) and (4.62) we obtain:

eiLjajk = 1 for j = 1, 2, 3 (4.63)

The possible k vectors fulfill this condition

Ljajk = 2πM (4.64)

with M being an arbitrary integer number. If we now write

k = k1b1 + k2b2 + k3b3

we obtainkj =

M

Lj

. (4.65)

Lj is determined by the size of the base region. M is an arbitrary integer number. There-fore, we are dealing (because of periodic boundary conditions) with a discrete, but arbi-trarily dense packed set of k vectors. In direction bj the distance of two k-points is 1/Lj.The volume in k-space referring to a k-point then is

Vk =b1

L1

(b2

L2

× b3

L3

). (4.66)

Since b1(b2 × b3) is the volume of the first Brillouin zone, we obtain

Vk =(2π)3

L1L2L3 Ω=

(2π)3

Vg(4.67)

with Vg being the volume of the base region.

The density of k-points then is Vg/(2π)3. This is the same result as we obtained earlierfor free electrons (cf. Eq. (2.7) and the following). Before, we have described the wavefunction as a simple plane wave. Here we have made no assumptions concerning the wavefunctions, but we have used the fact that they satisfy Bloch’s theorem.

117

5 The Band Structure of the Electrons

5.1 Introduction

In the Fourier representation, i.e., in the basis of plane waves

χn (k) =1√Vgei(k+Gn)r , (5.1)

the Hamilton operator of the Kohn-Sham equation has the form:

hn,m =~2

2m(k+Gn)

2δn,m + veff(Gn −Gm). (5.2)

veff(Gn − Gm) decreases with increasing length |Gn − Gm|, and thus hn,m is basicallydiagonal for large |Gn −Gm|. The corresponding eigenvalue equation enables the calcu-lation of the energies and eigenfunctions for given k.

Figure 5.1: Fourier representation of the atomic potentials of Al, Si and Ag.

In Fig. 5.1 we show the function veff(G) for three examples: Al, Si, and Ag. In fact, thefigure shows the potential1 of atoms and therefore, veff is defined for continuous values of|G|. For a solid we have roughly

veff(r) =M∑

I=1

veff−atom(r−RI) ,

1To be precise, Fig. 5.1 shows the effective potential for atomic pseudopotentials. The definition ofpseudopotentials is given in Section 5.5.1 below.

118

and then, in the Fourier representation veff−atom is only needed at discrete reciprocal lat-tice vectors. These are determined by the lattice structure, and we have: 2π

a= 1.55 Å−1

(Al), 1.16 Å−1 (Si), and 1.54 Å−1 (Ag).

If all Fourier components veff(Gl) are small, we obtain the dispersion of free electrons:

ǫn(k) =~2

2m(k+Gn)

2. (5.3)

ǫn(k) as a function of k is called the energy band n. For the one-dimensional case we haveshown this dispersion before (cf. Fig. 4.13). However, the one-dimensional case is nottypical, because in this case at most two-fold degeneracy can be present. Therefore, wewill now discuss a two-dimensional example, which shows all the important characteristicsof a band structure – even those of three-dimensional structures. We will investigate ahexagonal lattice. The 1st Brillouin zone and the labels of special k-points are shown inFig. 5.2. If we now evaluate Eq. (5.3), we obtain the band structure of free electrons,as shown in Fig. 5.3, for G0 = (0, 0) = Γ, G1 = 2π

a(0, 1), G2 = 2π

a(cos 30, sin 30) =

2πa(√32, 12), G3 =

2πa(cos 30,− sin 30) = 2π

a(√32,−1

2), etc. Here, we restrict ourselves to the

boundary of the so-called irreducible wedge, which is hatched in Fig. 5.2. By reflectionand rotation of this wedge the full 1st Brillouin zone can be obtained.

Figure 5.2: Brillouin zone of a hexagonal cell. Special k-points are shown. The hatchedarea is the irreducible part.

The point K is at 13(G2+G3) =

2πa( 1√

3, 0), and along the ΓK-direction we have k = (kx, 0).

We obtain the results:

n = 0, i.e., G0 : ǫ0(k) = ~2

2mk2x

ǫ0(Γ) = 0

ǫ0(K) = ~2

2m(2π

a1√3)2

n = 1, i.e., G1 : ǫ1(k) = ~2

2m(k2x + (2π

a)2)

ǫ1(Γ) = ~2

2m(2π

a)2

ǫ1(K) = ~2

2m(13+ 1) · (2π

a)2

119

Figure 5.3: Band structure of free electrons in a hexagonal lattice. The numbers in bracketsgive the degeneracy.

Figure 5.4: The band structure of nearly free electrons in a hexagonal lattice for twodifferent systems and two different Fermi energies: ǫ1F for 1 electron per cell and ǫ2F for 2electrons per cell.

n = 2, i.e., G2 : ǫ2(k) = ~2

2m

(kx +

2πa

√32

)2+ (2π

a)2 · 1

4

ǫ2(Γ) = ~2

2m(2π

a)2(3

4+ 1

4)

= ~2

2m(2π

a)2 = ǫ1(Γ)

ǫ2(K) = ~2

2m(2π

a)2 · 7

3

etc.

The result for ǫn(k) is shown in Fig. 5.3. Even for free electrons the band structure – justbecause of the reduction to the 1st Brillouin zone – looks rather complicated.

Which modifications can we expect if the potential – which has been zero so far – slightlydiffers from zero? This will be investigated in more detail now.

Plane waves with different Gn are coupled by the crystal potential (Bragg-condition,

120

reciprocal lattice

1st Brillouin

zone

Γ

ΓΓ

ΓΓΓ

2πa

−πa +π

a

ǫ(k)

Figure 5.5: Three possible representations for ǫn(k) (one-dimensional example). Top:reduced zone scheme. Middle: repeated zone scheme. Bottom: extended zone scheme.

hybridization of states), i.e., wave functions and eigenvalues are different:

1√Vgei(k+Gn)r −→ ϕn,k(r) =

∑l cGn

(l)ei(k+Gl)r,

ǫn(k) −→ ǫn(k) + ∆n(k) .

For non-degenerate states, the change of the energy levels is small (∼ veff(Gn)2), but it is

larger for degenerate states2 (∼ veff):

∆n(k) = ±|veff(Gn)| . (5.4)

Equation (5.4) describes the band structure close to the boundary of the 1st Brillouinzone. This is illustrated in Fig. 5.4. The representation of ǫn(k) in the domain of the 1stBrillouin zone is called a reduced zone scheme. Due to the equivalence of k and (k +G)we can consider the bands ǫn(k) also as periodic functions in k-space, as shown in Fig. 5.5for a one-dimensional example. However, the “repeated zone scheme” and the “extendedzone scheme” illustrated in Fig. 5.5 are very rarely used.

5.1.1 What Can We Learn from a Band Structure?

First we note that in an N electron system at T = 0K, the N lowest-energy states areoccupied, i.e., each state ϕn,k(r) can be filled by two electrons: ↑ and ↓.

2cf. e.g. Ashcroft/Mermin, Chapter 9 or Madelung, Chapters 18–19.

121

One band, i.e., a function ǫn(k) with fixed n but variable k (k ∈ 1st BZ), can be occupiedby

2

∫

1.BZ

Vg(2π)3

d3k (5.5)

electrons. The factor 2 takes into account the spin.

We have:

2

∫

1.BZ

Vg(2π)3

d3k = 2Vg

(2π)3(2π)3

Ω= 2

VgΩ

= 2N . (5.6)

One band can be filled by 2N electrons, where N is the number of unit cells in the basevolume Vg, and Ω is the volume of the unit cell. Thus, one band can be filled by 2 electronsper primitive unit cell.

The number of electrons per primitive unit cell and the band structure determine im-portant electric and optical properties of a solid, and we will now consider three specialsystems:

• System #1: One electron per cell.This can be, for example, alkali metals or noble metals (Cu, Ag, Au). The lowestband of the band structure is then half filled, i.e., the Fermi energy is in the middle ofthe band (cf. Fig. 5.4, ǫ1F ). Thus, directly above the highest occupied state there areunoccupied states. Therefore, the energy required to excite an electron and increaseits kinetic energy is arbitrarily small. Such a system is an electric conductor.

• System #2: Two electrons per cell and a band structure as in Fig. 5.4,left.If ∆ < 0, like in the left firgure of Fig. 5.4, the Fermi energy is at ǫ2F . Thus, thesecond band is partially filled while a fraction of the first band remains unoccupied.Also in this case the Fermi edge cuts bands, and thus also this system is a metal.

• System #3: Two electrons per cell, and ∆ > 0 (Fig. 5.4, left).When ∆ > 0, the lowest band is filled, and the Fermi energy is in the band gapabove this band. The band occupied at T = 0K is called valence band (VB), theunoccupied band above the band gap is called conduction band (CB). The positionof the Fermi energy is then set equal to the chemical potential at T = 0K. In thiscase, the electrons require an energy of at least Egap = ∆KS+∆xc for the excitationfrom an occupied to an unoccupied state. Here, ∆KS is given by the Kohn-Shameigenvalues calculated for the N -particle ground state:

∆KS = ǫNLB − ǫNVB .

The quantity ∆xc is introduced here because in principle ∆KS does not correspond to anexcitation. We will come back to this point in Section tba. below. Systems with Egap 6= 0consequently are not electric conductors but, depending on the size of the band gap∆KS +∆xc, they are called insulators or semiconductors.

122

The size of the band gap determines, for instance, the appearance of the material. Themeasured band gap is the difference between the ionization energy I (removal of anelectron from the highest level of the valence band) and the affinity A (addition of anelectron in the lowest level of the conduction band)3:

Egap = I − A , (5.7)

A = EN − EN+1 , (5.8)

I = EN−1 − EN . (5.9)

From this we obtain

Egap = EN−1 − EN + EN+1 − EN

= EN−1 + EN+1 − 2EN . (5.10)

Thus, for a correct evaluation of the band gap we need information for three differentsystems: The N − 1, N , and the N + 1 particle system. The Kohn-Sham eigenvalues aretypically only evaluated for the N -particle system.

On the other hand for the Kohn-Sham eigenvalues we have:

A = EN − EN+1 ≈ −ǫN+ 12

CB , (5.11)

I = EN−1 − EN ≈ −ǫN+ 12

VB , (5.12)

where we assumed that EN is continous and differentiable for EN−1 < N < EN andEN < N < EN+1. For integer values of N , this may not be the case (c.f. the discussionon the Janak-Slater transition state, Eq. (3.197)).

For the band gap we then obtain

Egap ≈ ǫN+ 1

2CB − ǫN− 1

2V B (5.13)

= ǫNCB − ǫNVB +∆xc . (5.14)

In the last line we used the Kohn-Sham eigenvalues only for the N -particle ground stateand called the correction term ∆xc.

At this point it is still controversially discussed if for the exact DFT the correction ∆xc

is there at all and, if it is, how big it may be. 4 For the known approximations of thexc functional it is quite clear, however, that the difference of the Kohn-Sham eigenvalues(ǫNCB − ǫNVB) and the measured experimental band gap is indeed noticeable but much ofthis difference is due to the approximate treatment of xc. For example, in the the LDA theKohn-Sham band gap underestimates the experimental band gap by about 50%. At thispoint, the only practical way to calculate a band gap is to leave DFT and to employ themany-body perturbation theory. Here, the so-called GW approximation (G is the Greenfunction and W is the screened Coulomb interaction) is the state-of-the-art approach (seeSection tba. below).

3I and A are both defined as positive quantities.4 For a recent discussion see P. Mori-Sanchez, A. J. Cohen, W. Yang, Phys. Rev. Lett. 100, 146401

(2008).

123

~ω

z

y

x

θ

φ

e−

ener

gy

crystalvolume

photon:~ω

electron e−:ǫf

≈ 12 eV ≈ 4 eV

veff

occupiedbands

vacuum

zposition

Figure 5.6: By measuring the intensity I(ǫf , θ, φ, ~ω, e) of the electrons emitted by anoptical excitation ~ω (ǫf is the kinetic energy of the emitted electrons in vacuum; e and~ω are the electric field vector and the energy of the light), information on the occupiedstates of the band structure can be obtained.

124

The quantity Egap determines the appearance of the materials. If Egap is lower than theenergy of visible light5, the solid looks like a metal (e.g. Si: Egap = 1.1 eV, or GaAs:Egap = 1.45 eV). If the band gap is in the range of visible light or larger, the solid is trans-parent to light (e.g. GaP: Egap = 2.35 eV looks orange because blue light is absorbed;diamond is clear and colorless: Egap ≈ 6 eV). For semiconductors Egap is in the order of0.5 . . . 5 eV, so that at room temperature some electrons are excited (at 300 K we havekB · T = 0.026 eV). For insulators we have: Egap ≫ kB · T . The term “semiconductor”is not well defined, i.e., strictly speaking a semiconductor is an insulator with a not toolarge band gap. In my view this is, however, not a useful definition. Much more relevantis the following: A semiconductor is an insulator that can be doped (i.e., impurity atomscan be added) to generate charge carriers in the valence band and/ or in the conductionband.

At the end of this paragraph, we note that the band structure ǫn(k) can be studied ex-perimentally. Angle-resolved photo emission (cf. Figs. 5.6 and 5.7) measures the electronsleaving the solid upon irradiation with light of energy ~ω, mostly UV or X-rays. Moreprecisely: One measures the kinetic energy ǫf of these electrons, the direction of theirmotion kf and their number per unit time (the index “f” (“final”) refers to the final state).From this, the energy of the occupied states ǫn(k) ≈ ǫf − ~ω and of the correspondingk-vectors can be determined. Such experiments require a tunable light frequency and thusa synchrotron. They are carried out for example at BESSY in Berlin, and at many other“synchrotron light sources” in the world.

The theoretical band structure of Fig. 5.7 agrees very well with the experimental data.This clearly proves that the theory captures the right physics. However, the excellentquantitative agreement is also a consequence of the fact that this here was an empiricaltheory. Ab initio calculations are doing slightly worse for the bands. For the band gap,between VB and CB (not shown in Fig. 5.7), DFT is so far not doing well.

5.2 General Properties of ǫn(k)

5.2.1 Continuity of ǫn(k) and Meaning of the First and SecondDerivatives of ǫn(k)

Withϕn,k(r) = eikrun,k(r) , (5.15)

we have

hϕn,k(r) =[− ~

2

2m∇2 + veff(r)

]eikrun,k(r)

= eikr[− ~

2

2m∇2 + veff(r) + ~

2

2mk2 − 2 ~

2

2mik∇

]un,k(r)

= ǫn(k)eikrun,k(r) .

(5.16)

5Visible light is in the energy range 1.65 eV < ~ω < 3.1 eV.

125

Figure 5.7: Comparison of a calculated empirical-pseudopotential band structure for GaAs(J.C. Phillips and K.C. Pandey, Phys. Rev. Lett. 30, 787 (1973)) (dashed curve) with datameasured by angle-resolved photo emission (T.C. Chiang, J.A. Knapp, M. Aono, and D.E.Eastman, Phys. Rev. B 21, 3515 (1980)). For more recent band structures of the Kohn-Sham eigenvalues and using the LDA or the GGA the agreement is less good. In particular,the Kohn-Sham band gap between the top of the valence band and the bottom of theconduction band (not shown in the figure) is typically much smaller than the experimentalone (at least when the LDA or GGA are used).

126

Thus, we find

ǫn(k) =

∫

Vg

u∗n,k(r) · h un,k(r)d3r , (5.17)

with a modified Hamiltonian

h(k) ≡[− ~

2

2m∇2 + veff(r) +

~2

2mk2 − 2

~2

2mik∇

]. (5.18)

The Hamiltonian h(k) determines the eigenvalue problem for given vectors k ∈ BZ:

h(k)un,k = ǫn(k)un,k , (5.19)

so that we only have to consider one primitive unit cell because un,k(r) is periodic. Inorder to investigate the analytical properties of ǫn(k), we look at the neighborhood of anarbitrary point k. For k+ κ we then have

ǫn(k+ κ) =

∫u∗n,k+κ

(r)h(k+ κ)un,k+κ(r)d3r . (5.20)

As long as |κ| is small, the difference between h(k) and h(k+ κ)

h(k+ κ)− h(k) = ~2

2m(κ2 + 2kκ)− ~

2

2m2iκ∇ (5.21)

is also small. It seems reasonable to calculate the energy ǫn(k+κ) by perturbation theory,i.e., by expanding the functions un,k+κ(r) with respect to the functions un,k(r) of theunperturbed problem:

ǫn(k+ κ) =

∫u∗n,k(r)h(k)un,k(r)d

3r︸︷︷︸

0. Order

+

∫u∗n,k(r)

~2

2m(κ2 + 2kκ)− ~

2

2m2iκ∇

un,k(r)d

3r

︸︷︷︸1. Order

+∑

m 6=n

∣∣∣∫u∗n,k(r) ~2

2m(κ2 + 2kκ)− ~2

2m2iκ∇um,k(r)d

3r∣∣∣2

ǫn(k)− ǫm(k)︸︷︷︸

2. Order

+ O(κ3) . (5.22)

For convenience, we now introduce the matrix element of the momentum operator:

pnm = 〈ϕn,k(r)|~

i∇|ϕm,k(r)〉

= 〈un,k(r)|~k+~

i∇|um,k(r)〉 , (5.23)

127

where the second equals sign is obtained from Bloch theorem. If we combine Eq. (5.23)and Eq. (5.22), we obtain:

ǫn(k+ κ)− ǫn(k) =~

mκpnn +

~2

2mκ2

+~2

2m

∑

m 6=n

|κpnm|2ǫn(k)− ǫm(k)

+O(κ3) . (5.24)

The limit |κ| −→ 0 illustrates that ǫn(k) is continuous as a function of k, and that it isdifferentiable. Furthermore, we obtain the gradient ∇k of ǫn(k) :

pnn =m

~∇kǫn(k) . (5.25)

The expectation value of the momentum operator therefore is not proportional to k, likefor free electrons, but it is given by the gradient of the function ǫn(k). This is a ratherimportant modification and, e.g., it may occur that pnn decreases with increasing k.

For the second derivative of the energy ǫn(k) with respect to k from Eq. (5.24) in thelimit |κ| → 0, we obtain:

∂2

∂kα∂kβǫn(k) =

~2

mδαβ +

~2

m

∑

m 6=n

Re (pα,nmp∗β,nm)

ǫn(k)− ǫm(k). (5.26)

For free electrons, the second term on the right side of Eq. (5.26) vanishes, and theexpression corresponds to the inverse of the inert mass of an electron. Close to maxima orminima of the band structure Bloch electrons behave as if they had a direction-dependentmass given by the tensor (5.26): This effective mass contains the effects due to the electron-lattice interaction and the electron-electron interaction. If we are interested in electronicstates and their behavior at band extrema, the Hamilton operator can be simplified usingthe concept of effective mass as introduced in Eq. (5.26):

h = − ~2

2m∇2 + veff(r) −→ h = − ~

2

2m∗∇2 , (5.27)

with1

m∗ =1

~2

∂2ǫn(k)

∂kα∂kβ. (5.28)

Here, 1m∗ is a tensor and the expression is reasonable only in the part of the Brillouin zone

where ǫn(k) has a parabolic shape to a good approximation. This is obviously a severelimitation, but in some later parts of this lecture the concept of the effective mass willprove useful.

In Fig. 5.8, we show the lower edges of the conduction bands of two semiconductors. Anelectron at the bottom of the conduction band of silicon (left figure) has a larger effectivemass than an electron at the bottom of the conduction band in GaAs (right figure). Thus,the conduction band electrons’ mobility in GaAs is larger.

128

Figure 5.8: Band structure of the lower conduction band for two semiconductors, left Siand right GaAs. For the left system at the minimum the curvature is small. Thus, 1/meff

is small, i.e., meff is large and the mobility of the charge carriers in this state is low. Forthe right system at the minimum the opposite is true, i.e., the mobility of the chargecarriers at the minimum of the right system is high.

5.2.2 Time Reversal Symmetry

Further important properties of ǫn(k) are found when we consider the operation of timereversal Tt: 6

Tt : t −→ −t . (5.29)

This operator reverses the state of motion. Thus, time reversal means that

t1 < t2 ⇒ Ttt1 > Ttt2 , (5.30a)

Tt(t2 − t1) = −(t2 − t1) , (5.30b)

andT = T−1 . (5.30c)

By inserting this into the time-dependent Schrödinger equation

i~∂

∂tϕ(r, t) =

[− ~

2

2m∇2 + veff(r)

]ϕ(r, t) , (5.31)

we find that

Ttr = rTt , (5.32a)

Ttp = Tt~

i∇ = −~

i∇Tt , (5.32b)

6cf. Madelung I., p. 107; Tinkham, p. 143.

129

andTtσ = −σTt . (5.32c)

In other words: Tt commutes with r and it anti-commutes with p and σ.If the Hamilton operator depends on an external magnetic field, the result of Tt is thatalso the magnetic field changes sign:

TtB = −B (5.32d)

The proof of these anti-commuter relation can be found in text books and is not repeatedhere.

At first, we investigate a system without spin-orbit-coupling and without any magneticfield, for which the Hamilton operator is real:

h = h∗ . (5.33)

We investigate two possibly different eigenfunctions of h, ϕn(k, r) and ϕ∗n(k, r). Both

functions are degenerate, because we have

hϕn(k, r) = ǫn(k)ϕn(k, r) , (5.34)

and with Eq. (5.30a) conjugate complex:

h∗ϕ∗n(r) = hϕ∗

n(k, r) = ǫn(k)ϕ∗n(k, r) . (5.35)

Are ϕn(k, r) and ϕ∗n(k, r) physically different eigenfunctions? To answer this question we

apply the translation operator to both functions:

TRIϕn(k, r) = eikRIϕn(k, r) , (5.36)

orTRI

ϕ∗n(k, r) = e−ikRIϕ∗

n(k, r) . (5.37)

On the other hand we have:

TRIϕn(−k, r) = e−ikR

I ϕn(−k, r) . (5.38)

For ϕn(k, r) we thus have the quantum numbers n and k. On the other hand, for ϕ∗n(k, r)

and for ϕn(−k, r) the quantum numbers are n and −k. Thus, ϕ∗n(k, r) is identical to

ϕn(−k, r). The energies of ϕn(k, r) and of ϕn(−k, r) = ϕ∗n(k, r) are ǫn(k) and ǫn(−k),

and with Eq. (5.35) we obtain:

ǫn(k) = ǫn(−k) . (5.39)

Another reason for this degeneracy to occur is given when there is a spatial inversionsymmetry, but what we have just found is also valid if the crystal does not have spatialinversion symmetry.Now we investigate the Hamilton operator of a single-particle problem with Spin-Orbit-Coupling:

h =

[− ~

2

2m∇2 + veff(r)

](1 00 1

)+

~2

4m2c2σ(∇veff(r)× ~

i∇) , (5.40)

130

where the components of σ have the form

σx =

(0 11 0

), σy =

(0 −ii 0

), σz =

(1 00 −1

). (5.41)

For the Hamilton operator without an external magnetic field we then have

[h, Tt] = 0 , (5.42)

which meanshϕ = ǫϕ , (5.43)

andh(Ttϕ) = ǫ(Ttϕ) . (5.44)

As long as ϕ and Ttϕ are linearly independent, both functions are degenerate. Accordingto the definitions given above Tt can be defined by the following equation:

Tt = −iσyK , (5.45)

where K is the conjugatorKϕ = ϕ∗ . (5.46)

One can easily prove that the operator Tt defined this way has the required properties(note: σy acts on the two components of a spinor, not on r or ∇):

(−iσyK)r = r(−iσyK) , (5.47)

(−iσyK)~

i∇ = −~

i∇(−iσyK) , (5.48)

(−iσyK)σx = −σx(−iσyK) , (5.49)

etc.

In our single-particle problem, the spin state is well defined. If we investigate a wavefunction with “spin up” (in the limit of j-j-coupling of the many-body system), we obtain:

Tt

[ψ0

]= −iσyK

[ψ0

]=

[0ψ∗

](5.50)

and analogous for “spin down”

Tt

[0ψ

]= −

[ψ∗

0

]. (5.51)

Thus we have:T 2t ψ(r, σ) = −ψ(r, σ) . (5.52)

It is clear that ψ and Ttψ are orthogonal to each other, and the wave functions for “spinup” and “spin down”, respectively. Therefore, as noted in Eq. (5.43) and (5.44), generallywe have that ϕn,k and Ttϕn,k are linearly independent functions and that the energy levelǫn(k) is twofold degenerate.

131

Figure 5.9: The Fermi Surface Database: http://www.phys.ufl.edu/fermisurface

Therefore, the eigenvalue ǫ of a single-particle problem is – independent of spatial sym-metry – two-fold degenerate because of time-reversal symmetry:

ǫn(k, ↑) = ǫn(−k, ↓) . (5.53)

As long as spin polarization can be neglected, we have further:

ǫn(k, ↑) = ǫn(k, ↓) = ǫn(−k, ↑) = ǫn(−k, ↓) , (5.54)

i.e., 4-fold degeneracy!

5.2.3 The Fermi Surface

For metals at T = 0K, all single particle states with ǫn(k) ≤ ǫF are occupied. This isin fact the definition of ǫF: For a finite N electron system we have ǫF = ǫN . For metals(infinite and periodic) there is a Fermi surface. It is defined by the equation

ǫn(k) = ǫF . (5.55)

132

http://www.phys.ufl.edu/fermisurface

The Fermi surface is a surface of constant energy in k-space. It separates the occupied fromthe unoccupied states. The Fermi surface consequently exists only for metals, because, ifǫF is in a band gap, condition (5.55) is never fulfilled. Then there is no Fermi surface. Atfirst we investigate the Fermi surface for the example of electrons in jellium. As the bandstructure is given by

ǫ(k) =~2

2mk2 , (5.56)

the Fermi-surface is thus the surface of a sphere of radius |k| = kF =√

2m~2ǫF.

For nearly-free electrons in a periodic system, the band structure looks more complicatedand the Fermi surface as well. We have:

ǫn(k) =~2

2m(k+Gn)

2 != ǫF . (5.57)

Sometimes only one band index n will contribute to the Fermi surface, but in generaldifferent n will contribute to different sections.For the realistic materials, i.e., veff(r) 6= const., the Fermi surfaces look again more compli-cated. Figure 5.9 shows some examples from “The Fermi Surface Database”. More detailscan be found on that webpage.

The Fermi surface is a useful quantity for predicting the thermal, electrical, magnetic,and optical properties of metals, semimetals, and doped semiconductors.It can be measured by various techniques. Traditionally, these are studies of the de Haas-van Alphen effect (an oscillation of the magnetic susceptibility) and the Shubnikov-deHaas effect (an oscillation of the resistivity). The oscillations are periodic versus 1/H.They reflect the quantization of the energy levels in the plane perpendicular to a magneticfield (Landau levels) and the period of the oscillation is related to the cross section of theFermi surface in the direction perpendicular to the magnetic field direction, as initiallyshown by Onsager.A direct experimental technique to obtain the electronic structure ǫ(k) and also the Fermisurface, is angle resolved photoemission (ARPES).For a further, very detailed discussion of Fermi surfaces I refer to the Ashcroft-Merminchapters 9, 14 and 15. This can be presented hardly better than there.

5.3 The LCAO (linear combination of atomic orbitals)

Method

In part 5.1 we assumed that the potential of the solid, veff , is not particularly strong,and that the band structure is only a weak modification of the dispersion relation of freeelectrons. This led to the band structure in the approximation of nearly free electrons.This treatment is in principle exact, but typically veff(r) =

∑G v

eff(G)eiGr will have avery large number of Fourier components.

To get a feeling for the wave functions, energies, and for the forces that hold the solidtogether, also another point of view is possible. For this we now want to start with well

133

H atom 1

•

H atom 2

•

ϕ1s(r−R1) ϕ1s(r−R2)

z

(x = y = 0)

Figure 5.10: Schematic presentation of the atomic eigenstates of two H atoms in the H2

molecule.

H atom 1

•

H atom 2

•

ϕb1s(r)

“bonding”

ϕa1s(r)

“antibonding”

z(x = y = 0)

Figure 5.11: Schematic representation of the electronic eigenstates in the H2 molecule.

separated atoms and investigate what happens if these atoms are brought closer together.In fact the situation present in a solid is rather in the middle between the properties ofatoms or molecules and the ones describing the behavior of nearly free electrons. Thevarious modern numerical methods for the calculation of the electronic structure of solidstherefore combine both aspects in their methodology. Such methods, in particular the abinitio pseudopotential theory, the linearized muffin-tin orbital method (LMTO) and thelinearized augmented plane waves method (LAPW) are discussed in this Chapter.We start by reminding the reader about the H2 molecule: Since the Hamilton operator(and the potential) has a reflection symmetry, we have: the eigenstates have to be eithersymmetric or antisymmetric with respect to this reflection plane. If we assume that themolecular eigenstates are linear combinations of the atomic 1s states ϕ1s (cf. Fig. 5.10),we have

ϕb1s(r) =

1√A

(ϕ1s(r−R1) + ϕ1s(r−R2)) , (5.58)

ϕa1s(r) =

1√A

(ϕ1s(r−R1)− ϕ1s(r−R2)) , (5.59)

where 1/√A ensures the normalization of the ϕb, ϕa to 1. This is illustrated in Fig. 5.11

and Fig. 5.12 shows the corresponding energy levels.

134

ǫ1s

ǫa “antibonding”

ǫ1s

ǫb “bonding”

Figure 5.12: Schematic representation of the electronic energy levels in the H2 molecule

Thus, when the atoms get closer to each other, so that the wave functions start to overlap,there is a splitting of the energy levels:

ǫb : low energy (thus favored)ϕb : ©+©+ : the electron density |φb|2 charge has a maximum between the nuclei

ǫa : high energy (thus unfavored)ϕa : ©+©− : the electron density |φa|2 is zero between the nuclei

We now use the same concept to describe a solid, i.e. we use such atom-centered basisfunctions as basis set. The LCAO basis set for the representation of the wave functionsin a solid or molecule is defined by:

χα(k, r) =1√A

M∑

RI

γI(k)ϕα(r−RI) (5.60)

with

M : Number of atoms in the base regionϕα(r−RI) : atom-like function,

centered at position RI , (e.g.numerical solution of the atomic Kohn-Sham equation,or Gaussians, or LMTOs with α = 1s, 2s, 2p, . . .)

From the translation invariance in a periodic crystal it follows (Bloch’s theorem) that

γI(k) = eikRI . (5.61)

Thus, there is an infinite number of phases: eikRI = +1 . . .− 1. Here, the value +1 refersto k = 0 and the value −1 to the edge of the Brillouin zone: k = 1

2G.

For normalization, we choose the condition

〈ϕα(r)|ϕβ(r)〉 = δα,β , (5.62)

andA = M = number of atoms . (5.63)

This yields that〈χα(k, r)|χβ(k, r)〉 −→ δα,β ,

135

when the lattice constant goes to ∞.

The eigenfunctions of the single-particle hamiltonian are then written as

ϕn(k, r) =∑

β

cnβ(k)χβ(k, r) , (5.64)

and the matrix form of the Kohn-Sham equation is∑

β

[hαβ − Sαβǫn(k)] cnβ(k) = 0 , (5.65)

with

hαβ = 〈χα(k, r)|h|χβ(k, r)〉 (5.66)

=1

M

∑

RI ,RJ

eik(RI−RJ )〈ϕα(r−RI)|h|ϕβ(r−RJ)〉

=M∑

RI

eikRI 〈ϕα(r−RI)|h|ϕβ(r)〉︸︷︷︸ǫαβ(RI)

(5.67)

Sαβ = 〈χα(k, r)|χβ(k, r)〉 (5.68)

=M∑

RI

eikRI 〈ϕα(r−RI)|ϕβ(r)〉︸︷︷︸sαβ(RI)

. (5.69)

The advantage of the LCAO method, i.e., of using atomic or atom-like orbitals for ϕα(r)is that these are very localized. The quantities ǫαβ(RI) and sαβ(RI) thus differ from zeroonly for very few RI (often only for |RI| ≤ 2 or 3 interatomic distances). This resultsin a high numerical efficiency and good scaling with system size. Another advantage ofthe LCAO method is that the number of basis functions and consequently the dimensionof the matrices in Eq. (5.65) can be kept very small. For a solid of hydrogen atoms orof alkali atoms (Na, Cs) as a first approximation (or for a qualitative discussion) it issufficient to use only one basis function per atom:

H : 1sLi : 2s

...Cs : 6s

(5.70)

For C, Si, Ge, GaAs one has to use at least four orbitals per atom:

C : 2s, 2px, 2py, 2pzSi : 3s, 3px, 3py, 3pz

...(5.71)

The “minimum basis sets” in the examples (5.70) and (5.71) allow for a qualitative descrip-tion, for a more accurate quantitative description further orbitals have to be included.

136

single-particleenergyǫ

ǫ2p : 3M -fold degenerate

ǫ2s : M -fold degenerate

distance between the atoms,lattice constant

typical range forequilibrium geometries

Figure 5.13: Schematic presentation of the energy levels of a solid as function of theinteratomic distance. For large distances one obtains the energy levels of the free atoms.

direct lattice Brillouin zone

a1

a2k1

k2

Γ

M

X

Figure 5.14: The two-dimensional square atomic lattice.

What happens when the atoms get closer to each other? Then the electronic energy levelsof the atoms split. This is sketched in Fig. 5.13. For smaller distances the sharp energylevels become energy bands. The total number of all states is constant, i.e., independentof the distance of the atoms.In the spirit of such LCAO basis sets and considering a “minimum basis” we now like toconstruct the band structure of a simple material.

5.3.1 Band Structure and Analysis of the Contributions to Chem-ical Bonding

We discuss a two-dimensional example. Although in part 5.1 we have used the hexagonallattice, I now want to talk about the square lattice. Because of the orthogonality of thelattice vectors the discussion is somewhat simpler, cf. Fig. 5.14.In a qualitative or semi-quantitative description of an s-band we will now just use ones-orbital per atom. An estimate of the relative energies at high symmetry points in theBrillouin zone is compiled in Table 5.1. Knowing about the continuity of the functions

137

k-point ϕs(k, r) conclusion about the energy

Γk = (0, 0)

©+©+©+©+©+

fully bonding⇒ minimum of the energy

Xk = (π

a, 0)

©+©−©+©−©+

half/ half⇒ mid value energy

Mk = (π

a, πa)

©−©−©+©−©−

fully antibonding⇒ maximum of the energy

Table 5.1: Schematic picture of the Bloch states of an s-band of a square lattice. Comparewith Eq. (5.60) for the basis function, with ϕα = s-orbital. The Bloch state ϕs(k, r) isshown at one atom and its four nearest neighbors.

ǫn(k) we can now draw the qualitative band structure (Fig. 5.15).

For the band structure of p-states we discuss pz (oriented perpendicular to the plane ofthe lattice) and px or py separately, because the two types of functions are for a two-dimensional system independent for symmetry reasons: pz is antisymmetric with respectto the plane of the lattice, px and py are symmetric with respect to the plane of the lattice.The illustrations of the wave functions of pz look qualitatively the same as for s-states(at least when looking from the top on the lattice plane). The dispersion of pz-orbitals isthus qualitatively the same as that of s-orbitals (cf. Fig. 5.15).In Fig. 5.16 we summarize the results for these px, py-states, of the pz-, and of the lower-lying s-states to the band structure shown in Fig. 5.16. We recognize that this bandstructure is similar to the result of nearly free electrons. The s- or p-band is similar to

single-particleenergyǫ

M Γ X M

•

•

••

Figure 5.15: Band structure of s-orbitals of the square lattice (qualitative presentation).The dots mark the estimates obtained from Table 5.1.

138

single-particleenergy

ǫ

M Γ X M

s-band

py-band

px-band❨

pz-band

Figure 5.16: Band structure of s- and p- states in the square lattice (qualitative repre-sentation, the relative position of the pz band with respect to the px-, py-band is chosenarbitrarily).

a parabola of free electrons, reduced to the 1st Brillouin zone and split at the pointssatisfying the Bragg-condition. This can also be shown mathematically, because a planewave can be expanded in spherical harmonics:

eikr = 4π∞∑

l=0

l∑

m=−1

il√

π

2krJl+1/2(kr)Y

∗lm(Ωk)Ylm(Ωr) , (5.72)

with Ωk and Ωr labelling the spatial angle and Jl+1/2 representing the Bessel function ofindex l + 1

2.

The above discussion and explanation of the band structure clarifies the meaning of thequantum number k. It describes the phase difference of orbitals that are centered at dif-ferent atoms, and λ = 2π

|k| is the wave length of the wave function. In a diatomic moleculethere are only two phases: bonding and antibonding (or +1 and −1). In a crystallinesolid there are an infinite number of phases covering the full range from “bonding” to“antibonding” with respect to the nearest neighbor interaction. The more bonding statesare occupied (compared to antibonding states) the stronger bound (the more stable) isthe material (e.g. Fe has a higher cohesive energy than Cu).

5.4 The Density of States, N(ǫ)

The density of states is defined as the number of states per unit volume at the energy ǫ:

N(ǫ) =∑

n

2

(2π)3

∫

1.BZ

δ(ǫ− ǫn(k))d3k .. (5.73)

We want to write N(ǫ) differently to point out the characteristic structure, which enablesa relatively direct comparison between band structure and density of states.

139

k-point ϕpx(k, r) ϕpy(k, r)

Γk = (0, 0)

©©−+©©−+ ©©−+ ©©−+

©©−+

©−©+

©−©+©−©+©−©+

©−©+

strongly antibonding strongly antibonding

Xk = (π

a, 0)

©©−+©©+− ©©−+ ©©+−

©©−+

©−©+

©+©−©−©+©+©−

©−©+

fully bonding fully antibonding

Mk = (π

a, πa)

©©+−©©+− ©©−+ ©©+−

©©+−

©+©−

©+©−©−©+©+©−

©+©−

strongly bonding strongly bonding

Table 5.2: Schematic picture of the Bloch states of the px- and py-bands of a squarelattice. Compare with Eq. (5.60) for the basis functions, with ϕα = px- or py-orbital. TheBloch states ϕpx(k, r) and ϕpy(k, r) are shown at one atom and its four nearest neighbors.

140

For this purpose we rewrite the number of states per unit volume [ǫ, ǫ+ dǫ] as follows:

N(ǫ)dǫ =∑

n

Nn(ǫ)dǫ , (5.74)

where n is the band index and Nn(ǫ) the density of states of the band n. Then we have:

Nn(ǫ)dǫ =2

(2π)3

∫

1.BZ

d3k ·

1 if ǫ ≤ ǫn(k) ≤ ǫ+ dǫ0 otherwise

. (5.75)

This is a volume integral in k-space, which is enclosed by the surfaces ǫn(k) = ǫ andǫn(k) = ǫ+ dǫ. δk(k) shall be the distance of these two surfaces perpendicular to the firstsurfaces. Then we have:

Nn(ǫ)dǫ =2

(2π)3

∫

ǫn(k)−ǫ=0

δk(k)df , (5.76)

and we obtain

Nn(ǫ)dǫ =2

(2π)3

∫

ǫn(k)−ǫ=0

1

|∇kǫn(k)|df , (5.77)

because

∇kǫn(k) =∂ǫn(k)

∂k(k)· n(k) , (5.78)

where n is the unit vector with a direction perpendicular to ǫn(k) = ǫ. 7

Characteristic structures occur at energies where the gradient in the denominator ofEq. (5.77) becomes zero. These positions are also called Van-Hove singularities (vHS). Inthree-dimensional systems, these divergences in the integrand introduce non-differentiablekinks in the density of states, as depicted in Fig. 5.17. In one and two dimensions, thedensity of states actually diverges at vHS.

5.5 Other Methods for Solving the Kohn-Sham Equa-

tions of Periodic Crystals

To be completed later.For now see:http://wwwitp.physik.tu-berlin.de/ekreide/ss08/TFP/2008-05-27/pdf/lect_col.pdf

7Alternatively, one obtains the result in Eq. (5.77) directly from Eq. (5.73) by using the delta-functionidentity

δ (f(x)) =∑

xi: f(xi)=0

δ(x− xi)

|f ′(xi)|

with the identifications f(x) ≡ ǫ(k) and |f ′(x)| ≡ |∇kǫ(k)|.

141

http://wwwitp.physik.tu-berlin.de/ekreide/ss08/TFP/2008-05-27/pdf/lect_col.pdf

N(ǫ)

ǫ

Figure 5.17: The density of states of a band. Singularities in the density of states can beidentified (arrows).

5.5.1 The Pseudopotential Method

To be completed later. For now see:http://wwwitp.physik.tu-berlin.de/ekreide/ss08/TFP/2008-05-27/pdf/lect_col.pdf

5.5.2 APW and LAPW


5.5.3 KKR, LMTO, and ASW


5.6 Many-Body Perturbation Theory (beyond DFT)


142





6 Cohesion (Bonding) in Solids

6.1 Introduction

Solids often adopt well ordered crystalline structures with well defined lattice constants.Having discussed the many-body Hamiltonian of a solid and its calculation in some detailin previous chapters, it is now natural to ask why a given element chooses a particularcrystal structure, and what kind of properties are connected with it. In particular, whattypes and strengths of forces, i.e. what bonds hold the solid together. This topic is called’cohesion’ and it is very much to do with the nature of chemical bonding in solids.Just like in all previous chapters, we will restrict ourselves to the situation T ≈ 0K, i.e.sufficiently low temperatures. This is because at higher temperatures, the properties ofmatter do not only follow from the total energy alone, but are also governed by other freeenergy contributions. Differing vibrational properties of different crystalline structurescan induce structural phase transitions to other configurations upon heating. In fact mostelements switch their crystal structure several times before they melt. Another issue is theconfigurational entropy e.g. due to defects such as vacancies interstials, and impurities. Atlow enough temperatures, however, the cohesive properties follow predominantly from thechemical binding in a perfect lattice, i.e. from the electrostatic interaction of the electrondensity with the ions and the ion-ion interaction. And this is what we will study in thischapter.The central property of low temperature cohesion is the cohesive energy Ecoh, which isthe energy needed to rip a sample apart into widely separated atoms. If RIdenotes aset of structural parameters characteristic for a crystal lattice, and R0 represents theirvalue at the equilibrium crystal structure (neglecting zero-point vibrations), we thus have

Ecoh = −(E(R0)

M− E(RI → ∞)

M

). (6.1)

Here E is the total energy of the solid (it has a negative value), and M the number ofatoms in the crystal. Note that with this definition, the cohesive energy is a positive num-ber. More generally, one could also say that the cohesive energy is the energy requiredto separate a solid into its elementary “building blocks”. It is usually understood thatthese “building blocks” are the neutral atoms, but sometimes it can be more convenientto use molecules (e.g. N2 for solid nitrogen) or ions (e.g. Na+ and Cl− for NaCl). By anappropriate correction for the molecular dissociation energy, or the ionization energy ofthe cation (energy to remove an electron) and the electron affinity of the anion (energyto add an electron), such numbers can always be translated into the cohesive energy withrespect to neutral atoms, which is what we will use primarily. Cohesive energies of solidsrange from little more than a few meV per atom to just under 10 eV per atom, as can be

143

Figure 6.1: Experimental cohesive energy over the periodic table of elements (96 kJ/mole= 1 eV). [From Webelements].

seen from Fig.6.1.

We note in passing that by itself, the cohesive energy is not of overridding importancefor the practical strength of a material. Resistance to scatches and fractures are criticalquantities as well, and these are physically distinct from the cohesive energy. The questionthat knowledge of the cohesive energy makes possible to answer, is which crystal structurethe solid will adopt, namely the one with the highest Ecoh (which is nothing else but thesystem achieving its lowest total energy). With the electronic structure methods discussedin the preceding chapters, the straightforward approach to cohesion would therefore sim-ply to compute the total energy of the crystal as a function of RI for a given latticestructure. The energy lowering obtained at the minimum R0 then gives the cohesiveenergy achievable in this particular lattice structure (cf. also with Fig. 2 of chapter 1).Repeating this for all sorts of lattices would finally enable us to identify the one of thehighest Ecoh and this will be the equilibrium crystal structure at low temperatures.

Figure (6.2) shows how this works in practice. Here, the total energy of tungsten hasbeen computed with density-functional theory (DFT-LDA) as a function of the unit-cellvolume V (i.e. using this one variable to represent the set RI for these high symme-try structures). The points are the actually calculated values for the fcc, the hcp, andthe bcc crystal structure. In order to obtain reliable minimum values for these discretesets of points (and also to reduce the inherent numerical noise), one usually interpolatesthe obtained curves with so-called equation of state functions, which are analytical func-tions derived from general thermodynamic considerations about the internal energy in thevicinity of the minimum. A popular form is due to Murnaghan (F.D. Murnaghan, Proc.

144

Figure 6.2: DFT-LDA total energy versus volume for W in the fcc, hcp and bcc structure.The bcc structure is the groundstate with the largest cohesive energy. [from C.T. Chanet al., Phys. Rev. B 33, 7941 (1986)].

Natl. Acad. Sci. U.S.A. 30, 244 (1944)),

E(V )

M− E(V0)

M=

B0V

B′0(B

′0 − 1)

[B′

0

(1− V0

V

)+

(V0V

)B′0

− 1

], (6.2)

which involves the following quantities

V0: Volume at the energy minimum

B0: Bulk modulus at Vo, as already defined in chapter 1. B0 = V ∂2E(V )∂V 2

∣∣∣V=V0

B′0: = ∂B

∂p

∣∣∣V=V0

Fitting V0, B0 and B′0 to the DFT data, the solid curves in Fig. 6.2 are obtained. We

see that over quite a range of volumes this fit is perfect. In this particular case, the bccstructure is correctly obtained as ground state crystal structure of W, with an equilib-rium lattice constant a0 = (2V0)

13 = 3.13Å and a bulk modulus B0 = 3.33Mbar, which

compare well to the experimental values of 3.16 Å and 3.23 Mbar, respectively. Also, theagreement of the derived Ecoh = 9.79 eV (exp: 8.90 eV) with experiment is reasonablethough not perfect. In fact, the significant overbinding obtained (too high cohesive energyand slightly too short bond length) is typical for the employed LDA functional, and par-tially corrected in present day GGA functionals. Note also, that less symmetric structureswith a more atom basis often require a more extended set of RI than just one parameter.One then has to compute the total energy as a higher dimensional function, e.g. of a and

145

Figure 6.3: Radial atomic wave functions of two neon atoms at the equilibrium inter-atomic distance. Since the distance is relatively large, there is hardly any overlap of thewavefunctions [from Ashcroft and Mermin].

c (in plane and out of plane lattice constant) for the hcp structure.

A similarly precise or even more precise description of the cohesive properties as obtainedfor tungsten (Fig. (6.2)) can be achieved by present day DFT calculations for most ele-mental and compound solids. With this, we could in principle already close the chapteron cohesion. The level of accuracy we have achieved in describing the electronic inter-actions in solids seems enough to also fully explain the ensuing cohesive properties. Yet,although it is quite gratifying to have reached such a high degree of quantitative modelingwith nowadays routinely employed electronic structure theories, this would still leave ussomewhat unsatisfied. Because what we have not yet gained is understanding of why thebcc structure is actually the most favored one for W, and why the cohesive energy hasroughly the value it has. Even more important, we would also like to understand, why thecohesive energy and equilibrium crystal structure exhibit certain trends over the periodicsystem of elements as exemplified in Fig. 6.1.

Such an understanding has typically been developed by discussing five idealized types ofbonding: i) van der Waals; ii) ionic; iii) covalent; iv) metallic; and v) hydrogen bonding.Almost no real solid can be classified 100% into any one of these five categories neverthelessit has proven useful to make this division to gain a qualitative understanding of the largelyvarying cohesive properties of solids. Indeed this is how we shall proceed by introducingeach of the five ’types’ of bond that hold solids together; focusing particular on commonexamples from each category and, where possible, providing simple intuitive models bywhich each type of bond can be understood.

6.2 Van der Waals Bonding

Before we discuss this type of bonding we shall point out that physicists and chemistshave, for the most part, two different definitions for what a van der Waals bond actuallyis. The International Union of Pure and Applied Chemistry (IUPAC) - the authority onnomenclature, definitions, etc. in chemistry - defines a van der Waals bond as ’the at-

146

tractive or repulsive forces between molecular entities (or between groups within the samemolecular entity) other than those due to bond formation or to the electrostatic interac-tion of ions or of ionic groups with one another or with neutral molecules. This termincludes: dipole-dipole, dipole-induced dipole and London (instantaneous induced dipole-induced dipole) forces.’ This definition persists largely for historical reasons; originallybeing used to explain the deviation of gases for ideal-gas behavior. This definition is notthe standard definition of physics for a van der Waals bond. Specifically, in physics onlythe third of the three types of interaction listed, namely the induced dipole-induced dipoledispersive forces, constitute van der Waals bonding. In this lecture we shall, of course,use the physics definition that dispersive forces, the instantaneous induced dipole-induceddipole forces constitute van der Waals bonding. We are careful to stress this distinctionbecause we will see that DFT does not describe the dispersive forces of van der Waalsbonding correctly. It is however capable of treating the other types of interaction includedin the chemical definition of van der Waals. So when it is said that DFT with state-of-the-art exchange-correlation functionals does not describe van der Walls bonding it is onlythe long-range tail of the dispersion forces that are not treated correctly.

The paradigm elements which are held together by van der Waals bonds are the noblegas atoms Ne, Ar, Kr, Xe (leaving out He which exhibits special properties due to itsextremely light mass ensuing strong quantum-mechanical effects/zero-point vibrations).These are the elements that have filled valence shells. The two overriding features of vander Waals forces are that they are: i) non-directional; and ii) relatively weak (comparedto the other types of bonding that we shall discuss).

Figure 6.3 illustrates the conceptual idea of van der Waals bonding using the exampleof two neon atoms. Noble gas atoms have filled shells, and there is a rather large energygap to the lowest lying unoccupied states in the next shell. As soon as the wave functionsof the two neon atoms start to overlap, electrons would need to go to these much higherstates, since there are no free states in the shell left and the Pauli principle forbids twoelectrons in the same states (technically the wave functions need to orthogonalize, yield-ing new solutions with high energy). This occupation of high lying states costs a lot ofenergy, the total energy goes up, or in other words we have a strongly repulsive interac-tion. Obviously, this Pauli repulsion will always occur, when filled shells start to overlap.Ultimately, this is the mechanism responsible for the steep increase of the total energy inall bonding types at very small distances (as soon as the inner shells of the atoms startto overlap). The big difference with the rare gas atoms is that this happens already atrelatively large distances, when the valence shells start to overlap, thereby preventing acloser approach of the two atoms. That there is an attractive interaction at all in this caseis only due to small quantum mechanical fluctuations in the electron density of any of thetwo atoms. These give rise to momentarily existing dipoles, and although they averageout over time, instantaneous electric fields are produced by them at any moment in time,inducing corresponding dipoles on the other atom. The interaction then results from theattraction between these two fluctuating dipoles.Qualitatively we would therefore expect a variation of the total energy of the two atomswith distance R as shown in Fig. 6.4. At large distances there is a weak attraction andat small distances there is a strong repulsion, giving rise to a weak bonding minimum inbetween. For the attractive part, we can even derive the rough functional form based on

147

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.8 1 1.2 1.4 1.6 1.8 2

E2(

r)/4

ε

r/σ

Figure 6.4: Lennard-Jones potential, cf. eq. (6.4), for the pair interaction between twonoble gas atoms at distance R.

Tb(C) α(10−24 cm 3)

He -268.9 0.21Ne -246.1 0.40Ar -185.9 1.64Kr -153.2 2.48Xe -108.0 4.04Rn -61.7 5.30

Table 6.1: Compilation of the boiling temperatures and (static average) polarizabilitiesfor the noble-gas elements. Data taken from the CRC handbook of Chemistry and Physics(78th edition)

the above sketched understanding of interacting dipoles: The electric field connected to adipole of dipole moment P1 is E ∝ P1

R3 . This field induces the dipole moment P2 = αE ≃αP1

R3 on the other atom, where α is the polarizability of the atom. The two dipoles P1

and P2 have then an interaction energy given by

P1P2

R3∼ αP1

2

R6, (6.3)

i.e. we would roughly expect the attractive part to scale as ∼ −AR−6, where A is a pro-portionality constant, and the negative sign indicates attraction.

It is also clear from Eq. (6.3) that the van der Waals force between two species depends onpolarizability (the susceptability of an atom or molecule to the formation of a dipole uponexposure to an electric field). Indeed this dependence nicely explains several trends, themost famous of which is the very variation in the boiling points of the noble gases. As onemoves down the noble-gas series in the periodic table the polarizability increases (becausethe size of the elements increase) and so too does the boiling point of the element. SeeTable 6.1, for example, in which the boiling points and polarizability for the noble gasesare listed.

148

Figure 6.5: Total- energy curve for the Ne2 molecule. The solid line is the result froman “exact” highest-order quantum chemistry calculation, exhibiting the correct shallowbinding minimum. DFT-LDA (circles), on the other hand, significantly overestimates thisbinding, whereas the GGA (stars) gives a purely repulsive curve. The admixture of exactexchange to the GGA (crosses) gives results that are not too far off anymore [from J.M.Perez-Jorda and A.D. Becke, Chem. Phys. Lett. 233, 134 (1995)].

Considering now the repulsive part due to the overlap of wavefunctions, an appealingchoice for a functional form would be an exponentially increasing term (since atomicwavefunctions have exponentially decaying tails). However, historically a positive powerlaw term ∼ BR−12 is rather used instead, leading in total to the so-called Lennard-Jones6-12 potential shown in Fig. 6.4. Using the units σ = (B/A)1/6 and ε = A2/4B, itsfunctional form is typically written as

ELJ2 (R) = 4ε

( σR

)12−( σR

)6. (6.4)

The precise form of the repulsive term is in fact not even that important. Any term withan inverse power higher than 6 would have done to yield a steeply rising energy at smalldistances. And as a note aside, there are in fact other frequently employed potential formslike the Born-Mayer potential that use an exponential for the hard-core repulsion.With the Lennard-Jones potential, the complete interaction between two noble gas atomsis described by just two parameters (σ and ε). These can be obtained by fitting this curveeither to experimental data from low-density gases (second virial coefficient) or to com-puted total-energy curves from highest order quantum-chemistry approaches. Both yieldvirtually identical results, which can be almost perfectly fitted by the functional formof the Lennard-Jones curve. For Ne2 one obtains e.g. σ = 2.74Å and ε = 3.1meV [cf.e.g. N. Bernardes, Phys. Rev. 112, 1534 (1958)] As we can see the parameters σ and εconveniently provide a feel for the location of the minimum and the magnitude of its onthe Lennard-Jones potential energy surface.

As we mentioned already the current workhorse in electronic structure theory calculations,DFT, has some problems with such very weakly binding van der Waals systems. This isultimately connected with the fact that the presently employed functionals (like LDA orGGAs) contain only local exchange and correlation effects by construction. The dipole-dipole fluctuations responsible for the attractive part of the Lennard-Jones curve are, onthe other hand, non-local in nature. This problem is usually simply summarized by sayingthat the current implementations of DFT are lacking the description of the long-range

149

behavior of van der Waals (or dispersion) forces. This issue is nicely explained in Fig. 6.5,again for the Ne2 molecule. Why LDA (or to a better degree GGAs with exact exchangeadmixture) nevertheless give a bonding minimum is still controversially discussed. Figure6.6, for example, illustrates how the addition of a van der Waals correction term to a com-monly used GGA functional (the ’PBE’ functional) yields a binding energy between twoamino acids that is in much better agreement with experiment. These approaches whilstarguably of some use essentially amount to little more than an a posteriori correction.At present the appropriate and efficient treatment of van der Waals interactions is a veryactive field in density-functional theory. We have seen significant developments in recentyears and more is expected to come. These developments also include terms that appearbeyond the pairwise ∼ 1/R6 interactions.

Having understood the binding between two noble gas atoms, a straightforward way ofdescribing the bonding in a noble gas solid would be to simply sum up the pairwisebonding contributions between all atoms in the solid. Since all atoms are equivalent, thiscorresponds to summing up the contributions from all other atoms as experienced by anarbitrary atom, which we take to be located at R1 = 0. For the energy per atom we thenobtain

E

M≃ 1

2

M∑

I=2

ELJ2 (|RI|) , (6.5)

where the factor 1/2 corrects for double-counting. We immediately stress that consideringonly the (purely distance dependent) pair interaction is a gross simplification. In general,any additional particle will affect the electron density of all atoms in its neighborhood,and thereby also modify the pairwise interactions among the latter. This is exactly, whyone needs in principle a new self-consistent calculation for each atomic configuration.

Expanding the total energy in terms of interactions between all particles, it can, however,be taken into account by including so-called three-body or even higher order many-bodyinteractions. We will find such terms necessary in the other bonding classes, but for thenoble gases the restriction to pairwise interaction is often sufficient, at least as a first step.As long as the particles do not approach each other too closely, the wave functions remainquite undisturbed from the form in the free atom. Adding more particles in the vicinityof other particles does therefore not affect the electron densities of the latter significantly,and correspondingly a pairwise sum is expected to describe the total energy quite well.

If we insert the exact form of the Lennard-Jones potential from eq. (6.4), we obtain forthe energy per atom in the pair potential approximation

E

M= 2ε

M∑

I=2

(σ

|RI |

)12

−(

σ

|RI |

)6

. (6.6)

It is useful to rewrite this expression in a form, which allows one to evaluate the sumover all atoms in a general form for a given crystal structure. We therefore introduce thedimensionless quantity αI ,

|RI | = αI · c, (6.7)

150

Figure 6.6: Binding energy as a function of separation for an adenine-thymine pair. Acommon GGA functional such as the ’PBE’ functional does not capture the van derWaals binding energy between these molecules. This can be corrected a posteroiri byadding a van der Waals correction to the DFT total energy which then yields a bindingenergy in qualitative agreement with experiment (solid diamond). Binding energy curvesfor two functional forms of the van der Waals correction (exponential and Fermi function)are shown. For more details see Q. Wu and W. Yang, J. Chem. Phys. 116, 505 (2002)[Courtesy of M. Fuchs].

where c is the distance to the nearest neighbor in the considered crystal structure. Withthis, the energy per atom can be written

E(c)crystal

M= 2ε

(σc

)12Acrystal

12 −(σc

)6Acrystal

6

. (6.8)

The energy is then just a function of the nearest neighbor distance, and all the informationabout the particular crystal structure (i.e. number of neighbors in nearest, next-nearestetc. shells) is contained in the lattice sum

Acrystaln =

∑

I∈crystalα−nI . (6.9)

As an illustration let us evaluate this lattice sum for the fcc crystal structure. In thislattice type we have 12 nearest neighbors at distance c (= afcc/

√2, where afcc is the

lattice constant). Then we have 6 next-nearest neighbors at a distance√2c, and so forth.

Hence, the αI for the first two shells are 1 and√2, cf. eq. (6.7). Obviously, for more distant

neighbor shells, the αI successively become larger, and correspondingly their contributionto the lattice sum smaller (inverse power). For Afcc

12 we arrive therefore at

Afcc12 = 12 · (1)−12 + 6 · (

√2)−12 + . . . = 12.13 . (6.10)

The lattice sum is thus already quite well approximated by only the first nearest neighborshell. This is a consequence of the high power n = 12 considered in A12. In fact, An→∞ isexactly given by the number of nearest neighbors (in this case the only non-zero term will

151

sc bcc fccNN 6 8 12A6 8.40 12.25 14.45A12 6.20 9.11 12.13A2

6/2A12 5.69 8.24 8.61

Table 6.2: Number of nearest neighbors, NN , and lattice sums for the three cubic Bravaislattices. The final row is proportional to the energy of the crystal. Further lattice sumvalues for other n can be found in Ashcroft/Mermin.

Ne Ar Kr Xec0 (theory) 3.13 Å 3.75 Å 3.99 Å 4.33 Åc0 (exp.) 2.99 Å 3.71 Å 3.98 Å 4.34 ÅEcoh (theory) 27 meV 89 meV 120 meV 172 meVEcoh (exp.) 20 meV 80 meV 110 meV 170 meV

Table 6.3: Equilibrium nearest neighbor distance c0 and cohesive energy Ecoh of the noblegases, as resulting from experiment and the pair potential approximation discussed in thetext (theory). The larger deviation of c0 for the lightest element Ne is due to zero-pointvibrations, which are neglected in the theory [from Ashcroft and Mermin].

be the leading one). For lower n, on the other hand, the more distant neighbour shellscontribute more significantly, as can be seen from Table 6.2.From the general form of eq. (6.8) it is straightforward to deduce the equilibrium nearestneighbor spacing co and the cohesive energy for any given crystal structure,

dEcrystal(c)

dc

∣∣∣∣c0

= 0 ⇒ c0 = σ

(2Acrystal

12

Acrystal6

)1/6

Ecoh,crystal = − Ecrystal(c0)

M= ε

(Acrystal6 )2

2Acrystal12

. (6.11)

The cohesive energy for a particular element (entering only via ε) in a given crystal lattice

therefore only the lattice sum, (Acrystal6 )2

2Acrystal12

. The lattice maximizing this sum will be the most

stable one. Inspecting Table (6.2) we find that this is the case for the fcc structure. Wehave to note, however, that the hcp lattice has highly similar lattice sums (deviating onlyin the third digits). The crudeness of the approach does not allow to distinguish betweensuch subtle differences. All we can conclude therefore is that van der Waals bonding willfavor close packed lattice structures, which is ultimately a consequence of the underlyingnon-directional pair interaction.

All noble gases (except He) solidify indeed into an fcc structure. Using the parametersσ and ε obtained by fitting the Lennard-Jones curve to low-density gas phase data orthe quantum-chemistry calculations of the diatomic molecules, we obtain the cohesiveenergies and lattice constants listed in Table 6.3. Considering the simplicity of our pair-potential model, the agreement is quite good (errors are roughly at the 10% level). At

152

Figure 6.7: Sketch of the energy levels of a noble gas crystal, using the example of Neon.In the atomic limit a, all shells are filled. Even at the equilibrium lattice constant ao theinteraction between the atoms is weak, and the electronic states are only broadened bya small amount. This still leaves a large energy gap between occupied and unoccupiedstates, and Neon results as an insulator.

the obtained quite large equilibrium bond lengths, the wave function overlap is indeedminimal (as anticipated). The energy levels in a noble-gas solid will therefore show only asmall broadening compared to the atomic limit as explained in Fig. 6.7. Due to the largeenergy difference between the uppermost occupied and the lowermost unoccupied band,rare gas crystals will behave as an insulator. On the basis of a crude pair potential we canthus understand quite some fundamental cohesive (and even electronic) properties of thenoble metals solids. This is gratifying in this special case, but we will see that it is morean exception, than a rule.

6.3 Ionic bonding

Atoms and ions with closed shells are particularly stable, i.e. a lot of energy is required toexcite an electron from a filled shell into an unoccupied state. The conceptual idea behindionic bonding is therefore that electrons are exchanged in such a way, that the atomsinvolved reach this stable closed shell state. This is most easily realized for the so-calledionic crystals formed of alkali halides (i.e. Group I and VII elements of the periodic table).Take NaCl as an example. The conception then is, that Na with the electron configurationof [1s22s22p6]3s1 becomes a Na+ ion, and Cl [1s22s22p6]3s23p5 is turned into a Cl− ion,thereby achieving the closed shell configuration in both cases. Similar ideas would, e.g.,also hold for II-VI compounds. With this electron transfer accomplished, we arrive ata situation that is quite similar to the one discussed in the last section: the interactionbetween two filled shell ions. As soon as they come too close together (something thatis now more determined by the ionic radii and not the atomic radii, though), hard corerepulsion will set in. Contrary to the van der Waals case, the attractive interaction is, pri-marily due to the much stronger electrostatic attraction between the differently chargedions. If this electrostatic attraction is stronger than the cost of exchanging the electronsbetween the atoms to reach the ionic states, the crystal will hold together. Ionic bonding

153

Ionization Energy

Electron Affinity

Figure 6.8: Periodic variation of the first ionization energies (top) and electron affinities(bottom) of the elements (96 kJ/mole = 1 eV) [From Webelements].

is therefore most effective, if the cost to create the ions is low, i.e. when one atom typereadily gives away electrons (low ionization potential) and the other one readily takesthem (high electron affinity). If one looks at Fig. 6.8 one sees that elements to the leftof the periodic table, in particular in groups I and II tend to have the lowest ionizationenergies and elements to the right of the periodic table (groups VI and VII) tend to havethe highest electron affinities. As a consequence ionic bonds are most common amongstbinary solids containing these elements. Note that H is a clear exception to this trend.We will have more to say about this when discussing hydrogen bonding.

In this most naive perception an ionic crystal is simply a collection of impenetrable chargedspheres, glued together by electrostatic interaction. In analogy to the van der Waals case,we thus expect the interaction between two ions of opposite charge to arise out of arepulsive and an attractive part. The repulsive part is due to Pauli repulsion, and sinceelectrostatic interaction is much stronger than the (also existing) van der Waals forces,

154

Structure AMad NNCesium chloride 1.76 8Sodium chloride 1.75 6Wurtzite 1.64 4Zincblende 1.64 4

Table 6.4: Madelung constants AMad and nearest-neighbor coordination NN for the mostcommon ionic crystal structures.

the attractive part will be predominantly given by a Coulomb 1/R-potential

E ionic2 (R) = Erep. + Eattr. =

C

R12− e2

4πεo R, (6.12)

where we have simply taken the charge on the ions as ±e, and ε0 = 8.85 · 10−12 As/Vmis the vacuum dielectric constant. Note, that evaluating the constants in the attractiveterm leads to Eattr. = −14.4 eV/R [in ], i.e. bringing the two ions together at a distanceof 3 Å yields already about 4.5 eV electrostatic energy gain. The cost to create Na+ andCl− (difference of electron affinity and ionization potential) is only ∼ 1.5 eV, leaving stillquite a lot of energy gain to form a very stable ionic bond. Just like in the van derWaals case, one has to recognize that the 1/R12 repulsive potential is only a rough andconvenient choice. One can determine the proportionality constant C by fitting eitherto first-principles calculations or to experimental compressibility data. One then oftenfinds, that using smaller inverse powers somewhere in the range 6-10 or an exponentialform (Born-Mayer potential) can fit the data even better. For the general discussionon the chemical bonding intended here, such multiparameter fits are, however, not veryilluminating, and we will stick for simplicity to the 1/R12-potential already used in thevan der Waals case.Having obtained the interaction between an ion pair, we may employ the same reasoningas in the last section to determine the cohesive energy of an ionic solid. Again, we do notexpect dramatic charge rearrangements in the solid compared to the case of the isolated(closed-shell) ions. A simple sum over the pairwise contributions as in eq. (6.5) shouldtherefore already describe the energy per ion pair quite well. This leads to

E

M≃ 1

2

M∑

I=2

E ionic2 (|RI|) =

M∑

I=2

C

R12I

− ±14πεo R1

, (6.13)

where the ±1 applies when the ion I in the sum has negative or positive charge, respec-tively. As before, we proceed by eliminating the properties exclusively due to the crystalstructure through the definition of the dimensionless quantity αI (cf. eq. (6.7)) and arriveat

E(c)crystal

M=

C

c12Acrystal

12 − e2

4πε0 cAcrystal

Mad

. (6.14)

Similar to the lattice sums Acrystaln defined in eq. (6.9), the complete information about the

neighbor shells of ions with positive or negative charges in the particular crystal structureare now summarized in the so-called Madelung constant

AcrystalMad =

∑

I∈crystal

±1αI

. (6.15)

155

(A) CsCl (B) NaCl (C) ZnS

Figure 6.9: The two most common ionic crystal lattices: (A) Cesium chloride and (B)Sodium chloride; and (C)the less common zinc blende structure.

Apparently, this Madelung constant is the analog of the lattice sum Acrystal1 , taking ad-

ditionally into account that neighboring ions with positive or negative charge contributedifferently to the electrostatic interaction in the lattice. The weak decrease (or long range)of the 1/R potential makes the actual calculation of Madelung constants more trickythan to the one of lattice sums. Depending on the way the summation is carried out, anyvalue whatsoever can be obtained (corresponding to finite crystals with differing surfacecharges). The method of choice to avoid such problems is the Ewald summation techniquealready discussed in section 1.3. With this, the Madelung constants of any crystal latticecan readily be computed and Table 6.4 lists a few AMad for lattice types that will becomerelevant in the discussion below.From the structure of eq. (6.14) it is obvious that again the maximum cohesive energy willbe obtained by close-packed structures, which maximize both the lattice sum A12 and theMadelung constant. As already noted in the last Section, this follows simply from the non-directional bonding implied by the interionic pair potential E ionic

2 (R). Since the nearestneighbor shell contributes most strongly to A12 and AMad, but only ions of opposite chargeyield electrostatic attraction, ionic crystals will more specifically prefer those close-packedstructures in which each ion is surrounded by a shell of ions with opposite charge. Fig.6.9 shows the two crystal structures that fulfill these close-packing and opposite ion shellrequirements to an optimum. The sodium chloride (rocksalt) structure consists of twointerpenetrating fcc lattices, thus achieving a coordination of 6 per ion, while the cesiumchloride structure can be viewed as a bcc lattice with the ion of second type inhabitingthe interior of the cube (coordination 8).Table 6.4 lists the Madelung constants and coordination numbers of these two, and twoless dense lattices (zincblende and wurtzite). As already discussed in the context of thelattice sum A1, the Madelung constant is expected to scale with the coordination number,but not as clearly as for example A12. The contribution of second and further neighborsis still significant, leading to highly similar Madelung constants for the 8-fold and 6-foldcoordinated CsCl and NaCl lattices (only the 4-fold coordinated zincblende and wurtzitestructures exhibit a noticeably lower AMad).Leaving aside this influence on the specific crystalline arrangement chosen, the dominantcontribution to the cohesive energy comes in any case from the electrostatic interaction(also often called Madelung energy EMad). This can be discerned by evaluating it at the

156

Compound c0(exp) -EMad Ecoh(theory) Ecoh(exp)LiF 2.01 Å 11.81 eV 10.83 eV 11.45 eVLiCl 2.57 Å 9.65 eV 8.85 eV 8.98 eVLiBr 2.75 Å 9.28 eV 8.51 eV 8.39 eVLiI 3.01 Å 8.64 eV 7.92 eV 7.66 eVNaF 2.32 Å 10.49 eV 9.62 eV 9.96 eVNaCl 2.82 Å 8.32 eV 8.18 eV 8.18 eVNaBr 2.99 Å 8.52 eV 7.81 eV 7.72 eVNaI 3.24 Å 7.39 eV 7.32 eV 7.13 eV

Table 6.5: Experimental lattice constants co, Madelung electrostatic energies EMad, the-oretical cohesive energies per (charged) ion pair Ecoh(theory), cf. eq. (6.16), and experi-mental cohesive energies per ion pair Ecoh(exp) for a number of alkali halides crystallizingin the sodium chloride lattice. The larger the ionic radii, the larger the lattice constant,and accordingly the lower the cohesive energy becomes. The largest part to the cohesiveenergy comes indeed from the electrostatic Madelung energy.

Figure 6.10: Calculated energy levels of a KCl-crystal as a function of the interionicdistance d (measured in Bohr radii, ao). The vertical line is the experimental value, andthe ionic levels are indicated by arrows on the right-hand side. The valence band derivesfrom the full Cl 3p shell, and at the experimental lattice constant a noticeable, but stillsmall level broadening is discernible [from H. Ibach and H. Luth, Solid State Physics,original source: L.P. Howard, Phys. Rev. 109, 1927 (1958)].

157

experimentally determined lattice constant co. Using a strategy as in eqs. (6.11), one findsfurthermore that the cohesive energy at this experimental lattice constant is

Ecoh,crystal = − Ecrystal(c0)

M= −11

12EMad(c0) =

11

12

AMade2

4πε0c0. (6.16)

Table 6.5 lists this energy and the Madelung energy for a number of alkali halides, andcompares with experiment. The agreement is not as good as the one obtained in theprevious section for the vdW bonded crystals, but given the simplicity of the theoreticalmodel, it is clear that the essential physics are contained in it. The remaining 10-20% ofthe binding comes from overlapping and hybridized wave functions, which is for examplereflected in the noticeable broadening of the energy levels as shown for KCl in Fig. 6.10.Still, the broadening is still much smaller than the separation between the individuallevels, and consequently the alkali halides are insulators. Again, the only small overlap ofthe ionic charge distributions and correspondingly small charge rearrangements comparedto the isolated ions, is the reason why the primitive pair potential approach works so wellfor these systems. A few other interesting relationships can be seen in Table 6.5: First,EMad is generally larger than Ecoh which reflects the obvious existence of some repulsiveenergy at equilibrium; Second, Ecoh is inversely proportional to lattice constant, whichis what we would expect based on the 1/r dependence of the Madelung energy. Furtherdiscussions on these aspects can be found in J.A. Majewski and P. Vogl, Phys. Rev. Lett.57, 1366 (1986); Phys. Rev. B 35, 9666 (1987).This crude pair potential model can even be extended to explain semi-quantitatively somestructural trends exhibited by ionic solids. In this context we introduce Pauling’s so calledradius ratio rules. Very briefly these relate the relative size of the anion and cation in acrystal (ionic radii can be determined by experiment with, for example, x-ray diffraction)to the preferred structure which is adopted. Specifically these state the intervals withinwhich various structures are likely to occur:

1 >R+

R− > 0.73 (CsCl structure) (6.17)

0.73 >R+

R− > 0.41 (NaCl structure) (6.18)

0.41 >R+

R− > 0.23 (ZnS structure) (6.19)

The relationships as listed here are merely based on the most efficient ways of packingdifferent sized spheres in the various crystal structures. They can be easily verified with afew lines of algebra. The essential messages to take from these relationships are basically:(i) If the anion and cation are not very different in size then the CsCl structure willprobably be favored: (ii) If there is an extreme disparity in their size then the ZnS structureis likely; and (iii) if in between the NaCl structure is likely.Of course atoms are much more than just hard spheres but nonetheless correlations suchas those predicted with these simple rules are observed in the structures of many materials.A partial understanding of why this is so can be obtained by simply plotting EMad as afunction of the anion-cation ratio. This is done in Fig. 6.11. If, for example, one looks atthe CsCl to NaCl transition in Fig. 6.11 it is clear that there is a discontinuity at 0.73after which the Madelung energy remains constant. This transition is a consequence ofthe fact that the volume of the CsCl structure is determined solely by the second nearest

158

neighbour anion-anion interactions. Once adjacent anions come into contact no furtherenergy can be gained by shrinking the cation further. It would simply "rattle" around inits cavity with the volume of the cell and thus the Madelung energy remaining constant.Further discussion on this issue and these simple rules of thumb can be found in Pettifor,Bonding and Structure of Molecules and Solids.

0.0 0.2 0.4 0.6 0.8 1.0

R+/R-

Ma

de

lung

Ene

rgy

NaCl

CsCl

ZnS

Figure 6.11: The Madelung energy in ionic compounds as a function of the radius ratio forCsCl, NaCl and cubic ZnS lattices (assuming the anion radius is held constant) (Basedon Pettifor, Bonding and Structure of Molecules and Solids).

6.4 Covalent bonding

The ionic bonding described in the last section is based on a complete electron transferbetween the atoms involved in the bond. The somewhat opposite case (still in our ideal-ized pictures), i.e. when chemical binding arises out of electrons being more or less equallyshared between the bonding partners, is called covalence. Contrary to the ionic case, wherethe electron density in the solid does not differ appreciably from the one of the isolatedions, covalent bonding results from a strong overlap of the atomic-like wavefunctions ofthe different atoms. The valence electron density is therefore increased between the atoms,in contrast to the hitherto discussed van der Waals and ionic bonding types. It is intu-itive that such an overlap will also depend on the orbital character of the wavefunctionsinvolved, i.e. in which directions the bonding partners lie. Intrinsic to covalent bonding istherefore a strong directionality as opposed to the non-directional ionic or van der Waalsbonds. From this understanding, we can immediately draw some conclusions:

• When directionality matters, the preferred crystal structures will not simply resultfrom an optimum packing fraction (leading to fcc, hcp or CsCl, NaCl lattices). Theclassic examples of covalent bonding, the group IV elements (C, Si, Ge) or III-Vcompounds (GaAs, GaP), solidify indeed in more open structures like diamond orzincblende.

• Due to the strong directional bonds, the displacement of atoms against each other(shear etc.) will on average be more difficult (at least more difficult than in the caseof metals discussed below). Covalent crystals are therefore quite brittle.

159

Element ao (Å) Ecoh (eV/atom) Bo (Mbar)theory 3.602 7.58 4.33

C exp. 3.567 7.37 4.43%diff. <1% 3% –2%theory 5.451 4.67 0.98

Si exp. 5.429 4.63 0.99%diff. <1% 1% –1%theory 5.655 4.02 0.73

Ge exp. 5.652 3.85 0.77%diff. 0.2% 4% –5%

Table 6.6: Comparison between DFT and experimental results for structural and cohesiveproperties of group IV semiconductors in the diamond structure. ao lattice constant, Ecoh

cohesive energy, and Bo the bulk modulus [from M.T. Yin and M.L. Cohen, Phys. Rev.B 24, 6121 (1981)].

• Directionality can not be described by only distance-dependent pair potentials. Atheory of cohesion in covalent crystals will therefore be significantly more involvedthan the crude pair-potential approach that we found so successful to describe vander Waals and ionic crystals. In the language of interatomic potentials, there will beno way around introducing at least threebody, if not higher many-body terms. Infact, the common theme of interatomic potentials for covalent crystals, like the fa-mous Stillinger-Weber potentials or the ubiquitous force-fields, are three-body termsthat take angular components into account. Even then, the success and value of us-ing such potentials is completely different compared to the pair potentials of the lasttwo sections: For the latter we found that one general form can treat quite a rangeof situations and elements very well. Even for the best covalent interatomic poten-tials currently on the market, this transferability is very much limited. Althoughthere are parametrizations that can describe one bonding situation for one elementextremely well (say Si bulk), they completely fail for another element or for thesame element in a different bonding environment (say Si surface). This reflects thefact, that the functional forms employed cannot embrace the changing characterof the hybridizing wave functions, or in other words that one needs to explicitlytreat the quantum mechanics of the electrons achieve a trustful descriptionand un-derstanding. Interatomic potentials are nevertheless frequently (often unfortunatelyuncritically) employed in materials science research, and quite some effort is dedi-cated to developing further, improved functional forms that might exhibit a highertransferability and reliability. For our general discussion on bonding and cohesion,such refined potential approaches are, however, not very helpful.

Lacking a model of cohesive energy of comparable simplicity to those of van der Waalsor ionic bonding, we have to stick to the more elaborate electronic structure theory de-scriptions as obtained e.g. with DFT. Fortunately, the latter does at least a remarkablygood job in describing covalent crystals, as exemplified for the group IV semiconductorsin Table 6.6. Recalling that there is no free parameter in the theory, the agreement is in-deed quite impressive and shows that the current exchange-correlation functionals capturemost of the essential physics underlying covalent bonding.

160

The requirement for quantitative calculations does not, however, necessarily prevent usfrom still attempting to gain some further conceptual understanding of the cohesive prop-erties of covalent crystals. A useful concept for understanding some of the structures andproperties of covalent materials both in solids and molecules is hybridization. Let’s nowhave a quick look at this.

6.4.1 Hybridization

The formation of hybrid orbitals (or hybridization) has proved to be an extremely helpfuland instructive concept for understanding the structure and bonding in many covalentmaterials (solids and molecules). Here we shall introduce the basic ideas of hybrid orbitalformation with one or two instructive examples.Let’s consider carbon, which has the valence shell configuration 2s2, 2p2. It is possibleto make linear combinations of these four valence orbitals to yield a new set of hybridorbitals. The resulting so-called sp3-hybrid functions are

φ1 =1

2(s+ px + py + pz) (6.20)

φ2 =1

2(s+ px − py − pz) (6.21)

φ3 =1

2(s− px + py − pz) (6.22)

φ4 =1

2(s− px − py + pz) . (6.23)

These correspond to the orbitals of an excited state of the atom, i.e. this set of hybridorbitals is less stable than the original set of atomic orbitals. However, in certain circum-stances it is possible for these orbitals to bond more effectively with orbitals on adjacentatoms and in the process render the composite (molecule, solid) system more stable. Asshown in Fig. 6.12 the four sp3 orbitals point to the four corners of a tetrahedron. Thisimplies that sp3 hybrid orbitals favor bonding in which atoms are tetrahedrally coordi-nated. Indeed, they are perfectly suited for the diamond structure, and the energy gainupon chemical bonding in this tetrahedral configuration outweighs the energy that is re-quired initially to promote the s electrons to the p levels. Carbon in the diamond structurecrystalizes like this as do other elements from group IV (eg. Si and Ge) and several III-Vsemiconductors. Some examples of the band structures of these covalent materials areshown in Fig. 6.13.Although essentially just a mathematical construct to change basis functions the conceptof hybridization can be rather useful when seeking qualitative understanding of differentsystems. For example, hybridization can be used to explain the qualitative trend in thesize of the band gap in tetrahedral (group IV) semiconductors. In C, Si, Ge and Sn forexample the splitting between the valence s and p shells are all approximately 7.5 eV. Inthe solid, however, the measured gaps between the valence and conduction bands are: C =5.5 eV, Si = 1.1 eV, Ge = 0.7 eV, Sn = 0.1 eV. This trend can be understood through sp3

hybrid formation in each element, which in the solid leads to sp3 bonding (valence) andsp3 antibonding (conduction bands). The width of the bonding and antibonding bands,and hence the band gap, depends upon the overlap between atoms in each solid, which ofcourse is related to the ’size’ of the individual elements. Carbon (in the diamond structure)therefore exhibits the largest band gap and Sn the smallest (See Fig. 6.14).

161

Figure 6.12: Illustration of the formation of sp3 hybrid orbitals in C (from E. Kaxiras,Atomic and Electronic Structure of Solids).

Figure 6.13: Band structures for some typical covalent materials in the diamond (C) andzinc blende (GaAs) structures. Both materials exhibit a ("hybridization") band gap (fromE. Kaxiras, Atomic and Electronic Structure of Solids).

162

Ener

gy

EgapSn Ge Si C

Ener

gy

Bonding

Antibonding

Es

Ep

Atomicorbitals

sp3 hybridorbitals

sp3

bondssp3

bands

Figure 6.14: Illustration of the formation of sp3 valence and conduction bands in thetetrahedral semiconductors. In the inset a schematic illustration of the opening up of thehybridization band gap in the energy bands of these solids. As one goes from C to Si toGe to Sn the size of the atoms increase, which increases the band widths and decreasesthe band gap (after Pettifor).

Whereas this picture of sp3-hybrids renders the high stability of the diamond structureand its high p valence character comprehensible, there is unfortunately no simple rule thatwould predict, which hybrid orbital set (and corresponding structure) is most preferred foreach element. Hybrid orbital formation mainly provides a suitable language for describingthe bonding properties of solids (and molecules). Indeed another common set of hybridorbitals are the sp2 set, which are often used to discuss layered structures:

φ1 =1√3s+

√2√3px (6.24)

φ2 =1√3s− 1√

6px +

1√2py (6.25)

φ3 =1√3s− 1√

6px −

1√2py (6.26)

φ4 = pz . (6.27)

An illustration the sp2 hybrid orbitals and their resultant energy levels for C is shownin Fig. 6.15. It can be seen that sp2 hybrid formation is compatible with bonding in atrigonal arrangement (bonding in a plane with an angle of 120 between neighbours. ForC in the solid state this gives rise to graphite, which contains a combination of strongin-plane σ and π sp2 derived bonds and much weaker interlayer (van der Waals) bonds.Finally, we should stress that the initially mentioned view of ionic and covalent bonding asopposite extremes indicates already, that most real structures will exhibit varying degreesof both bonding types. This is nicely illustrated by the series Ge, GaAs, and ZnSe, i.e. apure group IV, a III-V and a II-V structure all in the same row of the periodic system ofelements. Fig. 6.16 shows the corresponding valence electron densities. While the increased

163

Figure 6.15: Illustration of the formation of sp2 hybrid orbitals in C (from E. Kaxiras,Atomic and Electronic Structure of Solids).

Figure 6.16: Valence electron density of Ge (top), b) GaAs (middle), and c) ZnSe (lower)in e per unit-cell volume.

164

Figure 6.17: DFT-LDA cohesive energy (plotted here as a negative number) for Al as afunction of the nearest neighbor coordination NN in various lattice types. The latticetypes considered are the linear chain (NN = 2), graphite (NN = 3), diamond (NN = 4),two-dimensional square mesh (NN = 4), square bilayer (NN = 5), simple cubic (NN =6), triangular mesh (NN = 6), vacancy lattice (NN = 8) and fcc (NN = 12). The solidcurve is a fit to A·(NN)−B

√NN [from V. Heine et al., Phil. Trans. Royal Soc. (London)

A334, 393 (1991)].

bonding density lies in the purely covalent Ge solid still symmetrically between the atoms,the maximum shifts more and more towards the anion for the case of the III-V and II-VI compound. GaAs can still be discussed in the sp3 picture, e.g. as Ga(−)(4s14p3) andAs(+)(4s14p3). The range of the sp3 hybrids is, however, larger for the As anion thanfor the Ga cation, shifting the bonding maximum and giving the bonding a slightly ionictouch. This becomes then even more pronounced for the II-VI compound, and for the I-VIIalkali halides discussed in section 6.2, the purely ionic bonding character is attained.

6.5 Metallic bonding

Although we have already viewed the even or completely one-sided sharing of electronsin covalent and ionic bonding as somewhat opposite extremes, they are similar in thesense that the valence electrons are still quite localized: either on the ions or in the bondsbetween the atoms. The conceptual idea behind metallic bonding is now complementaryto this, and describes the situation, when the valence electrons are highly delocalized. Inother words they are well shared by a larger number of atoms - in fact one can no longersay to which atom a valence electron really belongs: it is simply part of the “community”.Such a situation is for example most closely realized in the alkali metals, which readilygive away their only weakly bound s electron in the valence shell. In most abstract terms,such metals can thus be perceived as atomic nuclei immersed in a featureless electronglue. From this understanding, we can immediately (just like in the covalent case) drawa couple of conclusions:

165

• A delocalized binding is not directional, and should allow for easy displacement ofthe individual atoms with respect to each other. Metals are therefore rather elasticand ductile.

• Delocalization is the consequence of heavy overlap between the individual valencewave functions. The bands will therefore exhibit a strong dispersion, rendering theopening up of energy gaps in the density of states (DOS) less likely. With theFermi level cutting anywhere through this gapless valence DOS, unoccupied stateswill exist immediately above the highest occupied one. Indeed this is the definingcharacteristic of a metal and application of small external perturbations, e.g. anelectric field, can then induce current flow, i.e. metals are electric conductors (andin turn also good thermal conductors).

• It is intuitively clear that a contribution arising from delocalized bonding betweenmany atoms can not be described by a sum of pair potentials. This results equallyfrom the understanding that pair potentials are only adequate when there is a neg-ligible distortion of the atomic electron density when the atom is added to the solid.In metals, on the other hand, the overlap between the valence wave functions is solarge that the atomic character is hardly recognizable any more. That simple pairpotentials will not be appropriate for the description of metallic systems is also nicelyvisible from plots like the one shown in Fig. 6.17. Here, the cohesive energy for Alis plotted as a function of the nearest neighbor coordination NN in various crystallattices. If the binding arose only out of pairwise bonds with the nearest neighbors,the cohesive energy would be proportional to NN . What is instead obtained typi-cally as in Fig. 6.17 is that the cohesive energy scales with the coordination morelike A · (NN) − B

√NN , with A and B constant. Apparently, increasing the local

coordination about a given atom reduces the strength of the existing bonds, as thedelocalized electrons spread more evenly between all neighbors. This phenomenoncharacteristic for metallic bonding is often called bond order conservation, whilechemists refer to an unsaturated nature of the metallic bond. With pure pair po-tentials failing, the common theme of interatomic potential schemes used for metalsis therefore to add a coordination dependent term, which reduces the linear scal-ing due to the pair potential for higher coordinated atoms. Most famous examplesof such approaches are the so-called Embedded Atom Method (EAM) or Finnis-Sinclair/Bond-Order Potentials (BOPs). Still, the same word of caution holds hereas already discussed for the covalent crystals: Although frequently employed in ma-terials sciences, there is yet no really reliable and transferable interatomic potentialscheme for metals. Presumably there will never even be one, and real quantitativeunderstanding can only come out of quantum mechanical calculations explicitlytreating the electronic degrees of freedom.

As already in the case of covalent bonding, we will have to look for alternatives that willbring us a conceptual understanding of the quantitative data coming out of DFT calcula-tions. And we will do this first for the so-called simple or free-electron like metals, whichcomprises for example the alkali and alkaline earth metals1 (group I and II). Characteristicdata for some alkalis on which we will concentrate is listed in Table 6.7. Fig. 6.18 shows

1Although the alkaline earth metals have a filled valence s shell, under normal conditions they aremetallic solids because of partial occupation of their p bands (sp hybridization).

166

Figure 6.18: Radial wavefunctions of two sodium atoms at the equilibrium interatomicdistance they would have in the crystal. There is very small overlap between the 2s and2p orbitals, but a very large overlap of the 3s valence wave functions [from Ashcroft andMermin].

Element ao (Å) Ecoh (eV/atom) rs/aBLi 3.49 1.63 3.27Na 4.23 1.11 3.99K 5.23 0.93 4.95Rb 5.59 0.85 5.30Cs 6.05 0.80 5.75

Table 6.7: Experimental values for structural and cohesive properties of group I alkalisin the bcc structure. ao lattice constant, Ecoh cohesive energy, and rs the Wigner-Seitzradius.

Figure 6.19: DFT-LDA Bandstructure (left) and DOS (right) for Na in the bcc structure.Note how much the valence bands follow a free-electron like dispersion, as also clearlyvisible in the parabolic shape of the DOS [from V.L. Moruzzi, J.F. Janak and A.R.Williams, Calculated electronic properties of metals, Pergamon Press (1978)].

167

the radial wave functions of two Na atoms at the equilibrium distance they would havein a bcc solid, from which the already mentioned strong overlap of the valence 3s statesis apparent. In fact, ignoring the oscillations near the two nuclei, the charge distributionof the overlapping 3s orbitals can be seen to be practically constant. The band structureof alkalis should therefore exhibit a dispersion very similar to the one of free electrons(hence the name given to these metals), as also illustrated in Fig. 6.19 for Na. It alsosuggests that the simple jellium model discussed already in chapter 3 could serve as asuitable minimum model to qualitatively describe the bonding in the alkali crystals. Inthis jellium model, the electron density is considered to be constant over the whole solid,and in its simplest form the ion lattice is equally smeared out to a constant density exactlycompensating the electronic charge. The model is then completely specified by just theelectron density n = N/V , which is usually given in form of the so-called Wigner-Seitzradius

rs =

(3

4π n

)1/3

, (6.28)

corresponding to the spherical volume available to one conduction electron. In chapter 3we had seen, that the energy per electron can be written as

E/N = T + E ion−ion + Eel−ion + Eel−el

= Ts + E ion−ion + Eel−ion + EHartree + EXC , (6.29)

where T (Ts) is the kinetic energy of the (non-interacting) electron gas, E ion−ion and Eel−ion

the energy due to ion-ion and electron-ion interaction, and the energy due to electron-electron interaction, Eel−el, has been divided into Hartree and exchange-correlation con-tributions. For the completely smeared out constant ion density, one finds that

ECoulomb = E ion−ion + Eel−ion + EHartree = 0 , (6.30)

i.e. the Coulomb interaction due to the constant electron and ion densities cancels. Thissimplifies eq. (6.29), and in the exercise you will derive that one obtains for the energy inthe Hartree-Fock approximation

(E/N)const. ion = Ts + EXC HF≈ 30.1 eV(

rsaB

)2 −12.5 eV(

rsaB

) . (6.31)

Interestingly, this energy exhibits a minimum at r0s = 4.8 aB, i.e. already this crudestmodel of delocalized electrons leads to bonding. Before we directly proceed to analyze howthis compares to the real alkalis (or how we may somewhat refine our toy model), let us firstunderstand this quite astonishing fact. If we had treated the electron gas as independentparticles, its energy would have only contained the kinetic energy Ts. As we can see fromeq. (6.31), this first term is purely repulsive. Since in this approximation, the attractiveelectrostatic potential from the smeared-out ionic background is exactly compensated bythe average repulsive field from all the other electrons, there is no reason for the electronsto stay closer together. Adding exchange in the Hartree-Fock approximation, however,introduces the exchange-hole around each electron as discussed in chapter 3. Due tothis lowering of the electron density in its immediate vicinity, each electron sees now anadditional attractive potential from the surrounding positive jellium background. Since

168

fcc hcp bcc sc diamond1.79186 1.79175 1.79168 1.76012 1.67085

Table 6.8: Madelung constants for ion lattices immersed in a compensating constant elec-tron density [from C.A. Sholl, Proc. Phys. Soc. 92, 434 (1967)].

the potential at the centre of a sphere of uniform charge varies inversely with the sphereradius, we expect the electron to feel an additional attractive potential proportional to1/rs. This is indeed the second term in eq. (6.31), lowering the energy and leading to abinding minimum.Having understood this, how good or bad are we actually doing with this jellium model?The alkalis crystallize in the bcc structure, which is something we cannot get out of ourpresent model, because we have neglected the explicit form of the crystalline structure(but we will comment on the bcc structure below). The alkali atoms have one valence selectron, i.e. the number of electrons N is equal to the number of atoms M in the system.With this, we obtain for the cohesive energy

Ecoh = −(E[r0s ]

M− E[rs →∞]

M

)= −E[r

0s ]

N= 1.3 eV/atom . (6.32)

And in the bcc lattice with one electron per atom, the Wigner-Seitz radius is related tothe lattice constant by abcc ≈ 1.1(rs/aB). We therefore obtain for the lattice constanta0bcc = 5.3Å. Comparing these two cohesive quantities with the data compiled in Table6.7, the success of this admittedly trivial model is impressive.In particular, since we are well aware that we are looking at a spitting image of reality,we should verify that this agreement is not fortuitous. First, one should check, whetherthe Hartree-Fock approximation does really already describe the major effect due to theelectron-electron interaction. For the simple jellium system, one can fortunately calculatethe correlation beyond Hartree-Fock rather well. From homogeneous electron gas theorywe therefore find that such further contributions lead to a lowering of the equilibriumWigner-Seitz radius from r0s = 4.8 aB to 4.23 aB (and increasing the cohesive energy to2.2 eV/atom), i.e the HF value was not too bad after all. On the other hand, the other mostdrastic approximation in our model was to smear out the ion lattice to a constant value.Alternatively, one can employ a lattice of point charges with Z = e (for the alkalis) toresemble the atomic nuclei. Then, the Coulomb interaction of eq. (6.30) between electronsand ions does not cancel anymore. Instead one obtains (e.g. C.A. Sholl, Proc. Phys. Soc.92, 434 (1967))

ECoulomb = −α2

e2

rs, (6.33)

where α is again a Madelung constant. This time it refers to positive ions immersedinto 1 constant electron density. Values for α for some lattices are given in Table 6.8.Considering the ion lattice explicitly leads therefore to another term lowering the electronenergy (negative sign!), which is lowest for the lattice maximizing the Madelung constant.This lattice would correspondingly result as the stable one in our model, but looking atthe values listed in Table 6.8 we find that we will be unable to distinguish between fcc, hcpand bcc lattices. That the two close-packed lattices are among the most stable is no real

169

surprise, but the high stability of the more open bcc lattice is interesting. From the non-directionality of the metallic bond, we would have intuitively expected the close-packedlattices to be most favorable. Yet, even if this was so, our analysis shows now that thebcc lattice will be not very much less favorable (and this also results from accurate DFTcalculations of the alkalis). Since entropy favors more open structures, a phase transitionto bcc could therefore already occur at very low temperatures. At the finite temperaturesat which experiments have been carried out to date, always the bcc structure is found forall alkali metals. Whether this is really the ground state structure, or just the result of alowest temperature phase transition is not yet understood.In any case, in the bcc structure the Coulomb interaction term becomes

ECoulomb = −24.4 eV(rsaB

) . (6.34)

Adding this to eq. (6.31), we now obtain as the minimum r0s = 1.6 aB, i.e. the new energylowering term has considerably shifted the optimum for metallic binding to higher electrondensities. In fact, the shift is so large, that we now obtain a way too small lattice constantof 2.7 Å, cf. Table 6.7. The reason for this overshooting in the correction compared tothe smeared-out ion result is also the reason why in both cases we erroneously obtainidentical lattice constants and cohesive energies for all alkali metals (there is no “materialdependence” in the optimum electron density). Instead of point-like ions, there will inreality be a finite core region with a high density of core electrons. Due to exchange andcorrelation, the valence electrons will be repelled from this region and will be confinedto a smaller region left in between. This increases the average electron density in thisregion and thus also the kinetic energy repulsion. In parallel, the valence electrons canalso not come as close to the positively charged ions as in the situation approximated bythe point-like lattice. This gives less negative electrostatic energy, i.e. in total both theexchange and the kinetic energy term will favor rs > 1.6 (lower densities), when finitecore regions are considered. Depending on the size of the core region, one will thereforedescribe the alkali better with either the smeared-out ion model (approximating a largecore region) or with the point-like ion model (approximating a small core region). Thisalso explains the varying cohesive values within the alkali metal series: The small coreregion of Li is still very well modelled by the point-like lattice model, whereas the largecore region of the heavy Rb or Cs approach already the situation described better by ouroriginal smeared-out ion model.This correspondence is in fact exploited semi-quantitatively by assigning so-called empty-core (or Ashcroft) pseudopotentials to each metal, such that a jellium model on the level asdiscussed above (but with a finite impenetrable core region corresponding to the empty-core radius) fits the experimental cohesive data (lattice constant, cohesive energy) best.Although this allows to describe quite a range of properties ranging from phonon spectraand optical absorption to superconducting transition temperatures for all simple metals,this refinement is not very instructive for our general discussion on bonding. For us, itis primarily important that we understand that delocalized electrons can lead to metal-lic bonding. On the other hand, what we can not yet understand on the basis of ourcrude model (in fact not even with the refinement of empty-core pseudopotentials), iswhy the transition metals (TMs) exhibit significantly higher cohesive energies than thesimple metals (W has the highest one with 8.9 eV/atom). Furthermore, why do the co-hesive properties follow roughly a parabolic pattern over one transition metal series as

170

Figure 6.20: Typical radial distributions of the valence 3d and 4s wavefunctions in a 3dtransition metal. Compared to the s states the d states are much more contracted [fromC.S. Nichols, Structure and Bonding in Condensed Matter].

Figure 6.21: Experimental heat of formation for the 3d, 4d and 5d transition metals. Aparabolic variation of the cohesive properties is clearly visible for the 4d and 5dmetals. Forthe 3d series this trend is less clear, as discussed in the text [from C.S. Nichols, Structureand Bonding in Condensed Matter].

171

Figure 6.22: DFT-LDA band structure for Cu in the fcc structure (upper panel). Thes bands are located in the region shaded in light grey and show a dispersion highlyreminiscent of free electrons (the band structure of which is shown in the lower panel forcomparison). In contrast, the bands deriving from the d orbitals (lying in the dark shadedregion) are rather flat and have no correspondence in the free electron band structure[from Ashcroft and Mermin].

exemplified in Fig. 6.21, which is furthermore accompanied by a systematic change of sta-ble crystal structure from bcc over hcp to fcc when going from early to late TMs? Theremust be another component in the bonding responsible for this, and it is not difficult toimagine that this has to do with what is special about the TMs, namely the partly filledd valence shell. These stats add to the metallic bonding a strong covalent contribution.In a general, but highly simplified view, the d-orbitals can be regarded as relativelystrongly localized compared to the s valence electrons of the simple metals. A “tight-binding” type description in the sense of atomic-like orbitals is then reasonable, even inthe solid. Compared to the delocalized s bands, the d bands will correspondingly be ratherflat, as can indeed be seen in the band structure of Cu shown in Fig. 6.22. The valencedensity of states for transition metals can therefore be schematically decomposed intotwo contributions: a broad, featureless and essentially parabolic part due to the valences states (comparable to the simple alkali metals) and a relatively narrow (few eV wide)part due to the d states as illustrated in Fig. 6.23. Since there are many more d statesthan s states, the d contribution dominates, and the varying (cohesive) properties over atransition metal series can essentially be understood from a differing degree of filling ofthe d band (“rigid band model”): the number of valence electrons increases over the TMseries (e.g. Ru 8, Rh 9, Pd 10, and Ag 11 valence electrons), shifting the Fermi level moreand more to the right within the d band dominated DOS. At the end of a TM series, thed band is finally completely filled and the Fermi level cuts through the again s-like partof the DOS above the d band, as can e.g. be seen in the band structure of Cu shown in

172

Figure 6.23: Qualitative picture of the two contributions to a TM (fcc) density of states(DOS): a wide featureless and free-electron like s band and a narrow, structured d band.Due to the larger number of d states, the d band contribution to the DOS dominates,and the varying properties over a TM series can be understood as arising from a differingdegree of filling of the d band (different Fermi level position) [from Ashcroft and Mermin].

Figure 6.24: Density of states in the rectangular d band model for transition metals.

Fig. 6.22. Such transition metals with completely filled d bands are called noble metals(Cu, Ag, Au).The simplest model reflecting this understanding of the TM valence electronic structureas a composition of nearly free electron s bands and “tight-binding” d bands is the so-called rectangular d band model of Friedel. Here, the s states are taken as free-electronlike (i.e. the jellium model discussed for the simple metals) and the d states as constantover a given band width W as sketched in Fig. 6.24. Within this simple model one cananalytically derive and understand surprisingly many, not only cohesive properties of TMs,and we will encounter it again in later chapters of this lecture. Here, we content ourselveswith discussing only the salient features with respect to cohesion qualitatively. Due to themore localized nature of the d orbitals, their bonding contribution is in fact more covalentthan metallic. Bringing the atoms closer together results in d wave function overlap anda splitting into bonding and antibonding states yielding the narrow d band. Within thisd DOS we therefore expect the lowest energy states to exhibit a more bonding character,followed by non-bonding states at intermediate energies and the highest energy states tobe of antibonding character.What does this understanding now mean for the cohesive properties? Going over one TMseries, we start with the early transition metals and accordingly begin to fill electrons intothe lowest energy d states. These are of bonding type and we expect an increase in thecohesive energy. Since the DOS is dominated by d states, this rise in cohesion should berather strong, too. Due to the shorter range of the d orbitals, their bonding contributionwill also favor smaller lattice constants to maximize the wave function overlap. Towardsthe middle of the TM series the solids should therefore exhibit strongly increasing cohe-sive energies and decreasing lattice constants. The packing fraction and corresponding s

173

electron density becomes then higher than the optimum rs for the metallic bonding (e.g.compare the bcc lattice constants around 3 Å of the 5B and 6B TMs (V, Cr, Nb, Mo,Ta) with the ∼ 5Å favored by the heavy alkalis of similar core radius). The resultingstructure and cohesion balances therefore a contractive tendency from the d orbitals witha repulsive tendency from the s electron gas (often called s pressure). Once the fillingreaches the non-bonding and anti-bonding higher energy states in the d band (i.e. forthe middle and late TMs), the increasing number of d electrons does not yield furtherbonding anymore, and even diminishes the existing one. The cohesive energy will leveloff and decrease, while the s pressure leads to increasing lattice constants. At the noblemetals, the d contribution has in this simplistic view finally cancelled completely, and wereobtain cohesive properties (very roughly only) comparable to the simple metals. Calcu-lating through the rectangular d band model, one obtains therefore in total a parabolicshape for the cohesive energy over a TM series, and using the really computed d bandwidths of the order of a few eV for the parameter W , also the absolute magnitude of thecohesive energy comes out very well.With a very crude model we can therefore (again) understand the qualitative cohesivetrend over a large number of elements. What we can not reproduce with it yet, but whichis something that comes out very well in state-of-the-art DFT calculations by the way,is the structural trend from bcc to hcp to fcc over the TM series and the strange dipat the top of the parabola in the middle of the TM series, cf. Fig. 6.21. The first pointcan obviously not be understood within the rectangular d band model, since there isno explicit lattice structure contained in it. When one considers the latter, say in DFTcalculations, one finds that the lattice affects the sub structure within the d DOS thatis apparent in Fig. 6.23, but neglected in the coarse rectangular d band model. One canunderstand this sub structure in the DOS directly from the band structure: The DOSresults from the integration over the Brillouin zone; points that occur often and where thebands are relatively flat, will thus give rise to a high density of states. For the fcc lattice,cf. Fig. 6.23, one would for example typically expect five peaks, three associated with the(eightfold occurring) L point and two with the (sixfold occurring) X point, cf. Fig. 6.22.The shape of the d DOS is in other words quite characteristic for a given lattice type,not so much for the element (which more dictates the filling of the d DOS, again in theview of the “rigid band model”). Comparing this characteristic shape for bcc, hcp and fccstructures one can discern e.g. a rather skewed form of the bcc d DOS with many low lyingstates. All three lattices (bcc, fcc, hcp) offer almost the same volume per atom, in whichcase one can show that the contribution from the single particle energies governs the finaltotal energy [H.L. Skriver, Phys. Rev. B 31, 1909 (1985)]. If a particular lattice structureoffers therefore an optimum number of bonding states for a given filling fraction (likethe bcc structure for small fillings), it will result as most stable. With the characteristicDOS shapes, we therefore obtain in all three TM rows the same sequence bcc→hcp→ fccdepending on the filling ratio (a more in depth discussion of this point can for examplebe found in D. Pettifor, Bonding and Structure of Molecules and Solids, Clarendon Press(1995)).This leaves as the last point the dip in the middle of the TM series, cf. Fig. 6.21 Thereason behind this is the special properties of the free atoms that the cohesive energyresults as the difference between the energy of the isolated atom and the solid!). or atomsin the center of the TM series. In particular for Mn (3d54s2) and Mo (4d55s1), the d− dcorrelation is particularly important and leads to a pronounced stability of the isolated

174

A H B

a

(A) (B)

Figure 6.25: (A) The structure of the water dimer, which illustrates the structure of a typ-ical H bond. (B) An isosurface of constant electron density for one of the occupied molec-ular orbitals in the water dimer, illustrating overlap between the wavefunctions of bothwater molecules in the dimer. B is taken from http://www.lsbu.ac.uk/water/index.html,a very informative and detailed website on the properties of water.

atom. In the solid this is less important, i.e. there is no unusual stability compared to theoverall TM trend, yielding in total a diminished cohesive energy for these elements and adip in the parabola.

6.6 Hydrogen bonding

The final type of bonding that we shall briefly discuss is ’hydrogen bonding’. Here weshall provide little more than a basic introduction. More information can be found in,for example, Theoretical treatments of hydrogen bonding, edited by Dusan Hadzi or AnIntroduction to H bonding by G.A. Jeffrey.The importance of hydrogen bonds to the structures of materials can scarcely be over-stated. Hydrogen bonds are the single most important force determining the three di-mensional structure of proteins, the structure of liquid water and in the solid state theyfeature most prominently in holding the water molecules in ice together. Indeed it hasbeen estimated that a paper related to H bonding is published on average every fifteenminutes (see G.A. Jeffrey, An Introduction to H bonding).The concept of some special hydrogen mediated interaction has been around since 1902when Werner examined the reaction of ammonia and water. Not since the work of Latimerand Rodebush in 1920, however, has this interaction been known as a hydrogen bond2.Generally H bonds form when a covalently bound H atom forms a second bond to anotherelement. Schematically the H bond is often represented as A-H...B. A will invariably bean em electronegative species (for example N, O, F, Cl) and B must be an electron donor.The structure of a typical H bond, the H bond in the water dimer, is shown in Fig. 6.25.

6.6.1 Some Properties of Hydrogen bonds

Several properties of the H bond are clear:

• Although H bonds are the strongest intermolecular forces, compared to covalent orionic bonds they are relatively weak. H bond strengths range from about 0.1 eV to

2Huggins contests this, claiming that the he proposed the H bond in 1919 - see G.A. Jeffrey, An

Introduction to H bonding, for an interesting discussion on the history of H bonds.

175

0.5 eV. Water-water H bonds in ice or the water dimer are of intermediate strength,generally around 0.25 eV.

• H bonds are directional with A-H-B angles close to 180. Indeed the stronger the Hbond the closer it will be to 180.

• Upon formation of a H bond the AH bond (generally termed the H bond donor)is lengthened slightly, by about 0.01-0.04 Å. This leads to softened (’red-shifted’)and broadened AH vibrations, which can be observed by experiment. Indeed thelengthening of the AH bond upon formation of a H bond correlates with the mag-nitude of the red-shift and also correlates with the A-B distance. See Fig 6.26 foran illustration of this effect.

• H bonds can be ’cooperative’. In general this means that the strength of H bondsamongst fragments may increase as more H bonds are formed. Indeed several Hbonded chains such as the boron hydride polymer or a helix of alanine moleculesshow a monotonic cooperative increase. Specifically the average H bond strengthbetween monomers chain increases as the chain lengthens:

E2 < E3 < E4 < ... < EN < ... < E∞ (6.35)

The cooperative enhancement of infinite chains is typically quantified by the di-mensionless fraction: E∞/E2. Cooperative enhancements can be large, for example,recent calculations predict a cooperative enhancement factor of >2 , for the alaninechain. This is a dramatic cooperative increase in H bond strength. This cooperativebehaviour of H bonds is the opposite of the more intuitive behaviour exhibited by,for example, covalent bonds which generally decrease in strength as more bonds areformed (cf. Fig. 6.21). Cooperativity of H bonds is a crucial physical phenomena inbiology, providing additional energy to hold certain proteins together under ambientconditions.

6.6.2 Some Physics of Hydrogen bonds

Although many of the structural and physical properties of H bonds are clear, the elec-tronic make-up of H bonds is less clear and remains a matter of debate. The unique roleH plays is generally attributed to the fact that the H ion core size is negligible and that Hhas one valence electron but unlike the alkali metals, which also share this configuration,has a high ionization energy. Generally it is believed that H bonds are mostly mediated byelectrostatic forces; stabilized by the Coulomb interaction between the (partial) positivecharge on H and the (partial) negative charge on the electronegative element B. However,H bonds are not purely electrostatic. First, it has been shown that for the water dimer(the most studied prototype H bond) it is not possible to fit the energy versus separationcurve to any of the traditional multipole expansions characteristic of a pure electrostaticinteraction (see S. Scheiner, Hydrogen bonding a theoretical perspective, for more details.).Second, first- principles calculations reveal overlap between the wavefunctions of the donorand acceptor species in the H bond. This is shown in Fig. 6.25(B) which shows one ofthe occupied eigenstates of the gas phase water dimer. Overlap between orbitals such as

176

Figure 6.26: Correlation between the A-B distance (in this case O-O distance) and OH vi-brational frequency for a host of H bonded complexes. From G.A. Jeffrey, An Introductionto H bonding.

177

-1

0

1

2

3

4

5

6

7

8

9

10E

(kcal/m

ol)

Error PBE

Error LDA(H

F) 2

(HC

l)2

(H2O

) 2

(OC

)(H

F)

(ClH

)(N

H3

)

(FH

)(N

H3

)

(H2O

)(N

H3

)

(CO

)(H

F)

-1

0

1

2

3

4

5

6

7

8

9

10E

(kcal/m

ol)

Error PBE

Error LDA(H

F) 2

(HC

l)2

(H2O

) 2

(OC

)(H

F)

(ClH

)(N

H3

)

(FH

)(N

H3

)

(H2O

)(N

H3

)

(CO

)(H

F)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

110 120 130 140 150 160 170 180

q(°)

PB

Eerr

or

per

hb

(kcal/

mol)

(A)

(B)

Figure 6.27: (A) Comparison between LDA and GGA error for H bond strengths in severalgas phase complexes (from C. Tuma, D. Boese and N.C. Handy, Phys. Chem. Chem. Phys.1, 3939 (1999)) (B) Correlation between PBE error and H bond angle for several gas phaseH bonded complexes. See J. Ireta, J. Neugebauer and M. Scheffler J. Phys. Chem. A 108,5692 (2004) for more details.

178

that shown in Fig. 6.25(B)is characteristic of covalent bonding. Current estimates of theelectrostatic contribution to a typical H bond range anywhere from 90 to 50%.Finally, we shall end with a brief discussion on how well DFT describes H bonds. Thisquestion has been tackled in countless papers recently, particularly by focussing on theH2O dimer and other small gas phase clusters. The two most general (and basic) conclu-sions of these studies are:

• Predicted H bond strengths strongly depend on the exchange correlation functionalused. The LDA is not adequate for describing H bonds. The LDA routinely predictsH bonds that are too strong by 100%. GGAs on the other hand generally predictH bond strengths that are close (within 0.05 eV) of the corresponding experimentalvalue. This general conclusion is summarised nicely in Fig. 6.27(A) which plots thedifference between DFT (LDA and GGA (PBE)) H-bond strengths from those com-puted from high level quantum chemical calculations (post Hartree-Fock methodssuch as Configuration Interaction or Coupled-Cluster, which can yield almost exactresults) for several gas phase H-bonded clusters. It can be seen from this figure thatLDA always overestimates the H bond strengths, by at least 4 kcal/mol ( 0.16 eV).PBE on the other hand is always within 1 kcal/mol ( 0.04 eV) of the ’exact’ value.

• The quality of the GGA (PBE) description of H bonds depends on the structure ofthe H bond under consideration. Specifically, it has been shown that the more linearthe H bond is, the more accurate the PBE result is. This is shown in Fig. 6.27(B)for several different H bonded gas phase clusters.

6.7 Summary

Some of the key features of the five main types of bonding are shown in Fig. 6.28.

179

+ + ++

++++

d-d-

d-

Ionic- + -

- +- +

+

Van derWaals

Covalent

Metallic

Hbonding

Type SchematicCohesiveEnergy

Examples

very weak:1-10 meV

weak:0.1-0.5 eV

strong:

11 eV (molecules)£

£8 eV (solids)

strong:

eV£9

strong:

eV£8

noble gases(Ar, Ne, etc),

organicmolecules

water ,ice, proteins

C (graphite,diamond),

Si, transitionmetals

Na, Al,transition

metals

NaCl;NaBr, KI

d-d+

d-

d- d-

d-

d+d+d+

d-

d+

d

Figure 6.28: Very simple summary of the main types of bonding in solids

180

7 Lattice Vibrations

7.1 Introduction

Up to this point in the lecture, the crystal lattice (lattice vectors & position of basisatoms) was always assumed to be immobile, i.e., atomic displacements away from theequilibrium positions of a perfect lattice were not considered. This allowed us to developformalisms to compute the electronic ground state for an arbitrary periodic but staticatomic configuration and therewith the total energy of our system. It is obvious that theassumption of a completely rigid lattice does not make a lot of sense – neither at finitetemperatures nor at 0K, where it even violates the uncertainty principle. Nonetheless, weobtained with this simplification a very good description for a wide range of solid stateproperties, among which were the electronic structure (band structure), the nature andstrength of the chemical bonding, and low temperature cohesive properties, e.g., the stablecrystal lattice and its lattice constants. The reason for this success is that the internalenergy U of a solid (or similarly of a molecule) can be written as

U = Estatic + Evib , (7.1)

where Estatic is the electronic ground state energy with a fixed lattice (on which we havefocused so far) and Evib is the additional energy due to lattice motion. In general (andwe will substantiate this below), Estatic ≫ Evib, so that many properties are describedin qualitatively correct fashion –and often even semi-quantitatively– even when Evib isneglected.There is, however, also a multitude of solid state properties, for which the considerationof the lattice dynamics is crucial. In this respect, “dynamics” is already quite a big word,because for many of these properties it is sufficient to take small vibrations of the atomsaround their (ideal) equilibrium position into account. As we will see below, these vibra-tions of the crystal can be described as quasi-particles – so called phonons. The mostsalient examples are:

• High-Accuracy Predictions at 0 K: Even at 0 K, the nuclei are always in motiondue to quantum-nuclear effects. In turn, this alters the equilibrium properties at0 K (lattice constants, bulk moduli, band gaps, etc.). For high-accuracy predictions,the nuclear motion can thus not be neglected.

• Thermodynamic Equilibrium Properties for T > 0 K: In a rigid and im-mobile lattice, the electrons are the only degrees of freedom. In metals, only thefree-electron-like conduction electrons can take up small amounts of energy (of theorder of thermal energies). The specific heat capacity due to these conduction elec-trons was computed to vary linearly with temperature for T → 0 in exercise 4. In

181

insulators with the same rigid perfect lattice assumption, not even these degreesof freedom are available, since thermal excitations across the band gap Eg ≫ kBTexhibit a vanishingly small probability ∼ exp(−Eg/kBT ). In reality, one observes,however, that for both insulators and metals the specific heat capacity varies predom-inantly with T 3 at low temperatures, in clear contradiction to the above cited rigidlattice dependency of ∼ T . The same argument holds for the pronounced thermalexpansion typically observed in experiments: Due to the negligibly small probabilityof electronic excitations over the gap, there is nothing left in a rigid lattice insulatorthat could account for it. These expansions are, on the other hand, key to manyengineering applications in material science (different pieces simply have to fit to-gether over varying temperatures). Likewise, a rigid lattice offers no possibility todescribe or explain structural phase transitions upon heating and the final melting.

• Thermodynamic Non-Equilibrium Properties for T > 0 K: In a perfectlyperiodic potential, Bloch-electrons suffer no collisions, since they are eigenstates ofthe many-body Hamiltonian. Formally, this leads to infinite transport coefficients.The thermodynamic nuclear motion breaks this perfect periodicity and thus intro-duces scattering events and finite mean free paths, which in turn explain the reallymeasured, and obviously finite electric and thermal conductivity of metals and semi-conductors.1 Computing and understanding these transport properties and otherrelated non-equilibrium properties (propagation of sound, heat transport, opticalabsorption, etc.) is of fundamental importance for many real devices (like transis-tors, LED’s, lasers, etc.).

7.2 Vibrations of a Classical Lattice

7.2.1 Adiabatic (Born-Oppenheimer) Approximation

Throughout this chapter we assume that the electrons follow the atoms instantaneously,i.e., the electrons are always in their ground-state at any possible nuclear configura-tion RI. In this case, the potential-energy surface (PES) on which the nuclei moveis given by

V BO(RI) = minΨE(RI; Ψ) (7.2)

for any given atomic structure described by atomic positions RI (I ∈ nuclei). This isnothing else but the Born-Oppenheimer approximation, which we discussed at length inthe first chapter, and that is why the so defined PES is often also called Born-Oppenheimersurface. Qualitatively, this implies that there is no “memory effect” stored in the electronicmotion (e.g. due to the electron cloud lacking behind the atomic motion), since V BO(RI)is conservative, i.e., its value at a specific configuration RI does not depend on howthis configuration was reached by the nuclei.2

1Please note that defects and boundaries break the perfect symmetry of a crystal as well. However,they are not sufficient to quantitatively describe these phenomena.

2A breakdown of the Born-Oppenheimer approximations often manifests itself as friction: In that case,the actual value of V (RI) depends not only on RI, but on the whole trajectory. See for instanceWodtke, Tully, and Auerbach, Int. Rev. Phys. Chem. 23, 513 (2004).

182

The motion of the atoms is then governed by the Hamiltonian

H = T nu + V BO(RI) , (7.3)

which does not contain the electronic degrees of freedom explicitly anymore. The completeinteraction with the electrons (i.e. the chemical bonding) is instead contained implicitlyin the functional form of the PES V BO(RI). For a perfect crystal, the minimum ofthis PES corresponds to the equilibrium configuration R

I of a Bravais lattice (latticevectors & position of the basis atoms). In the preceding chapters, we have exclusivelydiscussed these equilibrium configurations R

I in the rigid lattice model. Now, we willalso discuss deviations from these equilibrium configurations R

I: In principle, this boilsdown to evaluating the PES at arbitrary positions RI. Practically, this can becomerapidly intractable both analytically and computationally. In many cases, however, we donot need to consider all arbitrary configurations RI, but only those that are very closeto the (pronounced) minimum corresponding to the ideal lattice R

I. This allows us towrite

RI = RI + sI with |sI | ≪ a (a: lattice constant) . (7.4)

For such small displacements around a minimum, the PES can be approximated by aparabola using the harmonic approximation.

7.2.2 Harmonic Approximation

The fundamental assumption of the harmonic approximation consists in limiting the de-scription to small displacements sI from the equilibrium configuration R

I, i.e., to vibra-tions with small amplitudes. Accordingly, this is a low-temperature approximation. In thiscase, the potential in which the particles are moving can be expanded in a Taylor seriesaround the equilibrium geometry, whereby only the first leading terms are considered. Weillustrate this first for the one-dimensional case: The general Taylor expansion around aminimum at x0 yields

V (x) = V (x0) +

[∂

∂xV (x)

]

x0

s+1

2

[∂2

∂x2V (x)

]

x0

s2 +1

3!

[∂3

∂x3V (x)

]

x0

s3 + . . . , (7.5)

with the displacement s = x − x0. The linear term vanishes, because x0 is a stationarypoint of the PES, i.e., the equilibrium geometry in which no forces are active. In theharmonic approximation, cubic (and higher) terms are neglected, which yields

V (x) ≈ V (x0) +1

2

[∂2

∂x2V (x)

]

x0

s2 = V (x0) +1

2cs2 . (7.6)

Here, we have introduced the spring constant c =[

∂2

∂x2V (x)]x0

in the last step, which

shows that we are indeed dealing with a harmonic potential that yields restoring forces

F = − ∂

∂xV (x) = −cs (7.7)

that are proportional to the actual displacement s.

183

A corresponding Taylor expansion can also be carried out in three dimensions for theBorn-Oppenheimer surface of many nuclei, which leads to

V BO(RI) = V BO(RI) +

1

2

M,M∑

I=1,J=1

3,3∑

µ=1,ν=1

sµI sνJ

[∂2

∂RµI ∂R

νJ

V BO(RI)]

RI

. (7.8)

Here, µ, ν = x, y, z are the three Cartesian coordinates, I J run over the individual nuclei,and the linear term has vanished again for the exact same reasons as in the one-dimensionalcase. By defining the (3M × 3M) matrix, i.e., the Hessian

ΦµνIJ =

[∂2

∂RµI ∂R

νJ

V BO(RI)]

RI

, (7.9)

the Hamilton operator for the M nuclei of the solid can be written as

H =M∑

I=1

P 2I

2Mnu,I

+ V BO(RI) +

1

2

M,M∑

I=1J=1

3,3∑

µ=1ν=1

sµIΦµνIJs

νJ , (7.10)

where Mnu,I is the mass of nucleus I. Please note that the contributions of the N elec-trons is still implicitly contained in V BO(R

I) and ΦµνIJ . As a matter of fact, V BO(R

I)corresponds to the energy of the rigid lattice discussed in previous chapters. Given thatit is a constant, it does however not affect the motion of the nuclei, which only dependson Φµν

IJ in the harmonic approximation. This also sheds some light on our earlier claimthat generally V BO(R

I) = Estatic ≫ Evib: The thermodynamically driven Evib is typ-ically ≪ 0.5 eV/atom, e.g., approx. 50 meV/atom at room temperature. In comparison,the static term is often much larger, especially in the case of a strong (ionic or covalent)bonding, which often feature cohesive energies > 1 eV/atom (see previous chapter). How-ever, Evib can become a decisive contribution whenever there is only a weak bonding orwhen there are (even strongly bound) competing phases, the energy difference of whichis in the order of Evib.To evaluate the Hamiltonian H defined in Eq. (7.10) we need the (3M)2 second derivativesof the PES at the equilibrium geometry with respect to the 3M coordinates sµI . Analyti-cally, this is very involved, since the Hellmann-Feynman theorem cannot be applied here:In exercise 12, we had seen that the forces acting on a nucleus J can be generally expressedas

FJ = −∇RJV BO(RI) = −∇RJ

〈Ψ|H |Ψ〉 (7.11)

= −〈Ψ| ∇RJH |Ψ〉 − 2

0〈Ψ|H |∇RJ

Ψ〉 . (7.12)

Here, H is the full many-body Hamiltonian and Ψ = ΨRI(ri) is the exact electronicmany-body wavefunction in the Born-Oppenheimer approximation (see Chap. 1). Ac-cordingly, evaluating the forces at a specific configuration RI only requires to knowthe wavefunction (density) at this specific configuration, but not its derivative. This is nolonger true for higher-order derivatives of V BO(RI), e.g.,

ΦµνIJ = − dF

µI

dRνJ

(7.13)

= −〈Ψ| ∂2H

∂RµI ∂R

νJ

|Ψ〉 − 〈∂Ψ/∂RνJ |∂H

∂RµI

|Ψ〉 − 〈Ψ| ∂H∂Rµ

I

|∂Ψ/∂RνJ〉 . (7.14)

184

In this case, the derivative or response of the wavefunction ∂Ψ/∂RνJ to a displacement

enters explicitly. Formally, this can be generalized in the 2n + 1 theorem [Gonze andVigneron, Phys. Rev. B 39, 13120 (1989)]: Evaluating the (2n+1)th derivative of the Born-Oppenheimer surface requires to know the nth derivative of the wavefunction (density).Essentially, this problem can be tackled by means of perturbation theory [Baroni, deGironcoli, and Dal Corso, Rev. Mod. Phys. 73, 515 (2001).] In this lecture, we will discussan alternative route that exploits the three-dimensional analog to Eq. (7.7), i.e.,

ΦµνIJ ≈

F µI (R

1, . . . ,R

J + δsµJ , . . . ,R

M)−

=0︷︸︸︷F µI (R

1, . . . ,R

J , . . . ,R

M)

δsJ,µ. (7.15)

In this finite difference approach, the second derivative can be obtained numerically bysimply displacing an atom slightly from its equilibrium position and monitoring the arisingforces on this and all other atoms. This approach will be discussed in more detail in theexercises.At this stage, this force point of view highlights that many elements of the second deriva-tive matrix will vanish: A small displacement of atom I will only lead to non-negligibleforces on the atoms in its immediate vicinity. Atoms further away will not notice thedisplacement, and hence the second derivative between these atoms J and the initiallydisplaced atom I will be zero:3

ΦµνIJ = 0 for |R

I −RJ | ≫ a0 , (7.16)

where a0 is the lattice parameter. Very often, one even finds that the values of ΦµνIJ

become negligible for |RI − R

J | > 2a0. Accordingly, only a finite number of differ-ences (7.15), i.e., only a finite number of neighbours J need to be inspected for eachatom I.Despite these simplifications, this numerical procedure is at this stage still unfortunatelyintractable, because we would have to do it for each I of the M atoms in the solid (i.e.a total of 3M times). With M ∼ 1023, this is unfeasible, and we need to exploit furtherproperties of solids to reduce the problem. Obviously, just like in the case of the electronicstructure problem a much more efficient use of the periodicity of the lattice must be made.Although the lattice vibrations break the perfect periodicity of the static crystal, we canstill exploit the translational invariance of the lattice by assuming that periodic imagesof the unit cell behave in a similar fashion, i.e., break the local symmetry in a similarfashion. Most clearly, this can be understood and derived by inspecting the equations ofmotion following from the general form of Eq. (7.10).

7.2.3 Classical Equations of Motion

Before further analyzing the quantum-mechanical problem, it is instructive and more in-tuitive to first discuss the classical Hamilton function corresponding to Eq. (7.10). We willsee that the classical treatment emphasizes more the wave-like character of the lattice vi-brations, whereas a particle (corpuscular) character follows from the quantum-mechanical

3In the case of polar crystals with ionic bonding and thus long-range Coulomb interactions this as-sumption does not hold. In that case, long-range interactions need to be explicitly accounted for, see forinstance Baroni, de Gironcoli, and Dal Corso, Rev. Mod. Phys. 73, 515 (2001).

185

treatment. This is not new, but just another example of the wave-particle dualism known,e.g., from light (electromagnetic waves viz. photons).In the classical case and knowing the structure of the Hamilton function from Eq. (7.10),the total energy (as sum of kinetic and potential energy) can be written as

E =M∑

I=1

Mnu,I

2s2I + V BO(R

I) +1

2

M,M∑

I=1,J=1

3,3∑

µ=1,ν=1

sµIΦµνIJs

νJ (7.17)

in the harmonic approximation. Accordingly, the classical equations of motion for thenuclei can be written as

Mnu,I sµI = F

µI = −

∑

J,ν

ΦµνIJs

νJ , (7.18)

where the right hand side represents the µ-component of the force vector acting on atomI of mass Mnu,I and the double dot above s represents the second time derivative. As afirst step, we will get rid of the time dependence by using the well-known ansatz

sµI (t) = uµI eiωt . (7.19)

for this harmonic second-order differential equation. Inserting Eq. (7.19) into Eq. (7.18)leads to

Mnu,Iω2uµI =

∑

J,ν

ΦµνIJu

νJ . (7.20)

This is a system of coupled algebraic equations with 3M unknowns (still with M ∼ 1023!).As already remarked above, it is hence not directly tractable in this form, and we exploitthat the solid has a periodic lattice structure in its equilibrium geometry (just like in theelectronic case). Correspondingly, we decompose the coordinate RI into a Bravais latticevector Ln (pointing to the unit cell) and a vector Rα pointing to atom α of the basiswithin the unit cell,

RµI = Lµ

n +Rµα . (7.21)

Formally, we can therefore replace the index I over all atoms in the solid by the index pair(n, α) running over the unit cells and different atoms in the basis. With this, the matrixof second derivatives takes the form

ΦµνIJ = Φµν

αβ(n, n′) . (7.22)

And in this form, we can now start to exploit the periodicity of the Bravais lattice. Fromthe translational invariance of the lattice follows immediately that

Φµναβ(n, n

′) = Φµναβ(n− n′) . (7.23)

This suggests, that uI may be describable by means of a vibration

uµα(Ln) =1√Mnu,α

cµαeiqLn (7.24)

186

where c is the amplitude of displacement and q is a vector in reciprocal space.4 Using thisansatz for Eq. (7.20), we obtain

ω2cµα =∑

βν

∑

n′

1√MαMβ

Φµναβ(n− n′)eiq(Ln−Ln′ )

cνβ . (7.25)

Note that the part in curly brackets is only formally still dependent on the position RI

(through the index n). Since all unit cells are equivalent, RI is just an arbitrary zero forthe summation over the whole Bravais lattice. The part in curly brackets,

Dµναβ(q) =

1√MαMβ

∑

n′

Φµναβ(n− n′)eiq(Ln−Ln′ ) , (7.26)

is called dynamical matrix, and its definition allows us to rewrite the homogeneous linearsystem of equations in a very compact way

ω2cµα =∑

βν

Dµναβ(q)c

νβ ⇒ ω2c = D(q)c . (7.27)

In the last step, we have introduced the vectorial notation c = (cx1 , cy1, c

z1, c

x2 , · · · ) to high-

light that we are dealing with an eigenvalue problem here.Compared to the original equation of motion, cf. Eq. (7.20), a dramatic simplification hasbeen reached. Due to the lattice symmetry we achieved to reduce the system of 3Mp×Mequations to 3Mp equations, where Mp is the number of atoms in the basis of the unitcell. The price we paid for it is that we have to solve the equations of motion for eachq anew. This is comparably easy, though, since this scales linearly with the number ofq-points and not cubic like the matrix inversion, and has to be done only for few q (sincewe will see that the q-dependent functions are quite smooth). From Eq. (7.27), one getsa condition for the existence of solutions to the eigenvalue problem (i.e. solutions mustsatisfy the following:)

det(D(q)− ω21

)= 0 . (7.28)

This problem has 3Mp eigenvalues, which are functions of the wave vector q,

ω = ωi(q) , i = 1, 2, . . . , 3Mp (7.29)

and to each of these eigenvalues ωi(q), Eq. (7.27) provides the eigenvector

ci(q) = (cxi,1(q), cyi,1(q), c

zi,1(q), c

xi,2(q), · · · ) . (7.30)

These eigenvectors are determined up to a constant factor, and this factor is finally ob-tained by requiring all eigenvectors to be normalized (and orthogonal to each other).In conclusion, we find that the displacement sI(t) of atom I = (n, α) away from theequilibrium position at time t is given by

sµI (t) =

1√Mnu,α

∑

i,q

Ai(q)cµi,α(q)e

i(qLn−ωi(q)t) , (7.31)

4To distinguish electronic and nuclear degrees of freedom, q and not k is used in the description oflattice vibrations.

187

whereby the amplitudes Ai(q) are determined by the initial conditions. With this, thesystem of 3M three-dimensional coupled oscillators is transformed into 3M decoupled(but “collective”) oscillators, and the solutions (or so-called normal modes) correspond towaves which propagate throughout the whole crystal. The resemblance of these waves towhat we think of as oscillators traditionally is thus only formal. The normal modes forma basis for the description of lattice vibrations, i.e., any arbitrary displacement situationof the atoms can now be written as an expansion in these special solutions.

7.2.4 Comparison between ǫn(k) and ωi(q)

We have derived the normal modes of lattice vibrations exploiting the periodicity of theideal crystal lattice in a similar way as we did for the electron waves in chapters 4 and 5.Since it is only the symmetry that matters, the dispersion relations ωi(q) will thereforeexhibit a number of properties equivalent to those of the electronic eigenvalues ǫn(k). Wewill not derive these general properties again, but simply list them here (check chapter 4and 5 to see how they emerge from the periodic properties of the lattice):

1. ωi(q) is periodic in q-space. The discussion can therefore always be restricted to thefirst Brillouin zone.

2. ǫn(k) and ωi(q) have the same symmetry within the Brillouin zone. Additionallyto the space group symmetry of the lattice, there is also time inversion symmetry,yielding ωi(q) = ωi(−q).

3. Due to the use of periodic boundary conditions, the number of q-values is finite.When the solid is composed of M/Mp unit cells, then there are M/Mp different q-values in the Brillouin-zone. Since i = 1 . . . 3Mp, there are in total 3Mp× (M/Mp) =3M values for ωi(q), i.e., exactly the number of degrees of freedom of the crystallattice. When speaking of solids, M ∼ 1023 so that q and the dispersions relationsωi(q) become essentially continuous. For practical calculation using supercells offinite size, this aspect has however to be kept in mind, as discussed in more detailin the exercises.

4. ωi(q) is an analytic function of q within the Brillouin zone. Whereas the band indexn in ǫn(k) can take arbitrarily many values, i in ωi(q) is restricted to 3Mp values.We will see below that this is connected to the fact that electrons are fermions andlattice vibrations (phonons) bosons.

7.2.5 Simple One-dimensional Examples

The difficulty of treating and understanding lattice vibrations is primarily one of book-keeping. The many indices present in the general formulae of section 7.2.2 and 7.2.3 let oneeasily forget the basic (and in principle simple) physics contained in them. It is thereforevery useful (albeit academic) to consider toy model systems to understand the idea behindnormal modes. The simplest possible system representing an infinite periodic lattice (i.e.with periodic boundary conditions) is a linear chain of atoms, and we will see that withthe conceptual understanding obtained for one dimension it will be straightforward togeneralize to real three-dimensional solids.

188

s(n+1) s(n+2)s(n)s(n-1)

M

a

unit cell

Figure 7.1: Linear chain with one atomic species (Mass M , and lattice constant a).

7.2.5.1 Linear Chain with One Atomic Species

Consider an infinite one-dimensional linear chain of atoms of identical mass M , connectedby springs with (constant) spring constant f as shown in Fig. 7.1. The distance betweentwo atoms in their equilibrium position is a, and s(n) the displacement of the nth atomfrom this equilibrium position. In analogy to eq. (7.18), the Newton equation of motionfor this system is written as

M..sn = −f(s(n)− s(n+ 1)) + f(s(n− 1)− s(n)) , (7.32)

i.e., the sum over all atoms breaks down to left and right nearest neighbor, and the secondderivative of the PES corresponds simply to the spring constant. We solve this problemas before by treating the displacement as a sinusoidally varying wave:

s(n) =1√Mcei(qan−ωt) , (7.33)

yieldingω2M = f

(2− e−iqa − eiqa

). (7.34)

Solving for ω, one obtains

ω(q) = 2

√f

M

∣∣∣sin(qa2

)∣∣∣ . (7.35)

Alternatively, one can solve this problem using the dynamical matrix formalism given byEq. (7.27) and discussed in the previous chapter. Formally, the potential energy termsthat depend on sn are

V =f

2(s(n− 1)− s(n))2 + f

2(s(n)− s(n+ 1))2 , (7.36)

which yields the second derivatives

∂2V

∂s(n)∂s(n− 1)=

∂2V

∂s(n)∂s(n+ 1)= −f and

∂2V

∂s(n)∂s(n)= 2f . (7.37)

189

0.0

0.5

1.0

1.5

2.0

−π/a 0 +π/aω

(q)

/ (f/M

)1/2

q

Figure 7.2: Dispersion relation ω(q) for the linear chain with one atomic species.

Accordingly, the full Hessian matrix has the structure

Φ =

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .0 . . . 0 −f 2f −f 0 0 0 . . . 00 . . . 0 0 −f 2f −f 0 0 . . . 00 . . . 0 0 0 −f 2f −f 0 . . . 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. (7.38)

Please note that we omitted the cartesian degrees of freedom µ, ν in this one-dimensionalcase. Since this system has only one atom in the unit cell, it is sufficient to focus on oneline of the Hessian (α = 1). This yields the dynamical “matrix” with one entry:

D(q) =f

M(2− eiqa − e−iqa) . (7.39)

Accordingly, we get the dispersion relation

ω(q) = 2

√f

M

∣∣∣sin(qa2

)∣∣∣ , (7.40)

the “eigenvector” c = 1, and thus the displacement field

s(n, t) =1√M

∑

q

A(q)ei(qan−ω(q)t) . (7.41)

As expected, ω is a periodic function of q and symmetric with respect to q and −q. Thefirst period lies between q = −π/a and q = +π/a (first Brillouin zone), and the formof the dispersion is plotted in Fig. 7.2. Equation (7.41) gives the explicit solutions forthe displacement pattern in the linear chain for any wave vector q. They describe wavespropagating along the chain with phase velocity ω/q and group velocity vg = ∂ω/∂q.Let us look in more detail at the two limiting cases q → 0 and q = π/a (i.e. the center andthe border of the Brillouin zone). For q → 0 we can approximate ω(q) given by Eq. (7.35)with its leading term in the small q Taylor expansion (sin(x) ≈ x for x→ 0)

ω ≈(a

√f

M

)q . (7.42)

190

s(n-1) = -s(n) s(n) s(n+1) = -s(n) s(n+2) = s(n) s(n+3) = -s(n) s(n+4) = s(n)

Figure 7.3: Snap shot of the displacement pattern in the linear chain at q = +π/a.

ω is thus linear in q, and the proportionality constant is simply the group velocity vg =

a√f/M , which is in turn identical to the phase velocity and independent of frequency.

For the displacement pattern, we obtain

s(n, t) =A(q)√Meiq(an−vgt) (7.43)

in this limit, i.e., at these small q-values (long wavelength) the phonons propagate simplyas vibrations with constant speed vg (the ordinary sound speed in this 1D crystal!). Thisis the same result we would have equally obtained within linear elasticity theory, which isunderstandable because for such very long wavelength displacements the atomic structureand spacing is less important (the atoms move almost identically on a short length scale).Fundamentally, the existence of a branch of solutions whose frequency vanishes as q van-ishes results from the symmetry requiring the energy of the crystal to remain unchanged,when all atoms are displaced by an identical amount. We will therefore always obtainsuch types of modes, i.e., ones with ω(0) = 0, regardless of whether we consider morecomplex interactions (e.g. spring constants to second nearest neighbors) or higher dimen-sions. Since their dispersion is close to the center of the Brillouin zone linear in q, whichis characteristic of sound waves, these modes are commonly called acoustic branches.5

One of the characteristic features of waves in discrete media, however, is that such alinear behaviour ceases to hold at wavelengths short enough to be comparable to theinterparticle spacing. In the present case ω falls below vgq as q increases, the dispersioncurve bends over and becomes flat (i.e. the group velocity drops to zero). At the borderof the Brillouin zone (for q = π/a) we then obtain

s(n, t) =1√Meiπne−iωt . (7.44)

Note that eiπn = 1 for even n, and eiπn = −1 for odd n. This means that neighboring atomsvibrate against each other as shown in Fig. 7.3. This also explains, why in this case thehighest frequency occurs. Again, the flattening of the dispersion relation is determinedentirely by symmetry: From the repetition in the next Brillouin zone and with kinksforbidden, vg = ∂ω/∂q = 0 must result at the Brillouin zone boundary, regardless of howmuch more complex and real we make our model.

191

s(1,n+1) s(2,n+1)s(2,n)s(1,n)

+a/4-a/4

a

M1 M2

Figure 7.4: Linear chain with two atomic species (Masses M1 and M2). The lattice con-stant is a and the interatomic distance is a/2.

7.2.5.2 Linear Chain with Two Atomic Species

Next, we consider the case shown in Fig. 7.4, where there are two different atomic speciesin the linear chain, i.e., the basis in the unit cell is two. The equations of motion for thetwo atoms follow as a direct generalization of Eq. (7.32)

M1..s1 (n) = −f (s1(n)− s2(n)− s2(n− 1)) (7.45)

M2..s2 (n) = −f (s2(n)− s1(n+ 1)− s1(n)) . (7.46)

Hence, also the same wave-like ansatz should work as before

s1(n) =1√M1

c1ei(q(n− 1

4)a−ωt) (7.47)

s2(n) =1√M2

c2ei(q(n+ 1

4)a−ωt) . (7.48)

This leads to

−ω2c1 = − 2f√M1M2

c1 +2f√M1M2

c2 cos(qa2

)(7.49)

−ω2c2 = − 2f√M1M2

c2 +2f√M1M2

c1 cos(qa2

), (7.50)

i.e., in this case the frequencies of the two atoms are still coupled to each other. Theeigenvalues follow from the determinant

∣∣∣∣∣

2f√M1M2

− ω2 − 2f√M1M2

cos(qa2

)

− 2f√M1M2

cos(qa2

)2f√

M1M2− ω2

∣∣∣∣∣ = 0 , (7.51)

which yields the two solutions

ω2± = f

(1

M1

+1

M2

)± f

√(1

M1

+1

M2

)2

− 4

M1M2

sin2(qa2

)

=f

M± f

√1

M2− 4

M1M2

sin2(qa2

), (7.52)

5Notable exceptions do exist: For instance, graphene exhibits a parabolic dispersion in its acousticmodes close to Γ.

192

with the reduced mass M =(

1M1

+ 1M2

)−1

.Again, one can also solve this problem using the dynamical matrix formalism given byEq. (7.27) and discussed in the previous chapter. Formally, the potential energy termsthat depend on s1(n) and s2(n) are given by

V =f

2[s2(n− 1)− s1(n)]2 +

f

2[s1(n)− s2(n)]2 +

f

2[s2(n)− s1(n+ 1)]2 , (7.53)

which yields the second derivatives

∂2V

∂s1(n)∂s2(n− 1)=

∂2V

∂s1(n)∂s2(n)=

∂2V

∂s2(n)∂s1(n)=

∂2V

∂s2(n)∂s1(n+ 1)= −f

∂2V

∂s1(n)∂s1(n)=

∂2V

∂s2(n)∂s2(n)= 2f . (7.54)

Accordingly, the full Hessian matrix exhibits the structure

Φ =

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .0 . . . 0 −f 2f −f 0 0 0 . . . 00 . . . 0 0 −f 2f −f 0 0 . . . 00 . . . 0 0 0 −f 2f −f 0 . . . 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. (7.55)

Again, cartesian degrees of freedom µ, ν have been omitted for this one-dimensional case.In contrast to our earlier example, this system features two atoms in the unit cell, so thatwe now get a dynamical matrix with dimension two:

D(q) =

(2fM1

− f√M1M2

(1 + exp(−iqa))− f√

M1M2(1 + exp(+iqa)) 2f

M2

). (7.56)

The respective eigenvalues are the solutions of

ω4 − 2f

(1

M1

+1

M2

)

︸︷︷︸M−1

ω2 +4f 2

M1M2

sin2(qa2

)= 0 (7.57)

that yields the dispersion relation

ω±(q) =

√√√√ f

M± f

√1

M2− 4

M1M2

sin2(qa2

). (7.58)

We therefore obtain two branches for ω(q), the dispersion of which is shown in Fig. 7.5. Asapparent, one branch goes to zero for q → 0, ω−(0) = 0, as in the case of the monoatomicchain, i.e., we have again an acoustic branch. The other branch, on the other hand, exhibits

interestingly a (high) finite value ω+(0) =√2 f

Mfor q → 0 and shows a much weaker

dispersion over the Brillouin zone. Whereas in the acoustic branch both atoms within the

193

0.0

0.5

1.0

1.5

2.0

−π/a 0 +π/aω

(q)

/ (f/M

)1/2

q

Figure 7.5: Dispersion relations ω(q) for the linear chain with two atomic species (thespecific values are for M1 = 3M2 = M). Note the appearance of the optical branch notpresent in Fig. 7.2.

unit cell move in concert, they vibrate against one another in opposite directions in theother solution. In ionic crystals (i.e. where there are opposite charges connected to the twospecies), these modes can therefore couple to electromagnetic radiation and are responsiblefor much of the characteristic optical behavior of such crystals. Correspondingly, they aregenerally (i.e. not only in ionic crystals) called optical modes, and their dispersion is calledthe optical branch.

At the border of the Brillouin zone at q = π/a we finally obtain the values ω+ =√

2fM1

and ω− =√

2fM2

, i.e. as apparent from Fig. 7.5 a gap has opened up between the acousticand the optical branch. The width of this gap depends only on the mass difference ofthe two atoms (M1 vs. M2) in this case.6 The gap closes for identical masses of the twospecies. This is comprehensible since then our model with equal spring constants andequal masses gets identical to the previously discussed model with one atomic species:The optical branch is then nothing but the upper half of the acoustic branch, folded backinto the half-sized Brillouin zone.

7.2.6 Phonon Band Structures

With the conceptual understanding gained within the one-dimensional models, let us nowproceed to look at the lattice vibrations of a real three-dimensional crystal. In principle,the formulae to describe the dispersion relations for the different solutions are alreadygiven in section 7.2.3. Most importantly, it is Eq. (7.28) that gives us these dispersions.In close analogy to the electronic case, the dispersions are called phonon band structurein their totality. The crucial quantity is the dynamical matrix. The different branches ofthe band structure follow from the eigenvalues after diagonalizing this matrix.This route is also followed in quantitative approaches to the phonon band structure prob-lem, e.g., within density-functional theory. In the so-called direct, supercell, or frozen-phonon approach, one exploits the expression of the forces as given in Eq. (7.15) for thecomputation of the dynamical matrix, since in principle it is nothing more than a mass-weighted Fourier transform of the force constants. In practice, the different atoms in the

6In more realistic cases, varying spring constants, e.g., f1 and f2, would also affect the gap.

194

Figure 7.6: DFT-LDA and experimental phonon band structure (left) and correspondingdensity of states (right) for fcc bulk Rh [from R. Heid, K.-P. Bohnen and W. Reichardt,Physica B 263, 432 (1999)].

unit cell are displaced by small amounts and the forces on all other atoms are recorded.In order to be able to compute the dynamical matrix with this procedure at differentpoints in the Brillouin zone, larger supercells are employed and the (in principle equiv-alent) atoms within this larger periodicity displaced differently, so as to account for thedifferent phase behavior. Obviously, in this way only q-points that are commensuratewith the supercell periodicity under observation can be accessed, and one runs into theproblem that quite large supercells need to be computed in order to get the full dispersioncurves. Approaches based on perturbation theory (linear response regime) are thereforealso frequently employed: The details of the different methods go beyond the scope of thepresent lecture, but a good overview can for example be found in R.M. Martin, ElectronicStructure: Basic Theory and Practical Methods, Cambridge University Press, Cambridge(2004).Experimentally, phonon dispersion curves are primarily measured using inelastic neutronscattering and Ashcroft/Mermin dedicates an entire chapter to the particularities of suchmeasurements. Overall, phonon band structures obtained by experimental measurementsand from DFT-LDA/GGA simulations typically agree extremely well these days, regard-less of whether computed by a direct or a linear response method. Instead of discussingthe details of either measurement or calculation, we rather proceed to some practicalexamples.If one has a mono-atomic basis in the three-dimensional unit cell, we expect from the one-dimensional model studies that only acoustic branches should be present in the phononband structure. There are, however, three of these branches because the orientation ofthe polarization vector (i.e. the eigenvector belonging to the eigenvalue ω(q)), cf. Eqs.(7.30) and (7.31)) matters in three-dimensions. In the linear chain, the displacement wasonly considered along the chain axis and such modes are called longitudinal. If the atomsvibrate in the three-dimensional case perpendicular to the propagation direction of thewave, two further transverse modes are obtained, and longitudinal and transverse modesmust not necessarily exhibit the same dispersion. Fig. 7.6 shows the phonon band structurefor bulk Rh in the fcc structure and the phonon density of states (DOS) resulting fromit (which is obtained analogously to the electronic structure case by integrating over the

195

Figure 7.7: DFT-LDA and experimental phonon band structure (left) and correspondingdensity of states (right) for Si (upper panel) and AlSb (lower panel) in the diamond lattice.Note the opening up of the gap between acoustic and optical modes in the case of AlSbas a consequence of the large mass difference. 8 cm−1 ≈ 1 meV [from P. Giannozzi et al.,Phys. Rev. B 43, 7231 (1991)].

196

Brillouin zone). Particularly along the lines Γ − X and Γ − L, which run both from theBrillouin zone center to the border, the dispersion of the bands qualitatively follows theform discussed for the linear chain model above and is thus conceptually comprehensible.Turning to a poly-atomic basis, the anticipated new feature is the emergence of opticalbranches due to vibrations of the different atoms against each other. This is indeed ap-parent in the phonon band structure of Si in the diamond lattice shown in Fig. 7.7, andwe again roughly find the shape that also emerged in the diatomic linear chain model forthese branches (check again the Γ−X and Γ− L curves). Since there are in general 3Mp

different branches for a basis of Mp atoms, cf. section 7.2.4, 3Mp−3 optical branches result(i.e. in the present case of a diatomic basis, there are 3 optical branches, one longitudinalone and two transverse ones). Although the diamond lattice has two inequivalent latticesites (and thus represents a case with a diatomic basis), these sites are both occupied bythe same atom type in bulk Si. From the discussion of the diatomic linear chain, we wouldtherefore expect a vanishing gap between the upper end of the acoustic branches and thelower end of the optical ones. Comparing with Fig. 7.7 this is indeed the case. Lookingat the phonon band structure of AlSb also shown in this figure, a huge gap can now,however, be discerned. Obviously, this is the result of the large mass difference betweenAl and Sb in this compound, showing that we can indeed understand most of the quali-tative features of real phonon band structures from the analysis of the two simple linearchain models. In order to develop a rough feeling for the order of magnitude of latticevibrational frequencies (or in turn the energies contained in these vibrations), it is helpfulto keep the inverse variation of the phonon frequencies with mass in mind, cf. Eq. (7.26).For transition metals, we are talking about a range of 20-30 meV for the optical modes,cf. Fig. 7.6, while for lighter elements this becomes increasingly higher, cf. Fig. 7.7. Firstrow elements like oxygen in solids (oxides) exhibit optical vibrational modes around 70-80 meV, whereas the acoustic modes go down to zero at the center of the Brillouin zonein all cases. Thus, all frequencies lie in the range of thermal energies, which already tellsus how important these modes will be for the thermal behavior of the solid.

7.3 Quantum Theory of the Harmonic Crystal

In the preceding section we have learnt to compute the lattice vibrational frequencieswhen the ions are treated as classical particles. In practice, this approximation is nottoo bad for quite a range of materials properties. The two main reasons are the quiteheavy masses of the ions (with the exception of H and He), and the overall rather similarproperties of the quantum mechanical and the classical harmonic oscillator on whichall of the harmonic theory of lattice vibrations is based. In a very rough view, one cansay that if only the vibrational frequency ω(q) of a mode itself is relevant, the classicaltreatment gives already a good answer (which is ultimately why we could understandthe experimentally measured phonon band structure with our classical analysis). As amatter of fact, the formalism introduced in the previous chapter has highlighted that theeigenvalues and -vectors are a property of the harmonic potential itself. If, on the otherhand, the energy connected with this mode, ω(q), is important or, more precisely, the wayone excites (or populates) this mode is concerned, then the classical picture usually fails.The prototypical example for this is the specific heat due to the lattice vibrations, whichis nothing else but the measure of how effectively the vibrational modes of the lattice can

197

take up thermal energy. In order to understand material properties of this kind, we needa quantum theory of the harmonic crystal. Just as in the classical case, let us first recallthis theory for a one-dimensional harmonic oscillator before we really address the problemof the full solid-state Hamiltonian.

7.3.1 One-dimensional Quantum Harmonic Oscillator

Instead of just solving the quantum mechanical problem of the one-dimensional harmonicoscillator, the focus of this section will rather be to recap a specific formalism that canbe easily generalized to the three-dimensional case of the coupled lattice vibrations af-terwards. At first sight, this formalism might appear mathematically quite involved andseems to “overdo” for this simple problem, but in the end, the easy generalization willmake it all worthwhile.In accordance with Eq. (7.6), the classical Hamilton function for the one-dimensionalharmonic oscillator is given by

Hclassic1D = T + V =

p2

2m+

1

2fs2 =

p2

2m+mω2

2s2 , (7.59)

where f = mω2 is the spring constant and s the displacement. From here, the translationto quantum mechanics proceeds via the general rule to replace the momentum p by theoperator

p = −i~ ∂∂s

, (7.60)

where p and s satisfy the commutator relation

[p, s] = ps− sp = −i~ , (7.61)

which is in this case nothing but the Heisenberg uncertainty principle. The resultingHamiltonian preserves the form as in Eq. (7.59), but p and s have now to be treated as(non-commuting) operators.This structure can be considerably simplified by introducing the lowering operator

a =

√mω

2~s + i

√1

2~mωp , (7.62)

and the raising operator

a† =

√mω

2~s − i

√1

2~mωp . (7.63)

Note that the idea behind this pair of adjoint operators leads to the concept called secondquantization, where a and a† are denoted as annihilation and creation operator, respec-tively. The canonical commutation relation of Eq. (7.61) then implies

[a, a†] = 1 , (7.64)

and the Hamiltonian takes the simple form

HQM1D = ~ω

(a†a+

1

2

). (7.65)

198

From the commutation relation and the form of the Hamiltonian, it follows that theeigenstates |n> (n = 0, 1, 2, . . .) of this problem fulfill the following relations:

a†|n> = (n+ 1)1/2 |n+ 1> , (7.66)

a|n> = n1/2 |n− 1> (n 6= 0) , (7.67)

a|0> = 0 . (7.68)

A derivation of these is given in any decent textbook on quantum mechanics.We see that the operators a and a† lower or raise the excitation state of the system.This explains the names given to these operators, in particular when considering thateach excitation can be seen as one quantum: The creation operator creates one quantum,while the annihilation operator wipes one out. With these recursion relations between thedifferent states, the eigenvalues (i.e. the energy levels) of the one-dimensional harmonicoscillator are finally obtained as

En = < n|HQM1D |n> = (n+ 1/2) ~ω . (7.69)

Note, that E0 6= 0, i.e., even in the ground state the oscillator has some energy (so calledenergy due to zero-point vibrations or zero-point energy), whereas with each higher excitedstate one quantum ~ω is added to the energy. With the above mentioned picture of creationand annihilation operators, one therefore commonly talks about the state En of the systemas corresponding to having excited n phonons of energy ~ω. This nomenclature accountsthus for the discrete nature of the excitation and emphasizes more the corpuscular/particleview. In the classical case, the energy of the system was determined by the amplitude ofthe vibration and could take arbitrary continuous values. This is no longer possible forthe quantum mechanical oscillator.

7.3.2 Three-dimensional Quantum Harmonic Crystal

For the real three-dimensional case, we will now proceed in an analogous fashion as in thesimple one-dimensional model. To make the formulae not too complex, we will restrictourselves to the case of a mono-atomic basis in this section, i.e., all ions have a mass Mnu

and Mp = 1. The generalization to polyatomic bases is straightforward, but messes upthe mathematics without yielding too much further insight.In section 7.2.2, we had derived the total Hamilton operator of the M nuclei in theharmonic approximation. It is given by

H =M∑

I=1

p2I2Mnu

+ V BO(RI) +

1

2

M,M∑

µ=1ν=1

3,3∑

I=1J=1

sµIΦµνIJs

νJ , (7.70)

and the displacement field defined in Eq. (7.31) solves the classical equations of motion.For a mono-atomic basis, we can subsume all time-dependent terms into si(q, t) and definethe transformation to normal mode coordinates

sµI (t) =

∑

i,q

si(q, t)cµi,α(q)e

i(qLn) . (7.71)

199

Inserting this form of the displacement field into Eq. (7.70) yields

Hvib =3∑

i=1

∑

q

p2i (q)

2Mnu

+1

2

3∑

i=1

∑

q

Mnuω2i (q)s

2i (q, t) (7.72)

=3∑

i=1

∑

q

[p2i (q)

2Mnu

+Mnuω

2i (q)

2s2i (q, t)

]

︸︷︷︸Hvib

i (q)

, (7.73)

where we have introduced the generalized momentum pi = d/dt s2i (q, t) and focused onlyon the vibrational part of the Hamiltonian, Hvib, i.e., we discarded the contribution fromthe static minimum of the potential energy surface by setting V BO(R

I) = 0. Withthis transformation, the Hamiltonian has decoupled to a sum over independent harmonicoscillators defined by one-dimensional Hamiltonians Hvib

i (q) (compare with Eq. (7.59)!).It is thus straightforward to generalize the solution discussed in detail in the last sectionto this more complicated case. For this purpose, all previously defined operators need tobecome i- and q-dependent: We define annihilation and creation operators for a phonon(i.e. normal mode) with frequency ωi(q) and wave vector q as

ai(q) =

√Mnuωi(q)

2~si(q) + i

√1

2~Mnuωi(q)pi(q) , (7.74)

a†i (q) =

√Mnuωi(q)

2~si(q) − i

√1

2~Mnuωi(q)pi(q) . (7.75)

Similarly, the Hamiltonian is brought to a form equivalent to Eq. (7.65), i.e.,

Hvib =3∑

i=1

∑

q

~ωi(q)

(a†i (q)ai(q) +

1

2

), (7.76)

and the eigenvalues generalize to

Evib =3∑

i=1

∑

q

~ωi(q)

(ni(q) +

1

2

). (7.77)

The energy contribution of the lattice vibrations therefore stems from 3M harmonic os-cillators with frequencies ωi(q). The frequencies are the same as in the classic case, whichis (as already remarked in the beginning of this section) why the classical analysis permit-ted us already to determine the phonon band structure. In the harmonic approximation,the 3M modes are completely independent of each other: Any mode can be excited in-dependently and contributes the energy ~ωi(q). While this sounds very familiar to theconcept of the independent electron gas (with its energy contribution ǫn(k) from eachstate (n,k)), the fundamental difference is that in the electronic system each state (ormode in the present language) can only be occupied with one electron (or two in the caseof non-spin polarized systems). This results from the Pauli principle and we talk aboutfermionic particles. In the phonon gas, on the other hand, each mode can be excited withan arbitrary number of (indistinguishable) phonons, i.e., the occupation numbers ni(q)

200

are not restricted to the values 0 and 1. Phonons are therefore bosonic particles. In ahand-waving manner, the following analogy holds: The individual modes ~ωi(q) are thephonon analogue to the electronic levels ǫn(k). In the phononic case, however, the respec-tive occupation numbers ni(q) are not restricted to the values 0 and 1, but can in principlehave any integer value in [0,∞) – depending on the thermodynamic conditions (see nextsection). For this exact reason, the number of phonons is neither constant nor conserved,but dictated by the energy of the lattice and therewith through the temperature. AtT = 0K, the occupation numbers are zero (ni(q) ≡ 0). Still, the lattice is not static,since each mode contributes ~ωi(q)/2 to the lattice energy from the quantum mechanicalzero-point vibrations as discussed for Eq. (7.77). At increasing temperatures, more andmore phonons are created, so that these lattice vibrations are further enhanced and thelattice energy rises.

7.3.3 Lattice Energy at Finite Temperatures

Having obtained the general form of the quantum mechanical lattice energy with Eq. (7.77),the logical next step is to ask what this energy is at a given temperature T , i.e. we want toevaluate Evib(T ). Looking at the form of Eq. (7.77), the quantity we need to evaluate inorder to obtain Evib(T ) is obviously ni(q), i.e., the number of excited phonons in the nor-mal mode (q, ω(q)) as a function of temperature. According to the last two sections, theenergy of this mode with n phonons excited is En = (n+1/2)~ωi(q) and consequently theprobability for such an excitation at temperature T is proportional to exp(−En/(kBT )).Normalizing (the sum of all probabilities in the denominator must be one), we arrive at

Pn(T ) =e− En

kBT

∑l e

− ElkBT

=e−n~ωi(q)

kBT

∑∞l e

− l~ωi(q)

kBT

=xn∑∞l xl

, (7.78)

where we have defined the short hand variable x = exp(−~ωi(q)kBT

) in the last step. Frommathematics, the value of the geometric series is known as

∞∑

n=0

xn =1

1− x , (7.79)

which simplifies the probability to

Pn(T ) = xn(1− x) . (7.80)

With this probability the mean energy of the oscillator at temperature T becomes

Ei(T, ω(q)) =∞∑

n=0

EnPn(T ) = E0 + ~ω(q)∞∑

n=1

nPn(T )

=~ωi(q)

2+ ~ωi(q) (1− x)

∞∑

n=0

nxn . (7.81)

Using the relation∞∑

n=0

nxn =x

(1− x)2 , (7.82)

201

0 5 10 15h_ ω (meV)

0

2

4

6

8

10

n BE

T = 100 KT = 300 KT = 500 K

0 100 200 300 400 500T (K)

0

0.5

1

1.5

2

2.5

3

n BE

h_ ω = 10 meVh_ ω = 25 meVh_ ω = 40 meV

Figure 7.8: Bose-Einstein distribution nBE as defined in Eq. (7.84): The left plot showsthe frequency dependence for three different temperatures, whereas the right plot showsthe temperature dependence for three different phonon frequencies. Note the dashed linethat highlights the maximum possible occupation number in a Fermi statistics.

that can be obtained from differentiating Eq. (7.79), we find that the mean energy of thisoscillating mode is

Ei(T, ω(q)) = ~ωi(q)

(1

e~ωi(q)

kBT − 1+

1

2

). (7.83)

Comparing with the energy levels of the mode En = ~ωi(q)(n+ 1/2), we can identify

ni(q) =1

e~ωi(q)

kBT − 1(7.84)

as the mean excitation (or occupation with phonons) of this particular mode at T . This issimply the famous Bose-Einstein statistics, consistent with the remark above that phononsbehave as bosonic particles.Knowing the energy of one mode at T , cf. Eq. (7.83), it is easy to generalize over all modesand arrive at the total energy due to all lattice vibrations at temperature T

Evib(T ) =

3Mp∑

i=1

∑

q

~ωi(q)

(1

e~ωi(q)

kBT − 1+

1

2

). (7.85)

This is still a formidable expression and we notice that the mean occupation of mode iwith frequency ωi(q) and wave vector q does not explicitly depend on q itself (only indi-rectly via the dependence of the frequency on q). It is therefore convenient to reduce thesummations over modes and wave vectors to one simple frequency integral, by introducinga density of normal modes (or phonon density of states) g(ω), defined so that g(ω) is thetotal number of modes with frequencies ω per volume,

g(ω) =1

(2π)3

∑

i

∫dq δ (ω − ωi(q)) . (7.86)

202

This implies obviously, that

∫ ∞

0

dω g(ω) = 3Mp/V , (7.87)

i.e., the integral over all modes yields the total number of modes per volume, which isthree times the number of atoms per volume, cf. Sec 7.2.4.With this definition of g(ω), the total energy due to all lattice vibrations at temperatureT of a solid of Mp atoms in a volume V takes the simple form

Evib(T ) = V

∫ ∞

0

dω g(ω)

[1

e~ω

kBT − 1+

1

2

]~ω , (7.88)

i.e., the complete information about the specific vibrational properties of the solid is con-tained in the phonon density of states g(ω). Once this quantity is known, either fromexperiment or DFT (see the examples in Figs. 7.6 and 7.7), the contribution from thelattice vibrations can be evaluated immediately. Via thermodynamic relations, also theentropy due to the vibrations is then accessible, allowing to compute the complete contri-bution from the phonons to thermodynamic quantities like the free energy or Gibbs freeenergy.

7.3.4 Phonon Specific Heat

We will discuss the phonon specific heat as a representative example of how the consider-ation of vibrational states improves the description of those quantities that were listed asfailures of the static lattice model in the beginning of this chapter. The phonon specificheat cV measures how well the crystal can take up energy with increasing temperatures,and we had stated in the beginning of this chapter that the absence of low energy exci-tations in a rigid lattice insulator would lead to a vanishingly small specific heat at lowtemperatures, in clear contradiction to the experimental data. Taking lattice vibrationsinto account, it is obvious that they, too, may be excited with temperature. In addition,we had seen that lattice vibrational energies are of the same order of magnitude as thermalenergies, so that the possible excitation of vibrations in the lattice should yield a strongcontribution to the total specific heat of any solid. The crucial point to notice is, however,that vibrations cannot take up energy continuously (in form of gradually increasing am-plitudes) in a quantum-mechanical formalism. Instead, only discrete excitations ~ωi(q) ofthe phonons modes are possible, and we will see that this has profound consequences onthe low temperature behavior of the specific heat. That the resulting functional form ofthe specific heat at low temperatures agrees with the experimental data was one of theearliest triumphs of the quantum theory of solids.In general, the specific heat at constant volume is defined as

cV (T ) =1

V

∂U

∂T

∣∣∣∣V

, (7.89)

i.e., it measures (normalized per volume) the amount of energy taken up by a solid whenthe temperature is increased. If we want to specifically evaluate the contribution from the

203

Figure 7.9: Qualitative form of the specific heat of solids as function of temperature (herenormalized to a temperature ΘD below which quantum effects become important). Athigh temperatures, cV approaches the constant value predicted by Dulong-Petit, while itdrops to zero at low temperatures.

lattice vibrations using Eq. (7.88), we get the general expression

cvibV (T ) =1

V

∂Evib

∂T

∣∣∣∣V

=∂

∂T

∫ ∞

0

dω g(ω)

[1

e~ω

kBT − 1+

1

2

]~ω

=∂

∂T

∫ ∞

0

dω g(ω)

[1

e~ω

kBT − 1

]~ω . (7.90)

In the last step, we have dropped the temperature-independent term associated withzero-point vibrations.

If one plugs in the phonon densities of states from experiment or DFT, the calculatedcvibV agree excellently with the measured specific heats of solids, proving that indeed theoverwhelming contribution to this quantity comes from the lattice vibrational modes.The curves have for most elements roughly the form shown in Fig. 7.9. In particular, cVapproaches a constant value at high temperatures, while in the lowest temperature rangeit scales with T 3 (in metals, this scaling is in fact αT + βT 3, because there is in additiona small contribution from the conduction electrons, which you calculated in exercise 4).But, let us again try to see, if we can understand these generic features on the basis ofthe theory developed in this chapter (and not simply accept the data coming out fromDFT or experiment).

204

−π/a 0 +π/a

ω(k

)

k

Figure 7.10: Left: Illustration of the dispersion curves behind the Einstein and Debyemodels (dashed curves). Einstein modes approximate optical branches, while Debye modesfocuses on the acoustic branches that become particularly relevant at low temperatures.Right: Specific heat curves obtained with both models.

7.3.4.1 High-Temperature Limit (Dulong-Petit Law)

In the high temperature limit, we have kBT ≫ ~ω, and can therefore expand the expo-nential in the denominator in Eq. (7.90)

cvibV (T →∞) =∂

∂T

∫ ∞

0

dω g(ω)

[1

e~ω

kBT − 1

]~ω

=∂

∂T

∫ ∞

0

dω g(ω)

[1

(1 + ~ωkBT

+ . . .)− 1

]~ω

=

∫ ∞

0

dω g(ω)∂

∂T(kBT )

= 3kBMp

V= constant . (7.91)

In the last step, we exploited that g(ω) is normalized to the total number of phonon statesas given in Eq. (7.87). We therefore find that the specific heat indeed becomes constant athigh temperatures and the absolute value of cvibV is entirely given by the available degrees offreedom in the solid (Mp/V ). This was long known as the Dulong-Petit law from classicalstatistical mechanics, i.e., from the equipartition theorem. In this high-temperature limit,we recover this behaviour because at such high temperatures the quantized nature of thevibrations does not show up anymore, i.e., ~ω ≪ kBT . Therefore, energy can be taken upin a quasi-continuous manner and we get the classical result that each degree of freedomthat enters the Hamiltonian quadratically contributes kBT

2to the energy of the solid.

7.3.4.2 Intermediate Temperature Range (Einstein Approximation, 1907)

The constant cvibV obtained in the high-temperature limit is not surprising, since it coin-cides with what we expect from classical statistical mechanics. This classical understand-ing is, however, not able to grasp why the specific heat deviates from the constant valuepredicted by the Dulong-Petit law at intermediate and low temperatures. This lowering of

205

cvibV can only be understood starting from quantum theory by taking the discrete, quan-tized nature of vibrational excitations into account. The simplest historic model that goesbeyond a classical theory and considers this quantized nature of the oscillations stemsfrom Einstein. He ignored all interactions between the vibrating atoms in the solid andassumed that all three degrees of freedom for atoms of one species vibrate with exactlyone characteristic frequency ωE (independent harmonic oscillators). Without interactions,the dispersion curves of the various phonon branches become simply constant in suchan Einstein approximation. Accordingly, they can be regarded as a model for opticalbranches (see Fig. 7.10).For a mono-atomic solid, the phonon density of states in the Einstein model is given by

g(ω)Einstein = 3Mp

Vδ(ω − ωE) , , (7.92)

where ωE is this characteristic frequency (often called Einstein frequency) with which allatoms vibrate. Inserting this form of g(ω) into Eq. (7.90) leads to

cEinsteinV (T ) = 3kBMp

V

[x2Ee

xE

(exE − 1)2

]. (7.93)

Here, the short hand variable xE = ~ωE/(kBT ) expresses the ratio between vibrationaland thermal energy. Eq. (7.93) already yields a fundamentally important result: In thehigh-temperature limit (x≪ 1), the term in brackets approaches unity and we recover theDulong-Petit law; at intermediate temperatures, however, this term declines and finallydrops to zero for x≫ 1 (i.e. T → 0). Accordingly, the resulting specific heat curve shownin Fig. 7.10 exhibits almost all qualitative features of the experimental data. This is abig success for such a simple model and was a sensation hundred years ago!. To be morespecific, the cEinsteinV (T ) curve starts to deviate significantly from the Dulong-Petit valuefor x ∼ 1, i.e., for temperatures ΘE ∼ ~ωE/kB. ΘE is known as the Einstein temperatureand for temperatures above it, all modes can easily be excited and a classical behavior isobtained. For temperatures of the order or below ΘE, however, the thermal energy is ofthe same order of magnitude as the quantized excitations, so that these excitations can nolonger be regarded to be quasi-continuous. As a matter of fact, the discrete nature of thevibrational energy starts to show up and one says that this mode starts to “freeze-out”.In other words, any phonon mode whose energy ~ω is much greater than kBT cannot beexcited and thus contributes nothing of note to the specific heat anymore. At T → 0K,the specific heat thus approaches zero.Yet, precisely this limit is not obtained properly in the Einstein model. From Eq. (7.93) wesee that the specific heat approaches the zero temperature limit exponentially, which is notthe proper T 3 scaling known from experiment. The reason for this failure lies, of course, inthe gross simplification of the phonon density of states. Approximating all modes to havethe same (dispersionless) frequency is not too bad for the treatment of optical phononbranches as illustrated in Fig. 7.10. If the characteristic frequency is chosen properly inthe range of the optical branches, the specific heat from this model in fact reproduces theexperimental or properly computed cvibV quite well in the intermediate temperature range.At these temperatures, the specific heat is thus indeed dominated by the contributionsfrom the optical modes. At lower temperatures, these modes freeze out rapidly, so that theEinstein model predicts a rapid exponential decline of cEinsteinV to zero. While the opticalbranches are indeed unimportant at low temperatures, the low-energy acoustic branches

206

still contribute, though. And it is the contribution from the latter that gives rise to theT 3 scaling for T → 0K. Since the Einstein model does not describe acoustic dispersionand low-frequency modes at all, it fails in reproducing this limit.

7.3.4.3 Low Temperature Limit (Debye Approximation, 1912)

To properly address the low temperature limit, we a model for the acoustic branches,since these are the lowest energy modes that can still be excited thermally for T → 0K.More specifically, the part of the acoustic branch close to the Brillouin zone center, cf.Fig. 7.10, needs to be considered. In section 7.2.5 we had discussed that in this region thedispersion is linear in |q| and for the corresponding very long wavelength modes the solidbehaves like a continuous elastic medium. This is the conceptual idea behind the Debyemodel, which approximates the dispersion of these modes using ωi(q) = vg|q|, wherevg (the group velocity) is the speed of sound that can be obtained from linear elasticitytheory. Note that we are only going to discuss a minimalistic Debye model here, becauseeven in a perfectly isotropic medium there would be at least different group velocities forlongitudinal and transverse waves, while in real solids this gets even more differentiated.To illustrate the idea behind the Debye model, all of this is neglected and we only use onegeneric speed of sound for all three acoustic phonon modes.

The important aspect of the Debye model is in any case not this material constant vg,but the linear scaling of the dispersion curves. With this linear scaling, one obtains thephonon density of states

g(ω)Debye =3

(2π)3

∫dq3δ(ω − vg|q|) (7.94)

=3

2π2

∫dqq2δ(ω − vgq) (7.95)

=3

2π2

ω2

|vg|3(7.96)

⇒ 3

2π2

ω2

|vg|3θ(ωD − ω) . (7.97)

Since this quadratic form would be unbound for ω → ∞, a cutoff frequency ωD (Debyefrequency) is introduced in the last step. The value of ωD is determined by Eq. (7.87),i.e., by requiring that the integral over g(ω)Debye yields the correct number of availablevibrational states per volume ( 3

Vin this case). Accordingly, one gets

ωD =√3|vg|3

6π2

V, (7.98)

which allows to define a Debye temperature ΘD = ~ωD/kB in close analogy to the Einsteintemperature introduced above.

207

In such a Debye model, the specific heat defined in Eq. (7.90) is given by

cDebyeV = kB

∫dωg(ω)Debye

(~ωkBT

)2exp

(~ωkBT

)

(exp

(~ωkBT

)− 1)2

(7.99)

=3kB

2π2|vg|3

ωD∫

0

dω ω2

[x2 exp (x)

(exp (x)− 1)2

](7.100)

=9kBV

(T

ΘD

)3 ∫ ΘD/T

0

dxx4ex

(ex − 1)2. (7.101)

As it was the case in the Einstein model, the Dulong-Petit value is again obtained properlyalso in the Debye model for the high-temperature limit (at least for the three acousticdegrees of freedom). Also, cvibV declines to zero at low temperatures T → 0. As immediatelyapparent from the equation above, the scaling in this limit is now correctly given by T 3. Asexpected, the Debye model thus recovers the correct low-temperature limit. However, itfails for the intermediate temperature range: As illustrated in Fig. 7.10, the Debye modelsubstantially deviates from the Einstein model that reproduces experimental data in thisrange very well. Obviously, the reason is that at these temperatures the optical branchesget noticeably excited, but that these optical branches are essentially not accounted forin the Debye model, cf. Fig. 7.10.

The specific heat data from a real solid can therefore be well understood from the dis-cussion of the three ranges just discussed. At low temperatures, only the acoustic modescontribute noticeably to cV and the Debye model provides a good description. With in-creasing temperatures, optical modes start to become excited, too, and the cV curvechanges gradually to the form predicted by the Einstein model. Finally, at the highesttemperatures all modes contribute quasi-continuously and the classical Dulong-Petit valueis approached. Similar to the role of ΘE in the Einstein model, also the Debye temper-ature ΘD indicates the temperature range above which classical behavior sets in andbelow which modes begin to freeze-out due to the quantized nature of the lattice vibra-tions. Debye temperatures are normally obtained by fitting the predicted T 3 form of thespecific heat to low temperature experimental data. They are tabulated for all elementsand are typically of the order of a few hundred degree Kelvin, which is thus the naturaltemperature scale for (acoustic) phonons.

Therefore, the Debye temperature plays the same role in the theory of lattice vibrations asthe Fermi temperature TF plays in the theory of metals, separating the low-temperatureregion where quantum statistics must be used from the high-temperature region whereclassical statistical mechanics is valid. For conduction electrons, TF is of the order of20000 K, though, i.e., at actual temperatures we are always well below TF in the electroniccase, while for phonons both classical and quantum regimes can be encountered. This largedifference between TF and ΘD is also the reason why the contribution from the conductionelectrons to the specific heat of metals is only noticeable at lowest temperatures. In thisregime, celV scales linearly with (T/TF ) (cf. exercise 4). The phonon contribution scaleswith (T/ΘD)

3, though. If we equate the two relationships, we see that the conduction

208

Figure 7.11: Measured specific heats at constant volume cV and at constant pressure cPfor aluminum (from Brooks and Bingham, J. Phys. Chem. Sol. 29, 1553 (1968)).

electron contribution will show up only for temperatures

T ∼√

Θ3D

TF, (7.102)

i.e., for temperatures of the order a few Kelvin.Finally, we note that comparing the tabulated values for ΘD with measured meltingtemperatures, a rough correlation between the two quantities (ΘD is about 30-50% ofthe melting temperature) is found. The Debye temperature can therefore also be seen asa measure for the stiffness of the material and thus reflects the critical role that latticevibrations play also in the process of melting (which is one of the other properties listedin the beginning of this chapter, for which a static lattice model obviously has to fail).For the exact description of melting, however, we come to a temperature range, where theinitially made assumption of small vibrations around the PES minimum and with thatthe harmonic approximation is certainly no longer valid. Melting is therefore one of thematerials properties, for which it is necessary to go beyond the theory of lattice vibrationsdeveloped so far in this chapter.

7.4 Anharmonic Effects in Crystals

So far, our general theory of lattice vibrations has been based on the initial assumptionthat the amplitudes of the oscillations are quite small. This has led us to the so called har-monic approximation, where higher order terms beyond the quadratic one are neglectedin the Taylor expansion around the PES minimum. This approximation is plausible at lowtemperatures, keeps the mathematics tractable and can indeed explain a wide range of ma-terial properties. We had exemplified this in the last section for a fundamental equilibrium

209

property like the specific heat. Similarly, we would find that the developed theory is ableto correctly reproduce and predict other thermodynamic equilibrium properties, e.g., freeenergies, at low and intermediate temperatures. At elevated temperatures, however, thistheoretical framework becomes inaccurate: As shown in Fig. 7.11 for aluminum, the spe-cific heat cV is not a simple constant in the high-temperature regime. Under these ther-modynamic conditions, our initial assumption of small oscillations obviously breaks down,so that higher-order terms in the Taylor expansion –so called anharmonic terms– need tobe accounted for. At elevated temperatures, these anharmonic effects often quantitativelychange the predictions made by a harmonic furthermore. There are, however, also a num-ber of materials properties that cannot be understood within a harmonic theory of latticevibrations at all. The most important of these are the thermal expansion of crystals andheat transport.

7.4.1 Thermal Expansion

In a rigorously harmonic crystal the equilibrium size would not depend on tempera-ture. One can prove this formally (rather lengthy, see Ashcroft and Mermin), but itbecomes already intuitively clear by looking at the equilibrium position in an arbitraryone-dimensional PES minimum. If we expand the Taylor series only to second order aroundthe minimum, the harmonic potential is symmetric to both sides. Already from symmetryit is therefore clear that the average equilibrium position of a particle vibrating aroundthis minimum must at any temperature coincide with the minimum position itself in theharmonic approximation. If the PES minimum represents the bond length (regardless ofwhether in a molecule or a solid), no change will therefore be observed with temperature.Only anharmonic terms, at least the cubic one, can yield an asymmetric potential andtherewith a temperature dependent equilibrium position. With this understanding, it isalso obvious that the temperature dependence of elastic constants of solids, e.g., the bulkmodulus, must result from anharmonic effects.Let’s consider in more detail the thermal expansion coefficient α. For an isotropic expan-sion of a cube of volume V , the expansion coefficient can be expressed as

α =1

3V

(∂V

∂T

)

P

. =1

3B

(∂P

∂T

)

V

. (7.103)

by using the thermodynamic definition of the bulk modulus

1

B=

1

V

(∂V

∂P

)

T

. (7.104)

Since we know that the pressure P can be written as

P = −(∂U

∂V

)

T

, (7.105)

volume-derivative of the internal energy U given in Eq. (7.85), the lattice expansion αcan be calculated via:

α =1

3B

∑

i,q

(−∂~ωi(q)

∂V

)

T

(∂ni(q)

∂T

)

V

. (7.106)

210

Here, ni(q) is the Bose-Einstein distribution for mode i at wavevector q. Since its deriva-tive also appears in the definition of the specific heat cV (T ), it is common to express thethermal expansion coefficient in terms of the specific heat. For this purpose, mode-specificspecific heats

cV (i,q, T ) =~ωi(q)

V

∂ni(q)

∂T(7.107)

and mode-specific Grüneisen parameters

γi(q) = −V

ωi(q)

∂ωi(q)

∂V= −∂ ln(ωi(q))

∂ ln(V )(7.108)

are introduced. This yields

α =1

3B

∑

i,q

γi(q)cV (i,q, T ) =γ(T )cV (T )

3B. (7.109)

In the last step, we have used the definition of the overall Grüneisen parameter

γ(T ) =

∑i,q γi(q) cV (i,q, T )∑

i,q,T

cV (i,q, T )

︸︷︷︸cV (T )

(7.110)

to further simplify the equation.The above discussion and especially Eq. (7.110) shows that in first order approxima-tion (γ(T ) ≈ constant, e.g., NaCl = 1.6, KBr = 1.5), α(T ) will exhibit a similar tempera-ture dependence as cV . Specifically, α(T ) approaches zero as T 3 and becomes constant athigh T . Notable exceptions, exist, though, e.g., the negative lattice expansion of siliconthat is discussed in the exercises. In this case, also the approximation γ(T ) ≈ constanthas its limits: the actual dependence of the Grüneisen parameter on temperature (andvolume) needs to be additionally accounted for. This also is true at high-temperaturesand/or in the case of strong anharmonicities (see exercises).

7.4.2 Heat Transport

Heat transport in solid, isotropic materials is described by Fourier’s law:

J = κ∇T . (7.111)

Whenever a temperature gradient ∇T is present, a heat flux J develops that drives thesystem back to equilibrium. The material-specific, temperature-dependent constant thatrelates J and ∇T is the thermal conductivity κ. Heat can be transported through acrystal either by electrons (metals) or by phonons (all solids).7 We will discuss electronictransport in more detail in a later chapter. Here, we focus on the lattice vibrations, whichare the dominant contribution in semiconductors and insulators.Heat in the form of a once created wavepacket of lattice vibrations would travel indef-initely in a rigorously harmonic crystal: This becomes evident from the fact that the

7For transparent materials at elevated temperatures, i.e., when a material is already incandescent,photons can play a role, too.

211

generalized coordinate R

Pote

ntia

l Ene

rgy

V VBO

HarmonicApproximation

3rd

OrderApproximation

Figure 7.12: Left: Schematic one-dimensional representation of the Taylor expansion foran anharmonic potential-energy surface V BO: Both the harmonic approximation (green)as well as a third-order expansion (7.112) that additionally includes a cubic term (orange)are shown. Right: Computed thermal conductivity κ of silicon (red) and germanium (blue)compared to experiment (full symbols), from Broido et al., Appl. Phys. Lett. 91, 231922(2007).

energy stored in each individual mode is time-independent, i.e., the amplitudes Ai(q) in aclassical case (see Eq. (7.31)) and the occupation numbers ni(q) in a quantum-mechanicalformalism (see Eq. (7.77). This is not surprising: These oscillations are eigenstates of theharmonic Hamiltonian and do thus not decay over time, but are completely determined bythe initial conditions. Consequently, a perfectly harmonic solid would exhibit an infinitethermal conductivity.To explain the finite thermal conductivity of insulators and semiconductors, anharmonicterms beyond the quadratic one must be considered in the PES expansion, e.g., by ex-tending the Taylor expansion in Eq. (7.8) to the third order:

V BO(RI) ≈ V BO(RI) +

1

2

∑

I,J

∑

µ,ν

sµI sνJ

[∂2V BO(RI)∂Rµ

I ∂RνJ

]

RI

(7.112)

+1

6

∑

I,J,K

∑

µ,ν,η

sµI sνJs

ηK

[∂3V BO(RI)∂Rµ

I ∂RνJ∂R

ηK

]

RI︸︷︷︸

ΨµνηIJK

. (7.113)

As shown in Fig. 7.12, such an approximation obviously improves the description of thePES in the region close to the actual minimum with respect to a pure harmonic poten-tial. Still, inaccuracies arise whenever the system is not close to equilibrium anymore: Inparticular, this third-order potential yields qualitatively wrong results in this limit, i.e., itis unbound V BO(RI) → −∞ due to the asymmetry in the cubic term. This makesan exact solution of the equations of motion in such a potential essentially impossible –both analytically and numerically. Instead, the cubic Ψµνη

IJK term is typically treated viaperturbation theory by expanding the Hamiltonian for the third-order potential in termsof the harmonic solutions. The actual derivation is lengthy and complex (see for instance

212

Broido, Ward, and Mingo, Phys. Rev. B 72, 014308 (2005)), so we only summarize thefinal result here: In an isotropic crystal, the thermal conductivity can be evaluated via

κ =1

(2π)3

∑

i

∫dqcV (i,q, T )v

2g(i,q)τi(q) . (7.114)

Loosely speaking, this expression just formalizes that each phonon mode i,q transportsits energy cV (i,q, T ) (the mode-specific specific heat) with the respective group veloc-ity vg(i,q) for a certain time (the lifetime τi(q)). After this lifetime, which is the onlyterm that depends on the Ψµνη

IJK , the complete contribution of the mode is lost. This fi-nite lifetime arises since the harmonic phonons are now no longer perfect solutions to theequations of motions, but only a first approximation. Phonons can therefore “decay” byscattering at each other and this immediately yields a finite thermal conductivity.As shown in Fig. 7.12, this kind of approach yields qualitatively and quantitatively excel-lent results at low temperatures and for harmonic materials with large thermal conduc-tivities, i.e., whenever anharmonicity can indeed be regarded as a small perturbation. Inall other cases, this approach is prone to fail, since a third-order expansion only describesa small portion of the potential-energy surface that is explored in strongly anharmonicsystems at elevated temperatures (see Fig. 7.12). Accordingly, an assessment of thermalinsulators requires to take the full potential-energy surface into account. This can beachieved with molecular dynamics simulations that will be discussed in the exercises.The thermal conductivity can then be evaluated from such simulations (see Carbogno,Ramprasad, and Scheffler, Phys. Rev. Lett. 118, 175901 (2017)).

213

8 Magnetism

8.1 Introduction

The topic of this part of the lecture are the magnetic properties of materials. Most materi-als are generally considered to be “non-magnetic”, which is a loose way of saying that theybecome magnetized only in the presence of an applied magnetic field (dia- and paramag-netism). We will see that in most cases these effects are very weak and the magnetizationis lost as soon as the external field is removed. Much more interesting (also from a tech-nological point of view) are those materials which not only have a large magnetization,but also retain it even after the removal of the external field. Such materials are calledpermanent magnets (ferromagnetism).The property that like (unlike) poles of permanent magnets repel (attract) each otherwas already known to the Ancient Greeks and Chinese over 2000 years ago. They usedthis knowledge for instance in compasses. Since then, the importance of magnets has risensteadily. They play an important role in many modern technologies:

• Recording media (e.g. hard disks): Data is recorded on a thin magnetic coating.The revolution in information technology owes as much to magnetic storage as toinformation processing with computer chips (i. e., the ubiquitous silicon chip).

• Credit, debit, and ATM cards: All of these cards have a magnetic strip on one side.

• TVs and computer monitors: Monitors contain a cathode ray tube that employsan electromagnet to guide electrons to the screen. Plasma screens and LCDs usedifferent technologies.

• Speakers and microphones: Most speakers employ a permanent magnet and a current-carrying coil to convert electric energy (the signal) into mechanical energy (move-ment that creates the sound).

• Electric motors and generators : Most electric motors rely upon a combination ofan electromagnet and a permanent magnet, and, much like loudspeakers, they con-vert electric energy into mechanical energy. A generator is the reverse: it convertsmechanical energy into electric energy by moving a conductor through a magneticfield.

• Medicine: Hospitals use (nuclear) magnetic resonance tomography to spot problemsin a patient’s organs without invasive surgery.

Figure 8.1 illustrates the type of magnetism in the elementary materials of the periodictable. Only Fe, Ni, Co and Gd exhibit ferromagnetism. Most magnetic materials are there-fore alloys or oxides of these elements or contain them in another form. There is, however,

214

Figure 8.1: Magnetic type of the elements in the periodic table. For elements without acolor designation, magnetism is smaller or not explored.

recent and increased research into new magnetic materials such as plastic magnets (or-ganic polymers), molecular magnets (often based on transition metals) or molecule basedmagnets (in which magnetism arises from strongly localized s and p electrons).Electrodynamics gives us a first impression of magnetism. Magnetic fields act on movingelectric charges or, in other words, currents. Very roughly, one may understand the be-havior of materials in magnetic fields as arising from the presence of “moving charges”.Bound core and quasi-free valence electrons possess a spin, which in a very simplified clas-sical picture can be viewed as a rotating charge, i. e., a current. Bound electrons have anadditional orbital momentum which adds another source of current (again in a simplisticclassical picture). These “microscopic currents” react in two different ways to an appliedmagnetic field: First, according to Lenz’ law, a current is induced that creates a mag-netic field opposing the external magnetic field (diamagnetism). Second, the “individualmagnets” represented by the electron currents align with the external field and enhanceit (paramagnetism). If these effects happen for each “individual magnet” independently,they remain small. The corresponding magnetic properties of the material may then beunderstood from the individual behavior of the constituents, i. e., from the individualatoms (insulators, semiconductors) or from the ions and free electrons (conductors), cf.Sec. 8.3. The much stronger ferromagnetism, however, arises from a collective behaviorof the “individual magnets” in the material. In Sec. 8.4 we will first discuss the source forsuch an interaction before we move on to simple models treating either a possible couplingof localized moments or itinerant ferromagnetism.

8.2 Macroscopic Electrodynamics

Before we plunge into the microscopic sources of magnetism in solids, let us first recapa few definitions from macroscopic electrodynamics. In vacuum we have E and B as the

215

electric field (unit: V/m) and the magnetic flux density (unit: Tesla = Vs/m2), respectively.Note that both are vector quantities, and in principle they are also functions of space andtime. Since we will only deal with constant, uniform fields in this lecture, this dependencewill be dropped throughout. Inside macroscopic media, both fields can be affected by thecharges and currents present in the material, yielding the two new net fields, D (electricdisplacement, unit: As/m2) and H (magnetic field, unit: A/m). For the formulation ofmacroscopic electrodynamics (the Maxwell equations in particular) we therefore need so-called constitutive relations between the external (applied) and internal (effective) fields:

D = D(E,B), H = H(E,B). (8.1)

In general, this functional dependence can be written in form of a multipole expansion. Inmost materials, however, already the first (dipole) contribution is sufficient. These dipoleterms are called P (el. polarization) and M (magnetization), and we can write

H =B

µ0

−M+ . . . (8.2)

D = ǫ0 (E+P+ . . .) , (8.3)

where ǫ0 = 8.85 · 10−12 As/Vm and µ0 = 4π · 10−7 Vs/Am are the dielectric constant andpermeability of vacuum. P and M depend on the applied field, which we can formallywrite as a Taylor expansion in E and B. If the applied fields are not too strong, onecan truncate this series after the first term (linear response), i. e., the induced polariza-tion/magnetization is then simply proportional to the applied field. We will see below thatexternal magnetic fields we can generate at present in the laboratory are indeed weak com-pared to microscopic magnetic fields, i. e., the assumption of a magnetic linear responseis often well justified. As a side note, current high-intensity lasers may, however, bringus easily out of the linear response regime for the electric polarization. Correspondingphenomena are treated in the field of non-linear optics.For the magnetization, it is actually more convenient to expand in the internal field H

instead of in B. In the linear response regime, we thus obtain for the induced magnetizationand the electric polarization

M = χmag H ⇒ M = χmag H (8.4)

P = χel E ⇒ P = χel E . (8.5)

Here, χel/mag is the dimensionless electric/magnetic susceptibility tensor. In simple ma-terials (on which we will focus in this lecture), the linear response is often isotropic inspace and parallel to the applied field. The susceptibility tensors then reduce to a scalarform χmag/el.These equations are in fact often directly written as defining equations for the dimension-less susceptibility constants in solid state textbooks. It is important to remember, however,that we made the dipole approximation and restricted ourselves to isotropic media. Forthis case, the constitutive relations between external and internal field reduce to

H =1

µ0µr

B (8.6)

D = ǫ0ǫr E, (8.7)

216

material susceptibility χmag

vacuum 0H20 -8·10−6 diamagneticCu ∼ -10−5 diamagneticAl 2·10−5 paramagneticiron (depends on purity) ∼ 100 - 1000 ferromagneticµ-metal (nickle-iron alloy) ∼ 80,000 - 100,000 basically screens everything (used

in magnetic shielding)

Table 8.1: Magnetic susceptibility of some materials.

where ǫr = 1+χel is the relative dielectric constant, and µr = 1+χmag the relative perme-ability of the medium. For χmag < 0, the magnetization is anti-parallel to H. Accordingly,we have µr < 1 and |H| < | 1

µ0B|. The response of the medium reduces (or screens) the

external field. Such a system is called diamagnetic. For the opposite case (χmag > 0), themagnetization is aligned with H. Hence, we have µr > 1 and therefore |H| > | 1

µ0B|. The

external field is enhanced by the material, and we talk about paramagnetism.To connect to microscopic theories, we consider the energy. The process of screening orenhancing the external field will require work, i. e., the energy of the system is changed(magnetic energy). Assume therefore that we have a material of volume V which we bringinto a magnetic field B (which for simplicity we take to be along one dimension only).From electrodynamics, we know that the energy of this magnetic field is

Emag =1

2BHV ⇒ 1

VdEmag =

1

2(B dH + H dB) . (8.8)

From Eq. (8.6), we find dB = µ0µrdH and therefore H dB = B dH, which leads to

1

VdEmag = B dH = (µ0H︸︷︷︸

B0

+µ0M) dH (8.9)

The first term µ0H = B0 describes the field in the vacuum, i. e., without the material,and thus just gives the field induced energy change if no material was present. Only thesecond term is material dependent and describes the energy change of the field in reactionto the material. Due to energy conservation, the energy change in the material itself istherefore

dEmagmaterial = −µ0MV dH. (8.10)

Recalling that the energy of a dipole with magnetic moment m in a magnetic field isE = −mB, we see that the approximations leading to Eq. (8.6) are equivalent to assumingthat the homogeneous solid is build up of a constant density of “molecular dipoles”, i. e.,the magnetization M is the (average) dipole moment density.By rearranging Eq. (8.10), we finally arrive at an expression that is a special case of afrequently employed alternative definition of the magnetization

M(H) = − 1

µ0V

∂E(H)

∂H

∣∣∣∣S,V

, (8.11)

and in turn of the susceptibility

χmag(H) =∂M(H)

∂H

∣∣∣∣S,V

= − 1

µ0V

∂2E(H)

∂H2

∣∣∣∣S,V

. (8.12)

217

At finite temperatures, it is straightforward to generalize these definitions to

M(H, T ) = − 1

µ0V

∂F (H, T )

∂H

∣∣∣∣S,V

χmag(H, T ) = − 1

µ0V

∂2F (H, T )

∂H2

∣∣∣∣S,V

. (8.13)

While the derivation via macroscopic electrodynamics is most illustrative, we will see thatthese last two equations will be much more useful for the actual quantum-mechanicalcomputation of the magnetization and susceptibility of real systems. All we have to do isto derive an expression for the energy of the system as a function of the applied externalfield. The first and second derivatives with respect to the field then yield M and χmag,respectively.

8.3 Magnetism of Atoms and Free Electrons

We had already discussed that magnetism arises from the “microscopic currents” con-nected to the orbital and spin moments of electrons. Each electron therefore representsa “microscopic magnet”. But how do they couple in materials with a large number ofelectrons? Since any material is composed of atoms, it is reasonable to first reduce theproblem to that of the electronic coupling inside an atom before attempting to describethe coupling of “atomic magnets”. Since both orbital and spin momentum of all boundelectrons of an atom will contribute to its magnetic behavior, it will be useful to firstrecall how the total electronic angular momentum of an atom is determined (Sec. 8.3.1)before we turn to the effect of a magnetic field on the atomic Hamiltonian (Sec. 8.3.2).In metals, we have the additional presence of free conduction electrons. Their magneticproperties will be addressed in Sec. 8.3.3. Finally, we will discuss in Sec. 8.3.4 which as-pects of magnetism we can already understand just on the basis of these results on atomicmagnetism, i. e., in the limit of vanishing coupling between the “atomic magnets”.

8.3.1 Total Angular Momentum of Atoms

In the atomic shell model, the possible quantum states of electrons are labeled as nlmlms,where n is the principle quantum number, l the orbital quantum number, ml the orbitalmagnetic quantum number, and ms the spin magnetic quantum number. For any given nthat defines a so-called “shell”, the orbital quantum number l can take only integer valuesbetween 0 and (n− 1). For l = 0, 1, 2, 3 . . ., we generally use the letters s, p, d, f and soon. Within such an nl “subshell”, the orbital magnetic number ml may have the (2l + 1)integer values between −l and +l. The last quantum number, the spin magnetic quantumnumber ms, takes the values −1/2 and +1/2. For the example of the first two shells, thisleads to two 1s, two 2s, and six 2p states.When an atom has more than one electron, the Pauli exclusion principle dictates thateach quantum state specified by the set of four quantum numbers nlmlms can only beoccupied by one electron. In all but the heaviest ions, spin-orbit coupling1 is negligible. Inthis case, the Hamiltonian does not depend on spin or the orbital moment and thereforecommutes with these quantum numbers (Russel-Saunders coupling). The total orbital

1Spin-orbit (or LS) coupling is a relativistic effect that requires the solution of the Dirac equation.

218

and spin angular momentum for a given subshell are then good quantum numbers andare given by

L =∑

ml and S =∑

ms. (8.14)

If a subshell is completely filled, it is easy to verify that L=S=0. The total electronicangular momentum of an atom, J = L+ S, is therefore determined by the partially filledsubshells. The occupation of quantum states in a partially filled shell is given by Hund’srules :

1. The states are occupied so that as many electrons as possible (within the limitationsof the Pauli exclusion principle) have their spins aligned parallel to one another, i. e.,so that the value of S is as large as possible.

2. When determined how the spins are assigned, then the electrons occupy states suchthat the value of L is a maximum.

3. The total angular momentum J is obtained by combining L and S as follows:

• if the subshell is less than half filled (i. e., if the number of electrons is < 2l+1),then J = L− S;

• if the subshell is more than half filled (i. e., if the number of electrons is> 2l+1),then J = L+ S;

• if the subshell is exactly half filled (i. e., if the number of electrons is = 2l+1),then L = 0, J = S.

The first two Hund’s rules are determined purely from electrostatic energy considera-tions, e. g., electrons with equal spins are farther away from each other on account of theexchange-hole. Only the third rule follows from spin-orbit coupling: Each quantum statein this partially filled subshell is called a multiplet. The term comes from the spectroscopycommunity and refers to multiple lines (peaks) that have the same origin (in this casethe same subshell nl). Without any interaction, e. g. electron-electron, spin-orbit, etc., allquantum states in this subshell would be energetically degenerate and the spectrum wouldshow only one peak at this energy. The interaction lifts the degeneracy and, in an ensembleof atoms, different atoms might be in different quantum states. The single peak then splitsinto multiple peaks in the spectrum. The notation for the ground-state multiplet obtainedby Hund’s rules is not simply denoted LSJ as one would have expected. Instead, for histor-ical reasons, the notation (2S+1)LJ is used. To make matters worse, the angular momentumsymbols are used for the total angular momentum (L = 0, 1, 2, 3, . . . = S, P,D, F, . . .). Therules are in fact easier to apply than their description might suggest at first glance. Table8.2 lists the filling and notation for the example of a d-shell.

8.3.2 General derivation of atomic susceptibilities

Having determined the total angular momentum of an atom, we now turn to the effect ofa uniform magnetic field

H =

00H

(8.15)

219

el. ml = 2 1 0 −1 −2 S L = |∑ml| J Symbol1 ↓ 1/2 2 3/2 2D3/2

2 ↓ ↓ 1 3 2 3F2

3 ↓ ↓ ↓ 3/2 3 3/2 4F3/2

4 ↓ ↓ ↓ ↓ 2 2 0 5D0

5 ↓ ↓ ↓ ↓ ↓ 5/2 0 5/2 6S5/2

6 ↓↑ ↓ ↓ ↓ ↓ 2 2 4 5D4

7 ↓↑ ↓↑ ↓ ↓ ↓ 3/2 3 9/2 4F9/2

8 ↓↑ ↓↑ ↓↑ ↓ ↓ 1 3 4 3F4

9 ↓↑ ↓↑ ↓↑ ↓↑ ↓ 1/2 2 5/2 2D5/2

10 ↓↑ ↓↑ ↓↑ ↓↑ ↓↑ 0 0 0 1S0

Table 8.2: Ground states of ions with partially filled d-shells (l = 2), as constructed fromHund’s rules.

along the z-axis on the electronic Hamiltonian of an atom, He = T e + V e−ion + V e−e.Note that the focus on He means that we neglect the effect of H on the nuclear motionand spin. This is in general justified by the much greater mass of the nuclei rendering thenuclear contribution to the atomic magnetic moment very small, but it would of coursebe crucial if we were to address, e. g., nuclear magnetic resonance (NMR) experiments.Since V e−ion and V e−e are not affected by the magnetic field, we are left with the kineticenergy operator.From classical electrodynamics, we know that in the presence of a magnetic field, we haveto replace all momenta p (= i~∇) by the canonic momenta p → p + eA. For a uniformmagnetic field (along the z-axis), a suitable vector potential A is

A = −1

2(r× µ0H) , (8.16)

for which it is straightforward to verify that it fulfills the conditions H = (∇ ×A) and∇ ·A = 0. The total kinetic energy operator is then written

T e(H) =1

2m

∑

k

[pk + eA]2 =1

2m

∑

k

[pk −

e

2(rk × µ0H)

]2. (8.17)

Expanding the square leads to

T e(B) =∑

k

[p2k2m

+e

2mpk · (µ0H× rk) +

e2

8m(rk × µ0H)2

](8.18)

= T eo +

∑

k

eµ0

2mH · (rk × pk) +

e2µ20H

2

8m

∑

k

(x2k + y2k

),

where we have used the fact that p ·A = A · p if ∇ ·A = 0 and that the first term T eo

describes the kinetic energy of the system without any field. We simplify this equationfurther by introducing the dimensionless (!) total electronic orbital momentum operatorL = 1/~

∑k(rk ×pk) and the Bohr magneton µB = (e~/2m) = 0.579 · 10−4 eV/T, so that

the variation of the kinetic energy operator due to the magnetic field can be expressed as

∆T e(B) = T e(H)− T eo = µBµ0L ·H +

e2µ20

8mH2∑

k

(x2k + y2k

). (8.19)

220

If this was the only effect of the magnetic field on He, one can, however, show that themagnetization in thermal equilibrium must always vanish (Bohr-van Leeuwen theorem).In simple terms, this is because the vector potential amounts to a simple shift in p whichintegrates out when a summation over all momenta is performed. The free energy is thenindependent of H and the magnetization is zero.In reality, however, the magnetization does not vanish. The problem is that we have usedthe wrong kinetic energy operator. We should have solved the relativistic Dirac equation

(V −mc2 −cσ · (p− eA)

cσ · (p− eA) −V −mc2)(

φχ

)= E

(φχ

). (8.20)

Here, φ =

(φ↑φ↓

)is an electron wave function and χ =

(χ↑χ↓

)a positron one, σ is

a vector of 2 × 2 Pauli spin matrices, c is the speed of light, and V is the potentialenergy. To solve and fully understand this equation, one would have to resort to quantumelectrodynamics (QED), which goes far beyond the scope of this lecture.We therefore heuristically introduce another momentum, the electron spin, that wouldemerge from the fully relativistic treatment mentioned above. The new interaction energyterm has the form

∆Hspin(H) = g0µBµ0S ·H , where S =∑

k

Sk . (8.21)

Here, S is the again dimensionless total spin angular momentum operator andg0 is theso called electronic g-factor (i. e., the g-factor for one electron spin). With the Diracequation one obtains g0 = 2, whereas a full QED approach yields

g0 = 2[1 +

α

2π+O(α2) + . . .

], where α =

e2

~c≈ 1

137, (8.22)

= 2.0023 . . . .

For the sake of notational conciseness, we will use a value of g0 = 2 in this lecture.Eventually, the total field-dependent Hamiltonian is therefore

∆He(H) = ∆T e(H)+∆Hspin(H) = µ0µB (L+ g0S)·H +e2µ2

0

8mH2∑

k

(x2k + y2k

). (8.23)

We will see below that the energy shifts produced by Eq. (8.23) are generally quite smallon the scale of atomic excitation energies, even for the highest presently attainable lab-oratory field strengths. Perturbation theory can thus be used to calculate the changes inthe energies connected with this modified Hamiltonian. Remember that magnetizationand susceptibility were the first and second derivative of the field-dependent energy withrespect to the field, which is why we would like to write the energy as a function of H.Since we will require the second derivative later, terms to second order in H must beretained. Recalling second order perturbation theory, we can write the energy of the nthnon-degenerate level of the unperturbed Hamiltonian as

En → En +∆En(H); ∆En = 〈n|∆He(H)|n〉+∑

n′ 6=n

|〈n|∆He(H)|n′〉|2En − E ′

n

. (8.24)

221

Substituting Eq. (8.23) into the above and retaining terms to quadratic order in H, wearrive at the basic equation for the theory of the magnetic susceptibility of atoms:

∆En =

∆EPara

︷︸︸︷µ0µB〈n| (L+ g0S) ·H|n〉 +

∆ELL

︷︸︸︷e2µ2

0

8mH2〈n|

∑

k

(x2k + y2k

)|n〉

+ µ20µ

2B

∑

n′ 6=n

|〈n|(L+ g0S) ·H|n′〉|2En − En′

︸︷︷︸∆EVV

. (8.25)

Since we are mostly interested in the atomic ground state |0〉, we will now proceed to eval-uate the magnitude of the three terms contained in Eq. (8.25) and their correspondingsusceptibilities. This will also give us a feeling of the physics behind the different contri-butions. Beforehand, a few remarks –that will be justified during the discussion below–on the order of magnitude of these terms:

• If J = L = S = 0, we are dealing with a closed shell atom. Accordingly, alsoexcitation energies En−En′ will be large and the only non-vanishing term is ∆ELL.

• If J = 0, but L and S are not, we get an additional contribution from ∆EVV. Thecontribution will be the stronger the less energy an electronic excitation En − En′

costs.

• If J 6= 0, ∆EPara 6= 0 will completely dominate Eq. (8.25). In this case, however,the ground state is (2J + 1)-degenerate and we have to diagonalize the matrix in∆EPara.

8.3.2.1 ∆ELL: Larmor/Langevin Diamagnetism

To obtain an estimate for the magnitude of this term, we assume spherically symmetricwavefunctions (〈0|∑k(x

2k+y

2k)|0〉 ≈ 2/3 〈0|∑k r

2k|0〉). This allows us to approximate the

energy shift in the atomic ground state due to the second term as

∆Edia0 ≈ e2µ2

0H2

12m

∑

k

〈0|r2k|0〉 ∼e2µ2

0H2

12mZ r2atom , (8.26)

where Z is the total number of electrons in the atom (resulting from the sum over the kelectrons in the atom), and r2atom is the mean square atomic radius. If we take Z ∼ 30and r2atom of the order of Å2, we find ∆Edia

0 ∼ 10−9 eV even for fields of the order of Tesla.This contribution is therefore rather small. We make a similar observation from an orderof magnitude estimate for the susceptibility,

χmag,dia = − 1

µ0V

∂2E0

∂H2= −µ0e

2Zr2atom6mV

∼ −10−4. (8.27)

With χmag,dia < 0, this term represents a diamagnetic contribution (the so called Larmoror Langevin diamagnetism), i. e., the associated magnetic moment screens the externalfield. We have thus identified the term which we qualitatively expected to arise out of theinduction currents initiated by the applied magnetic field.

222

We will see below that this diamagnetic contribution will only determine the overallmagnetic susceptibility of the atom when the other two terms in Eq. (8.25) vanish. Thisis only the case for atoms with J|0〉 = L|0〉 = S|0〉, i. e., for closed shell atoms or ions(e.g. the noble gas atoms). In this case, the ground state |0〉 is indeed non-degeneratewhich justifies our use of Eq. (8.25) to calculate the energy level shift. As we recall fromchapter 6 on cohesion, the first excited state is much higher in energy for such closed shellsystems, so that at all but the highest temperatures, there is also a negligible probabilityof the atom being in any but its ground state in thermal equilibrium. This means that thediamagnetic susceptibility will be largely temperature independent (otherwise it wouldhave been necessary to use Eq. (8.13) for the derivation of χmag,dia).

8.3.2.2 ∆EVV: Van Vleck Paramagnetism

We first note that the evaluation of this term for a non-degenerate (J = 0) ground state |n〉and for a field aligned along the z-direction yields

∆EVV = µ20µ

2BH

2∑

n

|〈0|(Lz + g0Sz)|n〉|2E0 − En

, (8.28)

in which Lz and Sz are the total angular projection operators along the z-axis for theorbital and spin angular momenta, respectively. By taking the second derivative, we obtainthe susceptibility

χmag,vleck =2µ0µ

2B

V

∑

n

|〈0|(Lz + g0Sz)|n〉|2En − E0

. (8.29)

Note that we have compensated for the minus sign in the definition of the susceptibil-ity (8.12) by reversing the order in the denominator. Since the energy of any excitedstate will necessarily be higher than the ground state energy (En > E0), we thus getχmag,vleck > 0. The third term therefore represents a paramagnetic contribution (the socalled Van Vleck paramagnetism), which is connected to field-induced electronic transi-tions. If the electronic ground state is non-degenerate (and only for this case does theabove formula hold), it is precisely the normally quite large separation between electronicstates which makes this term similarly small as the diamagnetic one. Van Vleck param-agnetism therefore only plays an important role for atoms or ions that fulfill the followingconditions: First, the total angular momentum J has to vanish, otherwise the param-agnetic term ∆EPara discussed in the next section would completely overshadow ∆EVV.Second, L and S must not vanish individually, otherwise the contribution of ∆EVV wouldbecome vanishingly small due to the large separation between ground and excited statesin closed-shell configurations. A closer look at Hund’s rules (see Tab. 8.2 for the exampleof a d-shell) reveals that this is only the case for atoms wich are one electron short ofbeing half filled. The magnetic behavior of such atoms or ions is determined by a balancebetween Larmor diamagnetism and Van Vleck paramagnetism. Both terms are small andtend to cancel each other. The atomic lanthanide series is a notable exception: In this case,the energy spacing between the ground and excited states is small enough so that VanVleck paramagnetism indeed gives a notable contribution that is comparable in strengthto ∆EPara both for Eu (J = 0) and for Sm (J = 5/2)2, see Fig. 8.2. Here the effective

2Obviously, Eq. (8.28) is not directly applicable to Sm with J = 5/2. Rather, the six degenerate statesJz ∈ −5/2,−3/2, · · · ,+5/2 need to be explicitly considered, as discussed in the next section.

223

Figure 8.2: Van Vleck paramagnetism for Sm and Eu in the lanthanide atoms. From VanVleck’s nobel lecture.

magneton number µeff is plotted, which is defined in terms of the susceptibility

χ =Nµ2

effµ2B

3kBT.

8.3.2.3 ∆EPara : Paramagnetism

We again evaluate this term for a field aligned with the z-direction and get:

∆EPara = µ0µBH〈n|Lz + g0Sz|n〉 . (8.30)

As mentioned before, this first term does not vanish for any atom with J 6= 0 (which isthe case for most atoms) and will then completely dominate the magnetic behavior, sothat we can neglect the second and third term in Eq. (8.25) in this case. However, atomswith J 6= 0 have a (2J + 1)-fold degenerate ground state, which implies that the simpleform of Eq. (8.25) cannot be applied. Instead, we have to diagonalize the perturbationmatrix defined in Eq. (8.3.2.3) with respect to the α = 1, . . . , (2J +1) degenerate groundstates:

∆E0,α = µ0µBH

(2J+1)∑

α′=1

〈0α|Lz + g0Sz|0α′〉 = µ0µBH

(2J+1)∑

α′=1

Vα,α′ . (8.31)

224

The respective energy shifts are obtained by diagonalizing the interaction matrix Vα,α′

within the ground-state subspace. This is a standard problem in atomic physics (see e.g.Ashcroft/Mermin) and one finds that the basis diagonalizing this matrix are the statesdefined by J and Jz,

〈JLS, Jz|Lz + g0Sz|JLS, J ′z〉 = g(JLS)Jz δJz ,Jz′ . (8.32)

where JLS are the quantum numbers defining the atomic ground state |0〉 in the shellmodel and g(JLS) is the so-called Landé g-factor. Setting g0 ≈ 2, this factor is obtainedas

g(JLS) =3

2+

1

2

[S(S + 1)− L(L+ 1)

J(J + 1)

]. (8.33)

If we insert Eq. (8.32) into Eq. (8.31), we obtain

∆EJLS,Jz = g(JLS)µ0µBJz H , (8.34)

i. e., the magnetic field splits the degenerate ground state into (2J + 1) equidistant levelsseparated by g(JLS)µ0µBH, i.e.,

g(JLS)µ0µBH Jz with Jz ∈ −J, J − 1, . . . , J + 1, J . (8.35)

This is the same effect, as the magnetic field would have on a magnetic dipole withmagnetic moment

matom = −g(JLS)µ0µB J. (8.36)

Therefore, the first term in Eq. (8.25) can be interpreted as the expected paramagneticcontribution due to the alignment of a “microscopic magnet”. This is of course only truewhen we have unpaired electrons in partially filled shells (J 6= 0).Before we proceed to derive the paramagnetic susceptibility that arises from this contribu-tion, we note that this identification of matom via Eq. (8.25) serves nicely to illustrate thedifference between phenomenological and so-called first-principles theories. We have seenthat the simple Hamiltonian Hspin = −matom ·H would equally well describe the splittingof the (2J + 1) ground state levels. If we had only known this splitting, e.g. from experi-ment (Zeeman effect), we could have simply used Hspin as a phenomenological, so-called“spin Hamiltonian”. By fitting to the experimental splitting, we could even have obtainedthe magnitude of the magnetic moment for individual atoms (which due to their differenttotal angular momenta is of course different for different species). Examining the fittedmagnitudes, we might even have discovered that the splitting is connected to the total an-gular momentum. The virtue of the first-principles derivation used in this lecture, on theother hand, is that this prefactor is directly obtained, without any fitting to experimentaldata. This allows us not only to unambiguously check our theory by comparison with theexperimental data (which is not possible in the phenomenological theory, where the fittingcan often easily cover up for deficiencies in the assumed ansatz). We also directly obtainthe dependence on the total angular momentum and can even predict the properties ofatoms which have not (yet) been measured. Admittedly, in many cases first-principlestheories are harder to derive. Later we will see more examples of “spin Hamiltonians”which describe the observed magnetic behavior of the material very well, but for whichan all-encompassing first-principles theory is still lacking.

225

0 1 2 3 4x

0

0.2

0.4

0.6

0.8

1B

J(x)

J=1/2

1

3/2

25/2

Figure 8.3: Plot of the Brillouin function BJ(x) for various values of the spin J (the totalangular momentum).

8.3.2.4 Paramagnetic Susceptibility: Curie’s Law

The separation of the (2J + 1) ground state levels of a paramagnetic atom in a magneticfield is g(JLS)µ0µBH. Recalling that µB = 0.579 · 10−4 eV/T, we see that the splitting isonly of the order of 10−4 eV, even for fields of the order of Tesla. This is small comparedto kBT for realistic temperatures. Accordingly, more than one level will be populated atfinite temperatures, so that we have to use statistical mechanics to evaluate the suscepti-bility, i.e., the derivatives of the free energy as defined in Eq. (8.13). The Helmholtz freeenergy

F = −kBT lnZ (8.37)

is given in terms of the partition function Z with

Z =∑

n

e− En

kBT . (8.38)

Here, n is Jz = −J, . . . , J and En is our energy spacing g(JLS)µ0µBHJz. By defining

η =g(JLS)µ0µBH

kBT

so that η is the fraction of the magnetic splitting versus thermal energy, it becomes evidentthat the partition function is a geometric sum that evaluates to:

Z =J∑

Jz=−J

e−ηJz = eηJ + eη(J−1) + · · ·+ e−η(J−1) + e−ηJ

=e−η(J+1/2) − eη(J+1/2)

e−η/2 − eη/2 =sinh [(J + 1/2)η]

sinh [η/2]. (8.39)

226

With this, one finds for the magnetization

M(T ) = − 1

µ0V

∂F

∂H= − 1

µ0V

∂(−kBT lnZ)

∂H(8.40)

=kBT

µ0V

∂

∂H[ln (sinh[(J + 1/2)η])− ln (sinh[η/2])] =

g(JLS)µBJ

VBJ(η) .

In the last step, we have defined the so called Brillouin function

BJ(η) =1

J

(J +

1

2) coth

[(J +

1

2)η/J

]− 1

2coth

[ η2J

]. (8.41)

As shown in Fig. 8.3, BJ → 1 for η ≫ 1, i.e., in the limit of a strong field and/orlow temperatures. In this case, Eq. (8.40) simply tells us that all atomic magnets withmomentum matom defined in Eq. (8.36) have aligned to the external field and that themagnetization reaches its maximum value of M = matom/V , i. e., the magnetic momentdensity.We had, however, discussed above, that η will be much smaller than unity for normal fieldstrengths. At all but the lowest temperatures, the other limit for the Brillouin function,i. e., BJ(η → 0), will therefore be much more relevant. This small-η expansion yields

BJ(η ≪ 1) ≈ J + 1

3η + O(η3) . (8.42)

In this limit, we finally obtain for the paramagnetic susceptibility

χmag,para(T ) =∂M

∂H=

µoµ2Bg(JLS)

2J(J + 1)

3V kB

1

T=C

T. (8.43)

With χmag,para > 0, we have now confirmed our previous hypothesis that the first term inEq. (8.25) gives a paramagnetic contribution. The inverse dependence of the susceptibilityon temperature is known as Curie’s law, χmag,para = CCurie/T with CCurie being the Curieconstant. Such a dependence is characteristic for a system with “molecular magnets” thatfavor to align with the applied field and are being flipped by thermal fluctuations. Again,we see that the first-principles derivation not only reveals the range of validity for thisempirical law (remember that it only holds in the small η-limit), but also provides theproportionality constant in terms of fundamental properties of the system.To make a quick size estimate, we take the volume of atomic order (V ∼Å3) to findχmag,para ∼ 10−2 at room temperature. The paramagnetic contribution is thus orders ofmagnitude larger than the diamagnetic or the Van Vleck one, even at room temperature.Nevertheless, with χmag,para ≪ 1 it is still small compared to electric susceptibilities, whichare of the order of unity. We will discuss the consequences of this in section 8.3.4, butbefore we have to evaluate a last remaining, additional source of magnetic behavior inmetals, i. e., the free electrons.

8.3.3 Susceptibility of the Free Electron Gas

Having determined the magnetic properties of electrons bound to ions, it is valuable toalso consider the opposite extreme and examine their properties as they move nearlyfreely in a metal. There are again essentially two major terms in the response of free

227

fermions to an external magnetic field. One results from the “alignment” of spins (whichis a rough way of visualizing the quantum mechanical action of the field on the spins) andit is called Pauli paramagnetism. The other arises from the orbital moments created byinduced circular motions of the electrons. It thus tends to screen the exterior field and isknown as Landau diamagnetism.Let us first try to analyze the origin of Pauli paramagnetism and examine its order ofmagnitude like we have also done for all atomic magnetic effects. For this we consideragain the free electron gas, i. e., N non-interacting electrons in a volume V . In chapter 2,we had already derived the density of states (DOS) per volume

N(ǫ) =V

2π2

(2me

~2

)3/2√ǫ . (8.44)

In the electronic ground state, all states are filled following the Fermi-distribution

N

V=

∫ ∞

0

N(ǫ) f(ǫ, T ) dǫ. (8.45)

The electron density N/V uniquely characterizes our electron gas. For sufficiently lowtemperatures, the Fermi-distribution can be approximated by a step function so that allstates up to the Fermi level are occupied. This Fermi level is conveniently expressed interms of the Wigner-Seitz radius rs that we had already defined in chapter 6

ǫF =50.1 eV(

rsaB

)2 . (8.46)

The DOS at the Fermi level can then also be expressed in terms of rs

N(ǫF) =

(1

20.7 eV

)(rsaB

)−1

. (8.47)

Up to now we have not explicitly considered the spin degree of freedom which leads to thedouble occupation of each electronic state with a spin up and a spin down electron. Suchan explicit consideration becomes necessary, however, when we now include an externalfield. This is accomplished by defining spin up and spin down DOS’s, N↑ and N↓, inanalogy to the spin-unresolved case. In the absence of a magnetic field, the ground stateof the free electron gas has an equal number of spin-up and spin-down electrons, and thedegenerate spin-DOS are therefore simply half-copies of the original total DOS defined inEq. (8.44)

N↑(ǫ) = N↓(ǫ) =1

2N(ǫ) , (8.48)

cf. the graphical representation in Fig. 8.4a.An external field µ0H will now interact differently with spin up and spin down states.Since the electron gas is fully isotropic, we can always choose the spin axis that defines upand down spins to go along the direction of the applied field, and thus reduce the problemto one dimension. If we focus only on the interaction of the spin with the field and neglectthe orbital response for the moment, the effect boils down to an equivalent of Eq. (8.21),i. e., we have

∆Hspin(H) = µ0µBg0H · s = +µ0µBH (for spin up) (8.49)

= −µ0µBH (for spin down) , (8.50)

228

Figure 8.4: Free electron density of states and the effect of an external field: N↑ correspondsto the number of electrons with spins aligned parallel and N↓ antiparallel to the z-axis.(a) No magnetic field, (b) a magnetic field, µ0H0 is applied along the z-direction, (c) asa result of (b) some electrons with spins antiparallel to the field (shaded region on left)change their spin and move into available lower energy states with spins aligned parallelto the field.

229

where we have again approximated the electronic g-factor g0 by 2. The effect is thereforesimply to shift spin up electrons relative to spin down electrons. This can be expressedvia the spin DOS’s

N↑(ǫ) =1

2N(ǫ− µ0µBH) (8.51)

N↓(ǫ) =1

2N(ǫ+ µ0µBH) , (8.52)

or again graphically by shifting the two parabolas with respect to each other as shownin Fig. 8.4b. If we now fill the electronic states according to the Fermi-distribution, adifferent number of up and down electrons results and our electron gas exhibits a netmagnetization due to the applied field. Since the states aligned parallel to the field arefavored, the magnetization will enhance the exterior field and a paramagnetic contributionresults.For T ≈ 0, both parabolas are essentially filled up to a sharp energy, and the net mag-netization is simply given by the dark grey shaded region in Fig. 8.4c. Even for fields ofthe order of Tesla, the energetic shift of the states is µBµ0H ∼ 10−4 eV, i. e., small onthe scale of electronic energies. The shaded region representing our net magnetization cantherefore be approximated by a simple rectangle of height µ0µBH and width 1

2N(ǫF), i. e.,

the Fermi level DOS of the unperturbed system without applied field. We then have

N↑ =N

2+

1

2N(ǫF)µ0µBH (8.53)

up electrons per volume and

N↓ =N

2− 1

2N(ǫF)µ0µBH (8.54)

down electrons per volume. Remember that the magnetization is equal to the dipolemagnetic density and can therefore be written as dipole moment per electron times DOSin such a uniform system. Accordingly, each electron contributes a magnetic moment ofµB to the magnetization and we have

M = (N↑ −N↓)µB = N(ǫF)µ0µ2BH . (8.55)

This then yields the susceptibility

χmag,Pauli =∂M

∂H= N(ǫF)µ0µ

2B . (8.56)

Using Eq. (8.47) for the Fermi level DOS of the free electron gas, this can be rewritten asfunction of the Wigner-Seitz radius rs

χmag,Pauli = 10−6

(2.59

rs/aB

). (8.57)

Recalling that the Wigner-Seitz radius of most metals is of the order of 3-5 aB, we find thatthe Pauli paramagnetic contribution (χmag,Pauli > 0) of a free electron gas is small. In fact,it is as small as the diamagnetic susceptibility of atoms and therefore much smaller thantheir paramagnetic susceptibility. For magnetic atoms in a material, thermal disorder at

230

finite temperatures prevents their complete alignment to an external field and thereforekeeps the magnetization small. For an electron gas, on the other hand, it is the Pauliexclusion principle that opposes such an alignment by forcing the electrons to occupyenergetically higher lying states when aligned. For the same reason, Pauli paramagnetismdoes not show the linear temperature dependence we observed in Curie’s law in Eq. (8.43).The characteristic temperature for the electron gas is the Fermi temperature. One couldtherefore cast the Pauli susceptibility into Curie form, but with a fixed temperature oforder TF playing the role of T . Since TF ≫ T , the Pauli susceptibility is then hundreds oftimes smaller and almost always completely negligible.Turning to Landau diamagnetism, we note that its calculation would require solving thefull quantum mechanical free electron problem along similar lines as done in the lastsection for atoms. The field-dependent Hamilton operator would then also produce aterm that screens the applied field. Taking the second derivative of the resulting totalenergy with respect to H, one would obtain for the diamagnetic susceptibility

χmag,Landau = −1

3χmag,Pauli , (8.58)

i. e., another term that is vanishingly small, as is the paramagnetic response of the freeelectron gas. Because this term is so small and the derivation does not yield new insightcompared to the one already undertaken for the atomic case, we refer to the book by M.P.Marder, Condensed Matter Physics (Wiley, 2000) for a proper derivation of this result.

8.3.4 Atomic magnetism in solids

So far, we have discussed the magnetic behavior of materials from tight-binding viewpointby assuming that a material is “just” an ensemble of indipendent atoms or ions and elec-trons. That is why we first addressed the magnetic properties of isolated atoms and ions.As an additional source of magnetism in solids, we looked at free (delocalized or itiner-ant) electrons. In both cases we found two major sources of magnetism: a paramagneticone resulting from the alignment of existing “microscopic magnets” (either total angularmomentum in the case of atoms, or spin in case of free electrons) and a diamagnetic onearising from induction currents that try to screen the external field.

• Atoms (bound electrons):

- Paramagnetism χmag,para ≈ 1/T ∼ 10−2 (RT) ∆Epara0 ∼ 10−4 eV

- Van Vleck paramagnetismχmag,VV ≈ const. ∼ −10−4 ∆EVV0 ∼ 10−9 eV

- Larmor diamagnetism χmag,dia ≈ const. ∼ −10−4 ∆Edia0 ∼ 10−9 eV

• Free electrons:

- Pauli paramagnetism χmag,Pauli ≈ const. ∼ 10−6 ∆EPauli0 ∼ 10−4 eV

- Landau diamagnetism χmag,Landau ≈ const. ∼ −10−6 ∆ELandau0 ∼ 10−4 eV

If the coupling between the different sources of magnetism is small, the magnetic behaviorof the material would simply be a superposition of all relevant terms. For insulators thiswould imply, for example, that the magnetic moment of each atom/ion does not change

231

Figure 8.5: (a) In rare earth metal atoms the incomplete 4f electronic subshell is locatedinside the 5s and 5p subshells, so that the 4f electrons are not strongly affected byneighboring atoms. (b) In transition metal atoms, e.g. the iron group, the 3d electrons arethe outermost electrons and therefore interact strongly with other nearby atoms.

appreciably when transfered to the material and the total susceptibility would be just thesum of all atomic susceptibilities. And this is indeed what one primarily finds when goingthrough the periodic table:

Insulators: When J = 0, the paramagnetic contribution vanishes. If in addition L =S = 0 (closed-shell), the response is purely diamagnetic. This is the case for noble gassolids, simple ionic crystals like the alkali halides (recall from the discussion on cohesionthat the latter can be viewed as closed-shell systems!), many molecules (e.g. H2O), andthus also many molecular crystals. Otherwise, i.e., if J = 0 but not necessarily L and S,the response is a balance between Van Vleck paramagnetism and Larmor diamagnetism.In all cases, the effects are small and the results from the theoretical derivation presentedin the previous section are in excellent quantitative agreement with experiment.

For J 6= 0, the response is dominated by the paramagnetic term. This is the case, whenthe material contains rare earth (RE) or transition metal (TM) atoms (with partiallyfilled f and d shells, respectively). These systems indeed obey Curie’s law, i. e., theyexhibit susceptibilities that scale inversely with temperature. For the rare earths, eventhe magnitude of the measured Curie constant corresponds very well to the theoreticalprediction given in Eq. (8.43). For TM atoms in insulating solids, howver, the measuredCurie constants can only be understood if one assumes L = 0, while S is still given byHund’s rules. This effect is referred to as quenching of the orbital angular momentum andis caused by a general phenomenon known as crystal field splitting. As illustrated by Fig.8.5, the d shells of a TM atom belong to the outermost valence shells and are particularlysensitive to the environment. The crystal field lifts the five-fold degeneracy of the TMd-states, as shown for an octahedral environment in Fig. 8.6a). Hund’s rules are then incompetition with the energy scale of the d-state splitting. If the energy separation is large,only the three lowest states can be filled and Hund’s rules give a low spin configuration

232

Figure 8.6: Effect of octahedral environment on transition metal ion with 5 electrons (d5):a) the degenerate levels split into two groups, b) for large energy separation a low spinand for small energy separation a high spin configuration c) might be favorable.

d-count # unpaired electrons exampleshigh spin low spin

d4 4 2 Cr2+, Mn3+

d5 5 1 Fe3+, Mn2+

d6 4 0 Fe2+, Co3+

d7 3 1 Co2+

Table 8.3: High and low spin octahedral transition metal complexes.

with weak diamagnetism (Fig. 8.6b). If the energy separation is small, Hund’s rules canstill be applied to all five states and a high spin situation with strong paramagnetismemerges (Fig. 8.6c). Please note that the crystal field splitting does not affect RE atoms,since the partially filled f -orbitals of RE atoms are located deep inside the filled s and pshells and are therefore not significantly affected by the crystalline environment. Even inthe case of TM atoms, the important thing to notice is that although the solid perturbsthe magnitude of the “microscopic magnet” connected with the partially filled shell, allindividual moments inside the material are still virtually decoupled from each other (i. e.,the magnetic moment in the material is that of an atom that has simply adapted to anew symmetry situation). This is why Curie’s law is still valid, but the magnetic momentbecomes an effective magnetic moment that depends on the crystalline environment. Ta-ble 8.3 summarizes the configurations for some transition metal ions in an octahedralenvironment.Semiconductors: Covalent materials only have partially filled shells, so they could be ex-pected to have a finite magnetic moment. However, covalent bonds form through a pair of

233

electrons with opposite spin, and hence the net orbital angular momentum is zero. There-fore, covalent materials like silicon exhibit only a vanishingly small diamagnetic response,as long as they are not doped. Dilute magnetic semiconductors, i. e., semiconductors dopedwith transition metal atoms, hold the promise to combine the powerful world of semicon-ductors with magnetism. However, questions like how pronounced the magnetism is andif room temperature magnetism can be achieved are still heavily debated and subject ofactive research.Metals: In metals, the delocalized electrons add a new contribution to the magnetic be-havior. In simple metals like the alkalis, this new contribution is in fact the only remainingone. As discussed in the chapter on cohesion, such metals can be viewed as closed-shellion cores and a free electron glue formed from the valence electrons. The closed-shell ioncores have J = L = S = 0 and exhibit therefore only a negligible diamagnetic response.The magnetic behavior is then dominated by the Pauli paramagnetism of the conductionelectrons. Measured susceptibilities are indeed in the order of magnitude that can be ex-pected from Eq. (8.57). Quantitative differences arise from exchange-correlation effectsthat were not treated in our free electron model. We will come back to metals below.

8.4 Magnetic Order: Permanent Magnets

The theory we have developed up to now relies on the assumption that the magneticproperties of material can be described entirely by summing up the contributions that stemfrom the individual atoms and free electrons. In all cases, the paramagnetic or diamagneticresponse is very small, at least when compared to electric susceptibilities, which are ofthe order of unity. For this exact reason, typically only electric effects are discussed whenanalysing the interactions of materials with electromagnetic fields. At this stage, we couldtherefore close the chapter on magnetism of solids as something really unimportant, ifit was not for a few exceptions: Some elemental and compound materials (formed of orcontaining transition metal or rare earth atoms) exhibit a completely different magneticbehavior: enormous susceptibilities and a magnetization that does not vanish when theexternal field is removed. These so called ferromagnetic effects are not captured by thehitherto developed theory, since they originate from a strong coupling of the individualmagnetic moments. This is what we will consider next.

8.4.1 Ferro-, Antiferro- and Ferrimagnetism

In our previously discussed, simple theory of paramagnetism the individual magnetic mo-ments (ionic shells of non-zero angular momentum in insulators or the conduction electronsin simple metals) do not interact with one another. In the absence of an external field, theindividual magnetic moments are then thermally disordered at any finite temperature,i. e., they point in random directions yielding a zero net moment for the solid as a whole,cf. Fig. 8.7a. In such cases, an alignment can only be caused by applying an externalfield, which leads to an ordering of all magnetic moments at sufficiently low tempera-tures, i.e., as long as the field is strong enough to overcome the thermodynamic trendtowards disorder. Schematically, this is shown in Fig. 8.7b. However, a similar alignmentcould also be obtained by coupling the different magnetic moments, i. e., by an interac-tion that favors a parallel alignment for example. Already quite short-ranged interactions,

234

Figure 8.7: Schematic illustration of the distribution of directions of the local magneticmoments connected to “individual magnets” in a material. (a) Random thermal disorderin a paramagnetic solid with insignificant magnetic interactions, (b) complete alignmenteither in a paramagnetic solid due to a strong external field or in a ferromagnetic solidbelow its critical temperature as a result of magnetic interactions, and (c) example ofantiferromagnetic ordering below the critical temperature.

e.g., nearest-neighbor interactions, can potentially lead to an ordered structure as the oneshown in Fig. 8.7b. In literature, interactions that introduce a magnetic ordering are oftencalled magnetic interactions. This does however not imply that the fundamental sourceof the interaction is magnetic (e.g., a magnetic dipole-dipole interaction, which in fact isnot the reason for the ordering as we will see below).Materials that exhibit an ordered magnetic structure in the absence of an applied externalfield are called ferromagnets (or permanent magnets) and their resulting magnetic momentis known as spontaneous magnetization Ms. The complexity of the possible magneticallyordered states exceeds the simple parallel alignment case shown in Fig. 8.7b by far; someadditional examples are shown in Fig. 8.8a. Additionally, Fig. 8.7c and Fig. 8.8b showanother common case: Although, a microscopic ordering is present, the individual localmoments sum to zero, so that no spontaneous magnetization is present. Such magneticallyordered states with Ms = 0 are called antiferromagnetic. If a spontaneous magnetizationMs 6= 0 occurs from the ordering of magnetic moments of different magnitude that are notcompletely aligned with Ms (not all local moments have a positive component along thedirection of spontaneous magnetization), one talks about ferrimagnets. Fig. 8.8 shows afew examples of possible ordered ferrimagnetic structures. Please note that the complexityof possible magnetic structures is so large that it cannot be covered by a simple sketch.Indeed, some of them do not rigorously fall into any of the three categories discussed sofar. Those structures are well beyond the scope of this lecture and will not be coveredhere.

8.4.2 Interaction versus Thermal Disorder: Curie-Weiss Law

Even in permanent magnets, the magnetic order does not usually prevail at all temper-atures. Above a critical temperature TC, the thermal energy can overcome the orderinginduced by the interactions. Then, the magnetic order vanishes and the material (mostoften) behaves like a simple paramagnet. In ferromagnets, TC is known as the Curietemperature, while in antiferromagnets it is often called as Néel temperature (sometimesdenoted TN). Note that TC may depend strongly on the applied field (both in strength

235

Figure 8.8: Examples for some typical magnetic ordering that can be found in a simplelinear array of spins: (a) ferromagnetic, (b) antiferromagnetic, and (c) ferrimagnetic.

236

Figure 8.9: Typical temperature dependence of the magnetization Ms, of the specific heatcV , and of the zero-field susceptibility χo of a ferromagnet. The temperature scale isnormalized to the critical (Curie) temperature TC.

and direction!). If the external field is parallel to the direction of spontaneous magnetiza-tion, TC typically increases with increasing external field since both ordering tendenciessupport each other. For other field directions, a range of complex phenomena can arise asa result of the competition between both ordering tendencies.At first, we are more interested in the ferromagnetic properties of materials in the absenceof external fields, i. e., in the existence of a finite spontaneous magnetization. The gradualloss of order with increasing temperature is reflected in the continuous drop of Ms(T )illustrated in Fig. 8.9. Just below TC, a power law dependence is typically observed

Ms(T ) ∼ (TC − T )β (for T ր TC) , (8.59)

with a critical exponent β somewhere around 1/3. Coming from the high temperatureside, the onset of ordering also appears in the zero-field susceptibility, which is found todiverge as T approaches TC

χ0(T ) = χ(T )|B=0 ∼ (T − TC)−γ (for T ց TC) , (8.60)

with γ around 4/3. This behavior already indicates that the material experiences dra-matic changes at the critical temperature, which is also reflected by a divergence in otherfundamental quantities like the zero-field specific heat

cV (T )|B=0 ∼ (T − TC)−α (for T → TC) , (8.61)

with α around 0.1. The divergence is connected to the onset of long-range order (align-ment), which sets in more or less suddenly at the critical temperature resulting in a secondorder phase transition. The actual transition is difficult to describe theoretically and the-ories are therefore often judged by how well (or badly) they reproduce the experimentallymeasured critical exponents α, β and γ.Above the critical temperature, the properties of magnetic materials often become “morenormal”. In particular, for T ≫ TC, one usually finds a simple paramagnetic behavior,

237

ms (in µB) matom (in µB) TC (in K) ΘC (in K)Fe 2.2 6 (4) 1043 1100Co 1.7 6 (3) 1394 1415Ni 0.6 5 (2) 628 650Eu 7.1 7 289 108Gd 8.0 8 302 289Dy 10.6 10 85 157

Table 8.4: Magnetic quantities of some elemental ferromagnets: The saturation magneti-zation at T = 0K is given in form of the average magnetic moment ms per atom and thecritical temperature TC and the Curie-Weiss temperature ΘC are given in K. For compar-ison, also the magnetic moment matom of the corresponding isolated atoms is listed (thevalues in brackets are for the case of orbital angular momentum quenching).

with a susceptibility that obeys the so-called Curie-Weiss law, cf. Fig. 8.9,

χ0(T ) ∼ (T −ΘC)−1 (for T ≫ TC) . (8.62)

ΘC is called the Curie-Weiss temperature. Since γ in Eq. (8.60) is almost never exactlyequal to 1, ΘC does not necessarily coincide with the Curie temperature TC. This can,for example, be seen in Table 8.4, where also the saturated values for the spontaneousmagnetization at T → 0K are listed.In theoretical studies, one typically converts this measured Ms(T → 0K) directly into anaverage magnetic moment ms per atom (in units of µB) by simply dividing by the atomicdensity in the material. In this way, we can compare directly with the corresponding valuesfor the isolated atoms, obtained as matom = g(JLS)µBJ , cf. Eq. (8.36). As apparent fromTable 8.4, the two values agree very well for the rare earth metals, but differ substantiallyfor the transition metals. Recalling the discussion on the quenching of the orbital angularmomentum of TM atoms immersed in insulating materials in section 8.3.4, we could onceagain try to ascribe this difference to the symmetry lowering of interacting d orbitalsin the material. Yet, even using L = 0 (and thus J = S, g(JLS) = 2), the agreementdoes not become much better, cf. Table 8.4. Accordingly, one can summarize that inthe RE ferromagnets the saturation magnetization appears to be a result of the perfectalignment of atomic magnetic moments that are not strongly affected by the presence ofthe surrounding material. Conversely, the TM ferromagnets (Fe, Co, Ni) exhibit a muchmore complicated magnetic structure that is not well described by the magnetic behaviorof the individual atoms.As a first summary of our discussion on ferromagnetism (or ordered magnetic states ingeneral), we are therefore led to conclude that it must arise out of an (yet unspecified)interaction between magnetic moments. This interaction produces an alignment of themagnetic moments and subsequently a large net magnetization that prevails even in theabsence of an external field. This means we are certainly outside the linear response regimethat has been used as fundamental basis for developing the dia- and paramagnetic theoriesin the first sections of this chapter. With respect to the alignment, the interaction seemsto have the same qualitative effect as an applied field acting on a paramagnet, whichexplains the similarity between the Curie-Weiss and the Curie law of paramagnetic solids:Interaction and external field favor alignment, but are opposed by thermal disorder. Far

238

below the critical temperature the alignment is perfect and far above TC thermal disorder ispredominant. In this case, the competition between alignment with respect to an externalfield and disorder is equivalent to the situation in a paramagnet (just the origin of thetemperature scale has shifted).The interacting magnetic moments in ferromagnets can either stem from delocalized elec-trons (Pauli paramagnetism) or stem from partially filled atomic shells (Paramagnetism).This makes it plausible why ferromagnetism is a phenomenon specific to only some met-als, and in particular metals with a high magnetic moment that arises from partially filledd or f shells (i. e., transition metals, rare earths, as well as their compounds). However,we cannot yet explain why only some and not all TMs and REs exhibit ferromagneticbehavior. The comparison between the measured values of the spontaneous magnetizationand the atomic magnetic moments suggests that ferromagnetism in RE metals can be un-derstood as the coupling of localized magnetic moments (due to the inert partially filledf shells). The situation is more complicated for TM ferromagnets, where the picture of asimple coupling between atomic-like moments fails. We will see below that this so-calleditinerant ferromagnetism arises out of a subtle mixture of delocalized s electrons and themore localized, but nevertheless not inert d orbitals.

8.4.3 Phenomenological Theories of Ferromagnetism

After this first overview of phenomena arising out of magnetic order, let us now try toestablish a proper theoretical understanding. Unfortunately, the theory of ferromagnetismis one of the less well developed of the fundamental theories in solid state physics (a bitbetter for the localized ferromagnetism of the REs, worse for the itinerant ferromagnetismof the TMs). The complex interplay of single particle and many-body effects, as well ascollective effects and strong local coupling makes it difficult to break the topic down intosimple models appropriate for an introductory level lecture. We will therefore not be ableto present and discuss a full-blown ab initio theory like in the preceding chapters. Insteadwe will first consider simple phenomenological theories and refine them later, when needed.This will enable us to address some of the fundamental questions that are not amenableto phenomenological theories (like the source of the magnetic interaction).

8.4.3.1 Molecular (mean) field theory

The simplest phenomenological theory of ferromagnetism is the molecular field theory dueto Weiss (1906). We had seen in the preceding section that the effect of the interactionbetween the discrete magnetic moments in a material is very similar to that of an appliedexternal field (leading to alignment). For each magnetic moment, the net effect of theinteraction with all other moments can therefore be thought of as a kind of internal fieldcreated by all other moments (and the magnetic moment itself). In this way we averageover all interactions and condenses them into one effective field, i. e., molecular field theoryis a typical representative of a mean field theory. Since this effective internal, or so-calledmolecular field corresponds to an average over all interactions, it is reasonable to assumethat it will scale with M , i. e., the overall magnetization (density of magnetic moments).Without specifying the microscopic origin of the magnetic interaction and the resultingfield, the ansatz of Weiss was to simply postulate a linear scaling with M

Hmol = µ0λM , (8.63)

239

Figure 8.10: Graphical solution of equation 8.66.

where λ is the so-called molecular field constant. This leads to an effective field

Heff = H +Hmol = H + µ0λM , (8.64)

For simplicity, we will consider only the case of an isotropic material, so that the internaland the external point in the same direction, allowing us to write all formulae as scalarrelations.Recalling the magnetization of paramagnetic spins (Eq. 8.40),

M0(T ) =g(JLS)µBJ

VBJ(η) with η ∼ H

T, (8.65)

we obtain

M(T ) =M0

(Heff

T

)→M0

(λM

T

)(8.66)

when the external field is switched off (H=0). We now ask the question if such a field canexist without an external field? Since M appears on the left and on the right hand sideof the equation we perform a graphic solution. We set x = λ

TM(T ), which results in

M0(x) =T

λx . (8.67)

The graphic solution is show in Fig. 8.10 and tells us that the slop of M0(0K) has to belarger than T

λ. If this condition is fulfilled a ferromagnetic solution of our paramagnetic

mean field model is obtained, although at this point we have no microscopic understandingof λ.

240

For the mean-field susceptibility we then obtain with H(T ) =M0(Heff

T)

χ =∂M0

∂H=∂M0

∂Heff

∂Heff

∂H=C

T

(1 +

∂M

∂H

)=C

T(1 + λχ) (8.68)

where we have used that for our paramagnet ∂M0

∂Heff

is just Curie’s law. Solving Eq. 8.68 forχ gives the Curie-Weiss law

χ =C

T − λC ∼ (T −ΘC)−1 . (8.69)

ΘC is the Curie-Weiss temperature. The form of the Curie-Weiss law is compatible with aneffective molecular field that scales linearly with the magnetization. Given the experimen-tal observation of Curie-Weiss like scaling, the mean field ansatz of Weiss seems thereforereasonable. However, we can already see that this simple theory fails, as it predicts an in-verse scaling with temperature for all T > TC, i. e., within molecular field theory ΘC = TC

and γ = 1 at variance with experiment. The exponent of 1, by the way, is characteristicfor any mean field theory.Given the qualitative plausibility of the theory, let us nevertheless use it to derive a firstorder of magnitude estimate for this (yet unspecified) molecular field. This phenomeno-logical consideration therefore follows the reverse logic to the ab initio one we prefer. Inthe latter, we would have analyzed the microscopic origin of the magnetic interaction andderived the molecular field as a suitable average, which in turn would have brought us intoa position to predict materials’ properties like the saturation magnetization and the crit-ical temperature. Now, we do the opposite: We will use the experimentally known valuesof Ms and TC to obtain a first estimate of the size of the molecular field, which might helpus lateron to identify the microscopic origin of the magnetic interactions (about which westill do not know anything). For the estimate, recall that the Curie constant C was givenby Eq. (8.40) as

C =Nµoµ

2Bg(JLS)

2J(J + 1)

3kBV≈(N

V

)µom

2atom

3kB

, (8.70)

where we have exploited that J(J + 1) ∼ J2, allowing us to identify the atomic magneticmoment matom = µBg(JLS)J . With this, it is straightforward to arrive at the followingestimate of the molecular field at T = 0K,

Bmol(0K) = µ0λMs(0 K) = µ0TC

C

(N

Vms

)≈ 3kBTCms

m2atom

. (8.71)

When discussing the content of Table 8.4 we had already seen that, for most ferromagnets,matom is at least of the same order of ms, so that plugging in the numerical constants wearrive at the following order of magnitude estimate

Bmol ∼ [5TC in K]

[matom in µB]Tesla . (8.72)

Looking at the values listed in Table 8.4, one realizes that molecular fields are of theorder of some 103 Tesla, which is at least one order of magnitude more than the strongestmagnetic fields that can currently be produced in the laboratory. As take-home message

241

we therefore keep in mind that the magnetic interactions must be quite strong to yieldsuch a high internal molecular field.Molecular field theory can also be employed to understand the temperature behavior of thezero-field spontaneous magnetization. As shown in Fig. 8.9 Ms(T ) decays smoothly fromits saturation value at T = 0K to zero at the critical temperature. Since the molecularfield produced by the magnetic interaction is indistinguishable from an applied externalone, the variation of Ms(T ) must be equivalent to the one of a paramagnet, only withthe external field replaced by the molecular one. Using the results from section 8.3.2 foratomic paramagnetism, we therefore obtain directly a behavior equivalent to Eq. (8.40)

Ms(T ) = Ms(0 K) BJ

(g(JLS)µBB

mol

kBT

), (8.73)

i. e., the temperature dependence is entirely given by the Brillouin function defined in Eq.(8.41). BJ(η), in fact, exhibits exactly the behavior we just discussed: it decays smoothly tozero as sketched in Fig. 8.9. Similar to the reproduction of the Curie-Weiss law, mean fieldtheory provides us therefore with a simple rationalization of the observed phenomenon,but fails again on a quantitative level (and also does not tell us anything about theunderlying microscopic mechanism). The functional dependence of Ms(T ) in both limitsT → 0K and T ր TC is incorrect, with e.g.

Ms(T ր TC) ∼ (TC − T )1/2 (8.74)

and thus β = 1/2 instead of the observed values around 1/3. In the low temperaturelimit, the Brillouin function decays exponentially fast, which is also in gross disagreementwith the experimentally observed T 3/2 behavior (known as Bloch T 3/2 law). The failure topredict the correct critical exponents is a general feature mean field theories that cannotaccount for the fluctuations connected with phase transitions. The wrong low temperaturelimit, on the other hand, is due to the non-existence of a particular set of low-energyexcitations called spin-waves (or magnons), a point that we will come back to later.

8.4.3.2 Heisenberg and Ising Hamiltonian

Another class of phenomenological theories is called spin Hamiltonians. They are based onthe hypothesis that magnetic order is due to the coupling between discrete microscopicmagnetic moments. If these microscopic moments are, for example, connected to thepartially filled f -shells of RE atoms, one can replace the material by a discrete latticemodel, in which each site i has a magnetic moment mi. A coupling between two differentsites i and j in the lattice can then be described by a term

Hcouplingi,j = −Jij

µ2B

mi ·mj , (8.75)

where Jij is known as the exchange coupling constant between the two sites (unit: eV). Inprinciple, the coupling can take any value between −Jijmimj/µ

2B (parallel (ferromagnetic)

alignment) and +Jijmimj/µ2B (antiparallel (antiferromagnetic) alignment) depending on

the relative orientation of the two moment vectors. Note that this simplified interactionhas no explicit spatial dependence anymore. If this is required (e.g. in a non-isotropiccrystal with a preferred magnetization axis), additional terms need to be added to theHamiltonian.

242

Figure 8.11: Schematic illustrations of possible coupling mechanisms between localizedmagnetic moments: (a) direct exchange between neighboring atoms, (b) superexchangemediated by non-magnetic ions, and (c) indirect exchange mediated by conduction elec-trons.

Applying this coupling to the whole material, we first have to decide which lattice sitesto couple. Since we are now dealing with a phenomenological theory we have no rigorousguidelines that would tell us which interactions are important. Instead, we have to choosethe interactions (e.g. between nearest neighbors or up to next-nearest neighbors) and theirstrengths Jij. Later, after we have solved the model, we can revisit these assumptions andsee if they were justified. We can also use experimental (or first principles) data to fit theJij’s and hope to extract microscopic information. Often our choice is guided by intuitionconcerning the microscopic origin of the coupling. For the localized ferromagnetism ofthe rare earths one typically assumes a short-ranged interaction, which could either bebetween neighboring atoms (direct exchange) or reach across non-magnetic atoms in morecomplex structures (superexchange). As illustrated in Fig. 8.11 this conveys the idea thatthe interaction arises out of an overlap of electronic wavefunctions. Alternatively, onecould also think of an interaction mediated by the glue of conduction electrons (indirectexchange), cf. Fig. 8.11c. However, without a proper microscopic theory it is difficult todistinguish between these different possibilities.Having decided on the coupling, the Hamiltonian for the whole material becomes in itssimplest form of only pair-wise interactions

HHeisenberg =M∑

i,j=1

Hcouplingi,j = −

M∑

i,j=1

Jijµ2

B

mi ·mj . (8.76)

Similar to the discussion in chapter 6 on cohesion, such a pairwise summation is onlyjustified, when the interaction between two sites is not affected strongly by the presenceof magnetic moments at other nearby sites. Since as yet we do not know anything about

243

the microscopic origin of the interaction, this is quite hard to verify.The particular spin Hamiltonian of Eq. (8.76) is known as Heisenberg Hamiltonian andis the starting point for many theories of ferromagnetism. It looks deceptively simple,but in fact it is difficult to solve. Up to now, no analytic solution has been found. Sinceoften analytic solutions are favored over numerical solutions further simplifications aremade. Examples are one- or two-dimensional Heisenberg Hamiltonians (possibly onlywith nearest neighbor interaction) or the Ising model:

H Ising = −M∑

i,j=1

Jijµ2

B

mi,z ·mj,z︸︷︷︸±1 only

. (8.77)

These models opened up a whole field of research in the statistical mechanics community.It quickly developed a life of its own and became more and more detached from the originalmagnetic problem. Instead it was and is driven by the curiosity to solve the generic modelsthemselves. In the meantime numerical techniques such as Monte Carlo developed thatcan solve the original Heisenberg Hamiltonian, although they do of course not reach thebeauty of a proper analytical solution.Since the Ising model has played a prominent role historically and because it has analyticsolutions under certain conditions we will illustrate its solution in one dimension. Forthis we consider the Ising model in a magnetic field with only next nearest neighborinteractions

H Ising = −M∑

<ij>

Jijσiσj −∑

j

hjσj (8.78)

where < ij > denotes that the sum only runs over nearest neighbor and hj is the magneticfield µ0H at site j and σi = ±1. Despite this simplified form no exact solution is knownfor the three dimensional case. For two dimensions Onsager found an exact solution in1944 for zero magnetic field. Let us reduce the problem to one dimension and furthersimplify to Ji,i+1 = J and hj = h:

H Ising = −JN∑

i=1

σiσi+1 − h∑

i

σi (8.79)

where we have used the periodic boundary condition σN+1 = σ1. The partition functionfor this Hamiltonian is

Z(N, h, T ) =∑

σ1

. . .∑

σN

eβ[J∑N

i σiσi+1+12

∑Ni (σi+σi+1)] , (8.80)

with β =1

kBT. In order to carry out the spin summation we define a 2×2 matrix P with

elements:

〈σ|P |σ′〉 = eβ[Jσσ′+h

2(σ+σ′)] (8.81)

〈1|P |1〉 = eβ[J+h] (8.82)

〈−1|P | − 1〉 = eβ[J−h] (8.83)

〈1|P | − 1〉 = 〈−1|P |1〉 = e−βJ , (8.84)

244

P is called the transfer matrix and can be written in more compact form as

P =

(eβ(J+h) e−βJ

e−βJ eβ(J−h)

). (8.85)

With this definition, the partition function becomes

Z(N, h, T ) =∑

σ1

. . .∑

σN

〈σ1|P |σ2〉〈σ2|P |σ3〉 . . . 〈σN−1|P |σN〉〈σN |P |σ1〉 (8.86)

=∑

σ1

〈σ1|PN |σ1〉 = Tr(PN) . (8.87)

To evaluate the trace, we have to diagonalize P , which yields the following eigenvalues

λ± = eβJ[cosh(βh)±

√sinh2(βh) + e−4βh

]. (8.88)

With this, the trace equates to

Tr(PN) = λN+ + λN− . (8.89)

Since we are interested in the thermodynamic limit N → ∞, λ+ will dominate over λ−,because λ+ > λ− for any h. This finally yields

Z(N, h, T ) = λN+ (8.90)

and yields the free energy per spin,

F (h, T ) = −kBT lnλ+ = −J − 1

βln

[cosh(βh)±

√sinh2(βh) + e−4βh

]. (8.91)

The magnetization then follows as usual

M(h, T ) = −∂F∂h

=sinh(βh) + sinh(βh) cosh(βh)√

sinh2(βh)+e−4βh

cosh(βh) +√

sinh2(βh) + e−4βh

. (8.92)

The magnetization is shown in Fig. 8.12 for different ratios J/h. For zero magnetic field(h = 0) coth(βh) = 1 and sinh(βh) = 0 and the magnetization vanishes. The next-nearestneighbor Ising model therefore does not yield a ferromagnetic solution in the absence ofa magnetic field. It also does not produce a ferromagnetic phase transition at finite field,unless the temperature is 0. The critical temperature for this model is thus zero Kelvin.Returning now to the Heisenberg Hamiltonian, let us consider an elemental solid, in whichall atoms are equivalent and exhibit the same magnetic moment m. Inspection of Eq.(8.76) reveals that J > 0 describes ferromagnetic and J < 0 antiferromagnetic coupling.The ferromagnetic ground state is given by a perfect alignment of all moments in onedirection | ↑↑↑ . . . ↑↑〉. Naively one would expect the lowest energy excitations to be asimple spin flip | ↑↓↑ . . . ↑↑〉. Surprisingly, a detailed study of Heisenberg Hamiltoniansreveals that these states not eigenstates of the Hamiltonian. Instead, the low energyexcitations have a much more complicated wave-like form, with only slight directional

245

0 0.1 0.2 0.3 0.4 0.5 0.6β

0

0.2

0.4

0.6

0.8

1

M(β

)

h/J=0.1

h/J=0.5

h/J=0.01

h/J=1h/J=2h/J=5

Figure 8.12: The magnetization M(T ) in the 1-dimensional Ising model as a function ofinverse temperature β = (kBT )

−1 for different ratios J/h.

Figure 8.13: Schematic representation of the spatial orientation of the magnetic momentsin a one-dimensional spin-wave.

246

changes between neighboring magnetic moments as illustrated in Fig. 8.13. At secondthought it is then intuitive, that such waves will have only an infinitesimally higher energythan the ground state, if their wave vector is infinitely long. Such spin-waves are calledmagnons, and it is easy to see that they are missing from the mean-field description andalso from the Ising model. As a result the Heisenberg Hamiltonian correctly accounts forthe Bloch T 3/2-scaling of Ms(T ) for T → 0K. Magnons can in fact be expected quitegenerally, whenever there is a direction associated with the local order that can varyspatially in a continuous manner at an energy cost that becomes small as the wavelengthof the variation becomes large. Magnons are quite common in magnetism (see e.g. theground state of Cr) and their existence has been confirmed experimentally by e.g. neutronscattering.The Heisenberg model also does a good job at describing other characteristics of ferro-magnets. The high-temperature susceptibility follows a Curie-Weiss type law. Close tothe critical temperature the solution deviates from the 1/T -scaling. Monte Carlo simula-tions of Heisenberg ferromagnets yield critical exponents that are in very good agreementwith the experimental data. Moreover, experimental susceptibility or spontaneous mag-netization data can be used to fit the exchange constant Jij. The values obtained fornearest neighbor interactions are typically of the order of 10−2 eV for TM ferromagnetsand 10−3 − 10−4 eV for RE ferromagnets. These values are high compared to the energysplittings in dia- and paramagnetism and give us a first impression that the microscopicorigin of ferromagnetic coupling must be comparatively strong.We can also obtain this result by considering the mean-field limit of the HeisenbergHamiltonian:

HHeisenberg = −M∑

i,j=1

Jijµ2

B

mi ·mj

= −M∑

i=1

mi ·(

M∑

j=1

Jijµ2

B

mj

)= −

M∑

i=1

mi · µ0Hi . (8.93)

In the last step, we have realized that the term in brackets has the units of Tesla, and cantherefore be perceived as a magnetic field µ0H arising from the coupling of moment i withall other moments in the solid. However, up to now we have not gained anything by thisrewriting, because the effective magnetic field is different for every state (i. e., directionalorientation of all moments). The situation can only be simplified, by assuming a meanfield point of view and approximating this effective field by its thermal average < H >.This yields

µ0〈H〉 =M∑

j=1

Jijµ2

B

〈mj〉 =

∑Mj=1 Jij

µ2B

(N

V

)Ms(T ) ∼ λMs(T ) (8.94)

since the average magnetic moment at site i in an elemental crystal is simply given bythe macroscopic magnetization (dipole moment density) divided by the density of sites.As apparent from Eq. (8.94) we therefore arrive at a mean field that is linear in themagnetization as assumed by Weiss in his molecular field theory. In addition, we can noweven relate the field to the microscopic quantity Jij. If we assume for only nearest neighbor

247

interactions in an fcc metal, it follows from Eq. (8.94) that

J = JNNij =

µ2BB

mol

12ms

. (8.95)

Using the saturation magnetizations listed in Table 8.4 and the estimate of molecularfields of the order of 103 Tesla obtained above, we obtain exchange constants of the orderof 10−2 eV for the TM or 10−3− 10−4 eV for the RE ferromagnets in good agreement withthe values mentioned before.

8.4.4 Microscopic origin of magnetic interaction

We have seen that the phenomenological models capture the characteristics of ferromag-netism. However, what is still lacking is microscopic insight into the origin of the stronginteraction J . This must come from an ab initio theory. The natural choice (somehow also(falsely) conveyed by the name “magnetic interaction”) is to consider the direct dipolarinteraction energy of two magnetic dipoles m1 and m2, separated by a distance r. Forthis, macroscopic electrodynamics gives us

Jmag dipole =1

r3[m1 ·m2 − 3(m1 · r)(m2 · r)] , (8.96)

Omitting the angular dependence and assuming the two dipoles to be oriented parallelto each other and perpendicular to the vector between them we obtain the simplifiedexpression

Jmag dipole ≈ 1

r3m1m2 . (8.97)

Inserting magnetic moments of the order of µB, cf. Table 8.4, and a typical distance of2 Å between atoms in solids, this yields Jmag dipole ∼ 10−4 eV. This value is 1-2 orders ofmagnitude smaller than the exchange constants we derived using the phenomenologicalmodel in the previous section.Having ruled out magnetic dipole interactions the only thing left are electrostatic (Coulomb)interactions. These do in fact give rise to the strong coupling, as we will see in the fol-lowing. To be more precise it is the Coulomb interaction in competition with the Pauliexclusion principle. Loosely speaking, the exclusion principle keeps electrons with parallelspins apart, and thus reduces their Coulomb repulsion. The resulting difference in energybetween the parallel spin configuration and the antiparallel one is the exchange energywe were seeking. This alignment is favorable for ferromagnetism only in exceptional cir-cumstances because the increase in kinetic energy associated with parallel spins usuallyoutweighs the favorable decrease in potential energy. The difficulty in understanding mag-netic ordering lies in the fact that we need to go beyond the independent electron (oneparticle) picture. In the next few sections, we will look at how the Pauli principle andthe Coulomb interaction can lead to magnetic effects. We will do this for two simple yetopposite model cases: two localized magnetic moments and the free electron gas.

8.4.4.1 Exchange interaction between two localized moments

The most simple realization of two localized magnetic moments is the familiar problemof the hydrogen molecule. Here the two magnetic moments are given by the spins of the

248

two electrons (1) and (2), which belong to the two H ions A and B. If there was only oneion and one electron, we would simply have the Hamiltonian of the hydrogen atom (ho)

ho|φ〉 = ǫo|φ〉 , (8.98)

with its ground state solution |φ〉 = |r, σ〉. For the molecule we then also have to considerthe interaction between the two electrons and two ions themselves, as well as between theions and electrons. This leads to the new Hamiltonian

H|Φ〉 = (h0(A) + h0(B) + hint)|Φ〉 , (8.99)

where the two-electron wave function |Φ〉 is now a function of the positions of the ions Aand B, the two electrons (1) and (2), and their two spins σ1, σ2. Since H has no explicitspin dependence, all spin operators commute with H and the total two-electron wavefunction must factorize

|Φ〉 = |Ψorb〉|χspin〉 , (8.100)

i. e., into an orbital part and a pure spin part. As a suitable basis for the latter, we canchoose the eigenfunctions of S2 and Sz, so that the spin part of the wave function is givenby

|χspin〉S = 2−1/2 (| ↑↓〉 − | ↓↑〉) S = 0 Sz = 0|χspin〉T1 = | ↑↑〉 S = 1 Sz = 1|χspin〉T2 = 2−1/2 (| ↑↓〉 + | ↓↑〉) S = 1 Sz = 0|χspin〉T3 = | ↓↓〉 S = 1 Sz = −1 .

The fermionic character of the two electrons (and the Pauli exclusion principle) manifestsitself at this stage in the general postulate that the two-electron wave function mustchange sign when interchanging the two indistinguishable electrons. |χspin〉S changes signupon interchanging the two spins, while the three triplet wave functions |χspin〉T do not.Therefore only the following combinations are possible

|Φ〉singlet = |Ψorb,sym〉 |χS〉|Φ〉triplet = |Ψorb,asym〉 |χT〉 , (8.101)

where |Ψorb,sym(1, 2)〉 = |Ψorb,sym(2, 1)〉 and |Ψorb,asym(1, 2)〉 = −|Ψorb,asym(2, 1)〉. Thistells us that although the Hamiltonian of the system is spin independent a correlationbetween the spatial symmetry of the wave function and the total spin of the systememerges from the Pauli principle.For the hydrogen molecule the full solution can be obtained numerically without too muchdifficulty. However, here we are interested in a solution that gives us insight and that allowsus to connect to the Heisenberg picture. We therefore start in the fully dissociated limitof infinitely separated atoms ( 〈Φ|hint|Φ〉 = 0). The total wave function must separateinto a product of wave functions from the isolated H atoms

φA(r) =1√Ne−α(r−RA) . (8.102)

However, the symmetry of the two electron wave function has to be maintained. Omittingthe two ionic choices φ(1A)φ(2A) and φ(1B)φ(2B) that have a higher energy, the two low

249

energy choices are

|Ψ∞orb,sym〉 = 2−1/2 [ |φ(1A)〉|φ(2B)〉 + |φ(2A)〉|φ(1B)〉 ]

|Ψ∞orb,asym〉 = 2−1/2 [ |φ(1A)〉|φ(2B)〉 − |φ(2A)〉|φ(1B)〉 ] . (8.103)

Here |φ(1A)〉 is our notation for electron (1) being in orbit around ion A. These two wavefunctions correspond to the ground state of H2 in either a spin singlet or a spin tripletconfiguration, and it is straightforward to verify that both states are degenerate in energywith 2ǫo.Let us now consider what happens when we bring the two atoms closer together. At acertain distance, the two electron densities will slowly start to overlap and to change.Heitler and London made the following approximation for the two wave functions

φA(r) =1√Ne−αr

(1+β

r·RAr

)

. (8.104)

The two-electron problem is then still of the form given in Eq. (8.103)

|ΨHLorb,sym〉 = (2 + 2S)−1/2 [ |φ(1A)〉|φ(2B)〉 + |φ(2A)〉|φ(1B)〉 ] (8.105)

|ΨHLorb,asym〉 = (2− 2S)−1/2 [ |φ(1A)〉|φ(2B)〉 − |φ(2A)〉|φ(1B)〉 ] ,

where S is the overlap integral

S = 〈φ(1A)|φ(1B)〉〈φ(2A)|φ(2B)〉 = |〈φ(1A)|φ(1B)〉|2 . (8.106)

Due to this change, the singlet ground state energy ES = 〈Φ|H|Φ〉singlet and the tripletground state energy ET = 〈Φ|H|Φ〉triplet begin to deviate, and we obtain

ET − ES = 2CS − A1− S2

, (8.107)

where

C = 〈φ(1A)|〈φ(2B)| Hint |φ(2B)〉|φ(1A)〉 (Coulomb integral) (8.108)

A = 〈φ(1A)|〈φ(2B)| Hint |φ(2A)〉|φ(1B)〉 (Exchange integral) . (8.109)

C is simply the Coulomb interaction of the two electrons with the two ions and themselves,while A arises from the exchange of the indistinguishable particles. If the exchange didnot incur a sign change (as e.g. for two bosons), then A = CS and the singlet-tripletenergy splitting would vanish. This tells us that it is the fermionic character of the twoelectrons that yields A 6= CS as a consequence of the Pauli exclusion principle and in turna finite energy difference between the singlet ground state (antiparallel spin alignment)and the triplet ground state (parallel spin alignment). For the H2 molecule, the tripletground state is in fact less energetically favorable, because it has an asymmetric orbitalwave function. The latter must have a node exactly midway between the two atoms,meaning that there is no electron density at precisely that point where they could screenthe Coulomb repulsion between the two atoms most efficiently. As we had anticipated,the coupling between the two localized magnetic moments is therefore the result of asubtle interplay of the electrostatic interactions and the Pauli exclusion principle. What

250

determines the alignment is the availability of a spin state that minimizes the Coulombrepulsion between electrons once Fermi statistics has been taken into account. For thisreason, magnetic interactions are also often referred to as exchange interactions (and thesplitting between the corresponding energy levels as the exchange splitting).By evaluating the two integrals C and A in Eq. (8.109) using the one-electron wave func-tions of the H atom problem, we can determine the energy splitting (ET−ES). Realizingthat the two level problem of the H2 molecule can be also described by the HeisenbergHamiltonian of Eq. (8.76) (as we will show in the next section), we have therefore found away to determine the exchange constant J . The Heitler-London approximation sketchedin this section can be generalized to more than two magnetic moments, and is thenone possible way for computing the exchange constants of Heisenberg Hamiltonians for(anti)ferromagnetic solids. However, this works only in the limit of highly localized mag-netic moments, in which the wave function overlap is such that it provides an exchangeinteraction, but the electrons are not too delocalized (in which case the Heitler-Londonapproximation of Eq. (8.105) is unreasonable). With this understanding of the interac-tion between two localized moments, also the frequently used restriction to only nearest-neighbor interactions in Heisenberg Hamiltonians becomes plausible. As a note aside, inthe extreme limit of a highly screened Coulomb interaction that only operates at eachlattice site itself (and does not even reach the nearest neighbor lattice sites anymore), themagnetic behavior can also be described by a so-called Hubbard model, in which electronsare hopping on a discrete lattice and pay the price of a repulsive interaction U when twoelectrons of opposite spin occupy the same site.

8.4.4.2 From the Schrödinger equation to the Heisenberg Hamiltonian

Having gained a clearer understanding how the exchange interaction between two atomsarises we will now generalize this and derive a magnetic Hamiltonian from the Schrödingerequation. Recapping Chapter 0 the non-relativistic many-body Hamiltonian for the elec-trons is

H = −∑

i

∇2

2−∑

ia

Z

|ri −Ra|+

1

2

∑

i 6=j

1

|ri − rj|. (8.110)

It is more convenient to switch to 2nd quantization by introducing the field operators

ψ(r) =∑

α,n,λ

φnλ(r− rα)anλ(rα), (8.111)

where anλ(rα) annihilates an excitation from state n and spin state λ at site rα and φ isthe associated orbital. For the interaction part of the Hamiltonian we then obtain in 2ndquantization

1

2

∑

α1α2α3α4n1n2n3n4

λ1λ2

〈α1n1α2n2|V |α3n3α4n4〉a†n1λ1(rα1)a

†n2λ2

(rα2)an3λ2(rα3)an4λ1(rα4) . (8.112)

This is still exact, but a bit unwieldy and not very transparent.3 Since the relevant orbitalsare localized we can restrict the terms to those in which α3 = α1 and α4 = α2 or α3 = α2

3 Try to explain why the operators occur in the order a†a†aa instead of a†aa†a. Writing the creatorsto the left and annihilators to the right is called normal ordering in the literature.

251

and α4 = α1. The remaining terms involve various orbital excitations induced by theCoulomb interaction. These lead to off-diagonal exchange. We shall therefore restrict eachelectron to a definite orbital state. That is, we shall keep only those terms in which n3 = n1

and n4 = n2 or n3 = n2 and n4 = n1. If, for simplicity, we also neglect orbital-transferterms, in which two electrons interchange orbital states, then the interaction part reducesto

1

2

∑

αα′

nn′

λλ′

[〈αnα′n′|V |αnα′n′〉a†nλ(rα)a†n′λ′(rα′)an′λ′(rα′)anλ(rα) (8.113)

+〈αnα′n′|V |α′n′αn〉 ] a†nλ(rα)a†n′λ′(rα′)anλ′(rα)an′λ(rα′) (8.114)

The first term is the direct and the second the exchange term that we encountered in theprevious section. Using the fermion anticommutation relation

anλ(rα), a

†n′λ′(rα′)

= δαα′δnn′δλλ′ , (8.115)

we may write the exchange term as

1

2

∑

αα′

nn′

λλ′

Jnn′(rα, rα′)a†nλ(rα)anλ′(rα)a†n′λ′(rα′)an′λ(rα′) . (8.116)

Expanding the spin sums we can rewrite his

1

2

∑

αα′

nn′

λλ′

Jnn′(rα, rα′)

[1

4+

1

4S(rα) · S(rα′)

], (8.117)

where S is the spin (or magnetic moment) of the atom at site rα. Now mapping over tolattice sites (rα → i and rα′ → j) we finally obtain the Heisenberg Hamiltonian fromEq. (8.75)

Hspin =∑

ij

JijSi · Sj . (8.118)

We have thus shown that magnetism does not arise from a spin-dependent part in theSchrödinger equation, but from the Pauli exclusion principle that the full many-body wavefunction has to satisfy. We could now use the Coulomb integrals derived in this sectionor the microscopic interaction we have derived in the previous section for Jij to fill theHeisenberg Hamiltonian with life.

8.4.4.3 Exchange interaction in the homogeneous electron gas

The conclusion from the localized moment model is that feeding Heitler-London derivedexchange constants into Hubbard or Heisenberg models provides a good semi-quantitativebasis for describing the ferromagnetism of localized moments as is typical for rare earthsolids. However, the other extreme of the more itinerant ferromagnetism found in thetransition metals cannot be described. For this it is more appropriate to return to the

252

homogeneous electron gas, i. e., treating the ions as a jellium background. We had stud-ied this model already in section 8.3.3 for the independent electron approximation (zeroelectron-electron interaction) and found that the favorable spin alignment in an externalfield is counteracted by an increase in kinetic energy resulting from the occupation ofenergetically higher lying states. The net effect was weak and since there is no drivingforce for a spin alignment without an applied field we obtained a (Pauli) paramagneticbehavior.If we now consider the electron-electron interaction, we find a new source for spin align-ment in the exchange interaction, i. e., an exchange-hole mediated effect to reduce therepulsion of parallel spin states. This new contribution supports the ordering tendency ofan applied field, and could (if strong enough) maybe even favor ordering in the absenceof an applied field. To estimate whether this is possible, we consider the electron gas inthe Hartree-Fock (HF) approximation as the minimal theory accounting for the exchangeinteraction. In the chapter on cohesion, we already derived the HF energy per electron,but without explicitly considering spins

(E/N)jellium = Ts + EXC HF≈ 30.1 eV(

rsaB

)2 −12.5 eV(

rsaB

) . (8.119)

Compared to the independent electron model of section 8.3.3, which only yields the firstkinetic energy term, a second (attractive) term arises from exchange. To discuss magneticbehavior we now have to extend this description to the spin-polarized case. This is donein the most straightforward manner by rewriting the last expression as a function ofthe electron density (exploiting the definition of the Wigner-Seitz radius as rs/aB =(

34π(V/N)

)1/3)

Ejellium,HF(N) = N

78.2 eV

(N

V

)2/3

− 20.1 eV

(N

V

)1/3

. (8.120)

The obvious generalization to a spin-resolved gas is then

Espinjellium,HF(N

↑, N↓) = Ejellium,HF(N↑) + Ejellium,HF(N

↓) = (8.121)

=

78.2 eV

[(N↑

V

)2/3

+

(N↓

V

)2/3]− 20.1 eV

[(N↑

V

)1/3

+

(N↓

V

)1/3]

,

where N↑ is the number of electrons with up spin, and N↓ with down spin (N↑ + N↓ =N). More elegantly, the effects of spin-polarization can be discussed by introducing thepolarization

P =N↑ −N↓

N, (8.122)

which possesses the obvious limits P = ±1 for complete spin alignment and P = 0 for anon-spinpolarized gas. Making use of the relations N↑ = N

2(1 + P ) and N↓ = N

2(1− P ),

Eq. (8.121) can be cast into

Ejellium,HF(N,P ) = (8.123)

NT

1

2

[(1 + P )5/3 + (1− P )5/3

]− 5

4α[(1 + P )4/3 + (1− P )4/3

],

253

Figure 8.14: Plot of ∆E(P ), cf. Eq. (8.124), as a function of the polarization P and forvarious values of α. A ferromagnetic solution is found for α > 0.905.

with α = 0.10(V/N)1/3. Ferromagnetism, i. e., a non-vanishing spontaneous magnetizationat zero field, can occur when a system configuration with P 6= 0 leads to a lower energythan the non-polarized case, ∆E(N,P ) = E(N,P )−E(N, 0) < 0. Inserting the expressionfor the HF total energy we thus arrive at

∆E(N,P ) = (8.124)

NT

1

2

[(1 + P )5/3 + (1− P )5/3 − 2

]− 5

4α[(1 + P )4/3 + (1− P )4/3 − 2

].

Figure 8.14 shows ∆E(N,P ) as a function of the polarization P for various values of α.We see that the condition ∆E(N,P ) < 0 can only be fulfilled for

α > αc =2

5(21/3 + 1) ≈ 0.905 , (8.125)

in which case the lowest energy state is always given by a completely polarized gas (P = 1).Using the definition of α, this condition can be converted into a maximum electron densitybelow which the gas possesses a ferromagnetic solution. In units of the Wigner-Seitz radiusthis is (

rsaB

)>∼ 5.45 , (8.126)

i. e., only electron gases with a density that is low enough to yield Wigner-Seitz radiiabove 5.6 aB would be ferromagnetic. Recalling that 1.8 < rs < 5.6 for real metals, onlyCs would be a ferromagnet in the Hartree-Fock approximation.Already this is obviously not in agreement with reality (Cs is paramagnetic and Fe/Co/Niare ferromagnetic). However, the situation becomes even worse, when we start to improvethe theoretical description by considering further electron exchange and correlation effects

254

beyond Hartree-Fock. The prediction of the paramagnetic-ferromagnetic phase transitionin the homogeneous electron gas has a long history, and the results vary dramatically witheach level of exchange-correlation. The maximum density below which ferromagnetismcan occur is, however, consistently obtained much lower than in Eq. (8.126). The mostquantitative work so far predicts ferromagnetism for electron gases with densities below(rs/aB) ≈ 50±2 [F.H. Zhong, C.Lin, and D.M. Ceperley, Phys. Rev. B 66, 036703 (2002)].From this we conclude that Hartree-Fock theory allows us to qualitatively understand thereason behind itinerant ferromagnetism of delocalized electrons, but it overestimates theeffect of the exchange interaction by one order of magnitude in terms of rs. Yet, evena more refined treatment of electron exchange and correlation does not produce a gooddescription of itinerant magnetism for real metals, none of which (with 1.8 < rs < 5.6)would be ferromagnetic at all in this theory. The reason for this failure cannot be foundin our description of exchange and correlation, but must come from the jellium modelitself. The fact that completely delocalized electrons (as in e.g. simple metals) cannotlead to ferromagnetism is correctly obtained by the jellium model, but not that Fe/Co/Niare ferromagnetic. This results primarily from their partially filled d-electron shell, whichis not well described by a free electron model as we had already seen in the chapter oncohesion. In order to finally understand the itinerant ferromagnetism of the transitionmetals, we therefore need to combine the exchange interaction with something that wehave not yet considered at all: band structure effects.

8.4.5 Band consideration of ferromagnetism

In chapter 3 we had seen that an efficient treatment of electron correlation together withthe explicit consideration of the ionic positions is possible within density-functional theory(DFT). Here, we can write for the total energy of the problem

E/N = Ts[n] + E ion−ion[n] + Eel−ion[n] + EHartree[n] + EXC[n] , (8.127)

where all terms are uniquely determined functions of the electron density n(r). The mostfrequently employed approximation to the exchange-correlation term is the local densityapproximation (LDA), which gives the XC-energy at each point r as the one of the homo-geneous electron gas with equivalent density n(r). Recalling our procedure from the lastsection, it is straightforward to generalize to the so-called spin-polarized DFT, with

E/N = Ts[n↑, n↓] + E ion−ion[n] + Eel−ion[n] + EHartree[n] + EXC[n↑, n↓] . (8.128)

n↑(r) and n↓(r) then are the spin up and down electron densities, respectively. We onlyneed to consider them separately in the kinetic energy term (since there might be adifferent number of up- and down- electrons) and in the exchange-correlation term. Forthe latter, we proceed as before and obtain the local spin density approximation (LSDA)from the XC-energy of the spin-polarized homogeneous electron gas that we discussed inthe previous section. Introducing the magnetization density

m(r) =(n↑(r)− n↓(r)

)µB , (8.129)

we can write the total energy as a function of n(r) and m(r). Since the integral of themagnetization density over the whole unit cell yields the total magnetic moment, self-consistent solutions of the coupled Kohn-Sham equations can then either be obtained

255

Figure 8.15: Spin-resolved density of states (DOS) for bulk Fe and Ni in DFT-LSDA [fromV.L. Moruzzi, J.F. Janak and A.R. Williams, Calculated electronic properties of metals,Pergamon Press (1978)].

under the constraint of a specified magnetic moment (fixed spin moment (FSM) calcula-tions), or directly produce the magnetic moment that minimizes the total energy of thesystem.

Already in the LSDA this approach is very successful and yields a non-vanishing magneticmoment for Fe, Co, and Ni. Resulting spin-resolved density of states (DOS) for Fe and Niare exemplified in Fig. 8.15. After self-consistency is reached, these three ferromagneticmetals exhibit a larger number of electrons in one spin direction (called majority spin)than in the other (called minority spin). The effective potential seen by up and down elec-trons is therefore different, and correspondingly the Kohn-Sham eigenlevels differ (whichupon filling the states up to a common Fermi-level yields the different number of up anddown electrons that we had started with). Quantitatively one obtains in the LSDA thefollowing average magnetic moments per atom, ms = 2.1 µB (Fe), 1.6 µB (Co), and 0.6 µB

(Ni), which compare excellently with the experimental saturated spontaneous magnetiza-tions at T = 0K listed in Table 8.4. We can therefore state that DFT-LSDA reproduces

256

the itinerant ferromagnetism of the elemental transition metals qualitatively and quanti-tatively! Although this is quite an achievement, it is still somewhat unsatisfying, becausewe do not yet understand why only these three elements in the periodic table exhibitferromagnetism.

8.4.5.1 Stoner model of itinerant ferromagnetism

To derive at such an understanding, we take a closer look at the DOS of majority andminority spins in Fig. 8.15. Both DOS exhibit a fair amount of fine structure, but the mostobvious observation is that the majority and minority DOS seem to be shifted relativeto each other, but otherwise very similar. This can be rationalized by realizing that themagnetization density (i. e., the difference between the up and down electron densities) issignificantly smaller than the total electron density itself. For the XC-correlation potentialone can therefore write a Taylor expansion to first order in the magnetization density as

V ↑↓XC(r) =

δE↑↓XC[n(r),m(r)]

δn(r)≈ V 0

XC(r)± V (r)m(r) , (8.130)

where V 0XC(r) is the XC-potential in the non-magnetic case. If the electron density is only

slowly varying, one may assume the difference in up and down XC-potential (i. e., theterm V (r)m(r)) to also vary slowly. In the Stoner model, one therefore approximates itwith a constant term given by

V ↑↓XC,Stoner(r) = V 0

XC(r)±1

2IM . (8.131)

The proportionality constant I is called the Stoner parameter (or exchange integral), andM =

∫unit−cell

m(r)dr (and thus M = (ms/µB)× number of elements in the unit cell!).Since the only difference between the XC-potential seen by up and down electrons is aconstant shift (in total: IM), the wave functions in the Kohn-Sham equations are notchanged compared to the non-magnetic case. The only effect of the spin-polarization inthe XC-potential is therefore to shift the eigenvalues ǫ↑↓i by a constant to lower or highervalues

ǫ↑↓i = ǫ0i ± 1/2IM . (8.132)

The band structure and in turn the spin DOS are shifted with respect to the non-magneticcase,

N↑↓(E) =∑

i

∫

BZ

δ(E − ǫ↑↓k,i)dk = N0(E ± 1/2IM) , (8.133)

i. e., the Stoner approximation corresponds exactly to our initial observation of a constantshift between the up and down spin DOS.So what have we gained by? So far nothing, but as we will see, we can employ theStoner model to arrive at a simple condition that tells us by looking at an non-magneticcalculation, whether the material is likely to exhibit a ferromagnetic ground state or not.What we start out with are the results of a non-magnetic calculation, namely the non-magnetic DOS N0(E) and the total number of electrons N . In the magnetic case, Nremains the same, but what enters as a new free variable is the magnetization M . In theStoner model the whole effect of a possible spin-polarization is contained in the parameterI. We now want to relate M with I to arrive at a condition that tells us which I will yield

257

Figure 8.16: Graphical solution of the relation M = F (M) of Eq. (8.136) for two charac-teristic forms of F (M).

a non-vanishing M (and thereby a ferromagnetic ground state). We therefore proceed bynoticing that N and M are given by integrating over all occupied states of the shiftedspin DOSs up to the Fermi-level

N =

∫

ǫF

[N0(E + 1/2IM) +N0(E − 1/2IM)

]dE (8.134)

M =

∫

ǫF

[N0(E + 1/2IM)−N0(E − 1/2IM)

]dE . (8.135)

Since N0(E) and N are fixed, these two equations determine the remaining free variablesof the magnetic calculation, ǫF and M . However, the two equations are coupled (i. e.,they depend on the same variables), so that one has to solve them self-consistently. Theresulting M is therefore characterized by self-consistently fulfilling the relation

M =

∫

ǫF(M)

[N0(E + 1/2IM)−N0(E − 1/2IM)

]dE ≡ F (M) (8.136)

As expected M is a only function F (M) of the Stoner parameter I, the DOS N0(E) ofthe non-magnetic material, and the filling of the latter (determined by ǫF and in turn byN). To really quantify the resulting M at this point, we would need an explicit knowledgeof N0(E). Yet, even without such a knowledge we can derive some universal propertiesof the functional dependence of F (M) that simply follow from its form and the fact thatN0(E) > 0,

F (M) = F (−M) F (0) = 0F (±∞) = ±M∞ F ′(M) > 0 .

Here M∞ corresponds to the saturation magnetization at full spin polarization, when allmajority spin states are occupied and all minority spin states are empty. The structure ofF (M) must therefore be quite simple: monotonously rising from a constant value at −∞to a constant value at +∞, and in addition exhibiting mirror symmetry. Two possiblegeneric forms of F (M) that comply with these requirements are sketched in Fig. 8.16.In case (A), there is only the trivial solution M = 0 to the relation M = F (M) of Eq.

258

(8.136), i. e., only a non-magnetic state results. In contrast, case (B) has three possiblesolutions. Apart from the trivial M = 0, two new solutions with a finite spontaneousmagnetization M =Ms 6= 0 emerge. Therefore, if the slope of F (M = 0) is larger than 1,the monotonous function F (M) must necessarily cross the line M in another place, too.A sufficient criterion for the existence of ferromagnetic solutions with finite magnetizationMs is therefore F ′(0) > 1.

Taking the derivative of Eq. (8.136), we find that F ′(0) = IN0(ǫF), and finally arrive atthe famous Stoner criterion for ferromagnetism

IN0(ǫF) > 1 . (8.137)

A sufficient condition for a material to exhibit ferromagnetism is therefore to have a highdensity of states at the Fermi-level and a large Stoner parameter. The latter is a materialconstant that can be computed within DFT (by taking the functional derivative of V ↑↓

xc (r)with respect to the magnetization; see e.g. J.F. Janak, Phys. Rev. B 16, 255 (1977)).Loosely speaking it measures the localization of the wave functions at the Fermi-level.In turn it represents the strength of the exchange interaction per electron, while N0(ǫF)tells us how many electrons are able to participate in the ordering. Consequently, theStoner criterion has a very intuitive form: strength per electron times number of electronsparticipating. If this exceeds a critical threshold, a ferromagnetic state results.

Figure 8.17 shows the quantities entering the Stoner criterion calculated within DFT-LSDA for all elements. Gratifyingly, only the three elements Fe, Co and Ni that areindeed ferromagnetic in reality fulfill the condition. With the understanding of the Stonermodel we can now, however, also rationalize why it is only these three elements and notthe others: For ferromagnetism we need a high number of states at the Fermi-level, suchthat only transition metals come into question. In particular, N0(ǫF) is always highestfor the very late TMs with an almost filled d-band. In addition, we also require a stronglocalization of the wave functions at the Fermi-level to yield a high Stoner parameter.And here, the compact 3d orbitals outwin the more delocalized ones of the 4d and 5delements. Combining the two demands, only Fe, Co, and Ni remain.

Yet, it is interesting to see that also Ca, Sc and Pd come quite close to fulfilling the Stonercriterion. This is particularly relevant, if we consider other situations than the bulk. TheDOSs of solids (although in general quite structured) scale to first order inversely withthe band width W . In other words, the narrower the band, the higher the number ofstates per energy within the band. Just as much as by a stronger localization, W is alsodecreased by a lower coordination. In the atomic limit, the band width is zero and theStoner criterion is always fulfilled. Correspondingly, all atoms with partially filled shellshave a non-vanishing magnetic moment, given by Hund‘s rules. Somewhere intermediateare surfaces and thin films, where the atoms have a reduced coordination compared tothe bulk. The bands are then narrower and such systems can exhibit a larger magneticmoment or even a reversal of their magnetic properties (i. e., when they are non-magneticin the bulk). The magnetism of surfaces and thin films is therefore substantially richerthan the bulk magnetism discussed in this introductory lecture. Most importantly, it isalso the basis of many technological applications.

259

Figure 8.17: Trend over all elements of the Stoner parameter I, of the non-magnetic DOSat the Fermi-level (here denoted as D(EF)) and of their product. The Stoner criterionis only fulfilled for the elements Fe, Co, and Ni [from H. Ibach and H. Lüth, Solid StatePhysics, Springer (1990)].

260

Figure 8.18: Schematic illustration of magnetic domain structure.

8.5 Magnetic domain structure

With the results from the last section we can understand the ferromagnetism of the REmetals as a consequence of the exchange coupling between localized moments due to thepartially filled (and mostly atomic-like) f -shells. The itinerant ferromagnetism of the TMsFe, Co and Ni, on the other hand, comes from the exchange interaction of the still local-ized, but no longer atomic-like d orbitals together with the delocalized s-electron density.We can treat the prior localized magnetism using discrete Heisenberg or Hubbard modelsand exchange constants from Heitler-London type theories (or LDA+U type approaches),while the latter form of magnetism is more appropriately described within band theo-ries like spin-DFT-LSDA. Despite this quite comprehensive understanding, an apparentlycontradictory feature of magnetic materials is that a ferromagnet does not always showa macroscopic net magnetization, even when the temperature is well below the Curietemperature. In addition it is hard to grasp, how the magnetization can be affected byexternal fields, when we know that the molecular internal fields are vastly larger.The explanation for these puzzles lies in the fact that two rather different forces operatebetween the magnetic moments of a ferromagnet. At very short distance (of the orderof atomic spacings), magnetic order is induced by the powerful, but quite short-rangedexchange forces. Simultaneously, at distances large compared with atomic spacings, themagnetic moments still interact as magnetic dipoles. We stressed in the beginning of thelast section that the latter interaction is very weak; in fact between nearest neighbors it isorders of magnitude smaller than the exchange interaction. On the other hand, the dipolarinteraction is rather long-ranged, falling off only as the inverse cube of the separation. Con-sequently, it can still diverge, if one sums over all interactions between a large populationthat is uniformly magnetized. In order to accommodate these two competing interactions(short range alignment, but divergence if ordered ensembles become too large), ferromag-nets organize themselves into domains as illustrated in Fig. 8.18. The dipolar energy isthen substantially reduced, since due to the long-range interaction the energy of everyspin drops. This bulk-like effect is opposed by the unfavorable exchange interactions withthe nearby spins in the neighboring misaligned domains. Because, however, the exchangeinteraction is short-ranged, it is only the spins near the domain boundaries that will havetheir exchange energies raised, i. e., this is a surface effect. Provided that the domains arenot too small, domain formation will be favored in spite of the vastly greater strength of

261

Figure 8.19: Schematic showing the alignment of magnetic moments at a domain wallwith (a) an abrupt boundary and (b) a gradual boundary. The latter type is less costlyin exchange energy.

the exchange interaction: Every spin can lower its (small) dipolar energy, but only a few(those near the domain boundaries) have their (large) exchange energy raised.Whether or not a ferromagnet exhibits a macroscopic net magnetization, and how thismagnetization is altered by external fields, has therefore less to do with creating alignment(or ripping it apart), but with the domain structure and how its size and orientationaldistribution is changed by the field. In this respect it is important to notice that theboundary between two domains (known as the domain wall or Bloch wall) has often notthe (at first thought intuitive) abrupt structure. Instead a spread out, gradual change asshown in Fig. 8.19 is less costly in exchange energy. All an external field has to do then isto slightly affect the relative alignments (not flip one spin completely), in order to inducea smooth motion of the domain wall, and thereby increase the size of favorably aligneddomains. The way how the comparably weak external fields can "magnetize" a piece ofunmagnetized ferromagnet is therefore to rearrange and reorient the various domains, cf.Fig. 8.20. The corresponding domain wall motion can hereby be reversible, but it may alsowell be that it is hindered by crystalline imperfections (e.g. a grain boundary), throughwhich the wall will only pass when the driving force due to the applied field is large. Whenthe aligning field is removed, these defects may prevent the domain walls from returning totheir original unmagnetized configuration, so that it becomes necessary to apply a ratherstrong field in the opposite direction to restore the original situation. The dynamics of thedomains is thus quite complicated, and their states depend in detail upon the particularhistory that has been applied to them (which fields, how strong, which direction).As a consequence, ferromagnets always exhibit so-called hysteresis effects, when plottingthe magnetization of the sample against the external magnetic field strength as shown inFig. 8.21. Starting out with an initially unmagnetized crystal (at the origin), the appliedexternal field induces a magnetization, which reaches a saturation value Ms when alldipoles are aligned. On removing the field, the irreversible part of the domain boundarydynamics leads to a slower decrease of the magnetization, leaving the so-called remnantmagnetization MR at zero field. One needs a reverse field (opposite to the magnetizationof the sample) to further reduce the magnetization, which finally reaches zero at theso-called coercive field Bc.Depending on the value of this coercive field, one distinguishes between soft and hardmagnets in applications: A soft magnet has a small Bc and exhibits therefore only a small

262

Figure 8.20: (a) If all of the magnetic moments in a ferromagnet are aligned they producea magnetic field which extends outside the sample. (b) The net macroscopic magnetizationof a crystal can be zero, when the moments are arranged in (randomly oriented) domains.(c) Application of an external magnetic field can move the domain walls so that thedomains which have the moments aligned parallel to the field are increased in size. Intotal this gives rise to a net magnetization.

Figure 8.21: Typical hysteresis loop for a ferromagnet.

263

Figure 8.22: Hysteresis loops for soft and hard magnets.

hysteresis loop as illustrated in Fig. 8.22. Such materials are employed, whenever oneneeds to reorient the magnetization frequently and without wanting to resort to strongfields to do so. Typical applications are kernels in engines, transformators, or switchesand examples of soft magnetic materials are Fe or Fe-Ni, Fe-Co alloys. The idea of ahard magnet which requires a huge coercive field, cf. Fig. 8.22, is to maximally hinderthe demagnetization of a once magnetized state. A major field of application is thereforemagnetic storage with Fe- or Cr-Oxides, schematically illustrated in Fig. 8.23. The medium(tape or disc) is covered by many small magnetic particles each of which behaves as asingle domain. A magnet – write head – records the information onto this medium byorienting the magnetic particles along particular directions. Reading information is thensimply the reverse: The magnetic field produced by the magnetic particles induces amagnetization in the read head. In magnetoresistive materials, the electrical resistivitychanges in the presence of a magnetic field so that it is possible to directly convert themagnetic field information into an electrical response. In 1988 a structure consisting ofextremely thin, i. e., three atom-thick-alternative layers of Fe and Co was found to have amagnetoresistance about a hundred times larger than that of any natural material. Thiseffect is known as giant magnetoresistance and is in the meanwhile exploited in todaysdisc drives.

264

Figure 8.23: Illustration of the process of recording information onto a magnetic medium.The coil converts an electrical signal into a magnetic field in the write head, which in turnaffects the orientation of the magnetic particles on the magnetic tape or disc. Here thereare only two possible directions of magnetization of the particles, so the data is in binaryform and the two states can be identified as 0 and 1.

265

9 Transport Properties of Solids

9.1 Introduction

So far, we have almost exclusively discussed in solids in thermodynamic equilibrium. Inthis chapter, we will now discuss transport effects, i.e., how solids react to an externalperturbation, .e.g, to an external electric field E, i.e., to the gradient of an electric poten-tial E = −∇U , or to a temperature gradient ∇T . In all this cases, a flux (Je and Jh, seebelow) develops as reaction to the perturbation and tries to re-establish thermodynamicequilibrium. The effectiveness of this mechanism is characterized by the respective trans-port coefficients. These are material, temperature, and pressure dependent constants thatdescribe the linear relationship between the respective fluxes and perturbations, e.g., theelectrical conductivity σ in Ohm’s law

Je = −σ∇U (9.1)

and the thermal conductivity κ in Fourier’s law

Jh = −κ∇T . (9.2)

Please note that both σ and κ are tensorial quantities, e.g., a gradient along the x-axiscan in principle lead to fluxes along the y- or z-axis. In most cases, however, these non-diagonal terms of σ and κ are orders of magnitude smaller than the diagonal terms, sothat we will restrict our discussion to the diagonal terms in this chapter. Furthermore,it is important to realize that Eq. (9.1) and Eq. (9.2) are truncated first-order Taylorexpansions of the full relationships for Je(U) and Jh(T ). For the purposes of this chapter,using such a linear response approximation is more than justified, given that the typicalvoltages ∆U < 1, 000 V and temperature differences ∆T < 1, 000 K applied in real-worldapplications to macroscopic solids (volume V > 1µm3) correspond to minute electricfields1 and temperature gradients at the microscopic level V < 10 nm3.In this regime, we can thus consider the solid in the so called “Onsager picture” as con-sisting of microscopic, individual sections: Each of this section is assumed to so large thatthermodynamic equilibrium rules hold and allow to define thermodynamic quantities suchas the temperature for each individual section. However, the individual sections are notin thermodynamic equilibrium with respect to each other. At first sight, this seems tobe an assumption that apparently simplifies the discussion. As we will see in this chap-ter, this is not the case for practical considerations: The fundamental reason is that we

1As already mentioned in the previous chapter, this approximation would not longer hold for theinteractions of solids with lasers. Today’s laser can generate huge electrical fields at a microscopic levelthat in turn lead to non-linear responses.

266

are now discussing macroscopic effects driven by microscopic interactions, which requiresspanning several orders of magnitude in time and length scales. Developing theories thataccurately describe the microscopic level but still allow insights on the macroscopic levelwithin affordable computational effort is still topic of scientific research.

9.2 The Definition of Transport Coefficients

In the following, we will discuss electronic transport. For phonons, see Chap. 7, verysimilar relations are applicable with two important differences: Electrons carry chargeand are Fermions, while phonons carry no charge and are Bosons.If the fluxes consist of particles that carry both energy and charge, as it is the case forelectrons with charge e described by a chemical potential µ (i.e. the local Fermi energy inthe respective “Onsager section”), the respective fluxes Je and Jh are coupled. An electronthat is contributing to a heat flux driven by a thermal gradient still carries a charge andthus also leads to a electrical flux (and vice versa). Formally this is described by the socalled Onsager coefficients Lij:

Je = L11

(−∇U − ∇µ

e

)+ L12 (−∇T ) (9.3)

Jh = L21

(−∇U − ∇µ

e

)+ L22 (−∇T ) . (9.4)

As we will show with microscopic considerations later, L21 = TL12. With that, a com-parison of the Onsager relations above with Eq. (9.1) and (9.2) shows that the electricalconductivity corresponds to σ = L11 and that the thermal conductivity corresponds to

κ = L22 − L21L12

L11= L22 − T (L12)2

L11. (9.5)

The formal asymmetry in the definition of σ and κ stems in fact from their experimentaldefinition. While σ is defined for measurements in thermodynamic equilibrium, in whichboth cathode and anode are kept at the same temperature via interactions with theenvironment (so heat is added and taken from the system), κ is typically measured inan open circuit, in which an electronic flux and the respective charges cannot leave thesystem. This leads to a pile-up of charge on the cold side, which in turn leads to an electricflux in the opposite direction that needs to be accounted for by the non-diagonal Lij terms.This leads to the definition given in Eq. (9.5).In turn, the considerations discussed above highlight an interesting and peculiar effect,the thermoelectric effect. For a non-zero temperature gradient in a closed circuit, Eq. (9.3)and (9.4) imply that a residual voltage is induced by a temperature gradient

(−∇U − ∇µ

e

)=L12

L11∇T = S∇T , (9.6)

as described by the Seebeck coefficient S. This thermoelectric effect is currently very muchin the focus of many scientific and industrial studies, since it in principle allows to recovera useful voltage from otherwise wasted heat, e.g., from industrial plants operating athigh temperatures, from exhaustion fumes from vehicles, and even from electronics such

267

as CPUs. However, the conversion efficiency of the currently exhisting thermoelectricmaterials is still too low for a large-scale, economically attractive deployment. To enablesuch applications, thermoelectric materials with higher efficiency are needed, which istypically characterized by the figure-of-merit zT defined as:

zT =σS2T

κel + κph

, (9.7)

whereby currently known thermoelectric materials exhibit values of zT of ∼ 2 at most.Improving zT is far from trivial: Metals typically exhibit high values of σ but also highvalues of κel, which essentially leads to zT ≈ 0. Vice versa, κ = κph becomes small ininsulators due to the absence of charge carriers, but then σ and thus zT vanish again.Accordingly, (highly doped) semiconductors are the most attractive material class forthese kind of applications. Eventually, let us note that the Seebeck effect also works “theother way round”: The so called Peltier effect describes how an electric current through aclosed circuit induces a heat current

Jh = TL12

L11Je = T S · Je (9.8)

that allows cooling. For instance, this technique is used to date in so called solid statefridges, e.g., in airplanes.To understand these effects and to estimate and compute this quantities, it is necessary tointroduce concepts stemming from the still very active and quite complex research field ofnon-equilibrium quantum-mechanical thermodynamics and statistical mechanics. At first,let’s just discuss the qualitative results of such a derivation. The microscopic, local fluxdensities j(r, t) and jH(r, t) have to fulfill the respective continuity equations

e∂n(r, t)

∂t+∇ · j(r, t) = 0⇒ J(t) =

1

V

∫

V

j(r, t)dr (9.9)

∂ε(r, t)

∂t+∇ · jH(r, t) = 0⇒ JH(t) =

1

V

∫

V

jH(r, t)dr , (9.10)

in which e n(r, t) is the charge density, ε(r, t) the energy density, and V the volume of anOnsager system. This allows to define the quantum-mechanical flux operators

j =e

mp (9.11)

jH =1

2m[H, p]− hep . (9.12)

In the last equation, he denotes the enthalpy per electron. A full quantum-mechanicaltreatment of these equations involves quite elaborate derivations. For this exact reason,we will consider these effects in a semi-classical approximation first.

9.3 Semiclassical Theory of Transport

To introduce non-equilibrium in our system, we first define a wave packet

Ψn(r,k, t) =∑

k

g(k)Φn(k, r) exp

(− i~εn(k)

)with g(k′) ≈ 0 for |k− k′| > ∆k (9.13)

268

Figure 9.1: Sketch of an electron wavepacket, as described by Eq. (9.13): The wavepacketshall be spread out over multiple unit cells, i.e., it shall be larger that typical latticeconstants. Also, the wavepacket shall be smaller than the wavelength of the applied field.

using the one-particles states Φn(k, r) and their eigenvalues εn(k). The condition |k−k′| >δk implies that the wave packet is localized in k space and can thus be meaningfully ap-proximated by an average k value. In turn, this implies that this wave packet is delocalizedin real space, see Fig. 9.1. Accordingly, such a description is consistent with the Onsagerpicture defined in the introduction. Given that this wave packet is so wide spread, we candescribes its dynamics in an electric field by classical equations of motion:

r = vn(k) =1

~

∂εn(k)

∂kand ~k = −eE(r, t) . (9.14)

While the first equation just makes use of the fact that a wave-packet travels with a well-defined group velocity, a fully correct, formal derivation of the second equations is tricky.Still, one can qualitatively derive them by considering that energy conservation needs tohold during this motion:

εn(k)− eU = constant (9.15)

Then, its time derivative must vanish:

0 =∂εn(k)

∂kk− e∇U · r (9.16)

= vn(k) ·[~k− e∇U

]= vn(k) ·

[~k+ eE

](9.17)

Please note that in this case we have assumed that the band index n is a constant ofthe motion, i.e., that the considered electrical and thermal fields cannot induce electronictransition between different electronic bands. The flux associated to such a wave packetsthen becomes

Jn(k) = −evn(k) and JHn (k) = εn(k)vn(k) (9.18)

In this formalism, the individual electrons thus just travel with a velocity vn(k) andthereby carry a charge −e and an energy εn(k) with them.

269

Already this very simple considerations allow some very important insight in the physicsof transport:

• Solving this equations of motion for a constant DC field E(r, t) = E yields

k(t) = k(0)− eEt

~. (9.19)

Given that our group velocity is periodic and obeys time reversal symmetry vn(k) =−vn(−k), this implies that the group velocity can change sign over time. In turn,this means that a DC field could induce an AC current if the electrons travel longenough to overcome zone boundaries. We will see later that this is generally not thecase.

• For the exact same reasons (vn(k) = −vn(−k), εn(k) = εn(−k)), we can see that afully filled band is inert, i.e., it does not carry any flux

Jn =

∫dkJn(k) = 0 , (9.20)

since the contributions from k and −k cancel each other out. That’s why insulatorsare insulators and electronic transport only happens in semiconductors and metalswith partially filled bands. This implies that the only relevant portion of the elec-tronic band structure that contributes to transport is situated around the Fermienergy εF ± kBT .

• Electrons & Holes: Now, let’s assume that band n contains only one electron at k′

viz. εn(k′), i.e.,

f(εn(k)) =

1 for k = k′

0 else. (9.21)

In this case, the electric flux becomes:

Jn =

∫dk f(εn(k))Jn(k) = −evn(k

′) . (9.22)

If the band would be fully filled except one electron at k′ viz. εn(k′), the respectiveoccupation numbers become f ′(εn(k)) = 1− f(εn(k)) and we get the flux

Jn =

∫dk f ′(εn(k))Jn(k) =

∫dk Jn(k)−

∫dk f(εn(k))Jn(k) = evn(k

′) . (9.23)

The first integral vanishes due to Eq. (9.20); the second one describes the flux ofone “electron with opposite charge”, i.e., a hole. Although formally equivalent, it isoften more convenient to discuss conduction in almost fully filled bands in terms ofmoving holes instead of electrons.

There is one additional important implication: Since in full thermodynamic equi-librium the Fermi distribution is symmetric f(εn(−k)) = f(εn(k)), no flux can bepresent even in not fully filled bands. That is why we will have to look at non-equilibrium distributions in the next section to understand the transport.

270

• In semiconductors, in which only electrons/holes close to the valence band maxi-mum/conduction band minimum contribute, one can often approximate the respec-tive band structure with a second order Taylor expansion:

εn(k) ≈ εn(k) +1

2

∑

i,j

∂2εn(k)

∂kikj

kikj . (9.24)

Formally, the dispersion thus corresponds to the one of a free electron with a slightlydifferent, not necessarily isotropic mass, as described by the tensor:

[M ]ij =

(1

~2

∂2εn(k)

∂ki∂kj

)−1

. (9.25)

Accordingly, one can speak about fast (light) and slow (heavy) electrons/holes in asolid.

9.4 Boltzmann Transport Theory

With the formalisms introduced in the previous sections, we are now able to discuss how anarbitrary electron population g(r,k, t), i.e., one that is not necessarily in thermodynamicequilibrium, evolves over time. Using Liouville’s theorem (conservation of phase space)and the relations derived for r(t) and k(t) in the previous section, we thus get:

g(r,k, t)drdk = g(r− v(k)dt,k+eE

~, t− dt)drdk (9.26)

By linearizing these equation, we get the so called “drift” terms that in an Onsager picturehave to be counterbalance by a scattering or collision mechanism:

(dg(r,k, t)

dt

)

drift

=∂g

∂t+ v · ∂g

∂r− eE

~

∂g

∂k=

(∂g

∂t

)

coll

(9.27)

By defining a collision scattering probability from state k into k′

Wkk′

(2π)3dtdk′ , (9.28)

we find that these collisions can either lead to less electrons at k (scattering from k intoany state k′): (

∂g

∂t

)

coll,out

= −g(k)∫

dk′

(2π)3Wkk′ [1− g(k′)] (9.29)

or to additional electrons at k (scattering from any state k′ into state k)2:(∂g

∂t

)

coll,in

= [1− g(k)]∫

dk′

(2π)3Wk′kg(k

′) . (9.30)

2Please note that the respective collision rates in Eq. (9.29) and (9.30) are proportional to the popula-tion g of electrons that are actually scattered and proportional to 1− g of states in which these electronscan be scattered.

271

With that, we get the full collision term(∂g

∂t

)

coll

=

(∂g

∂t

)

coll,in

+

(∂g

∂t

)

coll,out

(9.31)

= −∫

dk′

(2π)3Wkk′g(k)[1− g(k′)]−Wk′k[1− g(k)]g(k′) (9.32)

≈ − [g(k)− g0(k)]τ(k)

(9.33)

In the last step, we have applied the so called relaxation time approximation (RTA) that al-lows to solve the Boltzmann transport equation analytically. This relaxation time approx-imation assumes (a) that the electron distribution is close to the equilibrium one, i.e., tothe Fermi function (b) that scattering generates equilibrium, i.e., drives the populationg(k) towards g0(k), and (c) that all the involved scattering rates can be subsumed into sin-gle state-specific lifetime or relaxation time τ(k) that does not depend on g(k). Formallythis implies:

(∂g

∂t

)

coll,out

= −g(k)∫

dk′

(2π)3Wkk′ [1− g(k′)] ≈ −g(k)

τ(k)(9.34)

(∂g

∂t

)

coll,in

= [1− g(k)]∫

dk′

(2π)3Wk′kg(k

′) ≈ g0(k)

τ(k)(9.35)

Now, lets solve the Boltzmann transport equation under different boundary conditions:

• If no external forces/fields are present (E = 0) and the electrons are uniformlydistributed in space (∇g = 0), Eq. (9.27) becomes

∂g

∂t= − [g(k)− g0(k)]

τ(k)(9.36)

in the RTA. Its solution is

g(t) =(g(k, t = 0)− g0(k)

)exp

(− t

τ(k)

)+ g0(k) , (9.37)

which describes the relaxation of a non-equilibrium population g(k, t = 0) to theequilibrium one in an exponential decay characterized by the relaxation time τ .

• Under steady state (∂g/∂t = 0), closed circuit conditions (∇g = 0) with an externalfield E, Eq. (9.27) becomes

− eE

~

∂g

∂k= − [g(k)− g0(k)]

τ(k)(9.38)

in the RTA. Its solution is

g(t) = g0(t) + τeE

~

∂g

∂k≈ g0(t) + τ

eE

~

∂g0

∂k= g0(t) + e

∂g0

∂ǫ(k)v(k)τE (9.39)

272

In the last step, we have simplified the expression by assuming ∂g/∂k ≈ ∂g0/∂kand using the definition of the group velocity. With that, we can write the flux as

j = −e∑

n

∫dk

4π3gn(k, t)vn(k) = e2

∑

n

∫dk

4π3v2n(k)τn(k)

∂g0

∂ǫn(k)︸︷︷︸

σ

E . (9.40)

A comparison with Ohm’s law yields the definition of σ. Again, the fact that thederivative ∂g0/∂ǫn(k) enters the definition of σ above highlights that filled bandsare inert and that only electrons/holes close to the Fermi-energy matter. Please notethat the derivation can be generalized to the AC case

σ(ω) = e2∑

n

∫dk

4π3v2n(k)

∂g0

∂ǫn(k)

1

1/τn(k)− iω(9.41)

by usingE→ Re[E exp(iωt)] . (9.42)

• Eventually, we can generalize this derivation to describe all Onsager coefficients bydefining the respective heat jq and entropy js fluxes from the respective thermody-namic relation:

dQ = TdS ⇒ jq = T js . (9.43)

Similarly, the respective particle jn and energy flux je can be defined using therespective thermodynamic relation and the chemical potential µ:

TdS = dU − µdN ⇒ T js = je − µjn . (9.44)

Accordingly, we get:

je =∑

n

∫dk

4π3gn(k, t)vn(k)εn(k) (9.45)

jn =∑

n

∫dk

4π3gn(k, t)vn(k) (9.46)

⇒ jq =∑

n

∫dk

4π3gn(k, t)vn(k)(εn(k)− µ) (9.47)

Solving the respective BTEs in the RTA yields:

g(t) ≈ g0(t)−(∂g0

∂ǫk

)v(k)τ(k)

[−eE−∇µ+

εn(k)− µT

(−∇T )]. (9.48)

Again, a comparison with the Onsager relations introduced before

j = L11(E+∇µ) + L12(−∇T ) (9.49)

jq = L21(E+∇µ) + L22(−∇T ) (9.50)

yields the following definition for the Onsager coefficients

Lα = e2

∑

n

∫dk

4π3v2n(k)τn(k)

∂g0

∂ǫn(k)(εn(k)− µ)α , (9.51)

where L11 = L0, L21 = TL12 = −1

eL

1, and L22 = 1e2T

L2.

273

From the derived equations, one can immediately derive some important material proper-ties: For metals, which feature a large amount of free carriers, the electrical and thermalconductivity is typically. Using the Sommerfeld expansion around εF introduced in theexercises, one can immediately show that the thermal conductivity and the electricalconductivity are proportional to each other, as described by the Wiedemann-Franz law:

κ =π2

3

(kB

e

)2

Tσ . (9.52)

As shown in Fig. 9.2, the typical temperature dependence of the resistivity ρ = 1/σ followsthe following rule:

ρ(T ) ≈ ρ(0) +BT 5 . (9.53)

At low temperatures, we have a finite (residual) resistivity ρ(0) that is caused by scatteringwith (intrinsic) defects, grain boundaries, and surfaces. At elevated temperatures, theresistivity increases with T 5 due to the scattering of electrons with phonons that determinethe lifetimes τ(k). As we will see in the next section, the resistivity can, however, drop to0 (infinite conductivity) under certain circumstances.

9.5 Superconductivity

So far in this lecture we have considered materials as many-body “mix” of electrons andions. The key approximations that we have made are

• the Born-Oppenheimer approximation that allowed us to separate electronic andionic degrees of freedom and to concentrate on the solution of the electronic many-body Hamiltonian with the nuclei only entering parameterically.

• the independent-electron approximation, which greatly simplifies the theoretical de-scription and facilitates our understanding of matter. For density-functional theorywe have shown in Chapter 3 that the independent-electron approximation is alwayspossible for the many-electron ground state.

• that the independent quasiparticle can be seen as a simple extension of the independent-electron approximation (e.g. photoemission, Fermi liquid theory for metals, etc.).

We have also considered cases in which these approximations break down. The treatmentof electron-phonon (or electron-nuclei) coupling, for example, requires a more sophisti-cated approach, as we saw. Also most magnetic orderings are many-body phenomena. InChapter 8 we studied, for example, how spin coupling is introduced through the many-body form of the wave function.In 1911 H. Kamerlingh Onnes discovered superconductivity in Leiden. He observed thatthe electrical resistance of various metals such as mercury, lead and tin disappeared com-pletely at low enough temperatures (see Fig. 9.2). Instead of dropping off as T 5 to a finiteresistivity that is given by impurity scattering in normal metals, the resistivity of super-conductors drops of sharply to zero at a critical temperature Tc. The record for the longestunstained current in a material is now 2 1/2 years. However, according to the theoriesfor ordinary metals developed in this lecture, perfect conductivity should not exist atT > 0. Moreover, superconductivity is not an isolated phenomenon. Figure 9.3 shows the

274

Figure 9.2: The resistivity of a normal (left) and a superconductor (right) as a functionof temperature.

Figure 9.3: Superconducting elements in the periodic table. From Center for IntegratedResearch & Learning.

elements in the periodic table that exhibit superconductivity, which is surprisingly large.Much larger, in fact, than the number of ferromagnetic elements in Fig. 8.1. However,compared to the ferromagnetic transition temperatures discussed in the previous Chaptertypical critical temperatures for elemental superconductors are very low (see Tab. 9.1).The phenomenon of superconductivity eluded a proper understanding and theoreticaldescription for many decades, which did not emerge until the 1950s and 1960s. Then in1986 the picture that had been developed for “classical” superconductor was turned on itshead by the discovery of high temperature superconductivity by Bednorz and Müller. Thisled to a sudden jump in achievable critical temperatures as Fig. 9.4 illustrates. The record

element Hg (α) Sn Pb NbTc/K 4.15 3.72 7.19 9.26

Table 9.1: Critical temperatures in Kelvin for selected elemental superconductor.

275

Figure 9.4: Timeline of superconducting materials and their critical temperature. Redsymbols mark conventional superconductors and blue high temperature superconductors.From Wikipedia.

now lies upwards of 150 Kelvin, but although this is significantly higher than what hasso far been achieved with “conventional” superconductors (red symbols in Fig. 9.4), thedream of room temperature superconductivity still remains elusive. What also remainselusive is the mechanism that leads to high temperature superconductivity. In this Chapterwe will therefore focus on “classical” or “conventional” superconductivity and recap themost important historic steps that led to the formulation of the BCS ( Bardeen, Cooper,and Schrieffer) theory of superconductivity. Suggested reading for this Chapter are thebooks by A. S. Alexandrov Theory of Superconductivity – From Weak to Strong Coupling,M. Tinkham Introduction to Superconductivity and D. L. Goodstein States of Matter.

9.6 Meissner effect

Perfect conductivity is certainly the first hallmark of superconductivity. The next hall-mark was discovered in 1933 by Meissner and Ochsenfeld. They observed that in thesuperconducting state the material behaves like a perfect diamagnet. Not only can anexternal field not enter the superconductor, which could still be explained by perfect con-ductivity, but also a field that was originally present in the normal state is expelled in thesuperconducting state (see Fig. 9.5). Expressed in the language of the previous Chapterthis implies

B = µ0µH = µ0(1 + χ)H = 0 (9.54)

and therefore χ = −1, which is not only diamagnetic, but also very large. The expulsionof a magnetic field cannot be explained by perfect conductivity, which would tend to trapflux. The existence of such a reversible Meissner effect implies that superconductivity willbe destroyed by a critical magnetic field Hc. The phase diagram of such a superconductor(denoted type I, as we will see in the following) is shown in Fig. 9.6.

276

Figure 9.5: Diagram of the Meissner effect. Magnetic field lines, represented as arrows,are excluded from a superconductor when it is below its critical temperature.

To understand this odd behavior we first consider a perfect conductor. However, we as-sume that only a certain portion ns of all electrons conduct perfectly. All others conductdissipatively. In an electric field E the electrons will be accelerated

mdvs

dt= −eE (9.55)

and the current density is given by j = −evsns. Inserting this into eq. 9.55 gives

dj

dt=nse

2

mE . (9.56)

Substituting this into Faraday’s law of induction

∇× E = −1

c

∂B

∂t(9.57)

we obtain∂

∂t

(∇× j+

nse2

mcB

)= 0 . (9.58)

Together with Maxwell’s equation

∇×B =4π

cj (9.59)

this determines the fields and currents that can exist in the superconductor. Eq. 9.58 tellsus that any change in the magnetic field B will be screened immediately. This is consistentwith Meissner’s and Ochsenfeld’s observation. However, there are static currents andmagnetic fields that satisfy the two equations. This is inconsistent with the observation

277

Figure 9.6: The phase diagram of a type I superconductor as a function temperature andthe magnetic field H.

that fields are expelled from the superconducting region. Therefore perfect conductivityalone does not explain the Meissner effect.To solve this conundrum F. and H. London proposed to restrict the solutions for super-conductors to fields and currents that satisfy

∇× j+nse

2

mcB = 0 . (9.60)

This equation is known as London equation and inserting it into Maxwell’s equation yields

∇× (∇×B) +4πnse

2

mc2B = 0 . (9.61)

Using the relation ∇× (∇×B) = ∇(∇B) −∇2B and applying Gauss’ law ∇B = 0 weobtain

∇2B =4πnse

2

mc2B . (9.62)

For the solution of this differential equation we consider a half infinite superconductor(x > 0). The magnetic field then decays exponentially

B(x) = B(0)e− x

λL , (9.63)

where the London penetration depth is given by

λL =

√mc2

4πnse2= 41.9

(rsa0

) 32(n

ns

) 12

Å . (9.64)

The magnetic field at the boarder between a normal material and a superconductor isshown schematically in Fig. 9.7. Inside the superconductor the field is expelled fromthe superconductor by eddy currents at its surface. The field subsequently decays ex-ponentially fast. The decay length is given by the London penetration depth, which for

278

Figure 9.7: The magnetic field inside a type I superconductor drops off exponentially fastfrom its surface.

typical superconductors is of the order of 102 to 103 Å. Although London’s explanationphenomenologically explains the Meissner effect it begs the question if we can create amicroscopic theory (i.e. version of London’s equation) that goes beyond standard electro-dynamics. And how can we create a phase transition in the electronic structure that leadsto the absence of all scattering.

9.7 London theory

F. London first noticed that a degenerate Bose-Einstein gas provides a good basis for aphenomenological model of superconductivity. F. and H. London then proposed a quantummechanical analog of London’s electrodynamical derivation shown in the previous section.They showed that the Meissner effect could be interpreted very simply by a peculiarcoupling in momentum space, as if there was something like a condensed phase of theBose gas. The idea to replace Fermi statistics by Bose statistics in the theory of metalsled them to an equation for the current, which turned out to be microscopically exact.The quantum mechanical current density is given by

j(r) =ie

2m(ψ∗∇ψ − ψ∇ψ∗)− e2

mAψ∗ψ (9.65)

where A ≡ A(r) is a vector potential such that∇×A = B. One should sum this expressionover all electron states below the Fermi level of the normal metal. If the particles governedby the wave functions ψ in this expression were electrons we would only observe weakLandau diamagnetism (see previous Chapter). So let us instead assume bosons with chargee∗ and mass m∗. They satisfy their Schrödinger equation

− 1

2m∗ (∇+ ie∗A)φ(r) = Eφ(r) . (9.66)

Choosing the Maxwell gauge for the vector field (∇A = 0) we obtain(− 1

2m∗∇2 − ie∗

m∗A∇+e∗

2m∗A2

)φ(r) = Eφ(r) . (9.67)

279

Now we assume A to be small so that we can apply perturbation theory

φ(r) = φ0(r) + φl(r) , (9.68)

where φ0(r) is the ground state wave function in the absence of the field and φl(r) theperturbation introduced by A. We know from the theory of Bose condensates that at lowtemperatures the condensate wave function is a constant

φ0(r) =1√V, k = 0, E0 = 0, (9.69)

where V is the volume of the condensate. Next we insert eq. 9.68 into eq. 9.67 and expandto first order in A and φl

− 1

2m∗∇2φ0(r)−

1

2m∗∇2φl(r)−

ie∗

m∗A∇φ0(r) = E0φl(r) + E1φ0(r) . (9.70)

The first and third term on the left hand side of this equation and the first term on theright hand side are zero. Since we are first order in A, E1 has to be proportional to A

and because E1 is a scalar our only choice is for it to be proportional to ∇A. This meansit vanishes in the Maxwell gauge and φl(r) = 0 to first order in A. Inserting φ(r) = φ0(r)into the equation for the current density (eq. 9.65) the term in brackets evaluates to zeroand we are left with

j(r) = − e∗2

2m∗Ns

VA = −e

∗2ns

2m∗ A . (9.71)

With ∇×A = B and Maxwell’s equation ∇×B =4π

cj we finally arrive at

B+ λL∇×∇B = 0, (9.72)

which is identical to the London equation derived from electrodynamics in the previ-ous section (cf eq. 9.61). We now have a quantum mechanical derivation of the Londonequation and the London penetration depth and we have identified the carriers in thesuperconducting state to be bosons. However, this begs the question where these bosonscome from?

9.8 Flux quantization

If the superconducting state is a condensate of Bosons, as proposed by London, then itswave function is of the form

φ(r) =√nse

−ϕ(r) . (9.73)

ns is the superconducting density, that we introduced in section 9.6 and which we assumeto be constant throughout the superconducting region. ϕ(r) is the phase factor of thewave function, which is yet undetermined. To pin down the phase let us consider a holein our superconductor (cf Fig. 9.8) or alternatively a superconducting ring. The magneticflux through this hole is

ΦB =

∫

surf

dsB =

∫

surf

ds∇×A(r) =

∮

C

dlA(r) , (9.74)

280

Figure 9.8: Flux trapped in a hole in a superconductor is quantized.

where we have used Stoke’s theorem to convert the surface integral into the line integralover the contour C. But according to London’s theory the magnetic field inside the super-conductor falls off exponential so that we can find a contour on which B and j are zero.Using the equation for the current density (eq. 9.65) this yields

0 = j(r) =ie∗

2m∗ (ψ∗∇ψ − ψ∇ψ∗)− e∗2

m∗Aψ∗ψ . (9.75)

Inserting our wave function (eq. 9.73) for the condensate gives the following equation forthe phase

ie∗

2m∗ (i∇ϕφ∗φ+ i∇ϕφφ∗)− e∗2

m∗Aφ∗φ = 0 (9.76)

or− e∗

m∗φ∗φ (∇ϕ+ e∗A) = 0 . (9.77)

This implies that there is a relation between the phase and the vector potential

A(r) = −∇ϕ(r)e∗

. (9.78)

Inserting this result into our expression for the magnetic flux (eq. 9.74) we finally obtain

ΦB = −∮

C

dl∇ϕe∗

=δϕ

e∗. (9.79)

Here δϕ is the change of the phase in the round trip along the contour, which is only givenup to modulo 2π: δϕ = 2πp with p=0, 1, 2,. . . This means the magnetic flux is quantized

ΦB =2π~c

e∗p. (9.80)

Experimentally this was confirmed by Deaver and Fairbank in 1961 who found e∗ = 2e!

281

9.9 Ogg’s pairs

Recapping at this point, we have learned that the superconducting state is made upof bosons with twice the electron charge. This led Richard Ogg Jr to propose in 1946that electrons might pair in real space. Like for the hydrogen molecule discussed in sec-tion 8.4.4.1 the two electrons could pair chemically and form a boson with spin S=0 or 1.Ogg suggested that an ensemble of such two-electron entities could, in principle, form asuperconducting Bose-Einstein condensate. The idea was motivated by his demonstrationthat electron pairs were a stable constituent of fairly dilute solutions of alkali metals inliquid ammonia. The theory was further developed by Schafroth, Butler and Blatt, butdid not produce a quantitative description of superconductivity (e.g. the Tc of ∼ 104 Kis much too high). The theory could also not provide a microscopic force to explain thepairing of normally repulsive electrons. It was therefore concluded that electron pairingin real-space does not work and the theory was forgotten.

9.10 Microscopic theory - BCS theory

9.10.1 Cooper pairs

The basic idea behind the weak attraction between two electrons was presented by Cooperin 1956. He showed that the Fermi sea of electrons is unstable against the formation of atleast one bound pair, regardless of how weak the interaction is, son long as it is attractive.This result is a consequence of Fermi statistics and of the existence of the Fermi-seabackground, since it is well known that binding does not ordinarily occur in the two-bodyproblem in three dimensions until the strength of the potential exceeds a finite thresholdvalue.To see how the binding comes about, we consider the model of two quasiparticles withmomentum k and −k in a Fermi liquid (e.g. free-electron metal). The spatial part of thetwo-particle wave function of this pair is then

ψ0(r1, r2) =∑

k

gkeikr1e−ikr2 . (9.81)

For the spin part we assume singlet pairing |χspin >S = 2−1/2 (| ↑↓> − | ↓↑>) (seealso section 8.4.4.1), which gives us S=0, i.e. a boson. The triplet with S=1 would alsobe possible, but is of higher energy for conventional superconductors (see e.g. unconven-tional superconductivity). |χspin >S is antisymmetric with respect to particle exchangeand ψ0(r1, r2) therefore has to be symmetric:

ψ0(r1, r2) =∑

k>kF

gk cosk(r1 − r2) . (9.82)

If we insert this wave function into our two-particle Schrödinger equation[∑

i=1,2

pi2

2m+ V (r1, r2)

]ψ0 = Eψ0 (9.83)

we obtain(E − 2ǫk)gk =

∑

k′>kF

Vkk′gk′ . (9.84)

282

Vkk′ is the characteristic strength of the scattering potential and is given by the Fouriertransform of the potential V

Vkk′ =1

Ω

∫drV (r)e−(k−k′)r (9.85)

with r = r1 − r2 and Ω the normalized volume. The energies ǫk in eq. 9.84 are theunperturbed plane-wave energies and are larger than ǫF , because the sum runs over k >kF . If a set of gk exists such that E < 2ǫF then we would have a bound state of twoelectrons.But how can this be? Recall that the Fourier transform of the bare Coulomb potential(i.e. free electrons) is

Vkk′ = Vq=k−k′ =4π2

q2> 0 . (9.86)

This is always larger than zero and therefore not attractive, as expected. However, quasi-particles are not free electrons and are screened by the electron sea. Let us considerThomas-Fermi screening introduced in Chapter 3. The dielectric function becomes

ε(k) =k2 + k2

0

k26= 1 . (9.87)

The bare Coulomb potential is screened by ε

V (r1 − r2) =ε−1(r1 − r2)

|r1 − r2|. (9.88)

Its Fourier transform

Vq =4π2

q2 + k20

(9.89)

is still positive, but now reduced compared to the bare Coulomb interaction. Building onthe idea of screening we should also consider the ion cores. They are positively chargedand also move to screen the negative charge of our electron pair. However, their “reactionspeed” is much smaller than those of the electrons and of the order of typical phononfrequencies. A rough estimate (see e.g. Ashcroft Chapter 26) to further screen the Thomas-Fermi expression could look like

Vq =4π2

q2 + k20

(1 +

ω2

ω2 − ω2q

), (9.90)

where ωq is a phonon frequency of wave vector q and ω =1

~(ǫk − ǫ′k). If now ω < ωq

this (oversimplified) potential would be negative and therefore attractive. In other words,phonons could provide our attractive interaction that is schematically depicted in Fig. 9.9.At normal temperatures, the positive ions in a superconducting material vibrate awayon the spot and constantly collide with the electron bath around them. It is these col-lisions that cause the electrical resistance that wastes energy and produces heat in anynormal circuit. But the cooler the material gets, the less energy the ions have, so theless they vibrate. When the material reaches its critical temperature, the ions’ vibrationsare incredibly weak and no longer the dominant form of motion in the lattice. The tiny

283

Figure 9.9: Cooper pair formation schematically: at extremely low temperatures, an elec-tron can draw the positive ions in a superconductor towards it. This movement of theions creates a more positive region that attracts another electron to the area.

attractive force of passing electrons that’s always been there is suddenly enough to dragthe positive ions out of position towards them. And that dragging affects the behaviourof the solid as a whole. When positive ions are drawn towards a passing electron, theycreate an area that is more positive than their surroundings, so another nearby electron isdrawn towards them. However, those electrons are on the move, so by the time the secondone has arrived the first one has moved on and created a path of higher positivity thatthe second electron keeps on following. The electrons are hitched in a game of catch-upthat lasts as long as the temperature stays low.To proceed Cooper then further simplified the potential

Vkk′ =

−V for |ǫk − ǫF |, |ǫk′ − ǫF | < ~ωc

0 otherwise(9.91)

where ωc is the Debye frequency. Now V pulls out of the sum on the right-hand side ofeq. 9.84 and we can rearrange for gk:

gk = V

∑k′ gk′

E − 2ǫk. (9.92)

Applying∑

k on both sides and canceling∑

k gk we obtain

1

V=∑

k>kF

(2ǫk − E)−1 . (9.93)

Replacing the sum by an integral yields

1

V=

∫ ǫF+~ωc

ǫF

dǫN(ǫ)1

2ǫ− E ≈1

2N(ǫF ) ln

2ǫF − E + 2~ωc

2ǫF − E(9.94)

284

where we have made the assumption N(ǫ) ≈ N(ǫF ). Solving this expression for E weobtain

E(e−2

NV − 1) = 2ǫF (e− 2

NV − 1) + 2~ωce− 2

NV . (9.95)

Now we make the “weak coupling” approximation N(ǫF )V ≪ 1 such that 1 − e− 2NV ≈ 1

to arrive at the final expression

E ≈ 2ǫF − 2~ωce− 2

NV < 2ǫF . (9.96)

We see that we indeed find a bound state, in which the binding outweighs the gain inkinetic energy for states above ǫF , regardless of how small V is. It is important to note thatthe binding energy is not analytic at V = 0, i.e. it cannot be expanded in powers of V . Thismeans that perturbation theory is not applicable, which slowed down the development ofthe theory significantly. With view to the Ogg pairs discussed in section 9.9 we note thatthe Cooper pair is bound in reciprocal space:

ψ0 =∑

k>kF

gk cosk(r1 − r2) (9.97)

∼∑

k>kF

V

2(ǫk − ǫF ) + 2ǫF − Ecosk(r1 − r2) (9.98)

The denominator assumes its maximum value for electrons at the Fermi level and fallsoff from there. Thus the electron states within a range 2ǫF − E above ǫF are those moststrongly involved in forming the bound states.

9.11 Bardeen-Cooper-Schrieffer (BCS) Theory

Having seen that the Fermi sea is unstable against formation of a bound Cooper pair whenthe net interaction is attractive, we then expect pairs to condense until an equilibriumpoint is reached. This will occur when the state of the system is so greatly changed fromthe Fermi sea due to the large number of pairs that the binding energy for an additionalpair has gone to zero. The wave function for such a state appears to be quite complex andit was the ingenuity of Bardeen, Cooper and Schrieffer to develop it. The basic idea is tocreate the total many-electron wave function from the pair wave functions φ we discussedin the previous section

Ψ = φ(r1σ1, r2σ2)φ(r3σ3, r4σ4) . . . φ(rN−1σN−1, rNσN) (9.99)

and then to antisymmeterize (ΨBCS = AΨ). The coefficients of ΨBCS could then be foundby minimizing the total energy of the many-body Hamiltonian. BCS introduced a simpli-fied pairing Hamiltonian

H =∑

k,σ

ξkc†kσckσ +

∑

kk′

c†k↑c†−k↓Vkk′c−k′↓ck′↑ , (9.100)

with ξ = ǫk − ǫF and the usual electron creation and annihilation operators c†kσ and ckσ.Vkk′ again is our attractive interaction as in Eq. (9.91), but it is restricted to (k ↑,−k ↓)pairs. The problem is solvable in principle but this may be not too illustrative. Instead,we will consider a mean-field approximation.

285

We start with the observation that the characteristic BCS pairing Hamiltonian will leadto a ground state which is some phase-coherent superposition of many-body states withpairs of Bloch states (k ↑,−k ↓) occupied or unoccupied. Because of the coherence, oper-ators such as c−k′↓ck′↑ can have nonzero expectation values bk = 〈c−k′↓ck′↑〉avg in such astate, rather than averaging to zero as in a normal metal, where the phases are random.Moreover, because of the large number of particles involved, the fluctuations about theseexpectation values should be small. This suggests that it will be useful to express such aproduct of operators formally as

c−k′↓ck′↑ = bk + (c−k′↓ck′↑ − bk) , (9.101)

and subsequently neglect quantities which are quadratic in the (presumably small) fluc-tuation term in parentheses. By making the following mean-field approximation for thepotential term

∑

k′

Vkk′c−k′↓ck′↑ ≈ −VΘ(~ωc − |ξk|)∑

k′

Θ(~ωc − |ξk′ |)〈c−k′↓ck′↑〉avg (9.102)

=: ∆k =

∆ for |ǫk − ǫF |, |ǫk′ − ǫF | < ~ωc

0 otherwise, (9.103)

inserting all into the BCS Hamiltonian Eq. (9.100), and expanding to first order, we write

HMF =∑

k,σ

ξkc†kσckσ +

∑

kk′

Vkk′

(c†k↑c

†−k↓bk′ + b†kc−k′↓ck′↑ − b

†kbk′

)(9.104)

=∑

k,σ

ξkc†kσckσ −

∑

k

(∆kc

†k↑c

†−k↓ +∆∗

kc−k′↓ck′↑ −∆kb†k

). (9.105)

This is an effective single particle Hamiltonian (two c-operators instead of four), whichwe can solve. To diagonalize this mean-field Hamiltonian, we define a suitable lineartransformation onto a new set of Fermi operators

ck↑ = ukαk + vkβ†k (9.106)

c−k↓ = ukβk − vkα†k , (9.107)

where αk and βk are two types of new non-interacting Fermions (i.e. quasiparticles). Theexpansion coefficients are

u2k =1

2

(1 +

ξkǫk

)(9.108)

v2k =1

2

(1− ξk

ǫk

)(9.109)

ukvk = −∆k

2ǫk(9.110)

andǫk =

√ξ2k + |∆k|2 . (9.111)

With this so-called Bogoliubov transformation, the mean-field Hamiltonian becomes di-agonal

HMF = E0 +∑

k

ǫk(α†kαk + β†

kβk) , (9.112)

286

with

E0 = 2∑

k

(ξkvk +∆kukvk) +|∆|2V

. (9.113)

The order parameter ∆ is given as before by

∆ = −V∑

k′

Θ(~ωc − |ξk′ |〈c−k′↓ck′↑〉avg . (9.114)

If we now replace 〈c−k′↓ck′↑〉avg by αk and βk as given in Eq. (9.106) and (9.107), we obtainthe following relation

∆ = V∑

k′

∆

ǫk(1− 2fk′) . (9.115)

fk is the quasiparticle distribution function fk =< α†kαk >=< β†

kβk >. Unlike in thecase of bare electrons, the total average number of quasi-particles is not fixed. Therefore,their chemical potential is zero in thermal equilibrium. Since they do not interact, theirdistribution is described by the usual Fermi-Dirac function, so that

fk =1

eǫk

kBT + 1. (9.116)

We see that at T = 0, fk = 0, so there are no quasiparticles in the ground state. Thisimplies that E0 is the ground state energy of the BCS supercondutor at T=0.To evaluate E0 we need to know ∆. The trivial solution to Eq. (9.114) is ∆ = 0. Itcorresponds to the normal state and gives

E0 = 2∑

k

ξkv2k = 2

∑

k

1

2

(1− ξk

ǫk

). (9.117)

For ∆ = 0, we have ǫk = |ξk| and therefore

E0 = 2∑

ξk<0

ξk , (9.118)

as expected (the term for |k| > kF gives zero since ǫk = ξk). The non-trivial solution isobtained by canceling the ∆ from Eq. (9.115):

1 = V∑

k′

1

ǫk. (9.119)

Since ǫk still depends on ∆, we solve for it by replacing the momentum summation∑

k

by an energy integration∫dǫN(ǫ) with the density of states N(ǫ). We again approximate

the density of states by its value at the Fermi level , N(ǫ) = N(ǫF ). This yields

1

N(ǫF )V=

∫~ωc

0

dξ√∆2 + ξ2

= sinh−1

(~ωc

∆

). (9.120)

Solving for ∆ in the weak coupling limit N(ǫF )V ≪ 1 finally gives

∆ =~ωc

sinh(

1N(ǫF )V

) ≈ 2~ωce− 1

N(ǫF )V . (9.121)

287

Figure 9.10: The dispersion of electron-like excitations in the SC state is gapped and differsfrom that of free electrons (∆ = 0). Taken from Mean-Field Theory: Hartree-Fock andBCS, Erik Koch, http://www.cond-mat.de/events/correl16/manuscripts/koch.pdf

Inserting everything into the equation for the ground state energy (Eq. (9.113)), we obtainfor the condensation energy

Ec = E0(∆ 6= 0)− E0(∆ = 0) (9.122)

≈ 2∑

ξk<0

ξk −1

2N(ǫF )∆

2 − 2∑

ξk<0

ξk (9.123)

= −1

2N(ǫF )∆

2 < 0 . (9.124)

We see that the condensation energy is indeed smaller than zero at T = 0 and thereforefavors Cooper pair formation. ∆ = ∆(T ) is called the order parameter that determineswhen superconductivity sets in. The order parameter also affects the dispersion of thequasiparticles. Recall eq. 9.111 (ǫk =

√ξ2k + |∆k|2). Away from the Fermi surface, the

quasiparticles α and β are electrons with up and down spin. In the vicinity of the Fermisurface they are a mixture of both and their energy dispersion is remarkably differentfrom that of the non-interacting electrons and holes (see Fig. 9.10). Most importantlythe dispersion now exhibits an energy gap. The existence of an energy gap in the super-conducting state was confirmed by the optical experiments of Glover and Tinkham in1956/1957 and is seen as the first, decisive early verification of the BCS theory.The second strong evidence for the existence of an energy gap is given by the temperaturedependence of the heat capacity (shown in Fig. 9.11). Corak and co-workers had deter-mined in 1954 and 1956 that the electronic specific heat in the superconducting state wasdominated by an exponential dependence (e−

∆kBT ), whereas in the normal state it followed

the expected linear dependence with temperature expected for the conduction electronsof an ordinary metal.

288

http://www.cond-mat.de/events/correl16/manuscripts/koch.pdf

Figure 9.11: The heat capacity in the superconducting state exhibits an Einstein-likeexponential behavior with temperature and assumes the linear behavior of a metal in thenormal state.

Interestingly, it is the existence of this energy gap that is responsible for the vanishingelectrical resistance of a superconductor, as a thought experiment by Landau shows: Weconsider a fluid of velocity v that flows in a rough pipe. Friction occurs when, via theinteraction with the wall, an elementary excitation of momentum p and energy ǫ(p) iscreated. Landau showed that in order for this process to be possible, energy conservationrequires

−p · v = ǫ(p) .

Since p·v ≥ −|p||v|, the condition can only be met if |p||v| ≥ ǫ(p). This is always the casefor “normal” dispersions with ǫ(p) ∝ p2, but for gapped dispersions with ǫ(p) →0−→ const.,and also for linear dispersions with ǫ(p) ∝ p, the condition cannot be fulfilled if v becomessmaller than a certain critical velocity v∗. For currents slower than v∗, the flow is thusdissipationless. While Landau intended to explain the origin of superfluidity with thisconsideration, it can as well be applied to the case of superconductivity, where one canthink of of the Cooper pairs as behaving like a condensed fluid that flows through themetal without friction due to the gap in the quasiparticle excitation spectrum. Since theCooper pairs carry a charge of −2e each, the resistivity of the metal drops to zero.

9.12 Outlook

It is a remarkable success of many-body theory that the BCS model with its essentiallyvery simple approximations and considerations was able to explain the peculiar phe-nomenon of superconductivity. Still, there are superconductors that cannot be describedin the language of BCS and that we want to mention briefly.So far, we considered Cooper pairing that was mediated by weak electron-phonon couplingwhere it is sufficient to capture the essential physics by using a simple mean field toinclude the interaction. In fact, this works well on a qualitative level for many materials,but quantitatively only for few elemental superconductors. We only mention that there

289

exists a generalization of the ideas leading to the BCS theory that is able to describe alarge class of superconductors quantitatively correct, also in the strong coupling regimeof the electron-phonon interaction. This theory was developed by Eliashberg in the earlysixties.3

While Eliashberg theory is a considerable improvement over BCS, there are still super-conductors that cannot be described in this way. These are called unconvential in the lit-erature, where the unconventional means that electron-phonon coupling alone seems notto be sufficient to provide the pairing mechanism. The most important example are theCu-O-based high-temperature superconductors with critical temperatures up to ∼ 100K.The record at ambient pressure is currently held by the layered cuprate HgBa2Ca2Cu3O8

with a Tc of 135 K. High-temperature superconductors are already of huge technologicalimportance, as they, e. g., provide the magnetic fields necessary in Magnetic resonanceimaging devices used in medicine.There is experimental evidence that the gap function ∆k of these materials is explicitly k

dependent with a symmetry that differs from those explained by BCS/Eliashberg theory(d-wave symmetry). The actual pairing mechanism leading to d-wave-pairing is still underdebate. It could well be that the electron-phonon coupling is complemented by antiferro-magnetic spin fluctuations that are able to explain pairing with d-wave symmetry, but aconcise picture has not emerged yet and a lot of open questions remain.

3Cf. this lecture script for an introduction to Eliashberg theory:http://www.cond-mat.de/events/correl13/manuscripts/ummarino.pdf

290

http://www.cond-mat.de/events/correl13/manuscripts/ummarino.pdf

Theoretical Materials Science - TU Berlin · 2017. 8. 31. · The complexity of the quantum...

Documents

Transcript of Theoretical Materials Science - TU Berlin · 2017. 8. 31. · The complexity of the quantum...