THE STANDARD MODEL Y. Grossman and Y. Niryuvalg/p4444/GNB-master.pdf11.3 The weak mixing angle, W. ....

THE STANDARD MODEL

Y. Grossman and Y. Nir

DRAFT as of May 12, 2017

1

Contents

I Model building 10

1 Lagrangians 11

1.1 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3 Fermions and scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.5 Discrete spacetime symmetries: C, P and T . . . . . . . . . . . . . . . . . . . . . . 15

1.5.1 C and P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.5.2 CP violation and complex couplings . . . . . . . . . . . . . . . . . . . . . . 16

1.6 Model building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Abelian symmetries 18

2.1 Global discrete symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Global continuous symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Charge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4 Product groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5 Fermion masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Local symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.6.1 Charge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Homework 29

3 QED 31

3.1 Defining QED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 The spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 The interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.5 Parameter counting and tests of QED . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.6 QED with more fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2

3.6.1 Two Dirac fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.6.2 Accidental symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.6.3 Even more fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Homework 36

4 Non-Abelian symmetries 39

4.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 Global symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.3 Local symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.4 Running coupling constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Appendices 45

4.A Noether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.A.1 Free massless scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.A.2 Free massless Dirac fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.A.3 Free massive Dirac fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Homework 51

5 QCD 55

5.1 Defining QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.3 The spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56


5.4.1 Confinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.5 Accidental symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.6 Combining QCD with QED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Homework 61

6 Spontaneous Symmetry Breaking 63

6.1 Global discrete symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.2 Global Abelian continuous symmetries . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.3 Global non-Abelian continuous symmetries . . . . . . . . . . . . . . . . . . . . . . . 68

6.4 Fermion masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.5 Local symmetries: the Higgs mechanism . . . . . . . . . . . . . . . . . . . . . . . . 72

6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3

Appendices 76

6.A The Goldstone Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Homework 78

7 The Leptonic Standard Model 82

7.1 Defining the LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

7.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7.2.1 Lkin and the gauge symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7.2.2 Lψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7.2.3 LYuk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7.2.4 Lφ and spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . 84

7.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3 The Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3.1 Scalars: back to Lφ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3.2 Vector bosons: back to Lkin(φ) . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3.3 Fermions: back to LYuk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89


7.4.1 The Higgs boson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.4.2 QED: Electromagnetic interactions . . . . . . . . . . . . . . . . . . . . . . . 91

7.4.3 Neutral current weak interactions . . . . . . . . . . . . . . . . . . . . . . . . 92

7.4.4 Charged current weak interactions . . . . . . . . . . . . . . . . . . . . . . . . 93

7.4.5 Gauge boson self-interactions . . . . . . . . . . . . . . . . . . . . . . . . . . 95

7.4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

7.5 Global symmetries and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

7.5.1 The interaction basis and the mass basis . . . . . . . . . . . . . . . . . . . . 96

7.5.2 The LSM parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98


7.5.4 Discrete symmetries: C, P and CP . . . . . . . . . . . . . . . . . . . . . . . 99

7.6 Low Energy Tests of the LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

7.6.1 CC weak interactions: Quasi-elastic neutrino–electron scattering . . . . . . . 100

7.6.2 NC weak interactions: neutrino–electron scattering . . . . . . . . . . . . . . 101

7.6.3 Forward-backward asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Homework 103

4

8 The Standard Model 109

8.1 Defining the Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.2.1 Lkin and the gauge symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 110

8.2.2 Lψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

8.2.3 LYuk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

8.2.4 Lφ and spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . 113

8.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.3 The Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.3.1 Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.3.2 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

8.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115


8.4.1 Neutral current weak interactions . . . . . . . . . . . . . . . . . . . . . . . . 115

8.4.2 Charged current weak interactions . . . . . . . . . . . . . . . . . . . . . . . . 116

8.4.3 Interactions of the Higgs boson . . . . . . . . . . . . . . . . . . . . . . . . . 118

8.4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

8.5 Accidental symmetries and parameter counting . . . . . . . . . . . . . . . . . . . . 120


8.5.2 Parameter counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8.5.3 Parameter counting in the SM . . . . . . . . . . . . . . . . . . . . . . . . . . 122

8.5.4 Parametrization of the CKM matrix . . . . . . . . . . . . . . . . . . . . . . 123

8.5.5 The strong CP parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

8.6 P, C and CP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

8.6.1 Unitarity Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Homework 127

II Particle physics 132

9 QCD at the IR 133

9.1 The quark model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

9.1.1 Hadron masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

9.1.2 Hadron lifetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

9.1.3 Hadron quantum numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

9.2 Combining QCD with the weak interaction . . . . . . . . . . . . . . . . . . . . . . . 138

9.2.1 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

9.2.2 The decay constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5

9.2.3 Form factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

9.2.4 Lattice QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

9.3 The approximate symmetries of QCD . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9.3.1 The approximate symmetries of QCD: light quarks . . . . . . . . . . . . . . 142

9.3.2 The approximate symmetries of QCD: heavy quarks . . . . . . . . . . . . . . 143

9.4 High energy QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.4.1 Quark hadron duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.4.2 Jets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

Appendices 150

9.A quark masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

9.B Flavor SU(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

9.C Names and QN for hadrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Homework 152

Homework 153

10 Mixing and CPV 155

10.1 Neutral meson mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

10.1.1 Toy model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

10.1.2 Flavor oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

10.1.3 Time scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

10.2 CP violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

10.2.1 CP violation in decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

10.2.2 CP violation in mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

10.2.3 CP violation in interference of decays with and without mixing . . . . . . . . 164

10.2.4 Indirect CP violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

10.3 SM calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

10.3.1 M12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

10.3.2 CP violation in decay: D → K+K− . . . . . . . . . . . . . . . . . . . . . . . 167

10.3.3 CP violation in mixing: K → `νπ . . . . . . . . . . . . . . . . . . . . . . . . 168

10.3.4 CP violation in interference of decays with and without mixing: B → ψKS . 168

III Testing the SM 170

11 Electroweak Precision Measurements 171

11.1 The SM beyond tree level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

11.2 Electroweak Precision Measurements (EWPM) . . . . . . . . . . . . . . . . . . . . . 172

6

11.3 The weak mixing angle, θW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

11.3.1 θ within the SM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

11.4 Custodial symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

11.5 Probing new physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

11.5.1 NR operators and the q2 expansion . . . . . . . . . . . . . . . . . . . . . . . 178

11.5.2 The four generation SM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

12 Flavor physics 183

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

12.2 Tree level: The CKM parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

12.3 FCNC processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

12.3.1 Photon and gluon mediated FCNCs . . . . . . . . . . . . . . . . . . . . . . . 185

12.3.2 Z-mediated FCNCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

12.3.3 Higgs-mediated FCNCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

12.3.4 The CKM and mq dependence of FCNC . . . . . . . . . . . . . . . . . . . . 188

12.3.5 FCNC examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

12.4 CP violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

12.4.1 B → DK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

12.4.2 B → ψKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

12.4.3 B → ππ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

12.4.4 KL → π`ν . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

12.5 Testing the flavor sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

12.5.1 Test of the SM flavor sector . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

12.5.2 Non-renormalizable terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

Appendices 197

12.A Extracting Vud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

12.A.1 Nuclear β decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

12.A.2 Neutron β decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

12.A.3 Pion β decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

12.B Extracting Vcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

12.B.1 B → D decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

12.B.2 Inclusive decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

Homework 204

13 Neutrinos 210

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

13.2 The νSM: The SM with d = 5 terms . . . . . . . . . . . . . . . . . . . . . . . . . . 210

7

13.2.1 The neutrino spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

13.2.2 The scale of generation of neutrino masses . . . . . . . . . . . . . . . . . . . 211

13.2.3 The neutrino interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

13.2.4 Accidental symmetries and the lepton mixing parameters . . . . . . . . . . . 214

13.3 The NSM: The SM with singlet fermions . . . . . . . . . . . . . . . . . . . . . . . . 215

13.3.1 Defining the NSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

13.3.2 The NSM Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

13.3.3 The NSM spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

13.3.4 The Ni interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

13.3.5 The case of mN v: Sterile neutrinos . . . . . . . . . . . . . . . . . . . . . 221

13.4 Probing neutrino masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

13.4.1 Neutrino oscillations in vacuum . . . . . . . . . . . . . . . . . . . . . . . . . 222

13.4.2 The MSW effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

13.4.3 Non-uniform density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

13.4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

Appendices 230

13.A Probing neutrino masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

13.A.1 Kinematic tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

13.A.2 Neutrinoless double-beta (0ν2β) decay . . . . . . . . . . . . . . . . . . . . . 230

Homework 232

IV Connection to astronomy and cosmology 235

14 Connection to cosmology 236

14.1 Baryogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

14.1.1 The baryon asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

14.1.2 Sakharov conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

14.1.3 The suppression of KM baryogenesis . . . . . . . . . . . . . . . . . . . . . . 239

14.1.4 Leptogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

14.2 Dark Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

14.2.1 Observational Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

14.2.2 Why not the neutrinos? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

14.2.3 The relic density: back of the envelope estimate . . . . . . . . . . . . . . . . 244

14.2.4 Detecting WIMPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

14.2.5 DM@LHC? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

14.2.6 Supersymmetry: neutralino dark matter . . . . . . . . . . . . . . . . . . . . 246

8

Appendices 248

A Lie Groups 249

A.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

A.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

A.3 Lie groups and Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

A.4 Roots and Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

A.5 SU(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

A.6 Classification and Dynkin diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

A.7 Naming representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

A.8 Combining representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Homework 264

9

Part I

Model building

10

Chapter 1

Lagrangians

Modern physics encodes the basic laws of Nature in the action S, and postulates the principle

of minimal action in its quantum interpretation. In Quantum Field Theory (QFT), the action is

an integral over spacetime of the “Lagrangian density” or Lagrangian, L, for short. For most of

our purposes, it is enough to consider the Lagrangian, rather than the action. In this section we

explain how particle physicists “construct” Lagrangians. Later in the book we discuss how they

determine the numerical values of the parameters that appear in the Lagrangian, and how they

test whether a Lagrangian provides a viable description of Nature.

The QFT equivalent of the generalized coordinates of classical mechanics are the fields. The

action is given by

S =∫d4x L[φi(x), ∂µφi(x)] , (1.1)

where d4x = dx0dx1dx2dx3 is the integration measure in four-dimensional Minkowski space. The

index i runs from 1 to the number of fields. Here we denote a generic field by φ(x). Later, we use

φ(x) for a scalar field, ψ(x) for a fermion field, and V (x) for a vector field.

The action S has units of ML2T−1 or, equivalently, h. In a natural unit system, where h = 1, S

is taken to be “dimensionless.” Then, in four dimensions, L has natural dimensions of L−4 = M4.

In general, we require the following properties for the Lagrangian:

(i) It is a function of the fields and their derivatives only, so as to ensure translational invariance.

(ii) It depends on the fields taken at one spacetime point xµ only, leading to a local field theory.

(iii) It is real, so that the total probability is conserved.

(iv) It is invariant under the Poincare group, that is under spacetime translations and Lorentz

transformations.

(v) It is an analytic function in the fields. This is not a general requirement, but it is common to

all field theories that are solved via perturbation theory. In all of these, we expand around

11

a minimum, and this expansion means that we consider a Lagrangian that is a polynomial

in the fields.

(vi) It is invariant under certain internal symmetry groups. The invariance of S (or of L) is

in correspondence with conserved quantities and reflects basic symmetries of the physical

system.

We often impose two additional requirements:

(vii) Naturalness: Every term in the Lagrangian that is not forbidden by a symmetry should

appear.

(viii) Renormalizability. A renormalizable Lagrangian contains only terms that are of dimension

less than or equal to four in the fields and their derivatives.

The requirement of renormalizability ensures that the Lagrangian contains at most two ∂µ oper-

ations, and leads to classical equations of motion that are no higher than second order derivatives.

If the full theory of Nature is described by QFT, its Lagrangian should indeed be renormaliz-

able. The theories that we consider, however, and, in particular, the Standard Model, are only

low energy effective theories, valid up to some energy scale Λ. Therefore, we must include also

non-renormalizable terms. These terms have coefficients with inverse mass dimensions, 1/Λn,

n = 1, 2, . . .. For most purposes, however, the renormalizable terms constitute the leading terms in

an expansion in E/Λ, where E is the energy scale of the physical processes under study. Therefore,

the renormalizable part of the Lagrangian is a good starting point for our study.

Properties (i)-(v) are not the subject of this book. You must be familiar with them from

your QFT course(s). We do, however, deal intensively with the other requirements. Actually, the

most important message that we would like to convey is the following: (Almost) all experimental

data for elementary particles and their interactions can be explained by the standard model of a

spontaneously broken SU(3)× SU(2)× U(1) gauge symmetry.1

Writing down a specific Lagrangian is the endpoint of the process known as “model building,”

and the starting point for a phenomenological interpretation and experimental testing. In this

book we explain both sides of this modern way of understanding high energy physics.

We next show a few examples of simple Lagrangians.

1.1 Scalars

The Lagrangian for a real scalar field φ is given by

LS =1

2∂µφ∂µφ−

m2

2φ2 − µ

2√

2φ3 − λ

4φ4 . (1.2)

1Actually, the great hope of the high-energy physics community is to prove this statement wrong, and to find

an even more fundamental theory.

12

The Lagrangian LS of Eq. (1.2) is the most general renormalizable L(φ) that can be written, so

it satisfies the naturalness principle. We emphasize the following points:

1. The term with derivatives is called the kinetic term. It is necessary if we want φ to be a

dynamical field, namely to be able to describe propagation in spacetime.

2. We work in the “canonically normalized” basis where the coefficient of the kinetic term is

1/2. (This is true for a real scalar field. For a complex scalar field, the canonically normalized

coefficient of the kinetic term is 1.)

3. We do not write a constant term since it does not enter the equation of motion for φ.

4. In principle we could write a linear term but it is not physical, that is, we can always redefine

the field such that the linear term vanishes.

5. The quadratic term (φ2) is a mass-squared term.

6. The trilinear (φ3) and quartic (φ4) terms describe interactions.

7. Terms with five or more scalar fields (φn, n ≥ 5) are non-renormalizable.

1.2 Fermions

The Lagrangian for a Dirac fermion field ψ is given by

LF = iψ∂/ψ −mψψ . (1.3)

The Lagrangian LF of Eq. (1.3) is the most general renormalizable L(ψ) that can be written, so it

satisfies the naturalness principle. (There is a subtlety involved in this statement. By saying that

the fermion in question is of the Dirac type, we are implicitly imposing a symmetry that forbids

Majorana mass terms. We discuss this issue later.) We treat ψ and ψ as independent fields. The

reason is that a fermion field is complex, and it is more convenient to deal with ψ and ψ than with

Re(ψ) and Im(ψ). We emphasize the following points:

1. The derivative term is the kinetic term. It is necessary if we want ψ to be a dynamical field.

2. We work in the canonically normalized basis where the coefficient of the kinetic term is 1.

3. Terms with an odd number of fermion fields violate Lorentz symmetry, and so they are

forbidden.

4. The quadratic term (ψψ) is a mass term.

5. Terms with four or more fermion fields are non-renormalizable.

13

1.3 Fermions and scalars

The renormalizable Lagrangian for a single Dirac fermion and a single real scalar field includes, in

addition to the terms written in Eqs. (1.2) and (1.3), the following term:

LYuk = −Y ψψφ. (1.4)

Such a term is called a Yukawa interaction and Y is the dimensionless Yukawa coupling. The most

general renormalizable Lagrangian for a real scalar field and a Dirac fermion is thus

L(φ, ψ) = LS + LF + LYuk, (1.5)

where LS is given in (1.2), LF is given in (1.3), and LYuk in (1.4).

1.4 Symmetries

We always seek deeper reasons for the laws of Nature that have been discovered. In modern

theories, these reasons are closely related to symmetries. The term symmetry refers to an invariance

of the equations that describe a physical system. The fact that a symmetry and an invariance are

related concepts is obvious enough — a smooth ball has a spherical symmetry and its appearance

is invariant under rotation.

Symmetries are built into physics as invariance properties of the Lagrangian. If we construct

our theories to encode various empirical facts and, in particular, the observed conservation laws,

then the equations turn out to exhibit certain invariance properties. For example, if we want the

theory to have energy conservation, then the Lagrangian must be invariant under time translations

(and therefore cannot depend explicitly on time). From this point of view, the conservation law is

the input and the symmetry is the output.

Conversely, if we take the symmetries to be the fundamental rules, then various observed

features of particles and their interactions are a necessary consequence of the symmetry principle.

In this sense, symmetries provide an explanation of these features. In modern particle physics

(and, in particular, in this book), we often take the latter point of view, in which symmetries are

the input and conservation laws are the output.

In the following we discuss the consequences of imposing a symmetry on a Lagrangian. This

is the starting point of model building in modern particle physics: One defines the basic symme-

tries and the particle content, and then obtains the predictions that follow from these imposed

symmetries.

We emphasize here that there are, however, symmetries that are not imposed and are called

accidental symmetries. They are outputs of the theory rather than external constraints. Accidental

symmetries arise due to the fact that we truncate our Lagrangian. In particular, the renormalizable

14

terms in the Lagrangian often have accidental symmetries that are broken by non-renormalizable

terms or by anomalies. Since we study mostly renormalizable Lagrangians, we will often encounter

accidental symmetries.

There are various types of symmetries. First, we distinguish between spacetime and inter-

nal symmetries. Spacetime symmetries include the Poincare group of translations, rotations and

boosts. They give us the energy–momentum and angular momentum conservation laws. As men-

tioned above, we always impose this symmetry. The list of possible spacetime symmetries includes,

in addition, space inversion (also called parity) P , time-reversal T , and charge conjugation C.

(While C is not truly a spacetime symmetry, the way it acts on fermions and the CPT theorem

make it simpler to include C in the same class of operators.) We discuss them briefly in Section 1.5.

Internal symmetries act on the fields, not directly on spacetime. In other words, they act in

mathematical spaces that are generated by the fields. This is the kind of symmetries that we

discuss in detail. In Chapter 2 we introduce Abelian symmetries. In Chapter 4 we introduce

non-Abelian symmetries.

1.5 Discrete spacetime symmetries: C, P and T

The discrete spacetime symmetries, C, P , and T , play an important role in our understanding of

Nature. Each of these three symmetries has been experimentally shown to be violated in Nature,

as discussed in detail further below. The CPT combination seems, however, to be an exact

symmetry of Nature. On the experimental side, no sign of CPT violation has been observed. On

the theoretical side, CPT must be conserved for any Lorentz invariant local field theory. Since we

only consider such theories, we assume that CPT holds. In this case, CP and T are equivalent.

Thus, we usually refer to CP .

1.5.1 C and P

We only consider C and P in theories that involve fermions. Under C, particles and antiparticles

are interchanged by conjugating all internal quantum numbers, e.g., reversing the sign of the

electromagnetic charge, Q → −Q. Under P , the handedness of space is reversed, ~x → −~x, and

the chirality of fermion fields is reversed, ψL ↔ ψR. For example, a LH electron e−L transforms

under C into a LH positron e+L , and under P into a RH electron e−R. (For a formal discussion see,

for example, Section 3.6 of Ref. [11].) The result that is important for our purposes is that C and

P are violated in any chiral theory. This is the case if there is a different number of LH and RH

fields, or if the LH and RH fermion fields transform in a different way under the symmetry group.

15

1.5.2 CP violation and complex couplings

The CP transformation combines charge conjugation C with parity P . For example, a LH electron

e−L transforms under CP into a RH positron, e+R. CP is a good symmetry if there is a basis where

all the parameters of the Lagrangian are real. We do not prove it here but provide a simple

intuitive explanation of this statement.

Consider a theory with a single scalar, φ, and two sets of N fermions, ψiL and ψiR (i = 1, 2, ..., N).

The Yukawa interactions are given by

−L = YijψLiφψRj + Y ∗ijψRjφ†ψLi, (1.6)

where we write the two hermitian conjugate terms explicitly. The CP transformation of the fields

is defined as follows:

φ→ φ†, ψLi → ψLi, ψRi → ψRi. (1.7)

Therefore, a CP transformation exchanges the operators

ψLiφψRj CP ←→ ψRjφ†ψLi, (1.8)

but leaves their coefficients, Yij and Y ∗ij , unchanged. This means that CP is a symmetry of L if

Yij = Y ∗ij .

In practice, things are more subtle, since one can define the CP transformation in a more

general way than Eq. (1.7):

φ→ eiθφ†, ψiL → eiθLiψiL, ψiR → eiθRiψiR, (1.9)

with θ, θLi, θRi convention-dependent phases. Then, there can be complex couplings, yet CP would

be a good symmetry. The correct statement is that CP is violated if, using all freedom to redefine

the phases of the fields, one cannot find any basis where all couplings are real.

Below we will encounter the idea of gauge interactions. We note that a theory with only gauge

interactions conserves CP as the coupling constants are real.

1.6 Model building

As stated above, writing a Lagrangian is the endpoint of “model building.” The starting point of

our way to build models is to provide as input the following two ingredients:

(i) The symmetry;

(ii) The transformation properties of the various scalar and fermion fields under the symmetry

operation.2

2We use interchangeably the terms ”charge,” ”quantum numbers,” and ”representation” for the transformation

properties. We later explain these terms in more detail.

16

Then we write down the most general renormalizable Lagrangian that depends on the scalar and

fermion fields and is invariant under the symmetry.

A Lagrangian defined in the above way has a finite number of parameters. For a theory with N

parameters, we need to perform N appropriate measurements such that additional measurements,

from the (N + 1)’th and on, test the theory. Note that in principle we never need to determine

the values of the parameters, and just use experimental input to make predictions. In practice,

however, it is usually useful to consider the first N measurements as those that determine the

values of these parameters.

As we go along this book, we repeat the process of model building several times. We will see how

QED (the theory of electromagnetic interactions), QCD (the theory of strong interactions), the

LSM (the theory of electroweak interactions among leptons), and the Standard Model itself, can

be understood in this modern way of thinking, starting from a postulate of symmetry principles.

17

Chapter 2

Abelian symmetries

2.1 Global discrete symmetries

We start with a simple example of imposing an internal discrete symmetry. Consider a real scalar

field φ. The most general Lagrangian we can write is given in Eq. (1.2). We now impose a

symmetry: We demand that L is invariant under φ→ −φ, namely

L(φ) = L(−φ) . (2.1)

L is invariant under this symmetry if µ = 0. Thus, by imposing the symmetry we force µ = 0:

The most general L(φ) that is invariant under φ→ −φ is then

L =1

2∂µφ∂µφ−

m2

2φ2 − λ

4φ4 . (2.2)

What conservation law corresponds to this symmetry? We note that the number of φ-particles

in a system that is described by the Lagrangian of Eq. (2.2) can change, but only by an even

number. Therefore, if we define φ-parity as (−1)n, where n is the number of φ-particles in the

system, this φ-parity is conserved. If we do not impose the symmetry and µ 6= 0, then the number

of particle can change by any integer and φ-parity is not conserved.

It is a useful exercise to describe the symmetry in terms of group theory. Here the relevant

group is Z2. It has two elements that we call even (+) and odd (−). The multiplication table is

very simple:

(+) · (+) = (−) · (−) = (+), (+) · (−) = (−) · (+) = (−). (2.3)

When we say that we impose a Z2 symmetry on L, with the Z2 transformation law φ→ −φ, what

we mean is that L belongs to the even representation of Z2, and φ belongs to the odd representation

of Z2. For L to be Z2-even, each term in L must be Z2-even. Since φ is Z2-odd, we can keep only

terms with even powers of φ. Then we can construct the most general L and it is given by Eq.

(2.2). Later on we use the language of group theory to deal with more complicated situations.

18

2.2 Global continuous symmetries

We now extend our discussion to global continuous symmetries. The idea is that we demand that

L is invariant under rotation in some internal space. While (some of) the fields are not invariant

under rotation in that space, the combinations that appear in the Lagrangian are.

The fact that we can rotate between fields should not come as a surprise. This is nothing

but the idea of generalized coordinates. Any linear combinations of the fields can be used as our

coordinates, not just the one we choose to start with.

Our first example is that of a complex scalar field, φ. A complex field has two degrees of

freedom (DoF). There are two useful ways to write down explicitly the two DoF. First, we can use

a Cartesian form:

φ ≡ 1√2

(φR + iφI) , (2.4)

with φR and φI real scalar fields. The most general renormalizable L(φR, φI) is given by

L =1

2δij∂

µφi∂µφj −m2ij

2φiφj −

µijk

2√

2φiφjφk −

λijk`4φiφjφkφ` , i, j, k, ` = R, I. (2.5)

We impose invariance of the Lagrangian under rotations in the complex plane:(φR

φI

)→ O

(φR

φI

), O =

(cos θ sin θ

− sin θ cos θ

), (2.6)

where θ is a number that does not depend on xµ. Imposing that L(φR, φI) is invariant under the

transformation of Eq. (2.6) forbids many terms and relates other:

L(φR, φI) =1

2∂µφR∂µφR +

1

2∂µφI∂µφI −

m2

2(φRφR + φIφI)−

λ

4

(φ4R + φ4

I + 2φ2Iφ

2R

). (2.7)

In the language of group theory, the imposed symmetry – rotations in a two dimensional real plane

– is called SO(2).

Second, we can formulate the transformation law directly in terms of the complex field φ:

φ→ eiθφ, φ† → e−iθφ†. (2.8)

Imposing that L(φ, φ†) is invariant under the transformation of Eq. (2.8) leads to

L(φ, φ†) = ∂µφ†∂µφ−m2φ†φ− λ(φ†φ)2. (2.9)

In the language of group theory, the imposed symmetry – rotations in a one dimensional complex

plane – is called U(1). (Mathematically SO(2) and U(1) are equivalent. The different names

represent the way we think about the underlying space.) It is easy to check that the Lagrangians

in Eqs. (2.7) and (2.9) are equivalent. Eq. (2.9) is, however, more compact. We emphasize the

following points regarding Eq. (2.9):

19

• All three terms that appear in this equation and, in particular, the mass term, do not

violate any internal symmetry. Thus, there is no way to forbid them by imposing an internal

symmetry.

• We would obtain the same result if we scale θ by any non-zero number. Explicitly, we would

obtain the same Lagrangian with a transformation law φ→ exp(iqθ)φ for any finite q. Since

q is arbitrary, we can choose it to be one, as we did. The situation is different when we have

more than one field, as we discuss next.

2.3 Charge

We are now ready to define charge. We are used to the notion of charge from the specific case

of electromagnetism. In electromagnetism, charge has two aspects: (i) It sets the strength of the

interaction of the fermions with the photon; (ii) It is a conserved quantity. Below we first deal

with the latter point, while the aspect of interaction strength will emerge only when we generalize

our discussion to local symmetries in Section 2.6. Charge conservation is related to symmetry. The

general relation between internal global continuous symmetries and conserved charges is known

as Noether’s theorem. Here we provide a simple example while the general case is explained in

Appendix 4.A.

Consider a theory with two complex scalar fields, φ1 and φ2. To each field φi we assign a real

number qi. We impose a symmetry under the simultaneous phase rotation of both fields:

φ1 → exp(iq1θ)φ1, φ2 → exp(iq2θ)φ2. (2.10)

The conjugate fields transform as follows:

φ†1 → exp(−iq1θ)φ†1, φ†2 → exp(−iq2θ)φ

†2. (2.11)

We say that qi is the charge of the field φi. The charge of the conjugate field φ†i is −qi. While we

can always set one of the charges to 1, we cannot do it for both. The ratio of charges, q2/q1, is

physical.

The charge qi is an input to model building: We assign charges to the fields and write down the

Lagrangian that is invariant under the above rotations. As a concrete example, consider a model

with two complex scalar fields of charges q1 = 1 and q2 = 3. Then the most general renormalizable

Lagrangian that is invariant under Eq. (2.10) is

L = ∂µφ†1∂µφ1 + ∂µφ†2∂µφ2 −m21φ†1φ1 −m2

2φ†2φ2

−λ11(φ†1φ1)2 − λ22(φ†2φ2)2 − λ12(φ†1φ1)(φ†2φ2)− (ηφ31φ†2 + h.c.). (2.12)

A few comments are in order:

20

• All the interactions that are allowed by the symmetry conserve the charge. This can be seen

formally (and most generally) by the Noether’s theorem. It can also be seen for our specific

example by inspecting the Lagrangian of Eq. (2.12) and observing that each term carries an

overall charge zero, and therefore corresponds to creation and annihilation of particles such

that the initial and final charges are equal.

• Since charge is related to the phase shift of a field, it can only be assigned to complex fields.

Conversely, real fields carry no charge.

• There are two terms that are often used in the Physics jargon instead of “charges”: “quantum

numbers” (QN’s) and “representations.” The use of the term QN’s might be confusing as in

quantum mechanics charge is assigned to a particle while QN is assigned to a state. In QFT,

however, we use QN to describe the charges that are assigned to a field. The use of the term

representations will become clear when we discuss non-Abelian symmetries.

2.4 Product groups

In our discussion above, we introduced the simplest continuous Abelian symmetry – U(1). We can

generalize this idea and impose a larger symmetry, [U(1)]N . This gives us another tool for model

building. Consider, for example, a model with two complex scalar fields, where we require invari-

ance of the Lagrangian under two independent rotations by phases θa and θb. Such a symmetry is

called U(1)a × U(1)b. The two fields transform as follows:

φ1 → exp[i(qa1θa + qb1θb)]φ1, φ2 → exp[i(qa2θa + qb2θb)]φ2. (2.13)

The rotation of the conjugate fields is done with the replacement qa,bi → −qa,bi .

Consider the assignment (qa1 , qb1) = (1, 0), and (qa2 , q

b2) = (0, 1). In such a case, the η term of

the Lagrangian of Eq. (2.12) is forbidden, and we have (repeated indices are summed over)

L = ∂µφ†i∂µφi −m2iφ†iφi − λij(φ

†iφi)(φ

†jφj) (2.14)

We see again how imposing a symmetry can be used to forbid terms in the Lagrangian.

The U(1)a × U(1)b symmetry can be defined in various ways: Instead of having θa and θb as

the independent rotation angles, we can use linear combinations of them. For example, we can

use θ± ≡ θa ± θb as the two independent angles. The corresponding charges are q±i = qai ± qbi and

the symmetry is denoted by U(1)+ × U(1)− symmetry. Below we see examples of when such a

redefinition is useful.

It is illuminating to ask if there is a way to obtain the Lagrangian of Eq. (2.14) by imposing

a single U(1) symmetry as in Eq. (2.10). The answer is in the affirmative. For example, we can

21

assign charges q1 = 1 and q2 = 4. Note that this choice allows a term of the form φ41φ†2. The reason

we do not write it is that it is a dimension-five term and therefore non-renormalizable.

This is in fact our first example of an accidental symmetry. The symmetry we impose is a

U(1) symmetry with certain charge assignments, and the resulting Lagrangian has a U(1)× U(1)

symmetry, that is, there is an extra U(1) symmetry that is accidental. In fact, the Lagrangian of

Eq. (2.14) would be the same for any U(1) symmetry with q1 and q2 positive integers, co-prime to

each other, and q1 + q2 ≥ 5. To each such assignment corresponds a different non-renormalizable

term, φq21 φ†q12 , that breaks the [U(1)]2 symmetry down to the U(1) that we imposed.

2.5 Fermion masses

The basic fermion fields are two component Weyl fermions that are generically denoted by ψL and

ψR, where L and R denote, respectively, left-handed and right-handed chirality. They are related

to a Dirac field by

ψR,L =1

2(1± γ5)ψ. (2.15)

Each of ψL and ψR has two degrees of freedom and is a complex field. We can define a phase

transformation of the Weyl fermions:

ψL → exp(iqLθ)ψL, ψR → exp(iqRθ)ψR. (2.16)

The conjugate fields transform as

ψL → exp(−iqLθ)ψL, ψR → exp(−iqRθ)ψR. (2.17)

To understand the consequences of imposing such a U(1) symmetry on the Lagrangian, re-

call that the combination of renormalizability and Lorentz invariance allows only terms with two

fermion fields (or no fermion fields at all). We here focus on the fermion mass terms. There are

two possible mass terms for fermions: Dirac and Majorana.

Dirac masses couple left- and right-handed fields,

LmD = mDψLψR + h.c., (2.18)

where mD is the Dirac mass.

Majorana masses couple a left-handed or a right-handed field to itself. Consider ψR, a right-

handed field. Defining

ψcR = C ψRT, (2.19)

where C is the charge conjugation matrix, a Majorana mass term reads

LmM =mM

2ψcR ψR + h.c., (2.20)

22

where mM is the Majorana mass. Note that ψR and ψcR transform in the same way under all

symmetries. In particular, if ψR → exp(iqRθ)ψR, then we have ψcR → exp(iqRθ)ψcR. Similar

expressions hold for left-handed fields.

There is a classification of symmetries applied to fermions that is relevant for our discussion.

Consider a theory with a single left-handed and a single right-handed fermion fields and an imposed

U(1) symmetry. The symmetry is chiral if qL 6= qR. The symmetry is vectorial if qL = qR. More

generally, if there are many fermion fields, the symmetry is vectorial if all left-handed fields and

all right-handed fields can be matched into pairs with the same charge, qLi = qRi for each i, and

chiral otherwise.

We emphasize the following points regarding Eqs. (2.18) and (2.20):

• Since ψL and ψR are different fields, there are four degrees of freedom with the same Dirac

mass, mD. In contrast, since only one Weyl fermion field is involved in a Majorana mass

term, there are only two degrees of freedom that have the same Majorana mass, mM .

• The relative factor of 1/2 between Majorana and Dirac mass terms is the analog of the similar

factor between the mass-squared terms for a real and complex scalar fields.

• Consider a theory with one or more U(1) symmetries. To allow a Dirac mass, the charges of

ψL and ψR under these symmetries must be opposite, which is the case when q(ψL) = q(ψR).

Thus, to have a Dirac mass term, the fermion has to be in a vector representation of the

symmetry group.

• Since the transformations of ψcR and ψR are the same, a fermion field can have a Majorana

mass only if it is neutral under all U(1) symmetries. In particular, as we discuss below, fields

that carry electric charges cannot acquire Majorana masses. If we include any non-Abelian

group (to be discussed later), a fermion field can have a Majorana mass only if it is in a real

representation of the symmetry group.

• When there are m left-handed fields and n right-handed fields with the same quantum num-

bers, the Dirac mass terms for these fields form an m× n general complex matrix mD:

(mD)ij(ψL)i(ψR)j + h.c.. (2.21)

The mass eigenstates are, for m ≤ n, m Dirac fermions and (n −m) massless right-handed

fermions or, for m ≥ n, n Dirac fermions and (m− n) massless left-handed fermions. In the

SM, as we discuss later, fermion fields are present in three copies with the same quantum

numbers, and the Dirac mass matrices are 3× 3.

• When there are m neutral left-handed fields and n neutral right-handed fields, the mass

terms form an (m+n)× (m+n) symmetric complex matrix Mψ. This matrix consists of an

23

Table 2.1: Dirac and Majorana masses

Dirac Majorana

# of degrees of freedom 4 2

Representation vector neutral

Mass matrix m× n, general (m+ n)× (m+ n), symmetric

SM fermions quarks, charged leptons neutrinos (?)

m×m block of Majorana mass terms for the left-handed fields, an n× n block of Majorana

mass terms for the right-handed fields, and an m× n block of Dirac mass terms:

(ψL ψcR )

(m

(m×m)ML m

(m×n)D

mT (n×m)D m

(n×n)MR

)(ψcL

ψR

). (2.22)

The (m + n) mass eigenstates are Majorana fermions. In the SM, neutrinos are the only

neutral fermions. If they have Majorana masses, then their mass matrix is 3× 3.

We summarize these differences between Dirac and Majorana masses in Table 2.1.

The main lesson that we can draw from these observations is the following: Charged fermions in

a chiral representation are massless. In other words, if we encounter massless fermions in Nature,

there is a way to explain their masslessness from symmetry principles.

2.6 Local symmetries

So far we discussed global symmetries, that is, symmetries where θ is independent of spacetime. In

this section we discuss local symmetries (also called gauge symmetries), that is, symmetries where

the transformation can be different in different space-time points θ(xµ). The implications of local

symmetries are far reaching. The symmetries that are imposed in defining the Standard Model

are all local symmetries.

We generalize Eq. (2.8) and define a local transformation of a complex scalar field:

φ(x)→ eiqθ(x)φ(x). (2.23)

(From here on, we do not write explicitly the transformation of the conjugate field, and we omit the

index µ from xµ.) Note that a global transformation is a special case of the local transformation.

Moreover, all terms in the Lagrangian that do not involve derivatives of fields and which are

invariant under a global symmetry are also invariant under the corresponding local symmetry.

This is, however, not the case for derivative terms. The local transformation of ∂µφ is given by

∂µφ→ ∂µ[eiqθ(x)φ] = eiqθ(x)∂µφ+ iqeiqθ(x)[∂µθ(x)]φ . (2.24)

24

Consequently, the kinetic term of a scalar in not invariant under the local symmetry:

∂µφ† ∂µφ→(∂µφ† − iq[∂µθ(x)]φ†

)(∂µφ+ iq[∂µθ(x)]φ) 6= ∂µφ†∂µφ. (2.25)

The kinetic term violates the local symmetry also in the fermionic case. We learn that in a theory

that includes only scalars and fermions, a local symmetry acting on these scalar and fermion fields

forbids the kinetic terms. A field without a kinetic term is not dynamical and cannot describe the

particles we observe in Nature. Can we have a theory of dynamical scalars and fermions that is

invariant under a local symmetry?

The answer is in the affirmative. To do that, we have to “correct” for the extra terms that arise

in the transformation of the kinetic terms. For the global symmetry case, L remains invariant since

φ and ∂µφ transform in the same way, that is, both are just multiplied by the phase factor exp(iqθ).

Then, we construct all the terms in L as products of φ and φ† or their derivatives. (Recall, φ and

φ† transform with opposite phases.) This procedure gives us an idea of how to solve the situation

for the local case. We should replace ∂µφ with a so-called “covariant” derivative Dµφ and require

that Dµ transforms in such a way that the transformation law for Dµφ is

Dµφ→ eiqθ(x)Dµφ. (2.26)

If we have a globally invariant L(∂µφ, φ), and the transformation law (2.26) applies, then L(Dµφ, φ)

is guaranteed to be locally invariant under the corresponding symmetry.

How do we find what Dµ is? Given the transformation law (2.23), and the desired transforma-

tion law (2.26), let us try

Dµ = ∂µ + igqAµ , (2.27)

where g is a dimensionless constant, and Aµ is a vector field with the following transformation law:

Aµ → Aµ −1

g∂µθ. (2.28)

We leave it as a homework to check that the covariant derivative of a field transforms in the same

way as the field, namely that Eqs. (2.27) and (2.28) indeed lead to (2.26). We thus achieved our

goal of obtaining a locally invariant Lagrangian with dynamical scalar and/or fermion fields. We

do so by taking a Lagrangian L that is invariant under the global symmetry, and replacing ∂µ with

Dµ. The “price” to pay is that we must add vector fields to the model.

The field Aµ is called a gauge field. The constant g is called the coupling constant (for reasons

that will become clear below). Following our principle that we include all the terms that are

allowed, we add a kinetic term for Aµ. To do that, we define the field strength F µν :

F µν = ∂µAν − ∂νAµ. (2.29)

The Lorentz invariant kinetic term for the gauge field is given by

LV = −1

4F µνFµν . (2.30)

25

The Lagrangian (2.30) is the most general renormalizable L(A) that is invariant under the local

symmetry.

Note the following points:

• Given the kinetic term, Aµ is a dynamical field and its excitations are physical particles. For

example, as we show later, the photon is associated with such excitations.

• We work in the canonically normalized basis where the coefficient of the kinetic term is 1/4.

• While a kinetic term is invariant under the local phase transformation, a mass-squared term

– 12m2AµAµ – is not. You will prove it in your homework. Here we just emphasize the

result: Local invariance implies massless gauge fields. Gauge bosons have only two degrees

of freedom.

• The gauge field does not couple to itself, that is, there is no term in the renormalizable

Lagrangian that involves more than two gauge fields. Specifically, a trilinear term is not

Lorentz invariant, while a quartic term violates the local symmetry.

• You may be puzzled by the fact that the transformation law is additive, and not multiplica-

tive, as is the case for the scalar and fermion fields. You should not be entirely surprised.

When we write the transformation law for a scalar field as φ → expiqθ φ, the phase of the

field is shifted, arg(φ)→ arg(φ) + iqθ. The Aµ field transforms like a phase.

• Consider Lagrangian terms involving only scalar and/or fermion fields and no derivatives of

fields. If such a term is invariant under the global symmetry then it is also invariant under the

corresponding local symmetry. This is not the case for the terms involving the gauge fields.

Indeed, a mass-squared term and a quartic interaction term are invariant under the global

U(1) but not under the local U(1). This is related to the fact that the U(1) transformation

law for scalars and fermions involves a multiplicative phase factor, while the one for gauge

fields involves an additive shift.

• The transformation of Eq. (2.28) is familiar from classical electromagnetism, where Aµ is

just the vector potential. In that context, the invariance of the vector potential under (2.28)

is often stated as the fact that only E and B are physical, while Aµ is redundant, and thus

we have an extra freedom that we called the gauge freedom. Here we reverse the logic: we

impose the gauge symmetry and the result is that the photon is massless and that only E

and B are physical.

• If a local symmetry decomposes into several commuting factors, each factor requires its

corresponding gauge field and has its own independent coupling constant. For example, if

the symmetry is a local U(1)a × U(1)b, we must include two massless gauge fields, Aaµ and

Abµ, and their transformation laws involve two independent coupling constants, ga and gb.

26

2.6.1 Charge

The notion of charge under a global symmetry was introduced in Eq. (2.10). The global symmetry

implies charge conservation. In this subsection we show that, in the case of a local symmetry, there

is an additional implication to the charge: it sets the strength of the interaction with the gauge

boson.

Consider a local U(1) where the expression for the covariant derivative of a field of charge q is

given by

Dµ = ∂µ + igqAµ . (2.31)

Take, for example, a fermion field with charge q under the local U(1). Its kinetic term is

iψD/ψ = iψ∂/ψ − gqψA/ψ. (2.32)

The second term is an interaction term between the fermion and the vector field. The strength

of the interaction is given by the gauge coupling constant g times the charge q. In particular, the

larger the charge, the stronger the coupling to the gauge boson.

Two comments are in order with regard to Eqs. (2.31) and (2.32):

1. Unlike the derivative ∂µ, the covariant derivative Dµ depends on the charge q of the field on

which it acts. It is different for fields of different charges.1

2. What appears in a Lagrangian is the combination gq, and not g and q separately. Indeed,

for the Abelian case, one can re-scale the coupling constant g and the charge q such that gq

remains the same, and the physics of the model is unaffected. When we have several fields

of different charges, one should re-scale all of them in the same way. The ratio between

different charges, say q2/q1, is physical and sets the relative strength of the interactions of

φ2 and φ1 with the gauge field Aµ. The situation is different in the non-Abelian case as we

discuss later.

We can summarize the situation as follows. To have a Lagrangian of dynamical scalar and/or

fermion fields that is invariant under a local U(1) symmetry requires the introduction of a vector

boson. The vector boson necessarily interacts with all scalars and fermions that are charged under

the symmetry. Conversely, the gauge boson does not couple to fields that are neutral (q = 0) under

the symmetry. The global symmetry implies charge conservation. The local symmetry implies, in

addition, that the charge sets the strength of the interaction with the corresponding gauge field.

1Perhaps it would have been helpful to denote a U(1)-related covariant derivative by Dµq , emphasizing the q-

dependence. It became, however, customary to keep the q-dependence implicit and write just Dµ and we comply

with this norm of the community.

27

2.7 Summary

We described the first steps in the process of model building for scalars and/or fermions. We need

to provide as input the following two ingredients:

(i) The symmetry;

(ii) The charges of the fermions and scalars.

Then we write down the most general renormalizable Lagrangian that depends on the scalar and

fermion fields and is invariant under the symmetry. If the imposed symmetry is local, corresponding

vector fields must be added.

The most general renormalizable Lagrangian with scalar, fermion and gauge fields can be

decomposed into

L = Lkin + Lψ + LYuk + Lφ. (2.33)

Here Lkin describes the free propagation in spacetime of all dynamical fields, as well as the gauge

interactions, Lψ gives the fermion mass terms, LYuk describes the Yukawa interactions, and Lφgives the scalar potential. In all the examples we will study, our task will be to find the specific

form of each of these four parts of the renormalizable Lagrangian.

In the SM, only local symmetries are imposed. Similarly, in most of the extensions of the SM,

only local and global discrete symmetries are imposed. While it is possible, in principle, to impose

also global continuous symmetries, this is rarely done in current model building. The reason for

that is twofold. First, there are arguments that suggest that continuous global symmetries are

always broken by gravitational effects and thus can only arise as accidental, rather than imposed

symmetries. Second, there is no obvious phenomenological motivation to impose such symmetries.

Thus, in all the models presented in this book that aim to describe Nature, global continuous

symmetries are not imposed.

In the next chapter we show how QED, the theory of electromagnetic interactions, can be

derived by starting from the above principles.

28

Homework

Question 2.1: Charges

Consider a system with three complex fields, φ1, φ3 and φ4, where a field φq carries charge q.

Write all the allowed interaction terms with three and four fields.

Question 2.2: Local symmetries

In order to “gauge” a symmetry we assume that the symmetry transformation parameter is

xµ-dependent, θ = θ(xµ), and the field transformation properties are:

φ(xµ)→ eiqθ(xµ)φ(xµ), Aµ(xµ)→ Aµ(xµ)− 1

g∂µθ(xµ), (2.34)

where g is a coupling constant and q is the charge of the field. We consider a complex field φ with

the following Lagrangian

L = |∂µφ|2 +m2|φ|2 + λ|φ|4 (2.35)

1. Explain why, for any terms that do not involve derivatives, the θ → θ(xµ) substitution has

no effect on the symmetry properties.

2. Show that the kinetic term is not invariant under a local transformation, φ(xµ)→ eiqθ(xµ)φ(xµ).

3. The covariant derivative is given by:

Dµφ = (∂µ + igqAµ)φ. (2.36)

Show that the covariant derivative of the field transforms in the same way as the field itself:

Dµφ→ eiqθ(xµ)Dµφ (2.37)

or equivalently:

Dµ → eiqθ(xµ)Dµe−iqθ(xµ). (2.38)

We can therefore replace the ordinary derivative with a covariant derivative to make the

Lagrangian (2.35) gauge invariant.

29

4. Show that a mass term for A, that is, m2AµAµ is not invariant under the transformation

properties of Eq. (2.34).

5. Write L that is invariant under the local transformation with the inclusion of A and its

kinetic term. Keep terms up to d = 4.

Question 2.3: Chiral symmetry

Consider a single free Dirac fermion

L = iψ ∂/ ψ , (2.39)

and the following transformations:

ψ → eiα ψ, ψ → eiαγ5 ψ . (2.40)

1. Show that the adjoint field transforms as

ψ → ψ e−iα, ψ → ψ eiαγ5 . (2.41)

2. Show that the ψ → eiα ψ transformation is vectorial: ψR,L → eiαψR,L.

3. Show that the ψ → eiαγ5 ψ transformation is chiral: ψR,L → e±iαψR,L.

4. Show that the Lagrangian (2.39) is invariant under both rotations of Eq. (2.40).

5. We now add a dirac mass term to the Lagrangian:

L = ψ[i∂/−m]ψ . (2.42)

Show that the presence of m 6= 0 breaks the chiral symmetry and conserves the vectorial

one. The above result is the source of the statement that we can use chiral symmetries to

forbid mass terms for fermions.

30

Chapter 3

QED

You are familiar with Quantum ElectroDynamics (QED) from your QFT courses, where the QED

Lagrangian is given and its implications are studied. In this Section we introduce QED using tools

of model building: We postulate a symmetry principle and derive the Lagrangian.

3.1 Defining QED

The simplest version of QED is defined as follows:

(i) The symmetry is a local

U(1)EM. (3.1)

(ii) There are two fermion fields,

eL(−1), eR(−1). (3.2)

The sub-indices L,R denote the chirality: left-handed and right-handed, respectively. The

number in parenthesis is the U(1)EM charge.

(iii) There are no scalars.

3.2 The Lagrangian

As discussed above, the most general renormalizable Lagrangian with scalar, fermion, and gauge

fields can be decomposed into

L = Lkin + Lψ + LYuk + Lφ. (3.3)

It is now our task to find the specific form of the Lagrangian made of the fermion fields eL and eR

subject to the U(1)EM gauge symmetry.

31

The imposed U(1)EM gauge symmetry requires that we include a single gauge boson, Aµ (of

charge q = 0) that we call the photon field. The corresponding field strength is given by

F µν = ∂µAν − ∂νAµ. (3.4)

The covariant derivative is

Dµ = ∂µ + ieqAµ, (3.5)

where e is the coupling constant and, specifically for the eL,R fields, q = −1.

Lkin includes the kinetic terms of all the fields:

Lkin = −1

4F µνFµν − ieLD/ eL − ieRD/ eR. (3.6)

Lψ includes a Dirac mass term for the electron fields:

Lψ = meeLeR + h.c. . (3.7)

Finally, since there are no scalar fields in the model,

LYuk = 0, Lφ = 0. (3.8)

3.3 The spectrum

The spectrum of the model that we defined above consists of a massive Dirac fermion of mass me,

and a massless gauge boson – the photon. We call the Dirac fermion “the electron,” and denote it

by e. (It is somewhat unfortunate that the QED coupling constant and the electron field are both

denoted by e. Which of the two options is meant should be clear from the context.)

We emphasize the following points:

• The electron is a Dirac field, which has four DoF.

• The reason we can write a mass term for the electron is that QED is vectorial, that is, the

eL and eR fields are assigned the same charge.

• The masslessness of the photon is a consequence of the gauge symmetry.

3.4 The interactions

Expanding Dµ, we obtain the photon–fermion interaction term:

Lint = eeA/e , (3.9)

32

where we used qe = −1. When the charge of the electron is set by convention to q = −1 (as we do

in our definition of the model), the coupling constant e is related to the fine structure constant as

α =e2

4π. (3.10)

We emphasize the following two points:

• The photon is not charged under QED (qA = 0) and consequently, at tree level, photons do

not interact with photons.

• The QED interaction is vector-like and, consequently, it conserves C, P and CP .

3.5 Parameter counting and tests of QED

The QED Lagrangian has two free parameters: the electron mass, me, and the coupling constant,

e or, equivalently, α ≡ e2/(4π). Both have been measured with impressive accuracy:

me = 0.510998928± 0.000000011 MeV,

α−1 = 137.035999074± 0.000000044. (3.11)

This model has no accidental symmetry.

The fact that the photon is massless is a prediction of QED, independent on the value of its

parameters. The massless photon field should generate a long term potential of the form e2q/r.

This is indeed the form of the Coulomb potential.

Given that the model has two parameters, once two appropriate experiments are carried out

to measure these parameters, one can make predictions for any other QED-related observable.

Examples can be found in several textbooks on QFT, for example, in Chapter 5 of [11].

3.6 QED with more fermions

3.6.1 Two Dirac fermions

We study a generalization of QED where we add a second fermion to the theory. We obtain

interesting lessons regarding symmetries. Our definition of the model is as follows:


U(1)EM. (3.12)

(ii) There are four fermions fields,

`iL(−1), `iR(−1) (i = 1, 2). (3.13)

33


The Lagrangian is a simple extension to the previous model. Since the model has no scalars,

we still have LYuk = Lφ = 0. The canonically-normalized kinetic terms for the fermions read

Lkin = −i`1LD/ `

1L − i`2

LD/ `2L − i`1

RD/ `1R − i`2

RD/ `2R. (3.14)

The fermion mass terms read

Lψ = (`1L `

2L)

(m11 m12

m21 m22

)(`1R

`2R

)+ h.c.. (3.15)

We can always, however, transform to a different basis for the fermion fields,(`1L

`2L

)→(eL

µL

)= VL

(`1L

`2L

),

(`1R

`2R

)→(eR

µR

)= VR

(`1R

`2R

), (3.16)

such that in the new basis the mass matrix is diagonal:

VL

(m11 m12

m21 m22

)V †R =

(me

mµ

). (3.17)

Since we work with canonical kinetic terms, the above rotation keeps the kinetic terms invariant.

The basis where the mass matrix is diagonal is called the mass basis. The fermion states in this

basis are called mass eigenstates. We call the mass eigenstates the electron e and the muon µ

where, by definition, the muon is heavier than the electron, mµ > me.

In the language of the e and µ Dirac fields, the full QED Lagrangian reads

Le,µQED = −1

4F µνFµν − ieD/ e− iµD/µ+meee+mµµµ. (3.18)

3.6.2 Accidental symmetries

An interesting feature of this model is that it exhibits an accidental symmetry. The imposed

symmetry is local U(1)EM symmetry. The Lagrangian is, however, symmetric under a global

U(1)e × U(1)µ symmetry. The conventional charge assigment under this symmetry is

e(+1, 0), µ(0,+1). (3.19)

The symmetry is manifest in Eq. (3.18), where there is no term that involves both e and µ fields.

Thus, an independent phase rotation of each of them represents a symmetry. The global U(1)EM

(which is automatically imposed when imposing local U(1)EM) is, however, part of U(1)e×U(1)µ.

When the phase rotation of the e and µ is the same, that is just the symmetry we imposed.

We can rephrase our findings in the following way. Equivalently to U(1)e × U(1)µ, we can use

U(1)EM × U(1)e−µ, (3.20)

34

with the charges

qEMe = qEM

µ = −1, qe−µe = −qe−µµ = +1. (3.21)

U(1)EM is the imposed symmetry, while U(1)e−µ is an accidental symmetry.

The experimental implication of the accidental symmetry is that QED conserves the muon and

electron numbers. In any QED process, the number of electrons minus the number of positrons is

conserved, and the number of muons minus the number of anti-muons is conserved. The symmetry

we impose – electric charge conservation – is the sum of the above two conservation laws. For

example, the decay process µ− → e−e+e− is allowed by the imposed U(1)EM but violates the

accidental U(1)e−µ symmetry.

The accidental symmetry is broken by higher dimension operators. For example, the dimension-

six term,

eLeReLµR, (3.22)

breaks the accidental symmetry (it carries U(1)e−µ charge of −2), but is still invariant under

the imposed local symmetry (its U(1)EM charge is zero). It allows the QED-forbidden process

µ− → e−e+e−.

So far no decay process that violates the U(1)e−µ symmetry has been observed. In neutrino

oscillation experiments, however, such violation has been observed. We discuss it in more detail

in Chapter 13.

3.6.3 Even more fields

We can add more fields to QED. As long as we add pairs of LH and RH fields with the same

charge, the situation is rather similar to what we described above. There is always a basis where

the mass terms are diagonal, and thus each pair of LH and RH fields can be combined into a Dirac

field that does not interact with the other fields. In this basis

LQED = −1

4F µνFµν − i

∑j

ψjD/ψj −mjψjψj, (3.23)

where j = 1, . . . , N runs over all the Dirac fields. This model has a [U(1)]N symmetry, which, in

this form, is manifest.

Note that not all fields necessarily carry the same U(1)EM charge. (The charge of each field is

implicit in the covariant derivative.) Indeed, the charged elementary fermions known to us come

in three different charges: The charged leptons (such as the electron and the muon) with q` = −1,

the up-type quarks of charge qu = +2/3, and the down-type quarks of charge qd = −1/3. There

are three Dirac fermions of each of these three types. The QED Lagrangian for these nine Dirac

fields has a global [U(1)]9 symmetry. In other words, the number of fermions minus the number

of anti-fermions of a given charge and a given mass is conserved.

35

Homework

Question 3.1: Light by light scattering

One of the well known properties of waves is the superposition principle that states that two

electromagnetic waves do not interact with each other. The superposition principle, however, is

only a classical approximation and it does not hold at the quantum level. In order to understand

this statement we discuss the cross section for the scattering process γγ → γγ.

1. Explain why at tree level (also referred to as the classical level)

σ(γγ → γγ) = 0. (3.24)

2. Within the framework of QED with only an electron field, draw a one loop diagram that

contributes to the above process.

3. To leading order, the cross section was calculated in the 1930s by Euler and Kockel. (For a

review, see, for example, [13].) In the center of mass frame and for photons of energy E, the

differential cross section is given by

dσ

dΩ=

139α4

(180πE)2

E8

m8e

(3 + cos2 θ

)2. (3.25)

Explain the α4 and the 1/m8e factors.

4. Calculate the total cross section.

5. The above result takes into account only an intermediate electron. Estimate the effect of

including the muon.

6. The cross section is numerically very small for visible light. Calculate the probability of

one light by light interaction when two laser beams of visible light (of 500 nm) cross each

others for one second. Assume that the beam cross section is 10−6 m2 and that the density

of photons is 1021 photons per second per m2.

7. At what photon energies do you expect the effect to become significant?

36

Question 3.2: A mirror world

Consider the following model:

1. The symmetry is a local U(1)EM × U(1)D.

2. The fields in the theory are four-component Dirac fermions: e(−1, 0) and d(0,−1).

Our notation is such that the first number is the charge under U(1)EM and the second number is

the charge under U(1)D. We denote the gauge bosons by Aµ and Cµ, respectively. We assume

that the coupling constants are equal, gEM = gD ≡ e.

1. Write down the covariant derivative, Dµ, for e and for d.

2. Draw the Feynman rules for this theory, i.e. draw all vertices from the interaction terms in

the Lagrangian. Be sure to label the fields.

3. We define, as usual, Fµν to be the field strength of Aµ, and we denote by Cµν that of Cµ.

Consider the term CµνFµν . Is that term gauge invariant? Is it Lorentz invariant? Based on

the answers to these two question, can we write it in our Lagrangian?

We now assume that there is no CµνFµν term. In that case we can write L = LEM + LD and

the two sectors are completely decoupled. In particular, the process ee → dd is not allowed as it

connects the two sectors.

4. We now add a scalar field S(−1, 1). Write down the covariant derivative Dµ for it, write down

its most general coupling to the fermions (up to dimension four), and draw the Feynman

rules.

5. Now the process ee→ dd is allowed. Draw a tree level Feynman diagram for this process.

6. We assume that the mass of the scalar, MS is large, that is, MS E, where E is the center

of mass energy of the incoming electron. In this limit, how does σ(ee→ dd) scale with MS?

7. We now add another fermion field, b(0,−2). Write down its couplings to the gauge bosons,

the scalar, and other fermions, and draw the corresponding Feynman rules (again, up to

dimension four).

8. There is no tree level diagram for the ee → bb process, but there are one loop ones. Draw

one such loop diagram.

37

The model that we considered in this homework is representative of a class of models where, in

addition to a sector with the particles and interactions known to us, one adds a sector that is

either completely decoupled from the first sector, or coupled to it only via very heavy degrees of

freedom. The (almost) decoupled sector is often called “dark sector.” Thus, Cµ would be called a

“dark photon”, and its interactions can be termed as “dark QED.”

38

Chapter 4

Non-Abelian symmetries

4.1 Basics

In previous chapters we discussed Abelian symmetries, such as U(1). For this type of symmetries,

also known as commutative symmetries, the result of applying two symmetry transformations does

not depend on the order in which they are applied. In this chapter we discuss non-Abelian sym-

metries, such as SU(2) and SU(3). For this type of symmetries, also known as non-commutative

symmetries, the result of applying two transformations might depend on the order of applying

them. Within the Standard Model, and in models that extend it, both Abelian and non-Abelian

symmetries play an important role.

In order to extend our discussion to non-Abelian symmetries, we use the language of Lie groups.

A short review of Lie groups is provided in an Appendix, aimed for self-study. From here on we

use this formalism assuming that the reader is familiar with it. We now emphasize some of the

differences between transformation laws for Abelian and non-Abelian symmetries.

• Consider a U(1) symmetry and a field φ that has charge q 6= 0 under this symmetry. For

q 6= 0, φ is necessarily a complex field. The transformation law for this field is of the form

φ→ eiqθφ. (4.1)

Thus, for an Abelian symmetry, the transformation law is defined for each single field sep-

arately. A U(1) symmetry operation changes the phase of the field proportionally to the

charge.

• Consider an SU(M) symmetry and a field φ in a representation R of dimension N > 1 under

this symmetry. Here φ is a vector with N components, φi with i = 1, . . . , N . If R is a

complex representation, φ is a complex field, while if R is a real representation, φ is real.

The transformation law for this field is of the form

φi →(eiTaθa

)ijφj. (4.2)

39

Here i, j = 1, . . . , N and a = 1, . . . , (M2 − 1). The Ta’s are the generators of the SU(M)

algebra:

[Ta, Tb] = ifabcTc. (4.3)

For the field φ in the N -dimensional representation R the Ta’s are represented by N × Nmatrices. Thus, for a non-Abelian symmetry, the transformation law is defined for each

multiplet of fields separately. The SU(M) symmetry operations consist of rotations among

the various components within each multiplet.

If the symmetry group is not simple, we can consider an independent rotation by each simple

subgroup. Then, it is convenient to represent the field as a vector under each of the simple Lie

subgroups. When applying a transformation under one subgroup, it does not affect the other ones.

For example, consider an SU(3) × SU(2) group, and a field that is a triplet under SU(3) and a

doublet under SU(2). We can denote the field by φαi, where α = 1, 2, 3 is the SU(3)-triplet index,

and i = 1, 2 is the SU(2)-doublet index. We can write separately the transformation laws under

an SU(3) symmetry transformation and under an SU(2) symmetry transformation:

φαi →(e(i/2)λaθa

)αβφβi,

φαi →(e(i/2)τbθb

)ijφαj. (4.4)

Here 12λa are the eight 3 × 3 Gell-Mann matrices, which are the SU(3) generators in the triplet

representation, and 12τb are the three 2× 2 Pauli matrices, which are the SU(2) generators in the

doublet representation.

4.2 Global symmetries

For a Lagrangian that is invariant under a U(1) symmetry, each term in the Lagrangian must

consist of products of fields such that the sum of their charges is zero. For a Lagrangian that is

invariant under a non-Abelian symmetry, each term must consist of products of various represen-

tations that are contracted into a singlet of the symmetry group.

Consider, for example, two SU(2)-doublets, ρ and σ. Under SU(2), 2 × 2 = 1 + 3. Thus, the

product ρσ can be decomposed into a singlet and a triplet representations of SU(2). If we want to

construct the quadratic term ρσ, we must contract them into the singlet combination, εijρiσj. If

the theory includes also an SU(2)-triplet ∆, then we can write a term ρσ∆. Now, ρ and σ should

be contracted into a triplet, ρiσj − δijρkσk, and then this triplet combination can be contracted

with ∆ to form a singlet (3× 3 = 1 + 3 + 5). In what follows, we do not write these contractions

explicitly. It should be understood that the contraction is the one needed to make each of the

Lagrangian terms a singlet under all imposed symmetries. In the simplest cases there is only one

way to do so.

40

To understand what kind of symmetries can be imposed, consider a model that has N complex

scalar fields φi. The kinetic terms are given by

Lkin = ∂µφ∗i∂

µφi. (4.5)

The symmetry of Lkin is U(N) = SU(N)×U(1), under which the scalar fields transform as a single

N -plet of charge q. The imposed non-Abelian symmetry can only be a subgroup of the SU(N). If

we choose to impose the full SU(N)× U(1), we have

L = ∂µφ†∂µφ−m2φ†φ− λ(φ†φ)2. (4.6)

Here φ transforms as (N)q, the fundamental representation of SU(N) with charge q under the

U(1). What we mean by φ†φ is the contraction of (N)−q × (N)q into (1)0 of SU(N)× U(1).

As another example, consider a model that has N left-handed Weyl fermions ψLi and N right-

handed Weyl fermions ψRi. The kinetic terms are given by

Lkin = iψLi∂/ψLi + iψRi∂/ψRi. (4.7)

The symmetry of Lkin is SU(N)L×SU(N)R×U(1)L×U(1)R, under which the ψL fields transform

as (N, 1)qL,0, while the ψR fields transform as (1, N)0,qR . The imposed non-Abelian symmetry can

only be a subgroup of the SU(N)L×SU(N)R. If we choose to impose the full [SU(N)×U(1)]2, only

the kinetic terms are allowed. If, for example, we impose the vectorial subgroup [SU(N)×U(1)]V ,

under which the left-handed and right-handed fields transform in the same way, a Dirac mass is

allowed. We can now write the Lagrangian in terms of the Dirac field ψ, which transforms as (N)q

under [SU(N)× U(1)]V :

Lkin = iψ∂/ψ −mψψ. (4.8)

Let us introduce a more specific example. Consider a model where we impose an SU(3)

symmetry and introduce a single scalar triplet (3) denoted by φ. The conjugate field, φ†, transform

as an anti-triplet (3). To construct the (renormalizable part of the) Lagrangian we need to find all

possible combinations of two, three or four triplets and anti-triplets that can be contracted into

an SU(3) singlet. At dimension two in the fields there is a single such combination, 3 × 3. At

dimension three in the fields there are two such combinations, 3× 3× 3 and 3× 3× 3. The singlet

in these latter contractions is antisymmetric under the exchange of the (anti-)triplets. This implies

that we cannot form a singlet if we have a single (or even two) (anti-)triplets. The most general

Lagrangian is then

L = ∂µφ†∂µφ−m2φ†φ− λ(φ†φ)2. (4.9)

Let us now consider the case of an SU(3) symmetry and three scalar triplets φi(3), i = 1, 2, 3.

Now there are trilinear scalar couplings:

L = ∂µφ†∂µφ−m2

ijφiφ∗j − λijklφ

†iφjφ

†kφl − [cijkφiφjφk + h.c.]. (4.10)

The m2 and λ terms are totally symmetric in the i, j, k, l indices, while c is anti-symmetric. Thus,

there is only one independent cijk coupling: cijk = εijkc.

41

4.3 Local symmetries

Theories with non-Abelian local symmetries are also called Yang-Mills theories. There are several

similarities to the case of local Abelian symmetries:

• Terms that depend on scalar and fermion fields but not on their derivatives, and which

are invariant under the corresponding global symmetry, are also invariant under the local

symmetry.

• The kinetic terms are not invariant under the local symmetry.

• To achieve such invariance we must add gauge fields, and replace the derivative ∂µφ with a

covariant derivative Dµφ such that it transforms like the field φ.

The more complicated SU(N) transformation law for φ,

φ→ eiTaθa(x)φ, (4.11)

where Ta are the N2 − 1 generators of the SU(N) algebra, leads to a more complicated transfor-

mation law for Dµ:

Dµ → eiTaθa(x)Dµe−iTaθa(x). (4.12)

Since the Ta form the adjoint representation of SU(N), to achieve local invariance we need to

introduce gauge fields, Aµa , in the adjoint representation. The covariant derivative is given by

Dµ = ∂µ + igTaAµa , (4.13)

where g is a dimensionless parameter called the coupling constant. The index a runs from 1 to

N2 − 1. The transformation law for Aµa is given by

Aµa → Aµa − fabcθbAµc −1

g∂µθa. (4.14)

The fact that the non-Abelian gauge field is in the adjoint representation of the gauge group and,

in particular, that – unlike the Abelian case – it is not a singlet, has significant consequences: It

leads to self-interactions of the gauge fields, as we discuss below.

To promote Aµa to a dynamical field, we must introduce a kinetic term. We define F µνa via

[Dµ, Dν ] = igTaFµνa . (4.15)

Then

TaFµνa = ∂µTaA

νa − ∂νTaAµa + ig[Ta, Tb]A

µaA

νb . (4.16)

Using

tr(TaTb) = δab, (4.17)

42

we can rewrite Eq. (4.16) as follows:

F µνa = ∂µAνa − ∂νAµa − gfabcA

µbA

νc . (4.18)

Compared to the Abelian case, Eq. (2.29), the non-Abelian case, Eq. (4.18) has an extra term.

This term is the source of self-interactions of the gauge fields.

We are now ready to find the kinetic term of the non-Abelian gauge fields:

LV = −1

4F µνa Faµν . (4.19)

The Lagrangian (4.19) is the most general renormalizable L(A) that is invariant under the non-

Abelian local symmetry.


• Given the kinetic term, Aµa is a dynamical field and its excitations are physical particles. For

example, as we show later, the gluon is associated with such excitations.

• We work in the canonically normalized basis where the coefficient of the kinetic term is 1/4.

• While a kinetic term is invariant under the gauge transformation, a mass-squared term –12m2AµaAaµ – is not. Local invariance under non-Abelian symmetry implies massless gauge

fields in the adjoint representation.

• The non-Abelian gauge fields couple to themselves. This can be see by replacing F µνa in Eq.

(4.19) with the explicit expression (4.18), which leads to interaction terms that are trilinear

and quartic in the gauge fields:

Lself−interactions = gfabc(∂µAaν)AµbA

νc −

1

4g2(fabcA

µbA

νc )(fadeAdµAeν). (4.20)

• If the symmetry group decomposes into several commuting factors, each factor has its own

gauge fields in the corresponding adjoint representation and an independent coupling con-

stant. For example, if the symmetry is SU(3) × SU(2) × U(1), we must introduce three

irreducible representations of gauge fields: (8, 1)0 representation, with coupling constant g3,

to make the Lagrangian invariant under the local SU(3); (1, 3)0, with coupling constant g2,

to achieve invariance under local SU(2); and (1, 1)0, with coupling constant g1, to achieve

invariance under local U(1).

The self-interactions of the non-Abelian gauge fields constitute a significant difference with the

Abelian case. As mentioned above, the source of this difference is the fact that the U(1) gauge

fields are neutral under the U(1), while the SU(N) gauge fields are in the adjoint (and not in the

singlet) representation of the SU(N).

43

4.4 Running coupling constants

In QFT, the coupling constants depend on the energy scale. In the Physics jargon, we say that

the coupling constants run. Here we do not discuss the theoretical background of this effect, and

assume that the reader is familiar with it from QFT classes. We just give some basic formulae and

mention some important consequence of the running of the coupling constants.

The running is given by the beta function:

β(g) =∂g

∂ log µ, (4.21)

where µ is the relevant energy scale. The beta function depends on the field content of the

theory. The leading order effects depend only on fields that are charged under the symmetry. The

fact mentioned above, that Abelian gauge fields are neutral under the Abelian symmetry, while

non-Abelian gauge fields are in the adjoint representation of the gauge symmetry, has important

implications on the running of the corresponding coupling constants.

For a local U(1) theory, with coupling constant g1, and with nf chiral fermion fields with charge

|q| = 1, we have

β(g1) =nfg

31

24π2. (4.22)

(This result is often quoted for the case of nf Dirac fermions, where the 24 is replaced by 12.)

For a local SU(N) theory, with coupling constant gN , and with nf fermions in the fundamental

representation and nf fermions in the anti-fundamental representation, we have

β(gN) =(

2nf3− 11N

3

)g3N

16π2. (4.23)

The important difference between the running effects of the Abelian (4.22) and non-Abelian

(4.23) cases is in the sign of the beta function. In the U(1) case, the beta function is always positive.

Consequently, the lower the relevant energy scale, the smaller the coupling. In the SU(N) case it

can assume either sign. In particular, if the number of fermions in not too large, nf < (11/2)N

(and in particular in the pure gauge case), the sign is negative. In this case, the lower the relevant

energy scale, the larger the coupling.

Reliable calculations of experimental observables are possible only in the perturbative regime.

If a coupling grows large, one loses the ability to make accurate predictions. The question of the

running is then closely related to the question of when we can make accurate predictions. For

U(1) theories, this is the case in low energy (so called the IR). For SU(N) theories, if there are

not too many fermions, this is the case at high energies (the UV), while in the IR perturbativity

is lost. We discuss the implications of this situation in the next chapter where we present QCD –

the theory of strong interactions – as a specific and important, example.

44

Appendix

4.A Noether’s theorem

Let φi(x) be a set of fields, i = 1, 2, . . . , N , on which the Lagrangian L(φ) depends. Consider an

infinitesimal change δφi in the fields. This is a symmetry if

L(φ+ δφ) = L(φ). (4.24)

Since L depends only on φ and ∂µφ, we have

δL(φ) = L(φ+ δφ)− L(φ) =δLδφj

δφj +δL

δ(∂µφj)δ(∂µφj). (4.25)

The relation between symmetries and conserved quantities is expressed by Noether’s theorem: To

every symmetry in the Lagrangian there corresponds a conserved current. To prove the theorem,

one uses the equation of motion:

∂µδL

δ(∂µφj)=δLδφj

. (4.26)

The condition for a symmetry is then

∂µ

[δL

δ(∂µφj)

]δφj +

δLδ(∂µφj)

δ(∂µφj) = ∂µ

[δL

δ(∂µφj)δφj

]= 0. (4.27)

Thus, the conserved current (∂µJµ = 0) is

Jµ =δL

δ(∂µφj)δφj. (4.28)

The conserved charge (Q = 0) is given by

Q =∫d3x J0(x). (4.29)

We are interested in unitary transformations,

φ→ φ′ = Uφ, UU † = 1. (4.30)

45

Here φ is a vector with N components, U is an N ×N unitary matrix, and 1 stands for the N ×Nunit matrix. The reason that we are interested in unitary transformation is that they keep the

canonical form of the kinetic terms. A unitary matrix can always be written as

U = eiεaTa

, T a† = T a, (4.31)

where εa are numbers and T a are N × N hermitian matrices. For infinitesimal transformation

(εa 1),

φ′ ≈ (1 + iεaTa)φ =⇒ δφ = iεaT

aφ. (4.32)

A global symmetry is defined by εa = const(x). For internal symmetry, δ(∂µφ) = ∂µ(δφ). Thus,

for an internal global symmetry,

δ(∂µφ) = iεaTa∂µφ. (4.33)

In the physics jargon, we say that ∂µφ transforms like φ. The conserved current is

Jaµ = iδL

δ(∂µφ)T aφ. (4.34)

The matrices T a form am algebra of the symmetry group,

[T a, T b] = ifabcT c. (4.35)

The charges that are associated with these symmetry also satisfy the algebra:

[Qa, Qb] = ifabcQc. (4.36)

Note that T a are N × N matrices, while Qa are operators in the Hilbert space where the theory

lives.

4.A.1 Free massless scalars

Consider N real, free, massless scalar fields φj:

L(φ) =1

2(∂µφ)(∂µφ). (4.37)

The index j is called flavor index, and here and in what follows the summation over it is implicit.

The theory is invariant under the group of orthogonal N × N matrices, which is the group of

rotations in an N -dimensional real vector space. This group is called SO(N). The generators T ajk

are the N(N − 1)/2 independent antisymmetric N ×N imaginary matrices, that is

δφ = iεaTaφ, (4.38)

with T a antisymmetric and imaginary. (It must be imaginary so that δφ is real.) For an internal

global symmetry, the spacetime and the internal symmetry group are unrelated and thus ∂µφ

transforms like φ, that is

δ(∂µφ) = iεaTa(∂µφ). (4.39)

46

Then,

δL =δLδφδφ+

δLδ(∂µφ)

δ(∂µφ) = 0 + (∂µφ) iεaTa(∂µφ) = 0, (4.40)

where in the last step we used the antisymmetry of T a in its flavor indices. The associated conserved

current is then

Jaµ = i∂µφTaφ. (4.41)

The SO(N) group that we have found is the largest possible symmetry for a Lagrangian

involving N real scalar fields. In general, mass and interaction terms will reduce the symmetry to

a subgroup of SO(N). In the presence of such SO(N) breaking terms, an SO(N) transformation

by a broken generator constitutes a change of basis, where the mass and interaction terms change

their form. The SO(N) groups have no important role in the SM. We mention SO(4) when we

discuss the Higgs mechanism. In a more advanced course, you may encounter SO(10) as a grand

unifying group.

4.A.2 Free massless Dirac fermions

Consider N free, massless, spin-12, four-component fermion fields ψj:

L(ψ) = iψ∂/ψ (4.42)

The ψ’s are necessarily complex because of the Dirac structure. The theory is invariant under the

group of unitary N × N matrices. This group is called U(N) = SU(N) × U(1). The generators

are the independent N2 Hermitian matrices:

δψ = iεaTaψ, (4.43)

where T a is a general Hermitian traceless matrix. The generators can be divided into N2 − 1

traceless generators that correspond to the SU(N) algebra, and a single U(1) generator. The

transformation law of ψ is as follows:

δψ = δ(ψ†γ0) = (iεaTaψ)†γ0 = ψ†(−i)ε∗aT a†γ0 = −iψ†γ0εaT

a = −iψεaT a, (4.44)

where we used the hermiticity of T a. The T a matrices and the γµ matrices commute because they

act on different spaces. We also need to derive the transformation property of the derivative. For

an internal symmetry,

δ∂/ψ = ∂/δψ. (4.45)

For an internal global symmetry (εa independent of x)

δ∂/ψ = iεaTa∂/ψ. (4.46)

We then have

δL = δψδLδψ

+ δ∂/ψδLδ∂/ψ

+δLδψδψ +

δLδ∂/ψ

δ∂/ψ . (4.47)

47

We use

δψδLδψ

= (−iψεaT a)(i∂/ψ) = ψεaTa∂/ψ,

δ∂/ψδLδ∂/ψ

= 0,

δLδψδψ = 0,

δLδ∂/ψ

δ∂/ψ = (iψ)(iεaTa∂/ψ) = −ψεaT a∂/ψ, (4.48)

and find that δL = 0. The corresponding conserved current is

Jaµ = ψγµTaψ. (4.49)

The charge associated with U(1),∫d3xψ†ψ, is the fermion number operator.

In fact, the symmetry of N free massless Dirac fermions is larger: [U(N)]2, rather than just a

single U(N). To understand this point, let us define the following projection operators:

P± =1

2(1± γ5). (4.50)

The four-component Dirac fermion can be decomposed into a left-handed (LH) and a right-handed

(RH) Weyl spinor fields, ψL and ψR:

ψL = P−ψ, ψR = P+ψ, ψL = ψP+, ψR = ψP−. (4.51)

The Lagrangian of Eq. (4.42) can be written as follows:

L(ψ) = iψLj∂/ψLj + iψRj∂/ψRj. (4.52)

It follows straightforwardly that this Lagrangian has a symmetry under independent rotations of

the left-chirality and right-chirality fields, namely

[SU(N)× U(1)]L × [SU(N)× U(1)]R. (4.53)

The conserved currents are

JaLµ = ψLγµTaLψL, JaRµ = ψRγµT

aRψR. (4.54)

The symmetry (4.53) is the largest possible symmetry group for a Lagrangian of N Dirac fermions.

In general, mass and interaction terms break the symmetry to a subgroup of (4.53). Transforma-

tions with broken [SU(N)× U(1)]L × [SU(N)× U(1)]R generators constitute a change of basis.

The ψL and ψR are eigenstates of the chirality operator γ5 (with eigenvalues −1 and +1,

respectively). For massless fields, they are also helicity eigenstates. To see that, consider a plane

48

wave traveling in the z direction, p0 = p3, and p1 = p2 = 0. The Dirac equation in momentum

space is p/ψ = 0, so p(γ0 − γ3)ψ = 0, or

γ0ψ = γ3ψ. (4.55)

The spin angular momentum in the z direction is

J3 = σ12/2 = iγ1γ2/2. (4.56)

Then

J3ψL =i

2γ1γ2ψL =

i

2γ0γ0γ1γ2ψL =

i

2γ0γ1γ2γ0ψL =

i

2γ0γ1γ2γ3ψL =

=1

2γ5ψL = −1

2ψL. (4.57)

We learn that ψL describes a massless particle with helicity −1/2. Similarly, ψR describes a

massless particle with helicity +1/2.

4.A.3 Free massive Dirac fermions

Consider N free spin-12, four-component fermion fields ψi with universal mass m:

L = iψ∂/ψ −mψψ = iψL∂/ψL + iψR∂/ψR −mψLψR −mψRψL, (4.58)

where the “flavor” index j is omitted. This Lagrangian is invariant under the symmetry in which

the LH and RH fields rotate together, U(N) = SU(N) × U(1). We learn that a universal mass

term breaks [U(N)]2 → U(N). This U(N) symmetry is actually the one identified in Eq. (4.43),

with the conserved current of Eq. (4.49).

For a general, non-universal, mass term the symmetry is smaller. By performing an [SU(N)×U(1)]L × [SU(N)× U(1)]R transformation, the mass matrix M is modified:

M → VLMV †R, (4.59)

where VL,R are unitary matrices. We can always choose a basis in this way where the mass matrix

is diagonal and real. In this basis,

L = iψL∂/ψL + iψR∂/ψR −mi(ψLiψRi + ψRiψLi) = ψi(i∂/−mi)ψi. (4.60)

The symmetry is [U(1)]N . The conserved currents are

Jµi = ψiγµψi, (4.61)

and the corresponding conserved charges are simply the fermion number for each fermion (mass

eigenstate) separately,

Qi =∫d3xψ†iψi. (4.62)

49

Within the SM, this is the case of lepton flavor symmetry, which ensures that the flavor of the

leptons (namely, e, µ and τ) is conserved. There is also a similar flavor symmetry of the quark

sector, which is respected by the strong and the EM forces but not by the weak force.

The Lagrangian of Eq. (4.60) is the most general renormalizable Lagrangian that includes only

Direc fermion fields [see Eq. (1.3)]. Thus, in the absence of Yukawa interactions, any renormalizable

Lagrangian that has N Dirac-fermions fields has an accidental [U(1)]N symmetry.

50

Homework

Question 4.1: Running and GUT

Add a quesion about running and GUT

Question 4.2: Scalar Lagrangians

Consider a system with eight real scalar fields φi (i = 1, . . . , 8).

1. Write the kinetic term for this system. What is the symmetry of the kinetic term?

2. To impose a U(1) symmetry under which the scalars are charged, we must group the scalars

into U(1) representations. That is, we combine two real scalar fields to form a complex scalar

field. How many complex scalar fields there are?

3. We assign the scalar fields charges 1, 4, 10, 16. Write the most general Lagrangian that

satisfies the requirements we listed in class.

4. The above Lagrangian has an accidental symmetry. What is that accidental symmetry?

Write non-renormalizable terms that break the accidental symmetry completely.

5. We now impose an SO(8) symmetry and assign the eight scalars into a fundamental of SO(8).

That is, they form a vector in eight dimensional real space. Write the most general La-

grangian in this case. Explain why the mass and the interaction terms must be proportional

to a unit matrix.

6. Let us consider only the kinetic and mass terms. When we imposed SO(8) the mass matrix

was proportional to the unit matrix. Now, instead, we take the mass matrix to be

m2 = diag(m21,m

21,m

21,m

21,m

22,m

22,m

22,m

22), m2

1 6= m22. (4.63)

What is the symmetry of this Lagrangian?

51

7. Last, we would like to impose a U(2) symmetry. For this, we think about the model as

having four complex fields and assign each two complex fields into an SU(2) doublet

H1 =1√2

(φ1 + iφ2

φ3 + iφ4

), H2 =

1√2

(φ5 + iφ6

φ7 + iφ8

). (4.64)

The U(1) charge of the two doublets can be different. There is one four-scalar interaction

term that we can write that breaks the SO(8) symmetry. Write this term.

Question 4.3: Discrete sub-symmetries

Consider

L = ψi[(i∂/−m)δij]ψj (4.65)

where the flavor indices i, j run from 1 to 2. This Lagrangian is manifestly invariant under

some discrete symmetries in flavor space. These symmetries, however, are part of the U(2) flavor

symmetry. For example, the symmetry rotation ψi → −ψi can be generated by the U(1) symmetry:

U = eiα with α = π. Find the U(2)=SU(2)×U(1) rotations for the following two symmetries:

(i) ψ1 → ψ1, ψ2 → −ψ2 (4.66)

(ii) ψ1 ↔ ψ2 .

Hints: You can write your answer as a series of several rotations. The following identity can also

be useful

eiασi = I cosα + iσi sinα (4.67)

Question 4.4: Flavor symmetries

Consider a system with 2 Dirac fermions. The most general Lagrangian is

L = ψi(i∂/δij −mij)ψj (4.68)

where the flavor indices i, j run from 1 to 2. The matrix mij has to be Hermitian and thus we

need four real parameters to describe it.

1. The flavor symmetry of this Lagrangian is U(1) × U(1). However, this symmetry is not

manifest. To make it manifest proceed as follows. Write

mdiag = UmU † (4.69)

such that mdiag is diagonal. Define ψ′ = Uψ and show that L in (4.68) is equal to

L = ψ′i[i∂/δij − (mdiag)ij]ψ′j (4.70)

52

2. In (4.70) the two U(1) symmetries are manifest. They are the independent rotations of each

flavor. One simple way to define them is

ψi → eiαiψi for i = 1, 2 (4.71)

Show that these two U(1) symmetries can also be written as eiα0 and eiα3σ3 .

3. When m is proportional to the unit matrix, the flavor symmetry is SU(2)×U(1). In the most

general case the symmetry is only U(1)×U(1). In the above parametrization the two broken

symmetries are those correspond to σ1 and σ2. Show by explicit calculation that they are

broken (it is enough to show it only for one of them).

4. The two broken symmetry generators can be used to generate U of (4.69). Namely, we can

write

U = eiα2σ2eiα1σ1 , (4.72)

such that α1 and α2 are given in terms of the parameters of mij. (We are usually saying that

we used the symmetries to perform basis rotations.) Find α1 and α2. Hint: it may be easier

to perform the two rotations one after the other. First use U1 = eiα1σ1 to make m real and

symmetric, and then use U2 = eiα2σ2 to diagonalize it.

Question 4.5: A system with fermion and scalars

Consider a system with N Dirac fermion fields and one real scalar field, φ. Write the fermions

as a vector in flavor space ψT = (ψ1, ψ2, . . . , ψN). We impose a U(1) symmetry such that the

fermions are charged under this U(1) and the scalar is not. The most general Lagrangian for this

system is

L = ψi[i∂/δij −mij + λijφ]ψj + LS , (4.73)

where i, j are flavor indices, λij are the Yukawa couplings and LS includes the kinetic, mass and

self interaction terms for the scalar field.

1. Generally, the flavor symmetry of (4.73) is U(1). Show that when λij ∝ mij the symmetry

is larger, U(1)N .

2. When we add terms to a Lagrangian that break some of the symmetries the following relation

holds: “The number of new physical parameters is equal to the number of total (physical and

unphysical) new parameters minus the number of broken generators.” Namely, we “use” the

broken generators to remove unphysical parameters. Here we can think about an original

model with only kinetic terms that has an SU(N)×U(1) flavor symmetry, and then the mass

and Yukawa terms were added. Show that the number of physical parameters in the theory

describes by (4.73) is N2 + 1.

53

3. In the N = 1 case you should find that there are two physical parameters, that you can think

of as m and λ. More generally, you see that we can always find a basis where m is diagonal

but λ is not. Show that for N = 2 we can find a basis where λ is real. For N = 3, however,

we cannot do it in the most general case. We can always write any complex number as a

real number and a phase. What is the minimal number of phases that we need in the N = 3

case?

4. Find a way to modify the theory such that mij = 0 but allow for λij 6= 0 in the Lagrangian

(4.73). Explain why this modification must involve a chiral symmetry.

54

Chapter 5

QCD

Quantum ChromoDynamics (QCD), the theory of strong interactions, is our first example of a

non-Abelian theory. The theory, however, has additional interesting aspects. In particular, at low

energy it is not perturbative, and the perturbative approach that we use cannot be applied. In

this chapter we define the theory at the UV in a perturbative way. We discuss the IR aspects in

Chapter 9.

5.1 Defining QCD

QCD is defined as follows:


SU(3)C . (5.1)

(ii) There are six left-handed and six right-handed quark fields,

QLi(3), QRi(3), i = 1, ..., 6. (5.2)


5.2 The Lagrangian

The most general renormalizable Lagrangian with fermion and gauge fields can be decomposed

into

L = Lkin + Lψ. (5.3)

It is now our task to find the specific form of the Lagrangian made of the fermion fields QLi and

QRi subject to the SU(3)C gauge symmetry.

55

The imposed SU(3)C gauge symmetry requires that we include a gauge field in the adjoint

representation of SU(3)C , Gµa , that we call the gluon field. The corresponding field strength is

given by

Gµνa = ∂µGν

a + ∂νGµa − gsfabcG

µbG

νc . (5.4)

Here, gs is the coupling constant, and fabc are the structure constants of SU(3). The covariant

derivative is

Dµ = ∂µ + igsGµaLa, (5.5)

where La, a = 1, . . . , 8, are the SU(3) generators, [La, Lb] = fabcLc. In the triplet representation,

La = 12λa, where λa are the eight 3 × 3 Gell-Mann matrices given explicitly in Eq. (A.19). The

covariant derivative is the same for all the fermions in the theory and is given by

Dµ = ∂µ +i

2gsG

µaλa. (5.6)


Lkin = −1

4Gµνa Gaµν − iQLiD/QLi − iQRiD/QRi. (5.7)

Lψ includes Dirac mass terms for the quarks:

Lψ = mQijQLiQRj + h.c. . (5.8)

We can always, however, make a unitary transformation of the fermion fields,

QLi → (VL)ijQLj, QRi → (VR)ijQRj, (5.9)

such that in the new basis the mass matrix is diagonal (this basis is called the mass basis):

VLmQV †R = diag(mu,md,ms,mc,mb,mt). (5.10)

The transformation (5.9) by unitary matrices preserves the canonical normalization of the kinetic

terms. We call the mass eigenstates the up u, down d, strange s, charm c, bottom b and top t

quarks where, by definition, mu < md < ms < mc < mb < mt.

In the language of the quark Dirac mass eigenstate fields, q ≡ (QL, QR)T , the full QCD La-

grangian reads

LQCD = −1

4Gµνa Gaµν − iqD/ q −mqqq, q = u, d, s, c, b, t. (5.11)

5.3 The spectrum

When discussing the spectrum of QCD, the energy scale at which it is probed makes a significant

difference. The reason is that the strong coupling constant (uniquely among the SM coupling

constants) becomes stronger at lower energy or, equivalently, the larger the distance between

56

(anti)quarks. This leads to confinement: quarks and antiquarks bind into color-singlet states.

Here we discuss the short distance spectrum. The IR spectrum is discussed further below.

At short distance, the spectrum of QCD consists of six massive Dirac quarks of mass mq in the

color-triplet representation, and a massless gluon, in the color-octet representation. We emphasize

the following points:

• Since each of the six quarks is a color-triplet Dirac fermion, it has twelve DoF.

• The reason that we can write mass terms for all six quarks is that QCD is vectorial, that is,

the QL and QR fields are both color-triplets.

• The masslessness of the gluon is a consequence of the gauge symmetry. Since it is a color-octet

gauge boson, it has sixteen DoF.


Expanding Dµ, we obtain the quark–gluon interaction term:

Lint = −gs2qiλaG/aqi . (5.12)

Expanding Gµνa Gaµν we obtain the gluon self-interactions:

Lself−int = gsfabc(∂µGν

a)GµbG

νc + g2

s(fabcGµbG

νc )(fadeG

µdG

νe) . (5.13)

We emphasize the following points:

• Both the quark–gluon interaction and the gluon self-interaction depend on the same coupling

constant, gs.

• The quark–gluon interaction is vector-like, diagonal and universal.

• The strong interaction conserves P , T , and C and any combination of them.

The QCD Lagrangian has seven free parameters: the six masses, mq, and the coupling constant,

gs or, equivalently, αs ≡ g2s/(4π). The values of the parameters depend on the scale where they

are measured, and on the renormalization scheme we are using. We do not discuss these issues

here, and just quote the numbers from the PDG:

mu = 2.3+0.7−0.5 MeV, md = 4.8+0.5

−0.3 MeV, ms = 95± 5 MeV, (5.14)

mc = 1.275± 0.025 GeV, mb = 4.18± 0.03 GeV, mt = 160+5−4 GeV.

The value of the coupling constant is given by

αs(m2Z) = 0.1185± 0.0006, (5.15)

where mZ ∼ 91 GeV is the mass of the Z boson that we discuss in Chapter 7.

57

5.4.1 Confinement

We do not observe free quarks in Nature. To explain it one has to postulate that all asymptotic

states are singlets of SU(3)C . This is the confinement hypothesis: quarks, which are SU(3)C

triplets, must be confined within color-singlet bound states.

The confinement hypothesis is consistent with the way the strong coupling runs. The beta

function for the SU(N) coupling constant is given in Eq. (4.23). For QCD, where the gauge group

is SU(3) and the number of relevant quarks at µ > mt is six, we have

β(gs)(µ > mt) = − 7g3s

16π2< 0. (5.16)

For lower energy scales, where the number of relevant quarks is smaller, β(gs) becomes even more

negative, β(gs) = −(33 − 2nlightq )/(48π2), where nlight

q is the number of quarks with mq < µ. The

fact that the beta function is negative is very important. It implies that the strong coupling

constant becomes larger for lower energies. This behavior is what leads to confinement: when two

quarks are traveling away from each other, at some point it becomes energetically favorable to pop

up a quark-antiquark pair from the vacuum. The result is that we do not observe free quarks at

the IR. All quarks are confined into color-singlet objects.

The scale where the transition between the UV and IR description of QCD occurs is called

ΛQCD. Roughly speaking, it is the scale where the coupling constant become strong, gs ≈ 4π. It

occurs around ΛQCD ≈ 300 MeV (The value depends on the renormalization scheme). For most

practical purposes, however, the important question is at what energy scales we can treat QCD as

a perturbative theory. This is the case for process where q2 Λ2QCD.

5.5 Accidental symmetries

The kinetic term of QCD has a large accidental global symmetry, U(6)L×U(6)R. The mass terms

break it, in general, to [U(1)]6:

U(1)u × U(1)d × U(1)s × U(1)c × U(1)b × U(1)t. (5.17)

In the mass basis, Eq. (5.11), this symmetry is manifest.

The following comments are in order:

• The [U(1)]6 symmetry predicts that the quarks are stable.

• More generally, the accidental symmetry implies that quark flavor is conserved. For example,

uu→ cc is allowed, but, for example, uc→ cu is not.

• Both predictions are violated in Nature. In fact, the weak interaction breaks the [U(1)]6 sym-

metry down to a single U(1) called baryon number. This U(1)B is one where all quarks rotate

58

with the same phase. It is also called the diagonal U(1). This symmetry breaking by the

weak interactions allows the QCD-forbidden processes to be mediated by weak interactions.

• At and below the scale of present experiments, the weak interactions are considerably weaker

than the strong interactions. Thus, the QCD Lagrangian and its accidental [U(1)]6 symmetry

constitute a good approximation.

• At low energies, the QCD Lagrangian has even larger approximate accidental symmetries:

isospin, SU(3)-flavor and heavy quark symmetry. These are discussed later.

• Similarly to QED, also QCD conserves C, P and CP since it is a vectorial theory with masses

and gauge interaction only.

5.6 Combining QCD with QED

In this section we combine QCD, the local SU(3)C theory of strong interactions, with QED, the

local U(1)EM theory of the electromagnetic (EM) interactions. The model describes the quark

spectrum and interactions in Nature, neglecting the weak interactions.

The model is defined as follows:


SU(3)C × U(1)EM. (5.18)

(ii) There are six left-handed and six right-handed quark fields,

ULi(3)+2/3, URi(3)+2/3, DLi(3)−1/3, DRi(3)−1/3, i = 1, ..., 3. (5.19)


The resulting Lagrangian combines QED with QCD. The imposed symmetry requires that we

include the following nine gauge field DoF:

Gµa(8)0, Aµ(1)0. (5.20)

The covariant derivative acting on the fermions fields is given by

Dµ = ∂µ +i

2gsG

µaλa + ieqAµ, (5.21)

where q = +2/3(−1/3) for the Ui(Di) fields.


Lkin = −1

4Gµνa Gaµν −

1

4F µνFaµν − iULiD/ULi − iURiD/URi.− iDLiD/DLi − iDRiD/DRi. (5.22)

59

From the kinetic term we read off the gauge interactions. The gluon interactions are identical to

the pure QCD Lagrangian and the photon interactions are identical to the pure QED Lagrangian

(except that the quark charges are different from those of the electron). The mass terms are the

same as in the QCD Lagrangian. In terms of the Dirac mass eigenstate quark fields, they read

Lψ = −muuu−mccc−mttt−mddd−msss−mbbb, (5.23)

where u, c, t are the three q = +2/3 fields and d, s, b are the three q = −1/3 fields.

The resulting theory is not much different from just putting together LQCD +LQED. The global

symmetry of the kinetic terms is now [U(3)]4 (instead of the [U(6)]2 symmetry of the pure QCD

case), but the mass terms break it to the same symmetry of QCD, [U(1)]6.

We can also add the three charged leptons to this model. They are in the (1)−1 representation

of SU(3)C × U(1)EM. The terms that involve them are the same as in Eq. (3.18).

60

Homework

Question 5.1: β function

In class we discussed the idea that coupling constants run, that is, that higher order corrections

can be absorbed into the definition of the coupling constant. For example, consider the cross section

for e+e− → µ+µ−. In the CM frame and to leading order, this is given by (up to normalization

factors)

σ =α2

E4, (5.24)

where E is the energy of the electron. Higher order effects change this result. Most of the effect

can be absorbed into the running of α, that is

σ =α(µ)2

E4, (5.25)

where µ ∼ E is the energy scale in the problem, and α(µ) is a running coupling constant that

satifies the differential equation:∂α

∂ log(µ)= β(α), (5.26)

where the beta function can be calculated to the desired precision in perturbation theory. In QED

at one loop with only electrons in the loop we have

β(α) = Bα2, B =2

3π. (5.27)

1. Verify that the solution of the beta function equation is

1

α(µ1)=

1

α(µ2)+B log

(µ2

µ1

). (5.28)

2. Use α(me) ≈ 1/137.0 to calculate α(mZ) where mZ ∼ 91 GeV.

3. Find the Landau pole, that is, find µ where α(µ) → ∞. What can you say about this

infinity?

61

Measurements found that α(mZ) ≈ 1/128. The reason for the disagreement with your result above

is that there are other particles in the loop beside the electrons. The generalization of eq. (5.28) is

1

α(µ1)=

1

α(µ2)+B

∑i

Q2iN

iC log

(mi

µ1

). (5.29)

where the sum over i is the sum of all the charged particles with mass below µ1, i.e. we assume

µ2 ≤ mi < µ1. N iC = 1(3) for leptons (quarks), and Qi is the electric charge.

4. Give a physical argument why we only sum over particles with mass less than µ1.

5. Use the physical masses and charges of the known fermions

q = −1 : m` ∼ (0.5, 100, 1777) MeV,

q = 2/3 : mu ∼ (0.3, 1.4, 174) GeV,

q = −1/3 : md ∼ (0.3, 0.4, 4.2) GeV, (5.30)

and calculate α(mZ). How close is your number to the measured value? (Note that mu, md

and ms are much larger than what you find in PDG, because the correct quark masses to

use here are “valence quark masses”, which is not what the PDG gives.)

We now move to QCD where the beta function is more complicated

B = −(

11− 2nf3

)1

2π(5.31)

where nf is the number of quark flavors with masses below the relevant scale. Below we use the

input αs(mZ) ∼ 0.12

6. Note that the sign of the beta function changes sign depending on the number of flavors.

How many flavors are needed to change the sign of the beta function?

7. Sketch the shape of the function αs(µ) for µ between 1 and 104 GeV for (i) a theory with

fewer flavors, and (ii) a theory with more flavors than this critical value. Use log scale for µ.

8. Estimate ΛQCD, that is, the scale where αs = 1. For simplicity, you can neglect all quark

masses but that of the top.

9. The proton mass is very roughly mP ≈ 3ΛQCD. Can you tell if the mass of the proton would

be lighter or heavier if we did not have the third generation, assuming the same measured

value of αs(mZ)?

62

Chapter 6

Spontaneous Symmetry Breaking

The notion of broken symmetries may seem strange: In what sense is there a difference between

the case that we call “broken symmetry” and the case of not having the symmetry at all? The

idea of a broken symmetry is however meaningful in two scenarios:

• Explicit breaking of a symmetry by a small parameter. The Lagrangian includes terms

that break the symmetry, but these terms are characterized by a small parameter. The small

parameter can be either a small dimensionless coupling, or a small ratio between mass scales.

The symmetry is then approximate, and one can obtain selection rules for processes that are

forbidden in the symmetry limit.

• Spontaneous symmetry breaking (SSB). The Lagrangian is symmetric, but the vacuum state

is not. Even though with SSB there is no longer a conserved charge, the number of parameters

is the same as in the case of unbroken symmetry. In this sense, the predictive power of a

spontaneously broken symmetry is as strong as that of the unbroken symmetry.

In this chapter we introduce the idea of spontaneous symmetry breaking and analyze its conse-

quences.

SSB is based on the following ingredients. Symmetries of interactions are determined by the

symmetries of the Lagrangian. The states, however, do not have to obey these symmetries. Con-

sider, for example, the hydrogen atom. While the Lagrangian is invariant under rotations, an

eigenstate does not have to be. Any state with a finite m quantum number is not invariant under

rotation around the z-axis. This is a general case when we have degenerate states.

In perturbative QFT we always expand around the lowest energy state. This lowest state is

called the “vacuum” state. When the vacuum state is degenerate, the fact that we can expand

around any of the vacua, and the physics would be the same, is a consequence of the symmetry.

Yet, when we expand around a specific vacuum state out of the degenerate set of vacua, we expand

around a state that does not conserve the symmetry.

63

The name “spontaneously broken” indicates that there is no preference as to which of the states

is chosen. A very simple example is that of a hungry donkey. Consider a donkey that stands exactly

halfway between two stacks of hay. Symmetry tells us that it costs the same amount of energy to

go to either stack. Thus, the donkey cannot choose where to go and would not go anywhere! Yet,

in reality, the donkey would make an arbitrary choice and go to one of the stacks to eat. We say

that the donkey spontaneously breaks the symmetry between the two sides.

In previous chapters we encountered the predictive power of imposed symmetries. In this

chapter we show that spontaneously broken symmetries are no less predictive than exact ones,

though the predictions are different. While the symmetry is no longer manifest, in the sense that

processes that are forbidden in the symmetry limit may become allowed if it is spontaneously

broken, there are subtle relations between these ‘forbidden’ processes and the allowed ones which

reveal that the Lagrangian does have this symmetry. This is why a spontaneously broken symmetry

is also called a hidden symmetry.

6.1 Global discrete symmetries

Consider a model with a single real scalar field φ, where we impose φ-parity:

φ→ −φ. (6.1)

The Lagrangian reads

L =1

2(∂µφ)(∂µφ)− µ2

2φ2 − λ

4φ4. (6.2)

In particular, the symmetry forbids a φ3 term.

Hermiticity of L requires that µ2 and λ are real. The scalar potential should be physically

relevant, so we must have λ > 0. As for the µ2 term, we can have either µ2 > 0 or µ2 < 0. For

µ2 > 0 we have an ordinary φ4 theory, and µ2 is the mass-squared of φ. The case of interest for

our purposes is

µ2 < 0. (6.3)

The minimum of the scalar potential should satisfy

0 =∂V

∂φ= φ(µ2 + λφ2). (6.4)

Thus, the potential has two possible minima:

φ± = ±√−µ2

λ≡ ±v. (6.5)

The classical solution would be either φ+ or φ−. We say that φ acquires a vacuum expectation

value (VEV):

〈φ〉 ≡ 〈0|φ|0〉 6= 0. (6.6)

64

Perturbative calculations should involve expansions around the classical minimum. Since the

two solutions are physically equivalent, the physics cannot depend on our choice, but we must

make a choice. Let us choose — without loss of generality — to expand around φ+. We define a

field φ′ with a vanishing VEV:

φ′ = φ− v. (6.7)

In terms of φ′, the Lagrangian reads

L =1

2(∂µφ

′)(∂µφ′)− 1

2(2λv2)φ′2 − λvφ′3 − λ

4φ′4, (6.8)

where we used µ2 = −λv2 and discarded a constant term.

Let us emphasize several points:

(i) The Lagrangian (6.8) includes all possible terms for the real scalar field φ′. In particular,

it has no φ′-parity symmetry. Thus, the φ → −φ symmetry is hidden. It is spontaneously

broken by our choice of the ground state 〈φ〉 = +v.

(ii) Yet, the Lagrangian (6.8) is not the most general renormalizable L(φ′). While the general

L(φ′) depends on three independent parameters [see Eq. (1.2)], (6.8) depends on only two.

The two parameters can be chosen to be v and λ, or µ2 and λ. The first choice is the one we

made in writing Eq. (6.8). The second choice employs the same parameters of the original,

manifestly symmetric L(φ), see Eq. (6.2). It demonstrates that the SSB does not introduce

additional new parameters.

(iii) The coefficients of the quadratic and trilinear terms in (6.8) are different from those of the

quadratic and trilinear terms in Eq. (6.2). In contrast, the coefficients of the quartic terms

are the same. This is a general result: as long as we consider only the renormalizable terms,

SSB changes dimensionful parameters, but not dimensionless ones.

(iv) While the symmetry is manifest in Eq. (6.2), the phenomenological interpretation of this

model should start from Eq. (6.8). Specifically, the model describes a scalar particle of

mass-squared 2λv2 = −2µ2. This particle is an excitation of the φ′ field.

The fact that the three terms in the scalar potential — the mass term, the trilinear term and

the quartic term — depend on only two parameters means that there is a relation between the

three couplings. In terms of the parameters of the general Lagrangian (1.2), the relation is [note

that µ is here the coefficient of the trilinear term, different from µ of Eq. (6.2)]

µ2 = 4λm2. (6.9)

This relation is the clue that the symmetry is spontaneously, rather than explicitly, broken.

65

6.2 Global Abelian continuous symmetries

Consider a model with a complex scalar field φ, with q = 1 under an imposed U(1) symmetry, and

thus its tranformation is

φ→ eiθφ. (6.10)


L = (∂µφ†)(∂µφ)− µ2φ†φ− λ(φ†φ)2. (6.11)

Equivalently, we can rewrite the Lagrangian in terms of two real scalar fields, h and ξ, where

φ = (h+ iξ)/√

2, (6.12)

and impose an SO(2) symmetry,(h

ξ

)→(h

ξ

)′=

(cos θ sin θ

− sin θ cos θ

)(h

ξ

). (6.13)


L =1

2(∂µh)(∂µh) +

1

2(∂µξ)(∂

µξ)− µ2

2(h2 + ξ2)− λ

4(h2 + ξ2)2. (6.14)

The µ2 and λ parameters are real, and we must have λ > 0. We consider the case that µ2 < 0.

We define v2 = −µ2/λ. The scalar potential can be written (up to a constant term) as

V = λ

(φ†φ− v2

2

)2

. (6.15)

Thus, φ acquires a VEV:

2〈φ†φ〉 = 〈h2 + ξ2〉 = v2 = −µ2

λ. (6.16)

In the (h, ξ) plane, there is a circle of radius v that corresponds to minima of the potential. We

now have to choose a specific vacuum to expand around it. We choose the real component of φ to

carry the VEV (〈Im φ〉 = 0):

〈h〉 = v, 〈ξ〉 = 0. (6.17)

We define the real scalar fields

h′ = h− v, ξ′ = ξ (6.18)

with vanishing VEVs:

〈h′〉 = 〈ξ′〉 = 0. (6.19)

We obtain the Lagrangian in terms of h′ and ξ′:

L =1

2(∂µh

′)(∂µh′) +1

2(∂µξ

′)(∂µξ′)− λv2h′2 − λvh′(h′2 + ξ′2)− λ

4(h′2 + ξ′2)2. (6.20)


66

1. The SO(2) symmetry is spontaneously broken. This can be seen from the presence of the

h′(h′2 + ξ′2) term.

2. The Lagrangian describes one massive scalar, h′, with m2 = 2λv2, and one massless scalar,

ξ′.

3. The Lagrangian of Eq. (6.20) is not the most general Lagrangian for two real scalar fields.

Many terms are missing, while others, that would have been independent in the general case,

are related. In particular, there are only two independent parameters, as for a Lagrangian

with an unbroken SO(2).

4. The quartic terms, with dimensionless couplings, are the same in (6.14) and (6.20). Only

dimensionful couplings are modified.

5. If the symmetry were not broken, it would be impossible distinguish the two components

of the complex scalar field. With the symmetry spontaneously broken, these two DoF are

distinguishable. For example, they have different masses.

6. We chose a basis by assigning the VEV to the real component of φ. This is an arbitrary

choice. We made it since it is convenient. The physics does not depend on this choice.

7. Since the symmetry is (spontaneously) broken, the Lagrangian is no longer invariant under

the transformation (6.13). This transformation can be viewed as fixing θ or a choice of basis

(where the VEV is real and positive).

8. A particle with negative mass-squared is called a tachyon and it travels faster than light. The

example above shows why tachyons do not appear in QFT. For the physical interpretation, we

have to expand around the minimum, and thus physical particles have positive mass-squared.

Fields with negative mass-squared are sometimes referred to as tachyonic fields.

One of the most interesting features of the model presented here is the existence of a mass-

less scalar field. This result is not particular to our specific model, but the result of a general

theorem: The spontaneous breaking of a continuous global symmetries is always accompanied by

the appearance of a massless scalars called Goldstone Bosons. We discuss it in more detail in

Appendix 6.A.

Here we only briefly describe the intuition of that theorem. We can get a SSB only when the

vacuum is degenerate, and in a case of a continuous symmetry, this degeneracy is also continuous.

In the case of a U(1) symmetry the shape of the potential is usually called “a Mexican hat.” In this

case when expanding around any point in the valley it can be seen that one direction is flat. A flat

direction in the potential correspond to a massless DoF. The Goldstone theorem is a generalization

of this simple picture. Fig. 6.1 demostrate this point.

67

Figure 6.1: The Mexcian hat potential. The masses of the two DoFs correspond to the second

derivative of the potential. We see that one of them (the pic on the left) is flat, while the other

one (on the right) is not flat. In the global case these two corresponds to the Goldstone boson and

the massive DoF. In the local case the flat direction is the one “eaten”by the gauge boson and the

massive oen is the Higgs boson. (The plots are taken from [9].)

6.3 Global non-Abelian continuous symmetries

The Non-Abelian case is similar to the Abelian one, with a significant different, that is, that the

symmetry may not be totaly broken. Since there are more than one generator, some of them may

not be broken by the vacuum. The exact pattern of symmetry breaking depends on the specific

potential and the represenations of the scalar fields.

A generator corresponds to a broken symmetry if the vacuum is not invariant under an operation

by the corresponding group elements. On the contrary, a generator corresponds to an unbroken

symmetry if the vacuum is invariant to an operation by the corresponding group elements. Given

the fact that the group elements is the exponet of the generator, we see that the condition can be

represent in terms of the generators as follows. We denote the the vacuum state by 〈φ〉. A broken

generator gives

Ta〈φ〉 6= 0 (6.21)

while an unbroken one has

Ta〈φ〉 = 0 (6.22)

More details are given in the homework.

We now give two examples where we discuss breaking of SU(2) by a triplet and by a doublet.

Breaking by a triple

Consider a model with an imposed SO(3) symmetry, and a scalar field, φ, that transform as a

triple the symmetry. This is one example where we cn get intution in the non-abelian case. What

we have here is a vector in a real 3d space. Once a vector is places in space the full 3d rotation

68

breaks into a rotation in a plane that is perpendicular to the vector. Here we show this inturion

explicitly.

Given the fact that φ is a triple it transform as

φ→ e(i/2)Laθaφ. (6.23)

where (La)bc = εabc are the triplet representation of the SO(3) algebra.


L = Lkin −µ2

2φ†φ− λ

4(φ†φ)2. (6.24)

We take µ2 < 0, and define v2 = −µ2/λ. Then φ acquires a VEV: |〈φ〉| = v. The triple φ has

three DoFs. We choose a basis and define the vev to be in the z direction

φ =

φ1

φ2

v + φ3

, (6.25)

such that 〈φi〉 = 0, i = 1, 2, 3.

The Lagrangian for the φi fields can be written as

L = Lkin + L2 + L3 + L4, (6.26)

where Ln includes terms that are n’th power in the φi fields. Let us comment on the significance

of each of these parts of the Lagrangian:

• Quadratic terms:

L2 = −λv2φ23. (6.27)

We learn that the model has one massive scalar, φ3, of mass-squared m23 = 2λv2, and

two massless scalars, m21 = m2

2 = 0. This is a manifestation of the Goldstone theorem.

Spontaneous symmetry breaking requires the appearance of massless Goldstone bosons in

correspondence to the broken generators. Since the SO(3) algebra has three generators and

it is brokn to SO(2) that has one generator our model must have two Goldstone bosons.

• Trilinear terms:

L3 = λvφ1φiφi. (6.28)

The fact that L3 6= 0 is a manifestation of the SO(3) breaking.

• The quartic terms:

L4 = −(λ/4)(φ21 + φ2

2 + φ23)2. (6.29)

This part of the Lagrangian has dimensionless couplings and therefore it is unchanged from

the symmetric form.

69

We can check that the unbroken generator is indeed rotation around the z axis. Write Lz

explicitly, we see that it annihilates the vacuum0 1 0

−1 0 0

0 0 0

0

0

v

= 0 (6.30)

We elaborate on this in the homework.

Breaking by a complex doublet

Consider a model with an imposed SU(2) symmetry, and a complex scalar field, φ, that transform

as a doublet under the symmetry:

φ→ e(i/2)τaθaφ. (6.31)


L = Lkin − µ2φ†φ− λ(φ†φ)2. (6.32)

We take µ2 < 0, and define v2 = −µ2/λ. Then φ acquires a VEV: |〈φ〉| = v/√

2. The complex

doublet φ has four DoFs. We choose a basis and define

φ =1√2

(φ3 + iφ4

v + φ1 + iφ2

), (6.33)

such that 〈φi〉 = 0, i = 1, 2, 3, 4. The Lagrangian for the φi fields can be written as

L = Lkin + L2 + L3 + L4, (6.34)

where Ln includes terms that are n’th power in the φi fields. Let us comment on the significance

of each of these parts of the Lagrangian:

• Quadratic terms:

L2 = −λv2φ21. (6.35)

We learn that the model has one massive scalar, φ1, of mass-squared m21 = 2λv2, and three

massless scalars, m22,3,4 = 0. This is a manifestation of the Goldstone theorem. Spontaneous

symmetry breaking requires the appearance of massless Goldstone bosons in correspondence

to the broken generators. Since the SU(2) algebra has three generators, our model must

have three Goldstone bosons.

• Trilinear terms:

L3 = λvφ1φiφi. (6.36)

The fact that L3 6= 0 is a manifestation of the SU(2) breaking.

70

• The quartic terms:

L4 = −(λ/4)(φ21 + φ2

2 + φ23 + φ2

4)2. (6.37)

This part of the Lagrangian has dimensionless couplings and therefore it is unchanged from

the symmetric form.

The symmetry structure of this model is worth a further discussion. The complex doublet φ

has four real DoF. If we write φ of Eq. (6.31) in the form

φ =1√2

(φ3 + iφ4

φ1 + iφ2

), (6.38)

it becomes clear that the Lagrangian of Eq. (6.32) depends only on the combination (φ21 + φ2

2 +

φ23 + φ2

4) and consequently has an accidental symmetry such that the overall symmetry is SO(4).

The VEV of φ breaks SO(4) → SO(3). This can be seen from the fact that the Lagrangian of

Eqs. (6.34)–(6.37) depends on φ2,3,4 only in the combination (φ22 +φ2

3 +φ24). The SO(4) group has

six generators while SO(3) has three, hence the appearance of three Goldstone bosons.

6.4 Fermion masses

Spontaneous symmetry breaking can give masses to chiral fermions. We explain this statement by

an explicit example.

Consider a model with a U(1) symmetry. The field content consists of a left-handed fermion

ψL, a right-handed fermion ψR, and a complex scalar φ with the following U(1) charges:

q (ψL) = 1, q (ψR) = 2, q (φ) = 1. (6.39)

The most general Lagrangian we can write is

L = Lkin − µ2φ†φ− λ(φ†φ)2 − Y φψRψL + h.c.. (6.40)

Since the fermions are charged and chiral, we cannot write mass terms for them (Lψ = 0).

We take µ2 < 0, so that the scalar potential is the one given in Eq. (6.14), leading to a VEV

for φ: |〈φ〉| = v/√

2 6= 0. As above, we choose 〈Reφ〉 = v/√

2, 〈Imφ〉 = 0. We define the real

fields h and ξ in such a way that they have vanishing VEVS:

φ =h+ v + iξ√

2, (6.41)

Expanding around the chosen vacuum we find

L = Lkin − V (h, ξ)− Y v√2ψRψL −

Y√2

(h+ iξ)ψRψL + h.c. , (6.42)

71

where V (h, ξ) can be read off Eq. (6.20). We learn that ψL and ψR combined to form a Dirac

fermion with mass

mψ =Y v√

2. (6.43)

This is possible because the symmetry under which the fermion is chiral is completely broken. The

two real scalar fields, h and ξ, couple to the fermion with same Yukawa coupling, and the coupling

is proportional to the fermion mass.

In a more general case, the symmetry might be only partially broken, namely a subgroup of

the original group remains unbroken. In this case, a necessary condition for generating fermion

masses is that the fermion representation is vector-like under the unbroken subgroup.

6.5 Local symmetries: the Higgs mechanism

In this section we discuss spontaneous breaking of local symmetries. We demonstrate it by studying

a U(1) gauge symmetry. We will find out that breaking of a local symmetry results in mass terms

for the gauge bosons that correspond to the broken generators. It is a somewhat surprising result,

since the spontaneous breaking of a global symmetry gives massless Goldstone bosons. In the

case of a local symmetry, these would-be Goldstone bosons are “eaten” by the gauge bosons and

become the longitudinal components of the resulting massive vector-bosons.

Consider a theory similar to the one discussed in Section 6.2, where we have a single scalar

field that is charged under a U(1) symmetry. The difference is that here we impose a local U(1)

symmetry, that is

φ→ eiθ(x)φ. (6.44)

The Lagrangian is given by

L = (Dµφ)†(Dµφ)− 1

4FµνF

µν − µ2φ†φ− λ(φ†φ)2. (6.45)

The covariant derivative is given by

Dµφ = (∂µ + igAµ)φ , (6.46)

Aµ is the gauge field, Fµν is defined in Eq. (2.29), and g is the coupling constant.

We consider the case of µ2 < 0, leading to SSB via a VEV of φ:

〈φ〉 =v√2, v2 = −µ

2

λ. (6.47)

We choose the real component of φ to carry the VEV. We again write the complex scalar in terms

of two scalar fields with vanishing VEVs, 〈h〉 = 〈ξ〉 = 0 but, unlike the global case, it is convenient

to write the two DoFs as a phase, ξ(x) and a magnitude, h(x):

φ(x) = eiξ(x)/v v + h(x)√2

. (6.48)

72

Note that we normalized ξ(x) such that it has mass dimension one. To leading order Eq. (6.48) is

the same as Eq. (6.41). We usually refer to Eq. (6.48) as a non-linear realization and to Eq. (6.41)

as a linear realization.

When a symmetry is spontaneously broken, the Lagrangian is no longer invariant under the

broken symmetry transformation. Instead, the transformation constitutes a change of basis. We

can use this change of basis to our advantage, by choosing a basis that makes the physics of the

model more transparent. This is what we do here by choosing a specific gauge: θ(x) = −ξ(x)/v.

(It is fully legitimate to choose the phase to be related to a field.)

With this choice of gauge,

φ→ φ′ =1√2

(h+ v), Aµ → Vµ = Aµ +i

gv∂µξ. (6.49)

The Lagrangian in terms of h and Vµ reads

L = −1

4FµνF

µν +1

2(∂µh)(∂µh) +

1

2(g2v2)VµV

µ − 1

2(2λv2)h2 (6.50)

+g2

2VµV

µh(2v + h)− λvh3 − λ

4h4.


1. The elementary particles of this model are a massive vector boson of mass-squared m2V =

(gv)2 and a massive scalar of mass-squared m2h = 2λv2.

2. The ξ field is “eaten” in order to give mass to the gauge boson. It was a conveniant choice

to make the phase to be the “eaten” DoF. The total number of degrees of freedom does

not change: instead of the scalar ξ, we have the longitudinal component of a massive vector

boson.

3. In the limit g → 0 we have mV → 0. This situation describes a massless gauge boson

and a massless scalar. We see that in that limit the longitudinal component is the massless

Goldstone boson as expected.

4. The field that acquires a VEV, in our case φ, is called the Brout-Englert-Higgs (BEH) field

or the Higgs field. The h scalar is called “a Higgs boson.”

5. The Lagrangian (6.45) depends on three parameters. They can be taken to be g, v and λ.

The Lagrangian (6.50) has two mass terms and four interaction terms which depend on the

same three parameters. Thus, the six relevant terms, which would be independent in the

absence of a symmetry, obey three relations among them. This is a sign of SSB.

6. The hV V coupling is proportional to the mass-squared of the vector boson.

7. The dimensionless V V hh and hhhh couplings are unchanged from the symmetric Lagrangian.

73

In the example above we consider the case of a SSB of a local U(1) symmetry. The basic

ingredients are, however, much more generic and apply also to non-Abelian symmetries and to

product groups. In fact, the SM incorporates SSB of a local SU(2) × U(1) symmetry. The

following lessons are generic to all cases of spontaneous breaking of a local symmetry:

1. Spontaneous symmetry breaking gives masses to the gauge bosons related to the broken

generators.

2. Gauge bosons related to an unbroken subgroup remain massless, because their masslessness

is protected by the symmetry.

3. The field that acquires a VEV (the BEH field) must be a scalar field. Otherwise its VEV

would break Lorentz invariance.

Spontaneous breaking of a local symmetry can give masses also to fermions, as is the case

for a spontaneously broken global symmetry. To see this, we add to our model fermions, with

q(ψR) − q(ψL) = q(φ), as in the model of Section 6.4. Working in the physical gauge, θ(x) =

−ξ(x)/v, we learn that Eq. (6.42) is modified to

L = Lkin − V (h)− Y v√2ψRψL −

Y√2hψRψL + h.c.. (6.51)

In the physical gauge, the ξ field does not appear explicitly anymore, as expected. The lon-

gitudinal component of the vector boson couples to the fermion, with a coupling strength that

is proportional to the fermion mass (∝ mf/v). The coupling of the transverse component is

proportional to the gauge coupling.

6.6 Summary

Symmetries in QFT have a strong predictive, or explanatory, power. The main consequences of

the various types of symmetries are summarized in Table 6.1.

To construct a model, we first define the following three ingredients:

(i) The symmetry;

(ii) The transformation properties of the fermions and scalars;

(iii) The pattern of spontaneous symmetry breaking (SSB).

Then we write the most general renormalizable Lagrangian that is invariant under the symmetry.

The renormalizable Lagrangian has a finite number of parameters that we need to determine

by experiment. In principle, for a theory with N independent parameters, we need to perform N

74

Table 6.1: Symmetries

Type Consequences

Spacetime Conservation of energy, momentum, angular momentum

Discrete Selection rules

Global (exact) Conserved charges

Global (spon. broken) Massless scalars

Local (exact) Interactions, massless spin-1 mediators

Local (spon. broken) Interactions, massive spin-1 mediators

appropriate measurements to extract the values of the parameters. Additional measurements test

the theory.

Strictly speaking, the pattern of the SSB is not an input, that is, it depends on the values of

the parameters. In all models that we discuss, the SSB pattern depends on the sign of µ2. Yet,

since the phenomenology is so different based on this choice, we prefer to quote it as an input

ingredient.

75

Appendix

6.A The Goldstone Theorem

The Goldstone Theorem states the following: The spontaneous breaking of a global continuous

symmetry is accompanied by massless scalars. Their number and quantum numbers equal those of

the broken generators.

Consider φj to be some multiplet of scalar fields with the Lagrangian

L(φ) =1

2(∂µφj)(∂

µφj)− V (φ) (6.52)

where L(φ) is invariant under some symmetry group as in Eq. (4.2)

φj → (exp [iT aθa])jk φk. (6.53)

We want to perturb around a minimum of the potential V (φ). We expect the φ field to have

a VEV, 〈φ〉, which minimizes V . The condition that 〈φ〉 is an extremum of V (φ) reads

V ′j∣∣∣φ=〈φ〉

= 0 ∀j, where V ′j ≡∂V

∂φj(6.54)

The condition for minimum at v is, in addition to (6.54), that the second derivative matrix at the

extremum

m2ij ≡

∂2V

∂φi∂φj

∣∣∣∣∣φ=〈φ〉

(6.55)

is a positive semidefinite matrix, that is, that all of its eigenvalues are nonnegative. Note that m2ij

is the scalar mass-squared matrix. We can see that by expanding V (φ) in a Taylor series in the

shifted fields φ′ = φ− 〈φ〉 and noting that the mass term is the one with two fields.

Now we check for the behavior of 〈φ〉 under the transformation (6.53). There are two cases. If

Ta〈φ〉 = 0 (6.56)

for all a, the symmetry is not broken. This is certainly what happens if 〈φ〉 = 0. It is also possible

that

Ta〈φ〉 6= 0 for some a. (6.57)

76

This is the case when Ta is spontaneous broken.

We focuse on the case where some generators of the original symmetry are spontaneously

broken while others are not. Note that the set of generators satisfying Eq. (6.56) is closed under

commutation becasue

Ta〈φ〉 = 0 & Tb〈φ〉 = 0 =⇒ [Ta, Tb] 〈φ〉 = 0 , (6.58)

and therfore they generate the unbroken subgroup of the original symmetry group.

Because V is invariant under Eq. (6.53), we have to leading order (that is, for θa 1)

V (φ+ δφ)− V (φ) = i∂V (φ)

∂φkθa(T

a)klφl = 0. (6.59)

If we differentiate with respect to φj, and set φ = 〈φ〉 we get

m2jk(T

a)klφl + Vk(〈φ〉)(T a)kj = 0. (6.60)

The second term drops out because we work around the minimum, see Eq. (6.54), and we obtain

m2jk(T

a)kl〈φ〉l = 0. (6.61)

For T a in the unbroken subgroup, T a〈φ〉 = 0 and Eq. (6.61) is trivially satisfied. But if T a〈φ〉 6= 0,

Eq. (6.61) requires that T a〈φ〉 is an eigenvector of m2 with eigenvalue zero. It corresponds to a

massless boson field given by

φj(Ta)jl〈φ〉l (6.62)

which is called a Goldstone boson.

77

Homework

Question 6.1: SSB with many scalars

Consider a Lagrangian for N interacting real scalar fields φi with i = 1..N ,

L =1

2∂µφi∂

µφi −1

2µ2∑i

(φ2i

)− 1

4λ

[∑i

(φ2i

)]2

, (6.63)

with µ2 < 0 and λ > 0. This Lagrangian is a generalization of Eq. (6.11). It is symmetric under

SO(N) rotation φi → Uijφj.

1. Show that L describes a massive field of mass√−2µ2 and N − 1 massless Goldstone bosons.

2. What is the unbroken symmetry group?

3. The Goldstone theorem states that the number of massless bosons is equal to the number of

broken generators. Show explicitly that this relation holds.

Question 6.2: Broken and unbroken symmetries

In this question we are going to elaborate on Eqs. (6.21) and (6.22) that stated that a unbroken

generator Ta annihilate the vacuum, Ta〈φ〉 = 0 while a broken one does not, Ta〈φ〉 6= 0. For this

consider the operation of a generator on the vacuum

〈φ′〉 = eiTaθ〈φ〉. (6.64)

1. Explain why the symmetry is unbroken if 〈φ′〉 = 〈φ〉 for any θ and that is broken if there is

a θ such that 〈φ′〉 6= 〈φ〉

2. Explain why the above implies that Ta〈φ〉 = 0 if Ta corresponds to an unbroken symmetry.

3. A familiar example is the case of a vector in 3d, that breaks the symmetry from an SO(3) to

SO(2) (or from SU(2) to U(1)) that is from a rotation is a 3D into rotation on the plane. The

unbroken symmetry is around the direction of the vector, or in the plane perpendicular to

78

it. Consider a case where we choose the normalized vector to be ~v = (0, 1, 0). Show that Lz

is still a symmetry while Lx and Ly are not. It is useful to recall the explicit representation

of Li for a vector.

Lx =

0 1 0

1 0 1

0 1 0

Ly =

0 i 0

−i 0 i

0 −i 0

Lz =

1 0 0

0 0 0

0 0 −1

(6.65)

4. Now consider a generic normalized vector, ~v = (a, b, c) such that a2 + b2 + c2 = 1. Show that

[(a− c)Lx − i(a+ c)Ly − 2bLz]~v = 0. (6.66)

The above shows that there is always one generator that is not broken, so indeed the unbroken

symmetry is SO(2).

5. While a 3 (aka vector) breaks SU(2) to U(1) a 2 (aka as spinor) breaks it to nothing. In order

to show it we need to prove that there is no way to annihilate a spinor with any combination

of the generators, which in this case are just the Pauli matrices. Consider a simple case

where the spinor is just ~s = (0, 1) and show that any normalized linear combination of the

Pauli matrices does not annihilate it. From that we learn that choosing a spinor the SU(2)

symmetry it totally broken.

Question 6.3: The sigma model

A classic example of a spontaneous symmetry breaking with Goldstone bosons is the so called

“σ-model”, which tries to explain pion-nucleon interactions.

We begin by putting the proton and neutron in a doublet of global SU(2) which we call isospin.

We further impose a chiral SU(2)L×SU(2)R symmetry such that the left-handed field transform

under SU(2)L and the right-handed field under SU(2)R. Then the most general L is

L = iψ∂/ψ, ψ =

(P

N

). (6.67)

The infinitesimal symmetry transformations are

δψL = iεaLTaψL, δψR = iεaRT

aψR. (6.68)

1. Show that L is invariant under these chiral symmetries. Rewrite them in the form

δψ = iεaT aψ, δψ = iγ5εa5T

aψ, (6.69)

and express εa and εa5 in term of εaL and εaR.

79

What you showed is that we can write the symmetry in a different basis. Instead of SU(2)L ×SU(2)R we can write SU(2)V × SU(2)A. (SU(2)V is also called “diagonal SU(2)”.)

2. A mass term for ψ breaks one of the SU(2) symmetries. Which one? Show it.

Instead of adding a mass term we introduce a scalar field, Σ, that transforms as doublet under

both SU(2)L and SU(2)R. Then the most general Lagrangian I can write is

L = Lkin − g(ψLΣψR + h.c.) + L(Σ†Σ) . (6.70)

In general Σ is a doublet under two SU(2) symmetries so naively it must have 8 real components.

Yet, recalling that SO(4)∼ SU(2)× SU(2) you should not be surprised that we can write it only

in term of 4 real scalar fields. We write

Σ = σ + iτaπa, (6.71)

such that τa are the Pauli matrices.

3. Show that the rotations of δΣ rotate these fields into each other, and write the rotations in

terms of σ and π and the ε and ε5 parameters. I will start you off: δσ = −εa5πa.

4. Rewrite the scalar-fermion couplings in terms of σ and π.

In order to give the nucleon mass we need to break the chiral symmetry spontaneously

SU(2)L × SU(2)R → SU(2) (6.72)

that is, we like to break the symmetry that forbid the nucleon mass. For that we like the Σ field

to acquires a vev. This can be done if we write its scalar potential as

1

4λ[σ2 + ~π2 − F 2

π

]2. (6.73)

Here Fπ is the so called “pion decay constant” and it is the only mass scale in the theory.

5. What are the minima of V ? Find a minimum when only σ but not π acquires a vev.

6. Rewrite L in terms of ψ, πa and s ≡ σ − Fπ. Note that all these fields are physical, that is,

have no vev. What are the masses of these new fields?

7. What is the nucleon-nucleon-pion interaction term? Show that it satisfied g = mN/Fπ.

(This relation is know as the Goldberger-Treiman relation and it satisfied in Nature to a

good accuracy.)

8. How many generators are broken? Does the Goldstone theorem hold?

80

9. In Nature the pions have small masses (small compared to the nucleon). That is, we like to

think about a small breaking of a symmetry. Which is this symmetry? Can you think about

a way to modify the model such that the pions end up with small masses? What other terms

have to be added once you decide to have a small breaking of the symmetry?

Question 6.4: The Higgs mechanism

We consider the model of section 6.5 with the Lagrangian of Eq. (6.45).

1. Show that to leading order in ξ/v and h/v, Eq. (6.48) is the same as Eq. (6.41), that is, show

that

eiξ(xµ)/v v + h(xµ)√2

≈ v + h(xµ) + iξ(xµ)√2

, (6.74)

2. Use Eqs. (6.48) and (6.49) to derive Eq. (6.50)

3. Draw the tree-level diagrams for the hh → hh scattering and write down the amplitude.

Note that there are two types of diagrams.

4. Calculate (up to numerical constants and phase space) the differential cross section in the

limit where E v and θ ∼ 1. Here E is the center of mass energy of the collision and θ is

the scattering angle such that θ = 0 is forward scattering.

5. Consider now the same theory but with µ2 > 0 and calculate (again, up to numerical con-

stants and phase space) the φφ∗ → φφ∗ differential cross section the limit where E µ and

θ ∼ 1. Explain the similarity to the result of the hh→ hh scattering cross section obtained

in the previous item.

81

Chapter 7

The Leptonic Standard Model

7.1 Defining the LSM

The Leptonic Standard Model (LSM) incorporates the three aspects of imposed symmetries that

are discussed in previous chapters: Abelian symmetries, non-Abelian symmetries, and spontaneous

symmetry breaking. Moreover, the model is relevant to Nature. It accounts for the weak, electro-

magnetic and Yukawa interactions of the leptons. The LSM is part of the SM. The complete SM

adds quarks and strong interactions to the LSM.

In Section 6.6 we presented the three ingredients that are required to define a model. For the

LSM, these three ingredients are defined as follows:


SU(2)L × U(1)Y . (7.1)

(ii) The pattern of spontaneous symmetry breaking is as follows:

SU(2)L × U(1)Y → U(1)EM. (7.2)

(iii) There are three fermion generations, each consisting of two different representations:

LLi(2)−1/2, ERi(1)−1, i = 1, 2, 3. (7.3)

There is a single scalar multiplet:

φ(2)+1/2. (7.4)

We use the notation (N)Y such that N is the irreducible representation (irrep) under SU(2)L

and Y is the hypercharge (the charge under U(1)Y ). What we mean by Eq. (7.3) is that there are

nine Weyl fermion degrees of freedom that are grouped into three copies (“generations”) of the

same gauge representations. The three fermionic degrees of freedom in each generation form an

SU(2)-doublet (of hypercharge −1/2) and an SU(2)-singlet (of hypercharge −1).

82

It is now our task to find the specific form of the Lagrangian of Eq. (2.33) made ofthe LLi, ERi

[Eq. (7.3)] and φ [Eq. (7.4)] fields, subject to the gauge symmetry of Eq. (7.1) and leading to the

SSB of Eq. (7.2).

7.2 The Lagrangian

The most general renormalizable Lagrangian with scalar, fermion and gauge fields can be decom-

posed into

L = Lkin + Lψ + Lφ + LY , (7.5)

such that Lkin involve all the kinetic terms, Lψ only fermion fields, Lφ only scalar fields, and Lψhave terms that combine fermion and scalar fields.

We discuss each of these terms below.

7.2.1 Lkin and the gauge symmetry

The gauge group is given in Eq. (7.1). It has four generators: three Ta’s that form the SU(2)

algebra

[Ta, Tb] = iεabcTc, (7.6)

where a, b, c = 1, 2, 3 and a single Y that correspond to the U(1) group, and thus does not form

an algebra. The generators of the SU(2) and the U(1) must commute as they belong to different

gauge groups

[Ta, Y ] = 0. (7.7)

There are two independent coupling constants in Lkin: there is a single g for all the SU(2) couplings

and a different one, g′, for the U(1) coupling. The SU(2) couplings must all be the same because

they mix with one another under SU(2) rotations. The U(1) coupling can be different from that

of the SU(2) because the generator Y never appears as a commutator of SU(2) generators.

The local symmetry requires four gauge boson DoFs, three in the adjoint representation of

SU(2) and one related to the U(1) symmetry:

W µa (3)0, Bµ(1)0. (7.8)

The corresponding field strengths are given by [see Eqs. (2.29) and (4.18)]

W µνa = ∂µW ν

a − ∂νW µa − gεabcW

µb W

νc , Bµν = ∂µBν − ∂νBµ. (7.9)

The covariant derivative is

Dµ = ∂µ + igW µa Ta + ig′BµY. (7.10)

83

We define Lkin to include the kinetic terms of all the fields:

Lkin = −1

4W µνa Waµν −

1

4BµνBµν − iLLiD/LLi − iERiD/ERi − (Dµφ)†(Dµφ). (7.11)

For the SU(2)L doublets Ta = σa/2 (σa are the Pauli matrices), while for the SU(2)L singlets,

Ta = 0. [For SU(2)L triplets, (Ta)bc = εabc, which has already been used in writing Eq. (7.9).]

Explicitly,

Dµφ =(∂µ +

i

2gW µ

a σa +i

2g′Bµ

)φ,

DµLL =(∂µ +

i

2gW µ

a σa −i

2g′Bµ

)LL,

DµER = (∂µ − ig′Bµ)ER. (7.12)

7.2.2 LψThere are no mass terms for the fermions in the LSM. We cannot write Dirac mass terms for the

fermions because they are assigned to chiral representations of the gauge symmetry. We cannot

write Majorana mass terms for the fermions because they all have Y 6= 0. Hence,

Lψ = 0. (7.13)

We learn that the LSM is a chiral theory, that is a theory without bare mass terms for the fermions.

7.2.3 LYuk

The Yukawa part of the Lagrangian is given by

LYuk = Y eijLLiERj φ+ h.c., (7.14)

where i, j = 1, 2, 3 are flavor indices. The Yukawa matrix Y e is a general complex 3× 3 matrix of

dimensionless couplings. Without loss of generality, we can choose a basis where Y e is diagonal

and real (see the discussion in subsection 7.5.1):

Y e = diag(ye, yµ, yτ ). (7.15)

7.2.4 Lφ and spontaneous symmetry breaking

The Higgs potential, which leads to the spontaneous symmetry breaking, is given by

−Lφ = µ2φ†φ+ λ(φ†φ

)2. (7.16)

The discussion follows the same lines as the U(1) and SU(2) models presented in Chapter 6. The

quartic coupling λ is dimensionless and real, and has to be positive for the potential to be bounded

84

from below. The quadratic coupling µ2 has mass dimension 2 and is real. If the gauge symmetry

is to be spontaneously broken, Eq. (7.2), we must take µ2 < 0. Defining

v2 = −µ2

λ, (7.17)

we can rewrite Eq. (7.16) as follows (up to a constant term):

−Lφ = λ

(φ†φ− v2

2

)2

. (7.18)

The scalar potential (7.18) implies that the scalar field acquires a VEV, |〈φ〉| = v/√

2. We havr

to make a choise and we choose it in the real direction of the down component,

〈φ〉 =

(0 + i0

v/√

2 + i0

). (7.19)

This VEV breaks the SU(2)×U(1) symmetry down to a U(1) subgroup. The way to see that this

is the case is to see that onlu one linear combination of generator anihhilate the vacuum state. In

our choise of vaccum it is T3 + Y . Since the generator is the of U(1)EM, Q, must be the generator

of the unbroken subgroup, we identify the unborken generator as

Q = T3 + Y. (7.20)

Before we proceed, let us clarify a few points regarding our choice of having the VEV in the

direction of the T3 = −1/2 component of φ:

1. We could equally well choose the have the VEV in the direction of the T3 = +1/2 component.

In this case we would have Q = T3 − Y , and the physics would remain the same.

2. Let us write explicitly the two components of SU(2)L doublets:

LL1 =

(νeL

eL

), φ =

(φ+

φ0

). (7.21)

The charge under U(1)EM of the different components, q, is given by

q(νeL) = 0, q(eL) = −1, q(eR) = −1, q(φ+) = +1, q(φ0) = 0. (7.22)

3. If SU(2)L×U(1)Y were an exact symmetry of Nature, there would be no way of distinguishing

particles of different electric charges in the same SU(2)L multiplet. The SSB makes, for

example, νeL distinguishable from eL.

Let us denote the four real components of the scalar doublet as three phases, θa(x) (a = 1, 2, 3),

and one magnitude, h(x). We choose the three phases to be the three “would be” Goldstone bosons,

85

in a way that is similar to the case we discuss in Section 6.5. In our case the broken generators

are T1, T2, and T3 − Y , and thus we write

φ(x) = exp [(i/2) (σaθa(x)− Iθ3(x))]1√2

(0

v + h(x)

). (7.23)

The local SU(2)L × U(1)Y symmetry of the Lagrangian allows one to rotate away the explicit

dependence on the three θa(x). They represent the three would-be Goldstone bosons that are

eaten by the three gauge bosons that acquire masses as a result of the SSB. See the discussion in

Section 6.5. In this gauge φ(x) has one DoF

φ(x) =1√2

(0

v + h(x)

). (7.24)

7.2.5 Summary

The renormalizable part of the Leptonic Standard Model Lagrangian is given by

LSM = − 1

4W µνa Waµν −

1

4BµνBµν − (Dµφ)†(Dµφ)− iLLiD/LLi − iERiD/ERi

+(Y eijLLiERj φ+ h.c.

)− µ2φ†φ− λ

(φ†φ

)2, (7.25)

where i, j = 1, 2, 3.

7.3 The Spectrum

7.3.1 Scalars: back to LφThe scalar sector contains one real scalar field that we denote by h. This is the Higgs boson of the

LSM. Its mass can be obtained by plugging (7.24) into (7.18), and is given by

mh =√

2λv. (7.26)

Experiment gives (PDG 2015)

mh = 125.09± 0.24 GeV. (7.27)

7.3.2 Vector bosons: back to Lkin(φ)

Since the symmetry that is related to three out of the four generators is spontaneously broken,

three of the four vector bosons acquire masses, while one remains massless. To see how this

happens, we examine (Dµ〈φ〉)†(Dµ〈φ〉). Using Eq. (7.12) for Dµφ, we obtain:

Dµ

(0

v/√

2

)=

i√8

(gW µa σa + g′Bµ)

(0

v

)=

i√8

(gW µ

3 + g′Bµ g(W µ1 − iW

µ2 )

g(W µ1 + iW µ

2 ) −gW µ3 + g′Bµ

)(0

v

). (7.28)

86

The mass terms for the vector bosons are thus given by (we omit Lorentz indices)

LMV=

1

8(0 v)

(gW3 + g′B g(W1 − iW2)

g(W1 + iW2) −gW3 + g′B

)(gW3 + g′B g(W1 − iW2)

g(W1 + iW2) −gW3 + g′B

)(0

v

). (7.29)

We define an angle θW via

tan θW ≡g′

g. (7.30)

We define four gauge boson states:

W± =1√2

(W1 ∓ iW2), Z0 = cos θWW3 − sin θWB, A0 = sin θWW3 + cos θWB. (7.31)

We see that the W± are charged under electromagnetism (hence the superscripts ±), while A0 and

Z0 are not. In terms of the vector boson fields of Eq. (7.31), we write Eq. (7.29) as follows:

LMV=

1

4g2v2W+W− +

1

8(g2 + g′2)v2Z0Z0. (7.32)

We learn that the four states of Eq. (7.31) are the mass eigenstates, with masses

m2W =

1

4g2v2, m2

Z =1

4(g2 + g′2)v2, m2

A = 0. (7.33)

(Recall that for a complex field φ with mass m the mass term is m2|φ|2 while for a real field it is

m2φ2/2.) Three points are worth emphasizing:

1. As anticipated, three vector boson acquire masses.

2. m2A = 0 is not a prediction, but rather a consistency check on our calculation.

3. The angle θW represents a rotation angle of the two neutral vector bosons from the interaction

basis, where fields have well-defined transformation properties under the full gauge symmetry,

(W3, B), into the mass basis for the vector bosons, (Z,A).

In Chapter 6 it is emphasized that SSB leads to relation between observables that would have

been independent in the absence of a symmetry. One such important relation involves the vector-

boson masses and their couplings:m2W

m2Z

=g2

g2 + g′2. (7.34)

It is conventional to express this relation in terms of θW , defined in Eq. (7.30):

ρ ≡ m2W

m2Z cos2 θW

= 1. (7.35)

This relation is testable. The left hand side of Eq. (7.34) can be derived from the measured

spectrum, and the right hand side from interaction rates. The ρ = 1 relation is a consequence of

87

the SSB by SU(2)-doublets. (See the homework for other possibilities.) It thus tests this specific

ingredient of the LSM.

The experimental values of the weak gauge boson masses are given by (PDG 2014)

mW = 80.385± 0.015 GeV; mZ = 91.1876± 0.0021 GeV. (7.36)

We can then use the ρ = 1 relation to determine sin2 θW :

mW

mZ

= 0.8815± 0.0002 =⇒ sin2 θW = 1− (mW/mZ)2 = 0.2229± 0.0004. (7.37)

Below we describe the determination of sin2 θW by various interaction rates. We will see that the

ρ = 1 is indeed realized in Nature (within experimental errors, and up to calculable quantum

corrections that we discuss in length in Chapter 11).

7.3.3 Fermions: back to LYuk

Here we see how some of the chiral fermions in the model acquire masses. The Yukawa part of the

Lagrangian is given by Eq. (7.14). The SSB allows us to tell the upper and lower components of

the doublet. In the basis defined in Eq. (7.15), we denote these components as follows:

LL1 =

(νeL

eL

), LL2 =

(νµL

µL

), LL3 =

(ντL

τL

), (7.38)

where e, µ, τ are ordered by the size of ye,µ,τ (from smallest to largest). We also define

ER1 = eR, ER2 = µR, ER3 = τR, (7.39)

Eq. (7.22) tells us that the neutrinos (νeL, νµL, ντL) have charge zero, while the left-handed charged

leptons (eL, µL, τL) and the right handed leptons (eR, µR, τR) carry charge −1.

With φ0 acquiring a VEV, the Yukawa term has a piece that corresponds to the charged lepton

masses. These terms are the ones obtained by replacing φ by its VEV in Eq. (7.14), leading to

−yev√2eL eR −

yµv√2µL µR −

yτv√2τL τR + h.c.. (7.40)

namely

me =yev√

2, mµ =

yµv√2, mτ =

yτv√2. (7.41)

The crucial point is that while the leptons are in a chiral representation of the full gauge

group SU(2)L × U(1)Y , the charged leptons — e, µ, τ — are in a vectorial representation of the

subgroup that is not spontaneously broken, that is U(1)EM. This situation is the key to opening

the possibility of acquiring masses as a result of the SSB, as realized in Eq. (7.40).

In your homework you will find that the number of Higgs representations that can give the

gauge boson their masses is large, but only very few also give masses to the fermions.

88

Table 7.1: The LSM particles

particle spin Q mass (theo)

W± 1 ±1 12gv

Z0 1 0 12

√g2 + g′2v

A0 1 0 0

h 0 0√

2λv

e 1/2 −1 1√2yev

µ 1/2 −1 1√2yµv

τ 1/2 −1 1√2yτv

νe 1/2 0 0

νµ 1/2 0 0

ντ 1/2 0 0

The charged lepton masses have been measured:

me = 0.510998928(11) MeV, mµ = 105.6583715(35) MeV, mτ = 1776.82(16) MeV.

(7.42)

The neutrinos are massless in this model. There are no LR(2)−1/2 fields in the LSM, so there

are no Dirac mass terms for the neutrinos in the symmetry limit. There are no NR(1)0 fields

in the LSM, so the neutrinos cannot acquire Dirac mass as a result of the SSB. A-priori, since

the neutrinos have no charge under U(1)EM, the possibility of acquiring Majorana masses is not

closed, and the neutrinos do not acquire Majorana masses from renormalizable terms. Thus,

Lepton number is an accidental symmetry of the theory (see Section 7.5.3).

7.3.4 Summary

We presented the details of the spectrum of the leptonic standard model. These are summarized

in the Table 7.1. All masses are proportional to the VEV of the scalar field, v. For the three

massive gauge bosons, and for the three charged leptons, this must be the case: In the absence

of spontaneous symmetry breaking, the former would be protected against acquiring a mass by

the gauge symmetry and the latter by their chiral nature. For the Higgs boson, the situation is

different, as a mass-squared term does not violate any symmetry. Here it is just a manifestation

of the fact that the LSM has a single dimensionful parameter, which can be taken to be v, and

therefore all masses must be proportional to this parameter.

89


IN this Section, we obtain the interactions among the mass eigenstates of well-defined EM charge

of the LSM. The scalar potential of Eq. (7.16) leads to Higgs self-interactions. The Yukawa terms

of Eq. (7.14) lead to Higgs-mediated Yukawa interactions among the charged leptons. The kinetic

terms of Eq. (7.11) lead to three type of interactions mediated by vector bosons: The photon-

mediated electromagnetic interactions (QED), the Z-mediated weak interactions (neutral current

weak interactions), and the W±-mediated weak interactions (charged current weak interactions).

To obtain the latter three types of interactions, we need to rewrite the covariant derivative given

in Eq. (7.10) in terms of the vector boson mass eigenstates, A, Z and W± defined in Eq. (7.31):

Dµ = ∂µ + ig(W+µ T

+ +W−µ T

−) (7.43)

+i(g sin θW T3 + g′ cos θW Y )Aµ + i(g cos θW T3 − g′ sin θW Y )Zµ,

where T± = (T1 ∓ iT2)/√

2.

7.4.1 The Higgs boson

The kinetic, gauge-interaction, self-interaction and Yukawa interaction terms of h are given by

Lh =1

2∂µh∂

µh− 1

2m2hh

2 − m2h

2vh3 − m2

h

8v2h4

+ m2WW

−µ W

µ+

(2h

v+h2

v2

)+

1

2m2ZZµZ

µ

(2h

v+h2

v2

)

− h

v(me eL eR +mµ µL µR +mτ τL τR + h.c.) . (7.44)

We write Lh in a way that demonstrates that all of the Higgs couplings can be expressed in terms

of the masses of the particles to which it couples.

The Higgs mass is given in Eq. (7.26), mh =√

2λv. It determines its quartic self-coupling,

m2h

2v2= λ, (7.45)

which is unchanged from the quartic coupling in (7.16), and its trilinear self-coupling,

m2h

2v= λv, (7.46)

which arises as a consequence of the SSB. The Higgs coupling to the weak interaction gauge bosons

is proportional to their masses-squared. The dimensionless hhV V couplings,

m2W

v2=g2

4,

m2Z

2v2=g2 + g′2

8(7.47)

are unchanged from Eq. (7.11). The hV V couplings,

2m2W

v=g2v

2,

m2Z

v=

(g2 + g′2)v

4, (7.48)

90

arise as a consequence of the SSB.

There is neither an hAA nor hhAA coupling. One can understand the absence of these couplings

in two ways. First, the Higgs boson is electromagnetically neutral, so it should not couple to the

electromagnetic force carrier. Second, the photon is massless, so it should not couple to the Higgs

boson.

The Yukawa couplings of the Higgs bosons to the charged leptons are proportional to their

masses: the heavier the lepton, the stronger the coupling. Note that these couplings, m`/v =

y`/√

2, are unchanged from Eq. (7.14).

7.4.2 QED: Electromagnetic interactions

In this Section we study the photon interactions and show that we recover QED. Using Eq. (7.43)

and the definition of θW in Eq. (7.30), we find that the photon coupling is proportional to

(g sin θW T3 + g′ cos θW Y ) =gg′√g2 + g′2

(T3 + Y ). (7.49)

This is what we wanted! The coupling is proportional to T3 + Y , which we defined as Q, the

generator of U(1)EM. The photon coupling is conventionally defined as eQ. We learn that

g =e

sin θW, g′ =

e

cos θW. (7.50)

Thus, the electromagnetic interactions are described by the QED Lagrangian [see Eq. (3.18)],

which is now understood as the part of the LSM Lagrangian that involves the photon field, A and

the charged fermions:

LQED = −1

4FµνF

µν + eAµ`iγµ`i, (7.51)

where Fµν = ∂µAν − ∂νAµ. The `1,2,3 = e, µ, τ fields are the Dirac fermions with Q = −1 that are

formed from the T3 = −1/2 component of a left-handed lepton doublet and from a right-handed

lepton singlet, for example τ = (τL, τR)T .

The QED interactions are discussed in Chapter 3. Here we only emphasize again some impor-

tant features that arise from Eq. (7.51):

1. The photon couplings are vector-like: It couples to the left-handed and right-handed fields

in the same way.

2. Thus, electromagnetic interactions are parity conserving.

3. Diagonality. The photon couples to e+e−, µ+µ− and τ+τ−, but not to e±µ∓, e±τ∓ or

µ±τ∓ pairs. Thus, electromagnetic interactions do not change flavor. This is a result of the

unbroken local U(1)EM symmetry.

4. Universality: the couplings of the photon to the different generations are universal. This is

a result of the U(1)EM gauge invariance.

91

7.4.3 Neutral current weak interactions

In this Section we study the Z-boson interactions with fermions. Using Eq. (7.43) and the definition

of θW in Eq. (7.30), we find that the Z-boson coupling is proportional to

(g cos θW T3 − g′ sin θW Y ) =g

cos θW(T3 − sin2 θWQ). (7.52)

This leads to the following Zff interactions:

LZff =g

cos θW

[−(

1

2− sin2 θW

)ìLZ/ ìL + sin2 θW ìRZ/ ìR +

1

2νiLZ/ νiL

]. (7.53)

where `1,2,3 = e, µ, τ and ν1,2,3 = νe, νµ, ντ . Note that, unlike the photon, the Z couples to neutrinos.

Z-exchange gives rise to neutral current weak interactions (NCWI). Eq. (7.53) reveals some further

important features of the model:

1. The Z-boson couplings are chiral: It couples to left-handed and right-handed fields with

different strength.

2. Thus, the Z-interactions are parity violating.

3. Diagonality. The Z-boson couples to, for example, e+e−, µ+µ−, νeLνeL and νµLνµL, but not

to, for example, e±µ∓ and νeLνµL pairs. Consequently, there are no flavor changing neutral

currents (FCNCs). This can be thought of a result of an accidental U(1)3 symmetry of the

model, see Section 7.5.3.

4. Universality: the couplings of the Z-boson to the different generations within each of the

three sectors (νL, `L, `R) are universal. This is a result of a special feature of the LSM: all

fermions of given chirality and given charge come from the same SU(2)×U(1) representation.

The above points have been experimentally tested. For example, the branching ratios of the

Z-boson into charged lepton pairs,

BR(Z → e+e−) = (3.363± 0.004)% , (7.54)

BR(Z → µ+µ−) = (3.366± 0.007)% ,

BR(Z → τ+τ−) = (3.367± 0.008)% .

beautifully confirm universality:

Γ(µ+µ−)/Γ(e+e−) = 1.0009± 0.0028, (7.55)

Γ(τ+τ−)/Γ(e+e−) = 1.0019± 0.0032.

Diagonality is also tested by the following experimental searches:

BR(Z → e+µ−) < 1.7× 10−6, (7.56)

BR(Z → e+τ−) < 9.8× 10−6,

BR(Z → µ+τ−) < 1.2× 10−5.

92

The branching ratio of Z decays into invisible final states which, in our model, is interpreted

as the decay into final neutrinos, is measured to be

BR(Z → νν) = (20.00± 0.06)%. (7.57)

From Eq. (7.53) we obtain

BR(Z → `+`−)

BR(Z → ν`ν`)=

(1/2− sin2 θW )2 + sin4 θW1/4

= 1− 4 sin2 θW + 8 sin4 θW . (7.58)

We can thus extract sin2 θW from the experimental data, sin2 θW = 0.226, consistent with Eq. (7.37).

We discuss these decays in more detail in Chapter 11.

We also remark that we can write the coupling of the Z to fermions in terms of Dirac fermions.

We define

gL = T3 − sin2 θWQ, gR = − sin2 θWQ, (7.59)

gV = gL + gR = T3 − 2 sin2 θWQ, gA = gL − gR = T3

and we get tha for any fermion ψ the coupling is

LZψψ =g

cos θWψ(gV − gAγ5)Z/ψ, (7.60)

7.4.4 Charged current weak interactions

In this Section we study the W±-boson interactions with fermions. Using Eq. (7.43), the definition

of θW in Eq. (7.30), and the explicit form of the Ta matrices, we find the W -boson couplings to

fermion pairs are given by

LW = − g√2νiL W/

+`−iL + h.c. (7.61)

The interactions mediated by theW± vector-bosons are called charged current interactions (CCWI).

They are unique among the interactions of the LSM, as the fermion pairs to which the W -boson

couples consist of two different fermions, a neutrino and a charged lepton. This must be the case

as the W -bosons are charged, so they must change the identity of the particle with which they

interact.

Eq. (7.61) reveals some important features of the model:

1. Only left-handed particles take part in charged-current interactions.

2. Parity violation: a consequence of the previous feature is that the W -mediated interactions

violate parity.

3. Diagonality: the charged current interactions couple each charged lepton to a single neutrino,

and each neutrino to a single charged lepton.

93

4. Universality: the couplings of the W -boson to τ ντ , to µνµ and to eνe are equal. This is a

result of the local nature of the imposed SU(2): a global symmetry would have allowed an

independent coupling to each lepton pair.

All of these predictions have been experimentally tested. As an example of how well universality

works, consider the decay rates of the W -bosons to the three lepton pairs:

BR(W+ → e+νe) = (10.75± 0.13)× 10−2,

BR(W+ → µ+νµ) = (10.57± 0.15)× 10−2,

BR(W+ → τ+ντ ) = (11.25± 0.20)× 10−2. (7.62)

You must be impressed by the nice agreement!

The charged current interaction gives rise to all flavor changing weak decays. One example is

the µ− → e−νµνe decay. One can use this decay rate as yet another independent way to determine

sin2 θW from an interaction rate. The low-energy W -propagator is well approximated via a four

fermion coupling:g2

m2W − q2

≈ g2

m2W

=4πα

sin2 θWm2W

≡ 4√

2GF , (7.63)

where we used g2 = 4πα/ sin2 θW based on Eqs. (3.10) and (7.50) . The measured muon lifetime,

τµ = (2.197034± 0.000021)× 10−6 s, (7.64)

determines GF via

Γµ =1

τµ=G2Fm

5µ

192π3f(m2

e/m2µ)(1 + δRC), f(x) = 1− 8x+ 8x3 − x4 − 12x2 log x, (7.65)

where f(x) is the phase space function for a three body decay with two massless final particles

(it is normalized to 1 in the case when all final particles are massless) and δRC encodes radiative

corrections, and is known to O(α2). One gets:

GF = 1.16637(1)× 10−5 GeV−2. (7.66)

Using α of Eq. (??), mW of Eq. (7.36) and GF of Eq. (7.66), we obtain

sin2 θW = 0.215, (7.67)

in good agreement with Eq. (7.37). The difference between the two is accounted for by higher

order radiative corrections (we discuss them in Chapter 11).

Note that GF determines also the VEV. Using GF = g2/(4√

2m2W ) and m2

W = g2v2/4 we obtain

v = (√

2GF )−1/2 ≈ 246 GeV. (7.68)

94

7.4.5 Gauge boson self-interactions

The gauge boson self-interactions that are presently most relevant to experiments are the W+W−V

(V = Z,A) couplings which, in the LSM, have the following form:

LWWV = ie cot θW[(W+

µνW−µ −W−

µνW+µ)Zν +W+

µ W−ν Z

µν]

+ ie[(W+

µνW−µ −W−

µνW+µ)Aν +W+

µ W−ν A

µν]. (7.69)

Here Wµν = ∂µWν − ∂νWµ, Zµν = ∂µZν − ∂νZµ, and Aµν = ∂µAν − ∂νAµ. (Note that we usually

use Fµν instead of what we now denote as Aµν .)

The above interaction depends on only two parameters, e and θW . It is much more restrictive

than the most general one. Moreover, these parameters can be measured from other sectors of the

theory. Thus, it can be used to test the theory. For example, the most general CP invariant form

of the couplings is given by (see for example [12] and references therein)

LWWV = −igWWV

[gV1 (W+

µνW−µ −W−

µνW+µ)V ν + κVW

+µ W

−ν V

µν +

(λV /M2W )W+ν

µ W−ρν V µ

ρ −igV5 εµνρσ(W+µ ∂ρW

−ν −W−

ν ∂ρW+µ )Vσ

], (7.70)

where gWWA = e, gWWZ = e cot θW and, due to EM gauge invariance, 1− gγ1 = gγ5 = 0. (Note that

we use here γ interchangeably with A.) The LSM predicts the following values for the parameters:

gZ1 = κγ = κZ = 1, λγ = λZ = gZ5 = 0. (7.71)

The experimental values are

gZ1 = 0.98± 0.02, κZ = 0.92± 0.07, λZ = −0.09± 0.07,

κγ = 0.97± 0.04, λγ = −0.03± 0.02, gZ5 = 0.07± 0.09, (7.72)

in very good agreement with the SM predictions.

Last we present the quartic gauge boson couplings within the SM

L4V = g2 cos2 θW(W+µ W

−ν Z

µZν −W+µ W

−µZνZν)

+g2(W+µ W

−ν A

µAν −W+µ W

−µAνAν)

+g2

2

(W+µ W

−ν

) (W+µW−ν −W+νW−µ

)+e2 cot θW

[(W+µ W

−ν

)(ZµAν − ZνAµ)− 2W+

µ W−µZνA

ν]. (7.73)

The experimental precision is not yet good enough to significantly probe them.

7.4.6 Summary

Leptons have four types of interactions. These interactions are summarized in Table 7.1.

95

Table 7.1: The LSM lepton interactions

interaction force carrier coupling range

electromagnetic (EM) γ eQ long

NC weak Z0 e(T3−s2WQ)

sW cWshort

CC weak W± g short

Yukawa h y` short

The name weak interactions is somewhat misleading. In fact, the weak coupling g is larger

than the electromagnetic coupling e. The more important feature is that the weak interactions are

mediated by massive vector bosons, and consequently they are short range, while the electromag-

netic interactions are mediated by the massless photon and hence they are long range. It is the

short range of the weak interactions which makes the neutrinos, which do not have electromagnetic

interactions, very hard to detect.

7.5 Global symmetries and parameters

7.5.1 The interaction basis and the mass basis

The interaction basis is the one where all fields have well-defined transformation properties under

the imposed symmetries of the Lagrangian. In particular, in this basis, the gauge interactions are

universal.

If there are several fields with the same quantum numbers, then the interaction basis is not

unique. The kinetic and gauge terms are invariant under a global unitary transformation among

these fields. On the other hand, the Yukawa terms and the fermion mass terms are, in general, not

invariant under a unitary transformation among fermion fields with the same quantum numbers,

fi → U fjifi, while the Yukawa terms and scalar potential are, in general, not invariant under a

unitary transformation among scalar fields with the same quantum numbers, si → U sjisi. Thus, by

performing such transformations, we are changing the interaction basis.

In the LSM, there are three copies of (2)−1/2 fermions and three copies of (1)−1 fermions.

Transforming the first by a unitary transformation UL, and the latter by an independent unitary

transformation UR, the Yukawa matrix Y e is transformed into ULYeU †R. The matrix Y e is a 3× 3

complex matrix and thus has, in general, nine complex parameters. We can always find a bi-

unitary transformation that would make Y e real and diagonal, and thus depend on only three real

parameters:

Y e → ULYeU †R = Y e

diag = diag(ye, yµ, yτ ). (7.74)

96

Often one chooses a basis where the number of Lagrangian parameters is minimal, as is the case

with the diagonal basis of Eq. (7.74). One could work in any other interaction basis. However,

when calculating physical observables, only the eigenvalues of Y †e Ye would play a role. Using the

diagonal basis just provides a shortcut to this result.

The mass basis is the one where all fields have well defined transformation properties under the

symmetries that are not spontaneously broken and are mass eigenstates. The fields in this basis

correspond to the particles that are eigenstates of free propagation in spacetime. The Lagrangian

parameters in this basis correspond directly to physical observables.

For the LSM, the interaction eigenstates have well defined transformation properties under the

SU(2)L × U(1)Y symmetry:

Wa(3)0, B(1)0, LL1,2,3(2)−1/2, ER1,2,3(1)−1, φ(2)+1/2. (7.75)

The mass eigenstates have well defined electromagnetic charge and mass:

W±, Z0, A0, e−, µ−, τ−, νe, νµ, ντ , h0. (7.76)

The number of degrees of freedom is the same in both bases. To verify this statement one has

to take into account the following features:

1. Wa and B have only transverse components, while W± and Z0 have also a longitudinal one.

2. LL and ER are Weyl fermions, while e, µ, τ are Dirac fermions.

3. φ is a complex scalar, while h is a real one.

The three electromagnetically neutral neutrino states are, at the renormalizable level, massless

and, in particular, degenerate. Thus, there is freedom in choosing the mass basis for the neutrinos.

We choose the basis where the W± couplings to the charged lepton mass eigenstates are diagonal.

One could choose a different mass basis, related to the one we chose by a unitary transformation

of the three neutrino fields, νe

νµ

ντ

→ν1

ν2

ν3

= U †

νe

νµ

ντ

. (7.77)

Let us see how the decay rate of the W -boson into an electron and a neutrino is calculated in this

basis. Since the experiment does not distinguish between ν1, ν2, ν3, one has to sum over all three

species:

Γ(W+ → e+ν) =∑

i=1,2,3

Γ(W+ → e+νi) = Γ(W+ → e+νe)(|Ue1|2 + |Ue2|2 + |Ue3|2)

= Γ(W+ → e+νe). (7.78)

97

Thus, if the neutrinos are degenerate, the elements of the matrix U have no physical significance;

They cannot appear in any physical observable. Our choice of basis (νe, νµ, ντ ) provides a shortcut

to this result.

Later we will see that non-renormalizable terms provide the neutrinos with (non-degenerate)

masses, and then the mass basis becomes unique.

7.5.2 The LSM parameters

There are seven independent parameters in the LSM. This implies that, in principle, we need to

perform seven appropriate measurements and then we can make predictions for any other processes

involving the leptons and the Higgs boson that are mediated by the EM, weak or Yukawa inter-

actions. It is convenient to think of these experiments as measurements of the seven parameters.

There are various ways in which we can choose the seven independent parameters, for example,

g, g′, v, λ, ye, yµ, yτ . (7.79)

Another example would be mW , mZ , mh, me, mµ, mτ , and α. This example shows that by

measuring the spectrum of the LSM fermions and the fine structure constant, all other interaction

rates are predicted.

A good choice of parameters would be one where the experimental errors in their determination

are the smallest. As of now this set is the following:

α, GF , me, mµ, mτ , mZ , mh. (7.80)

By now, all seven parameters have been measured, with mh (or, equivalently, λ in the previous

list) the latest addition. In the following we use the above seven parameters to show a few more

examples of how the LSM has been tested.


If we set the Yukawa couplings to zero, LYuk = 0, the LSM gains a large accidental global symmetry:

GglobalLSM (Y e = 0) = U(3)L × U(3)E = SU(3)L × SU(3)E × U(1)L × U(1)E. (7.81)

The (LL1, LL2, LL3) fields transform as (3, 1)qL,0 under this symmetry. The (ER1, ER2, ER3) fields

transform as (1, 3)0,qR under the symmetry. All other fields are singlets, (1, 1)0,0. Concerning the

U(1) factors, the choice of qL and qR is arbitrary (except that both must not equal zero). It is

customary to normalize these charges to +1.

The Yukawa couplings break this symmetry into the following subgroup:

GglobalLSM = U(1)e × U(1)µ × U(1)τ . (7.82)

98

The U(1) factors are called electron number, muon number, and tau number, respectively. The

charges of νeL and e are (1, 0, 0), of νµL and µ are (0, 1, 0), and of ντL and τ are (0, 0, 1). It is

thus a subgroup of GglobalLSM and is conserved. Thus, electron number, muon number, tau number,

and total lepton number are accidental symmetries of the LSM. This situation allows, for example,

the muon decay mode µ− → e−νeνµ, but forbids µ− → e−γ and µ− → e−e+e−. Also scattering

processes such as e+e− → µ+µ− are allowed, but e+µ− → µ+e− is forbidden.

it is useful to also define total lepton number which is the sum of these three lepton flavor

numbers and it corresponds to a U(1)L symmetry. Clearly total lepton number is also an accidental

symmetry of the leptonic SM. The conservation of total lepton number explains why Majorana

masses for neutrinos are not allowed within the LSM.

These accidental symmetries are, however, all broken by nonrenormalizable terms of the form

(1/Λ)LLiLLjφφ. If the scale Λ is high enough, these breaking effects are very small. It means

that the “forbidden” processes mentioned above are expected to occur, but at very low rates. It

also implies that we expect very small Majorana masses. We discuss these points in detail in

Chapter 13.

Finally, let us point out that the breaking of the symmetry (7.81) into (7.82) is by the Yukawa

couplings — ye, yµ, yτ — which are small, ofO(10−6, 10−3, 10−2), respectively. Thus, the full [U(3)]2

remains an approximate symmetry of the LSM.

7.5.4 Discrete symmetries: C, P and CP

The LSM violates C and P as it is a chiral theory: there are more LH DoF than there are RH

ones. This implies C and P violation.

On the other hand, CP is conserved by the LSM. This can be seen by the fact that the seven

parameters of the models defined in Eqs. (7.79) or (7.80) can be chosen real. In fact, we explicitly

found a basis where this is the case.

Experimentally, P violation was demonstrated in many ways. One example is given by the

measurement of τ polarization Pτ (also denoted as Aτ ) in the Z → τ+τ− decay. It is given by

Pτ ≡σR − σLσR + σL

, (7.83)

where σR(σL) is the cross section of producing a RH (LH) tau in Z decay. In a parity invariant

theory, Pτ = 0. The LSM prediction at tree level can be read from the couplings of the Z to the

fermions in Eq. (7.53):

Pτ =(1/2 + s2

W )2 − (s2W )2

(1/2 + s2W )2 + (s2

W )2≈ 0.16, (7.84)

where we used s2W = 0.23. Experimentally [10],

Pτ = 0.143± 0.004 (7.85)

99

which corresponds to sin2 θW = 0.2320± 0.0005. The small deviations from other determinations

that we discuss in this section are mainly due to higher order corrections that we neglect.

7.6 Low Energy Tests of the LSM

Nowadays, experiments produce the W and Z bosons and measure their properties directly. It

is interesting to understand, however, how the SM was tested at the time before the energy in

experiments became high enough for such direct production. It is not only the historical aspect

that is interesting; It is also important to see how we can use low energy data to understand shorter

distances.

7.6.1 CC weak interactions: Quasi-elastic neutrino–electron scattering

Let us compare the charged current contributions to the two elastic scattering processes

νµe− → νeµ

−, νee− → νµµ

−. (7.86)

(These processes are sometimes called “inverse muon decays”.) Since these are flavor changing

processes, in the LSM the only contributions come from W exchange. We consider scattering with

a center-of-mass energy in the range m2µ s m2

W . In particular, we can consider the leptons

massless.

We define θ to be the angle between the incoming (anti)neutrino and the outgoing muon.

Then cos θ = 1 corresponds to backward scattering of the beam particle. For the νee− scattering,

νL and `L have positive and negative helicities, respectively. Thus, in the center of mass frame,

their spins are in the same direction. Therefore (Jz)i = +1. When the scattering is backwards, the

respective momenta of the antineutrinos and the charged leptons change to the opposite directions,

and so do their helicities: (Jz)f = −1. Therefore, backward ν` scattering is forbidden by angular

momentum conservation. In fact, the process νee→ νµµ proceeds entirely in a J = 1 state with net

helicity +1. That is, only one of the three states is allowed. In contrast, in νµe→ νeµ, backward

scattering has (JZ)i = (JZ)f = 0 and all helicity states are allowed. The full calculation yields, for

m2µ s m2

W , and working in the electron rest frame

dσ(νµe− → νeµ

−)

dΩ=G2F s

4π2,

dσ(νee− → νµµ

−)

dΩ=G2F s

16π2(1− cos θ)2, (7.87)

σ(νµe− → νeµ

−) =G2F s

π, σ(νee

− → νµµ−) =

G2F s

3π,

with s = 2meEν . In particular, the ratio of cross sections is predicted by the LSM. To leading

order it does not depend on any parameter:

σ(νµe− → νeµ

−)

σ(νee− → νµµ−)= 3. (7.88)

100

7.6.2 NC weak interactions: neutrino–electron scattering

There are several observables that can be used to test neutral currents interactions. The first

example is low energy elastic scattering:

νµe− → νµe

−, νµe− → νµe

−. (7.89)

Since the W -boson couples diagonally, it does not couple to a νµe− pair. Since neutrinos are

involved, these processes cannot be mediated by photons. Consequently, the νµe → νµe and

νµe− → νµe

− scattering processes are mediated purely by the Z-boson.

We can use the ratio

R ≡ σ(νµe→ νµe)

σ(νµe→ νµe)(7.90)

to fix sin2 θW . Defining

geL = −1/2 + sin2 θW , geR = sin2 θW , (7.91)

geV = geL + geR = −1/2 + 2 sin2 θW , geA = geL − geR = −1/2,

working in the electron rest frame the results read

σνµ =G2F s

2π

[(geL)2 +

1

3(geR)2

], σνµ =

G2F s

2π

[(geR)2 +

1

3(geL)2

]. (7.92)

with s = 2meEν .

The experimental result of the scattering rates give geA = −0.507 ± 0.014 and geV = −0.040 ±0.015 [?] (in the “Electroweak model and constraints on new physics” review) that leads to

sin2 θW = 0.230± 0.008. (7.93)

in good agrrment with other determinations.

7.6.3 Forward-backward asymmetry

We consider e+e− → µ+µ− scattering. This process is mediated by both QED interactions and

NC weak interactions. The former are vector-like contributions, and therefore conserve parity.

The latter are parity violating. The interference between the photon-mediated contribution and

the Z-mediated contribution leads to a forward-backward asymmetry, which is a manifestation of

parity violation.

The forward-backward asymmetry is defined as follows:

AFB =σF − σBσF + σB

, σF = 2π∫ 1

0d cos θ

dσ

d cos θ, σB = 2π

∫ 0

−1d cos θ

dσ

d cos θ. (7.94)

A detailed calculation gives, for m2µ s m2

Z ,

dσ

dΩ=α2

4s

[1 + cos2 θ − 4g2

A

c2W s

2W

s

m2Z

cos θ

], (7.95)

101

yielding

AFB(m2µ s m2

Z) = − 3g2A

2c2W s

2W

s

m2Z

. (7.96)

102

Homework

Question 7.1: Some algebra

1. Starting from Eq. (7.29) and using the definitions of Eqs. (7.30) and (7.31) derive Eq. (7.32).

2. Starting from Eq. (7.43) and using the definition of θW in Eq. (7.30) and of Q in Eq. (7.20)

derive both sides of Eq. (7.52).

3. Using the gauge boson kinetic terms from Eq. (7.25), and the definitions in Eqs. (7.30), (7.31)

and (7.50) derive Eq. (7.69).

Question 7.2: Lepton universality

Here we consider muon and tau decays in the SM.

1. Find in the PDG the main decay mode of the muon. What is its width?

2. Find in the PDG the bound on the decay

Γ(µ→ eγ). (7.97)

What is the SM prediction to this mode? Give a short explanation not just a number.

3. Draw the tree level Feynman diagram for the leading muon decay.

4. We now move to tau decays. Based on lepton universality, what do you expect for the

following ratiosΓ(τ → eνν)

Γ(τ → µνν),

Γ(µ→ eνν)

Γ(τ → eνν). (7.98)

Compare your results with the PDG and explain any small deviation that you find compared

with your predictions.

103

Question 7.3: A modified leptonic SM

Consider a model where the gauge group is SU(2)L × U(1)Y broken to U(1)EM . we consider

one generation of leptons. We assume that the leptons and Higgs representations are

LL(3)−1, ER(2)−3/2, φ(2T + 1)1/2 , (7.99)

where T is undetermined.

1. Can we have a bare mass term for the leptons?

2. What are the possible values of T such that some of the leptons are massive after SSB?

Namely, what T can give a Yukawa term of the form LLER φ?

3. Explain why only T = 1/2 is consistent with the observed masses of the W and Z bosons.

From now on we then assume T = 1/2.

4. By considering the components of the fields and the Yukawa term, argue that the model

contains a massless q = 0 state (the neutrino), a massive, q = −1 Dirac state (the electron)

and a massive, q = −2 Dirac state that we did not observed in nature, that we denote as λ.

5. Show that mλ =√

2me. (Note: this question do require some amount of calculation. Thus,

if you get stuck move on to the next item.)

6. Write down the fermionic charged current interaction terms in the Lagrangian, that is, the

coupling of the W to the fermions. You are asked to keep track of factors of 2. Recall that

the spin-1/2 matrices are

1

2

(0 1

1 0

),

1

2

(0 −ii 0

),

1

2

(1 0

0 −1

). (7.100)

and the spin-1 matrices are

1√2

0 1 0

1 0 1

0 1 0

, 1√2

0 −i 0

i 0 −i0 i 0

,

1 0 0

0 0 0

0 0 −1

. (7.101)

7. Is the λ stable? Explain.

8. Calculate the ratioΓ(W− → eR

+λ−−R )

Γ(W− → eL+λ−−L ). (7.102)

Hint: make sure you are not doing a long calculation.

104

9. Is this model a candidate to replace the SM? Explain.

Question 7.4: ρ for a general Higgs

In the SM the Higgs transforms under SU(2)L × U(1)Y as (2)1/2. However, any scalar that is

charged under the gauge group and acquires a vev will break the SM gauge symmetry. We assume

that the Higgs potential is given by Eq. (7.18).

1. Consider a scalar φ that transforms as (2T + 1)Y . Since SU(2) is a non Abelian group,

2T + 1 has to be a positive integer, that is, T is a non negative half integer. Since U(1)

is Abelian, a priori Y can assume any real value. Yet, we like φ to be responsible for the

SU(2)L × U(1)Y → U(1)EM breaking where we define Q = T3 + Y . This definition restricts

the possible values for Y . Find these values.

2. Show that ρ, defined as

ρ ≡ m2W

m2Z cos2 θW

, (7.103)

is given by

ρ =T (T + 1)− Y 2

2Y 2. (7.104)

Hint: Recall that the 2T + 1 dim. representation of SU(2) is given by

T3 = diagT, T − 1, T − 2, . . . ,−T (7.105)

T1 =

0 a1 0 . . . 0

a1 0 a2 0...

0 a2. . . . . .

...... 0

. . . 0 a2T

0 . . . . . . a2T 0

T2 =

0 ia1 0 . . . 0

−ia1 0 ia2 0...

0 −ia2. . . . . .

...... 0

. . . 0 ia2T

0 . . . . . . −ia2T 0

where

ai =

√T (T + 1)− (T − i)(T − i+ 1)

2(7.106)

3. For T > 0 and Y = 0 one can see from Eq. (7.104) that ρ →∞ independent of T . Explain

this result using symmetry arguments.

4. Suppose that there exist several Higgs representations (i = 1, . . . , N) whose neutral members

acquire vevs vi. Find ρ in terms of vi, Ti and Yi.

5. Assume that, in addition to the usual Higgs doublet T = 1/2, Y = 1/2 with vev vW , there

exists one other multiplet Ti, Yi which acquires a much smaller vev vi. Find δρ ≡ ρ− 1 to

first order in (vi/vW )2.

105

6. Assume that experimentally −0.01 ≤ δρ ≤ +0.005. Find the constraint on (vi/vW )2 for the

following multiplets: (5)−1 and (4)3/2.

7. From Eq. (7.104) it is clear that ρ = 1 for all 3Y 2 = T (T+1) multiplets. Since experimentally

ρ is very close to 1, we assume that the SM Higgs is one of these multiplets. While from the

consideration of ρ alone there is no difference which multiplet we take, in the SM we do make

a choice and take T = 1/2 and Y = 1/2. What is the advantage of the SM Higgs compare

to the other possible choices?

Question 7.5: Left Right Symmetric (LRS) model for leptons

Assume the Left Right Symmetric (LRS) model for leptons: The gauge group is

GLR = SU(2)L × SU(2)R × U(1)X . (7.107)

The fermions transform as

LL(2, 1)−1, LR(1, 2)−1, (7.108)

where our notation is (NL, NR)qX such that NL (NR) is the representation under SU(2)L (SU(2)R)

and qX is the charge under U(1)X . The left handed leptons are denote as LL = (νL, e−L) and the

right handed leptons as LR = (νR, e−R). We use WL, WR and C to denote the seven gauge fields,

and gL, gR and gX to denote the three different coupling constants of the SU(2)L, SU(2)R and

U(1)X groups respectively. Note that we have extended both the gauge group and the lepton

content of the SM.

1. Write down the commutation relations between the various generators [the analogue of (7.6)].

2. Write down the covariant derivative [the analogue of (7.10)].

3. The LSM group SU(2)L×U(1)Y has to be included in the LRS group which is the case when

U(1)Y ⊂ SU(2)R × U(1)X . Find the linear combination of the LRS generators which gives

the SM generator Y . Then, find the linear combination of generators which gives the electric

charge Q [the analogue of (7.20)].

4. What are the SU(2)L × U(1)Y charges of the right handed neutrino, νR?

5. Write down explicitly the charged current interactions of the leptons [the analogue of (7.61)].

It is enough to write it for one generation only.

6. In the SM muon decay is mediates by WL. In the LRS model there is one more tree level

Feynman diagram that contributes to muon decay. Draw this new diagram.

106

7. Assuming that there is no mixing between WL and WR find the ratio of the new amplitude

to the SM amplitude in terms of the coupling constants and the masses of the gauge bosons.

8. The Bµ field of the SM must be a linear combination of the W µR3 and Cµ. The orthogonal

linear combination is called Z ′µ. Write the expression of Bµ and Z ′µ [the analogue of (7.31)].

Use θR to define the mixing angle.

9. Express gR and gX in terms of g′ and θR [the analogue of (7.50)].

10. Find the coupling of the Z ′ to the fermions in terms of T3R and Y [the analogue of (7.53)].

11. We now add a scalar, φ, to the model and demand that it couples to the fermions. (We need

it in order that after it acquires a vev the fermion will be massive.) Namely, we want that a

term of the form

LLφLR + h.c. (7.109)

will be allowed. Find the representation of the scalar under GLR. What are the electric

charges of the various components of the scalar field?

12. The first guess for the SSB sector may be to use the scalar that we know we must add,

namely the one that couples to the fermions. As you just found it is

Φ(2, 2)0 =

(Φ0

1 Φ+1

Φ−2 Φ02

)(7.110)

where the indecies are the electric charges and we explicitly wrote Φ in a matrix notation

such that the transformation law under SU(2)L × SU(2)R is Φ→ ULΦU †R. We now assume

that Φ acquires a vev

〈Φ〉 =

(k1 0

0 k2

)(7.111)

Explain why we require that only the neutral components of Φ acquire a vev. What is the

symmetry breaking pattern generated by these vevs?

13. This choice, however, does not give a realistic model. For example, the lightest charged

gauge boson is an equal mixture of WL and WR. (The calculation is somewhat lengthy, but

straighforward. You are encourage to do it and check the above statment yourself.) In order

to solve the problem we require a SSB pattern that will result in mWR mWL

. This is done

by adding two more scalars

∆L = (3, 1)2 ∆R = (1, 3)2 (7.112)

Write each of the scalars as a triplet (namely, as a vector with three components). What is

the hypercharge (Y ) and the electric charge (Q) of each component?

107

14. We assume that the neutral components of ∆L and ∆R acquire vevs, vL and vR, respectively.

What is the symmetry breaking pattern generated by each of these vevs? (Namely, assume

first that the only vev is vL and write down the SSB pattern. Then repeat the question

assuming that the only vev is vR.)

15. We assume the following hierarchy

vL ki vR . (7.113)

Now you are going to show why we assume these relations. Which of the four vevs (k1,

k2, vL and vR) affect the ρ = 1 relation? What bounds on the vevs can you deduce from

the experimental bound |ρ − 1| < 0.01 ? (You can neglect mixing between the WL and WR

bosons.)

16. Assuming the above relations, estimate the masses of the W±L and W±

R in terms of the

vevs and gauge couplings (do not worry about numbers and sub-leading corrections). What

bounds on the vevs can you deduce from the experimental bound on the right handed am-

plitude in muon decay |gVRR| < 0.033 ? (You can assume that gVLL = 1. Then, gVRR is nothing

but the ratio between the right handed and left handed muon decay amplitudes.)

108

Chapter 8

The Standard Model

8.1 Defining the Standard Model

The Standard Model (SM) is defined as follows:


SU(3)C × SU(2)L × U(1)Y . (8.1)

(ii) The pattern of spontaneous symmetry breaking is as follows:

SU(3)C × SU(2)L × U(1)Y → SU(3)C × U(1)EM (QEM = T3 + Y ). (8.2)

(iii) There are three fermion generations, each consisting of five different representations:

QLi(3, 2)+1/6, URi(3, 1)+2/3, DRi(3, 1)−1/3, LLi(1, 2)−1/2, ERi(1, 1)−1, i = 1, 2, 3.

(8.3)


φ(1, 2)+1/2. (8.4)

We use the notation (A,B)Y such that A (B) is the irrep under SU(3)C (SU(2)L), and Y is the

hypercharge. The fermions that transform as triplets of SU(3)C are called quarks, while those

that transform as singlets of SU(3)C are called leptons.

8.2 The Lagrangian

As explained in previous chapters, the most general renormalizable Lagrangian with scalar and

fermion fields can be decomposed into

L = Lkin + Lψ + LYuk + Lφ. (8.5)

109

It is now our task to find the specific form of the Lagrangian made of the fermion fields QLi, URi,

DRi, LLi and ERi (8.3), and the scalar field φ (8.4), subject to the gauge symmetry (8.1) and

leading to the SSB of Eq. (8.2).

8.2.1 Lkin and the gauge symmetry

The gauge group is given in Eq. (8.1). It has twelve generators: eight La’s that form the SU(3)

algebra, three Tb’s that form the SU(2) algebra, and a single Y that correspond to the U(1) group:

[La, Lb] = ifabcLc, [Ta, Tb] = iεabcTc, [La, Tb] = [La, Y ] = [Tb, Y ] = 0. (8.6)

Thus there are three independent coupling constants in Lkin: gs related to the SU(3)C subgroup,

g related to the SU(2)L subgroup, and g′ related to the U(1)Y subgroup.

The local symmetry requires twelve gauge bosons degrees of freedom, eight in the adjoint

representation of SU(3)C , three in the adjoint representation of SU(2)L, and one related to the

U(1)Y symmetry:

Gµa(8, 1)0, W µ

a (1, 3)0, Bµ(1, 1)0. (8.7)

The corresponding field strengths are given by

Gµνa = ∂µGν

a − ∂νGµa − gsfabcG

µbG

νc ,

W µνa = ∂µW ν

a − ∂νW µa − gεabcW

µb W

νc ,

Bµν = ∂µBν − ∂νBµ, (8.8)

where fabc is the structure constand of SU(3). The covariant derivative is given by

Dµ = ∂µ + igsGµaLa + igW µ

b Tb + ig′Y Bµ. (8.9)

For the SU(3)C triplets La = 12λa (λa are the Gell-Mann matrices), while for the SU(3)C singlets,

La = 0. For the SU(2)L doublets Tb = 12σb (σb are the Pauli matrices), while for the SU(2)L

singlets, Tb = 0. For SU(3)C adjoints, (La)bc = fabc and for SU(2)L adjoints, (Ta)bc = εabc, which

have already been used in writing (8.8).


Lkin = − 1

4Gµνa Gaµν −

1

4W µνb Wbµν −

1

4BµνBµν

− iQLiD/QLi − iURiD/URi − iDRiD/DRi − iLLiD/LLi − iERiD/ERi− (Dµφ)†(Dµφ). (8.10)

Explicitly, the covariant derivatives acting on the various fermion fields are given by

DµQL =(∂µ +

i

2gsG

µaλa +

i

2gW µ

b σb +i

6g′Bµ

)QL,

110

DµUR =(∂µ +

i

2gsG

µaλa +

2i

3g′Bµ

)UR,

DµDR =(∂µ +

i

2gsG

µaλa −

i

3g′Bµ

)DR,

DµLL =(∂µ +

i

2gW µ

b σb −i

2g′Bµ

)LL,

DµER = (∂µ − ig′Bµ)ER. (8.11)

8.2.2 LψThere are no mass terms for the fermions of the SM,

Lψ = 0. (8.12)

In Chapter 7 we have seen that this is the case for leptons. Note that a larger symmetry means

stronger constraints, hence it is impossible that lepton masses would become allowed when the

gauge symmetry is extended to include SU(3)C . As concerns the quarks, we cannot write Dirac

mass terms because they are assigned to chiral representations of the SU(2)L × U(1)Y gauge

symmetry. We cannot write Majorana mass terms for the quarks because they all have Y 6= 0.

8.2.3 LYuk

The Yukawa part of the Lagrangian is given by

LYuk = Y uijQLiURj φ+ Y d

ijQLiDRj φ+ Y eijLLiERj φ+ h.c., (8.13)

where i, j = 1, 2, 3 are flavor indices, and φa = εabφ∗b . The Yukawa matrices Y u, Y d and Y e are

general complex 3× 3 matrices of dimensionless couplings.

For the leptons, the situation is the same as what we discussed in Chapter 7. For the quarks,

because there are two Yukawa matirices, the situation is different in a significant way.

We start with the leptons. Without loss of generality, we can use a bi-unitary transformation,

Y e → Ye = UeLYeU †eR, (8.14)

to change the basis to one where Y e is diagonal and real:

Y e = diag(ye, yµ, yτ ). (8.15)

In the basis defined in Eq. (8.15), we denote the components of the lepton SU(2)-doublets, and

the three lepton SU(2)-singlets, as follows:(νeL

eL

),

(νµL

µL

),

(ντL

τL

); eR, µR, τR, (8.16)

where e, µ, τ are ordered by the size of ye, yµ, yτ (from smallest to largest).

111

We now move to the quarks. Similarly, without loss of generality, we can use a bi-unitary

transformation,

Y u → Yu = VuLYuV †uR, (8.17)

to change the basis to one where Y u is diagonal and real:

Y u = diag(yu, yc, yt). (8.18)

In the basis defined in Eq. (8.18), we denote the components of the quark SU(2)-doublets, and

the quark up SU(2)-singlets, as follows:(uL

duL

),

(cL

dcL

),

(tL

dtL

); uR, cR, tR, (8.19)

where u, c, t are ordered by the size of yu, yc, yt (from smallest to largest).

We can use yet another bi-unitary transformation,

Y d → Yd = VdLYdV †dR, (8.20)

to change the basis to one where Y d is diagonal and real:

Y d = diag(yd, ys, yb). (8.21)

In the basis defined in Eq. (8.21), we denote the components of the quark SU(2)-doublets, and

the quark down SU(2)-singlets, as follows:(udL

dL

),

(usL

sL

),

(ubL

bL

); dR, sR, bR, (8.22)

where d, s, b are ordered by the size of yd, ys, yb (from smallest to largest).

If VuL 6= VdL, as is the general case, then the interaction basis defined by (8.18) is different

from the interaction basis defined by (8.21). In the former, Y d can be written as a unitary matrix

times a diagonal one,

Y u = Y u, Y d = V Y d. (8.23)

In the latter, Y u can be written as a unitary matrix times a diagonal one,

Y d = Y d, Y u = V †Y u. (8.24)

In either case, the matrix V is given by

V = VuLV†dL, (8.25)

where VuL and VdL are defined in Eqs. (8.17) and (8.20), respectively. Note that VuL, VuR, VdL and

VdR depend on the basis from which we start the diagonalization. The combination V = VuLV†dL,

however, does not. This is a hint that V is physical. Indeed, below we see that it plays a crucial

role in the charged current weak interactions.

112

8.2.4 Lφ and spontaneous symmetry breaking

The scalar field is a singlet of the SU(3)C group. Thus, the form of Lφ is the same as in the LSM,

Lφ = −µ2φ†φ− λ(φ†φ

)2. (8.26)

Choosing µ2 < 0 and λ > 0 leads, as in the LSM, to spontaneous symmetry breaking, with

|〈φ〉| = v/√

2 (v2 = −µ2/λ). Since φ is SU(3)C singlet, the SU(3)C subgroup remains unbroken,

and the pattern of spontaneous symmetry breaking is as required by Eq. (8.2).

The spontaneous breaking of SU(2)L×U(1)Y into U(1)EM allows us to distinguish the compo-

nents of the SU(2)L-doublet fermions fields by their electromagnetic charges. For the lepton fields

of Eq.(7.21), we presented the charges in Eq.(7.22): −1 for the T3 = −1/2 member and 0 for the

T3 = +1/2 member. Writing down the two components of the SU(2)L-doublet quark fields as

QL =

(UL

DL

), (8.27)

we have the following EM charges of the quark fields:

q(UL) = +2

3, q(DL) = −1

3, q(UR) = +

2

3, q(DR) = −1

3. (8.28)

In what follows, we often call the q = −1/3 quarks “down-type quarks,” and the q = +2/3 quarks

“up-type quarks.”

8.2.5 Summary

The renormalizable part of the Standard Model Lagrangian is given by

LSM = − 1

4Gµνa Gaµν −

1

4W µνb Wbµν −

1

4BµνBµν − (Dµφ)†(Dµφ)

− iQLiD/QLi − iURiD/URi − iDRiD/DRi − iLLiD/LLi − iERiD/ERi+(Y uijQLiURj φ+ Y d

ijQLiDRj φ+ Y eijLLiERj φ+ h.c.

)− λ

(φ†φ− v2/2

)2, (8.29)

where i, j = 1, 2, 3.

8.3 The Spectrum

8.3.1 Bosons

Given the spontaneous breaking of the SU(2)L × U(1)Y symmetry to the U(1)EM subgroup, the

spectrum of the electroweak gauge bosons remains the same as in the LSM: three massive vector

113

bosons, W± and Z0, and a massless photon, A0. Furthermore, since the breaking is induced by an

SU(2)L-doublet, the ρ ≡ m2W/(m

2Z cos2 θW ) = 1 relation holds.

The new ingredient is the existence of a gluon in the octet representation of SU(3)C . Since the

SU(3)C gauge symmetry remains unbroken, the gluon is massless.

As concerns scalars, the three would-be Goldstone bosons become the longitudinal components

of the three massive vector bosons. The fourth scalar degree of freedom is the Higgs boson h, a

real massive scalar field,

8.3.2 Fermions

Since the SM allows no bare mass terms for the fermions, their masses can only arise from the

Yukawa part of the Lagrangian, which is given in Eq. (8.13). Indeed, with 〈φ0〉 = v/√

2, Eq.

(8.13) has a piece that corresponds to charged lepton masses:

me =yev√

2, mµ =

yµv√2, mτ =

yτv√2, (8.30)

a piece that corresponds to up-type quark masses,

mu =yuv√

2, mc =

ycv√2, mt =

ytv√2, (8.31)

and a piece that corresponds to down-type quark masses,

md =ydv√

2, ms =

ysv√2, mb =

ybv√2. (8.32)

We conclude that all charged fermions acquire Dirac masses as a result of the spontaneous symme-

try breaking. The key to this feature is that, while the charged fermions are in chiral representations

of the full gauge group SU(3)C × SU(2)L × U(1)Y , they are in vector-like representations of the

SU(3)C × U(1)EM group:

• The LH and RH charged lepton fields, e, µ and τ , are in the (1)−1 representation.

• The LH and RH up-type quark fields, u, c and t, are in the (3)+2/3 representation.

• The LH and RH down-type quark fields, d, s and b, are in the (3)−1/3 representation.

On the other hand, as is the case discussed in Section 7.3.3 the neutrinos remain massless:

mνe = mνµ = mντ = 0. (8.33)

The experimental values of the charged fermion masses are

me = 0.510998910(13) MeV, mµ = 105.658367(4) MeV, mτ = 1776.82(16) MeV,

mu = 1.5− 3.1 MeV, mc = 1.29+0.05−0.11 GeV, mt = 172.9± 0.12 GeV,

md = 4.1− 5.7 MeV, ms = 100+30−20 MeV, mb = 4.9+0.18

−0.06 GeV, (8.34)

where the quark masses are given at a scale µ = 2 GeV. We discuss how the masses of the quarkes

are determined in more details in Chapter 9.

114

Table 8.1: The SM particles

particle spin color Q mass [v]

W± 1 (1) ±1 12g

Z0 1 (1) 0 12

√g2 + g′2

A0 1 (1) 0 0

g 1 (8) 0 0

h 0 (1) 0√

2λ

e, µ, τ 1/2 (1) −1 ye,µ,τ/√

2

νe, νµ, ντ 1/2 (1) 0 0

u, c, t 1/2 (3) +2/3 yu,c,t/√

2

d, s, b 1/2 (3) −1/3 yd,s,b/√

2

8.3.3 Summary

The mass eigenstates of the SM, their SU(3)C × U(1)EM quantum numbers, and their masses in

units of the VEV v, are presented in Table 8.1. All masses are proportional to the VEV of the

scalar field, v. For the three massive gauge bosons, and for the fermions, this is expected: In the

absence of spontaneous symmetry breaking, the former would be protected from acquiring masses

by the gauge symmetry and the latter by their chiral nature. For the Higgs boson, the situation

is different, as a mass-squared term does not violate any symmetry.


In this section, we discuss the interactions of the fermion and scalar mass eigenstates of the

Standard Model. The QED interactions of the leptons have been presented in Chapters 3 and

7. The electromagnetic interactions of the quarks are dictated by their charges, presented in Eq.

(8.28), and can be obtained along the lines explained in Section 3.6.3. The QCD interactions of

the quarks have been presented in Chapter 5. Here we focus on the weak and Yukawa interactions,

that is, the couplings of fermions to the W , Z and h bosons.

8.4.1 Neutral current weak interactions

The Z couplings to fermions can be written as follows:

LZ,fermions =e

sin θW cos θW(T3i − sin2 θWQi) ψiZ/ψi . (8.35)

115

Using the T3 and Y assignments of the various fermion fields, we find the following types of Z

couplings:

L =e

sW cW

[−(

1

2− s2

W

)eLZ/eL + s2

W eRZ/eR +1

2νeLZ/νeL (8.36)

+(

1

2− 2

3s2W

)uLZ/uL −

2

3s2W uRZ/uR −

(1

2− 1

3s2W

)dLZ/dL +

1

3s2W dRZ/dR

]+(e, νe, u, d→ µ, νµ, c, s) + (e, νe, u, d→ τ, ντ , t, b).

The Z couplings are chiral, parity-violating, diagonal and universal.

Omitting common factors, particularly, a factor of e2/(4s2W c

2W ), and phase-space factors, we

obtain the following predictions for the Z decays into a one-generation fermion-pair of each type:

Γ(Z → νν) ∝ 1,

Γ(Z → `¯) ∝ 1− 4s2W + 8s4

W ,

Γ(Z → uu) ∝ 3(

1− 8

3s2W +

32

9s4W

),

Γ(Z → dd) ∝ 3(

1− 4

3s2W +

8

9s4W

). (8.37)

Putting s2W = 0.225, we obtain

Γν : Γ` : Γu : Γd = 1 : 0.51 : 1.74 : 2.24. (8.38)

Experiments measure the following average branching ratio into a single generation of each fermion

species:

BR(Z → νν) = (6.67± 0.02)%,

BR(Z → `¯) = (3.37± 0.01)%,

BR(Z → uu) = (11.6± 0.6)%,

BR(Z → dd) = (15.6± 0.4)%, (8.39)

which, using central values, give

Γν : Γ` : Γu : Γd = 1 : 0.505 : 1.74 : 2.34, (8.40)

in very nice agreement with the predictions

8.4.2 Charged current weak interactions

We now study the couplings of the charged vector bosons, W±, to fermion pairs. For the lepton

mass eigenstates, things are simple, because there exists an interaction basis that is also a mass

basis. Thus, the W interactions must be universal also in the mass basis:

− g√2

(νeL W/

+e−L + νµL W/+µ−L + ντL W/

+τ−L + h.c.). (8.41)

116

As concerns quarks, things are more complicated, since there is no interaction basis that is also

a mass basis. In the interaction basis where the down quarks are mass eigenstates (8.22), the W

interactions have the following form:

− g√2

(udL W/

+dL + usL W/+sL + ubL W/

+bL + h.c.). (8.42)

The Yukawa matrices in this basis have the form (8.24), and in particular, for the up sector, we

have

LuYuk = (udL usL ubL)V †Y u

uR

cR

tR

, (8.43)

which tells us straightforwardly how to transform to the mass basis:uL

cL

tL

= V

udL

usL

ubL

. (8.44)

Using Eq. (8.44), we obtain the form of the W interactions (8.42) in the mass basis:

− g√2

(uL cL tL) V W/ +

dL

sL

bL

+ h.c.. (8.45)

Recalling that V = VuLV†dL is basis independent, you can convince yourself that we would have

obtained the same form starting from any arbitrary interaction basis.

Eq. (8.45) reveals some important features of the model:

1. Only left-handed particles take part in charged-current interactions. Consequently, parity is

violated by these interactions.

2. The W couplings to the quark mass eigenstates are neither universal nor diagonal. The

universality of gauge interactions is hidden in the unitarity of the matrix V .

The matrix V is called the Cabibbo-Kobayashi-Maskawa (CKM) matrix.

The form of the CKM matrix is not unique. First, there is freedom in defining V in that we

can permute between the various generations. This freedom is fixed by ordering the up quarks

and the down quarks by their masses, i.e. (u1, u2, u3) → (u, c, t) and (d1, d2, d3) → (d, s, b). The

elements of V are therefore written as follows:

V =

Vud Vus Vub

Vcd Vcs Vcb

Vtd Vts Vtb

. (8.46)

We discuss the CKM matrix in more details in later in this Chapter and in Chapter ??.

117

Omitting common factors (particularly, a factor of g2/4) and phase-space factors, we obtain

the following predictions for the W decays:

Γ(W+ → `+ν`) ∝ 1,

Γ(W+ → uidj) ∝ 3|Vij|2 (i = 1, 2; j = 1, 2, 3). (8.47)

The top quark is not included because it is heavier than the W boson. Taking this fact into

account, and the CKM unitarity relations

|Vud|2 + |Vus|2 + |Vub|2 = |Vcd|2 + |Vcs|2 + |Vcb|2 = 1, (8.48)

we obtain

Γ(W → hadrons) ≈ 2Γ(W → leptons). (8.49)

Experimentally,

BR(W → leptons) = (32.40± 0.27)%, BR(W → hadrons) = (67.60± 0.27)%, (8.50)

which leads to

Γ(W → hadrons)/Γ(W → leptons) = 2.09± 0.01, (8.51)

in beautiful agreement with the SM prediction. The (hidden) universality within the quark sector

is tested by the prediction

Γ(W → uX) = Γ(W → cX) =1

2Γ(W → hadrons). (8.52)

Experimentally,

Γ(W → cX)/Γ(W → hadrons) = 0.49± 0.04. (8.53)

We discuss more aspects of the phenomenology related to the CKM matrix in Chapter ??.

8.4.3 Interactions of the Higgs boson

The Higgs boson has self-interactions, weak interactions, and Yukawa interactions:

Lh =1

2∂µh∂

µh− 1

2m2hh

2 − m2h

2vh3 − m2

h

8v2h4 (8.54)

+ m2WW

−µ W

µ+

(2h

v+h2

v2

)+

1

2m2ZZµZ

µ

(2h

v+h2

v2

)

− h

v(me eL eR +mµ µL µR +mτ τL τR

+mu uL uR +mc cL cR +mt tL tR +md dL dR +ms sL sR +mb bL bR + h.c.).

Note that the Higgs boson couples diagonally to the quark mass eigenstates. The reason for

this is that the Yukawa couplings determined both the masses and the Higgs couplings to the

118

fermions. Thus, in the mass basis the Yukawa interactions are also diagonal. Formally we can see

it as follows. Let us start from an arbitrary interaction basis:

hDLYdDR = hDL(V †dLVdL)Y d(V †dRVdR)DR

= h(DLV†dL)(VdLY

dV †dR)(VdRDR)

= h(dL sL bL)Y d(dR sR bR)T . (8.55)

We conclude that the Higgs couplings to the fermion mass eigenstates are diagonal, but not uni-

versal. Instead, they are proportional to the fermion masses: the heavier the fermion, the stronger

the coupling.

Thus, the Higgs boson decay is dominated by the heaviest particle which can be pair-produced

in the decay. For mh ∼ 125 GeV, this is the bottom quark. Indeed, the SM predicts the following

branching ratios for the leading decay modes:

BRbb : BRWW ∗ : BRgg : BRτ+τ− : BRZZ∗ : BRcc = 0.58 : 0.21 : 0.09 : 0.06 : 0.03 : 0.03. (8.56)

The following comments are in order with regard to Eq. (8.56):

1. From the six branching ratios, three (b, τ, c) stand for two-body tree-level decays. Thus, at

tree level, the respective branching ratios obey BRbb : BRτ+τ− : BRcc = 3m2b : m2

τ : 3m2c .

QCD radiative corrections are significant and suppress the two modes with the quark final

states (b, c) compared to one with the lepton final state (τ).

2. The WW ∗ and ZZ∗ modes stand for the three-body tree-level decays, where one of the vector

bosons is on-shell and the other off-shell.

3. The Higgs boson does not have a tree-level coupling to gluons since it carries no color (and

the gluons have no mass). The decay into final gluons proceeds via loop diagrams. The

dominant contribution comes from the top-quark loop.

4. Similarly, the Higgs decays into final two photons via loop diagrams with small (BRγγ ∼0.002), but observable, rate. The dominant contributions come from the W and the top-

quark loops which interfere destructively.

Experimentally, the decays into final ZZ∗, WW ∗ and γγ have been established, and there are

evidences for the τ+τ− one.

8.4.4 Summary

Within the SM, quarks have five types of interactions. These interactions are summarized in Table

8.1.

119

Table 8.1: The SM quark interactions

interaction force carrier coupling range

electromagneric γ eQ long

Strong g gs long

NC weak Z0 e(T3−s2WQ)

sW cWshort

CC weak W± gV short

Yukawa h yq short

8.5 Accidental symmetries and parameter counting


If we set the Yukawa couplings to zero, LYuk = 0, the SM gains a large accidental global symmetry:

GglobalSM (Y u,d,e = 0) = U(3)Q × U(3)U × U(3)D × U(3)L × U(3)E, (8.57)

where U(3)Q has (Q1, Q2, Q3) transforming as an SU(3)Q triplet, and all other fields singlets,

U(3)U has (U1, U2, U3) transforming as an SU(3)U triplet, and all other fields singlets, U(3)D has

(D1, D2, D3) transforming as an SU(3)D triplet, and all other fields singlets, U(3)L has (L1, L2, L3)

transforming as an SU(3)L triplet, and all other fields singlets, and U(3)E has (E1, E2, E3) trans-

forming as an SU(3)E triplet, and all other fields singlets.

The Yukawa couplings break this symmetry into the following subgroup:

GglobalSM = U(1)B × U(1)e × U(1)µ × U(1)τ . (8.58)

Under U(1)B, all quarks (antiquarks) carry charge +1/3 (−1/3), while all other fields are neutral. It

explains why proton decay has not been observed. Possible proton decay modes, such as p→ π0e+

or p→ K+ν, are not forbidden by the SU(3)C ×U(1)EM symmetry. However, they violate U(1)B,

and therefore do not occur within the SM. The lesson here is quite general: The lightest particle

that carries a conserved charge is stable. The accidental U(1)B symmetry also explains why

neutron-antineutron oscillations have not been observed.

Note that U(1)B as well as each of the lepton numbers are anomalous. The combination

of B − L, however, is anomaly free. Due to the anomaly, baryon and lepton number violating

processes occur non-perturbatively. However, the non-perturbative effects obey ∆B = ∆L = 3n,

with n =integer, and thus do not lead to proton decay. Moreover, they are very small, and can be

neglected in almost all casses we study, and thus we do not discuss them any further.

The accidental symmetries of the renormalizable part of the SM Lagrangian also explain the

vanishing of neutrino masses. A Majorana mass term violates the accidental B − L symmetry by

120

two units. Thus, the symmetry prevents mass terms not only at tree level but also to all orders

in perturbation theory. Moreover, since the B − L symmetry is non-anomalous, Majorana mass

terms do not arise even at the non-perturbative level. We conclude that the renormalizable SM

gives the exact prediction:

mν = 0. (8.59)

8.5.2 Parameter counting

Before we discuss the SM parameters in detail, we explain the basics of identifying the number

of physical parameters. The Lagrangian written in a general interaction basis might include a

number of parameters that is larger than the number of physical parameters. This means that

when we express physical observables in terms of the Lagrangian parameters, only a subset of

these parameters (or combinations of them) will appear. It also means that there is a specific

basis where the non-physical parameters are identically zero. For the purpose of testing a model,

it is important to count and identify its physical parameters. In this subsection we explain how to

determine the number of physical parameters.

We start with a very simple example. Consider a hydrogen atom in a uniform magnetic field.

Before turning on the magnetic field, the hydrogen atom is invariant under spatial rotations, which

are described by the SO(3) group. Furthermore, there is an energy eigenvalue degeneracy of the

Hamiltonian: states with different angular momenta have the same energy. This degeneracy is a

consequence of the symmetry of the system.

When magnetic field is added to the system, we can define, without loss of generality, the

direction of the magnetic field. The common convention is to define the positive z direction to

be the direction of the magnetic field. Consider this choice more carefully. A generic uniform

magnetic field would be described by three real numbers: the three components of the magnetic

field (Bx, By, Bz). The magnetic field breaks the SO(3) symmetry of the hydrogen atom system

down to an SO(2) symmetry of rotations in the plane perpendicular to the magnetic field. The

one generator of the SO(2) symmetry is the only valid symmetry generator now; the remaining

two SO(3) generators in the orthogonal planes are broken. These broken symmetry generators

allow us to rotate the system such that the magnetic field points in the z direction:

OxzOyz(Bx, By, Bz) = (0, 0, B′z), (8.60)

where Oxz and Oyz are rotations in the xz and yz planes respectively. The two broken generators

were used to rotate away two unphysical parameters, leaving us with one physical parameter, the

magnitude of the magnetic field. We learn that, when turning on the magnetic field, all measurable

quantities in the system depend on only one new parameter, rather than the naıve three.

The results described above are more generally applicable. Particularly, they are useful in

studying the flavor physics of quantum field theories. Consider a gauge theory with matter content.

121

The kinetic and gauge terms (Lkin) have a certain global symmetry, Gf . In adding terms (Lψ +

Lφ +LYuk) that respect the imposed gauge symmetries, the global symmetry may be broken down

to a smaller symmetry group. In breaking the global symmetry, there is an added freedom to use

the broken Gf generators to change basis and, in particular, rotate away unphysical parameters,

as when a magnetic field is added to the hydrogen atom system.

We are interested in obtaining the number of parameters affecting physical measurements,

Nphys. In a general basis, the added terms depend on Ngeneral parameters. The global symmetry

of the entire model, Hf , has fewer generators than Gf . We call the difference in the number of

generators Nbroken. Then Nphys is given by

Nphys = Ngeneral −Nbroken. (8.61)

Furthermore, the rule in (8.61) applies separately to real parameters and to phases. A general

n × n complex matrix can be parameterized by n2 real parameters and n2 phases. Imposing

restrictions like Hermiticity or unitarity reduces the number of parameters required to describe

the matrix. A Hermitian matrix can be described by n(n + 1)/2 real parameters and n(n − 1)/2

phases. As for generators, the rules for unitary matrices are as follows. The generator of U(1) is,

clearly, a phase. For SU(N) the real parameters are associated with the SO(N) subgroup. Thus,

there are n(n− 1)/2 real parameters and n(n+ 1)/2 phases.

8.5.3 Parameter counting in the SM

The rule given by (8.61) can be applied to the standard model. Consider the quark sector of the

model. The kinetic term has a global symmetry

Gf = U(3)Q × U(3)U × U(3)D. (8.62)

A U(3) algebra has 9 generators (3 real and 6 imaginary), so the total number of generators of Gf

is 27. The Yukawa interactions defined in Eq. (8.13), Y F (F = u, d), are 3× 3 complex matrices,

which contain a total of 36 parameters (18 real parameters and 18 phases) in a general basis. These

parameters also break Gf down to baryon number:

U(3)Q × U(3)U × U(3)D → U(1)B. (8.63)

While U(3)3 has 27 generators, U(1)B has only one and thus Nbroken = 26. This broken symmetry

allows us to rotate away a large number of the parameters by moving to a more convenient basis.

Using (8.61), the number of physical parameters should be given by

Nphys = 36− 26 = 10. (8.64)

These parameters can be split into real parameters and phases. The three unitary matrices

generating the symmetry of the kinetic and gauge terms have a total of 9 real parameters and 18

122

phases. The symmetry is broken down to a symmetry with only one phase generator. Thus,

N(r)phys = 18− 9 = 9, N

(i)phys = 18− 17 = 1. (8.65)

Let us now identify these parameters. Of the 9 real parameters, 6 are the fermion masses and

three are the CKM matrix mixing angles. The one phase is the CP-violating phase of the CKM

mixing matrix. (In your homework you will count the number of parameters for different models.)

We thus conclude that the full SM has 18 parameters: 3 gauge couplings, 2 parameters of the

Higgs potential, the 3 lepton masses and the 10 parameters of the quark sector.

8.5.4 Parametrization of the CKM matrix

we concluded above that only one of these phases is physical. This implies that we can find bases

where V has a single phase. This physical phase is the Kobayashi-Maskawa phase that is usually

denoted by δKM.

The fact that there are only three real and one imaginary physical parameters in V can be made

manifest by choosing an explicit parametrization. For example, the standard parametrization, used

by the Particle Data Group (PDG) [?], is given by

V =

c12c13 s12c13 s13e

−iδ

−s12c23 − c12s23s13eiδ c12c23 − s12s23s13e

iδ s23c13

s12s23 − c12c23s13eiδ −c12s23 − s12c23s13e

iδ c23c13

, (8.66)

where cij ≡ cos θij and sij ≡ sin θij. The three sin θij are the three real mixing parameters

while δ is the Kobayashi-Maskawa phase. With the fixed mass ordering explained above, we

have θij ∈ 0, π/2 and δ ∈ 0, 2π. Another parametrization is the Wolfenstein parametrization

where the four mixing parameters are (λ,A, ρ, η) with η the CP violating phase. The Wolfen-

stein parametrization is an expansion in the small parameter, λ = |Vus| ≈ 0.22. To O(λ3) the

parametrization is given by

V =

1− 1

2λ2 λ Aλ3(ρ− iη)

−λ 1− 12λ2 Aλ2

Aλ3(1− ρ− iη) −Aλ2 1

. (8.67)

In Chapter ?? we discuss in detail the measurements of the CKM parameters. Here we just mention

that the Wolfenstein parametrization provides a good approximation to the actual numerical val-

ues: The CKM matrix is close to a unit matrix, with off-diagonal terms that are small. The order

of magnitude of each element can be read from the power of λ in the Wolfenstein parametrization.

8.5.5 The strong CP parameter

The above counting of parameters is done at the classical level. Usually, when quantizing a

system, the number of parameters is not changed. Yet, there are exceptions that are related to

123

non-Abelian gauge groups. In the SM it turns out that there is one more renormalizable parameter

that is unphysical at the classical level but is physical at the quantum level. This parameter is

called θQCD:

LθQCD=θQCD

32π2εµνρσG

µνa G

ρσa . (8.68)

This term violates P and CP. In particular, it leads to an electric dipole moment (EDM) of the

neutron dn. The experimental upper bound on the EDM of the neutron,

dn < 2.9× 10−26 e cm, (8.69)

implies a very small value for θQCD, θQCD ∼< 10−9. The problem of why θQCD is so small is known as

the strong CP problem. We do not discuss it any further here. We just conclude that the number

of independent parameters in the quantum SM is thus 19: the 18 mentioned above and θQCD.

8.6 P, C and CP

Just like in the LSM, also the full SM violates C and P as it is a chiral theory. For CP, however,

the situation is different. The SM violates CP while the LSM conserves it.

The single phase of the CKM matrix is the only source of CP violation within the SM. Note

that the fact that CP is violated in the SM is closely related to the number of generations as three

is the minimal number that exhibits a phase.

Various parameterizations differ in the way that the freedom of phase rotation is used to leave

a single phase in V . One can define, however, a CP violating quantity in V that is independent of

the parametrization. This quantity, the Jarlskog invariant, JCKM, is defined through

Im(VijVklV∗ilV∗kj) = JCKM

3∑m,n=1

εikmεjln, (i, j, k, l = 1, 2, 3). (8.70)

In terms of the explicit parameterizations given above, we have

JCKM = c12c23c213s12s23s13 sin δ ≈ λ6A2η. (8.71)

While there is room for CP violation in the SM, so that we expect that indeed CP is violated,

this is not necessarily the case. A necessary and sufficient condition for CP violation in the quark

sector of the SM is given by

∆m2tc∆m

2tu∆m

2cu∆m

2bs∆m

2bd∆m

2sdJCKM 6= 0, (8.72)

where ∆m2ij ≡ m2

i −m2j . Eq. (8.72) puts the following requirements on the SM in order that CP

is violated:

1. Within each quark sector, there should be no mass degeneracy;

124

2. None of the three mixing angles should be zero or π/2;

3. The phase should be neither 0 nor π.

These conditions can also be written as a single requirement on the quark mass matrices in the

interaction basis:

XCP ≡ Im

det[MdM

†d ,MuM

†u

]6= 0 ⇔ CP violation. (8.73)

This is a convention independent condition.

8.6.1 Unitarity Triangles

A very useful concept is that of the unitarity triangles. The unitarity of the CKM matrix leads to

various relations among the matrix elements. Of particular interest are the six relations:∑i

VidV∗is = 0. (8.74)

These relations require the sum of three complex quantities to vanish. Therefore, they can be

geometrically represented in the complex plane as a triangle and are called “unitarity triangles.” It

is a feature of the CKM matrix that all unitarity triangles have equal areas. Moreover, the area of

each unitarity triangle equals |JCKM|/2 while the sign of JCKM gives the direction of the complex

vectors around the triangles.

The triangle which corresponds to the relation

VudV∗ub + VcdV

∗cb + VtdV

∗tb = 0. (8.75)

has its three sides of roughly the same length. Furthermore, both the length of its sides and

its angles are experimentally accessible in practice. For these reasons, the term “the unitarity

triangle” is reserved for Eq. (8.75). We further define the rescaled unitarity triangle. It is derived

from (8.75) by choosing a phase convention such that (VcdV∗cb) is real and dividing the lengths of all

sides by |VcdV ∗cb|. The rescaled unitarity triangle is similar to the unitarity triangle. Two vertices

of the rescaled unitarity triangle are fixed at (0,0) and (1,0). The coordinates of the remaining

vertex correspond to the Wolfenstein parameters (ρ, η). The unitarity triangle is shown in Fig.

8.1.

The lengths of the two complex sides are

Ru ≡∣∣∣∣VudVubVcdVcb

∣∣∣∣ =√ρ2 + η2, Rt ≡

∣∣∣∣VtdVtbVcdVcb

∣∣∣∣ =√

(1− ρ)2 + η2. (8.76)

The three angles of the unitarity triangle are defined as follows:

α ≡ arg

[− VtdV

∗tb

VudV ∗ub

], β ≡ arg

[−VcdV

∗cb

VtdV ∗tb

], γ ≡ arg

[−VudV

∗ub

VcdV ∗cb

]. (8.77)

They are physical quantities and can be independently measured, as we discuss below. Another

commonly used notation is φ1 = β, φ2 = α, and φ3 = γ. Note that in the standard parametrization

γ = δKM.

125

VudV∗ub

VcdV ∗cb

VtdV∗tb

VcdV ∗cb

(ρ, η)

α

βγ

(0, 0) (1, 0)

Figure 8.1: The unitarity triangle.

126

Homework

Question 8.1: Semi-leptonic decays and CKM

A semi-leptonic decay is one where in the final state we have both hadrons and leptons. Here

we consider semileptonic b decays.

1. Draw the tree-level diagram for b→ ueν and estimate the diagram

2. Estimate the ratioΓ(b→ ueν)

Γ(b→ ceν)(8.78)

as a function of some CKM matrix element. You can negelect the mass of the electron and

use the phase space function for a 1→ 3 decay with two massless final states. It is given in

Eq. (7.65) and we repeat it here

f(x) = 1− 8x+ 8x3 − x4 − 12x2 log x, xf ≡ (mf/mb)2. (8.79)

For the quark masses usemc

mb

≈ 0.3,mu

mb

≈ 0. (8.80)

3. Based on the following experimental data, estimate the relevant CKM matrix element ratio

Γ(b→ u`ν)

Γ(b→ c`ν)≈ 2× 10−2 . (8.81)

Question 8.2: τ decays

1. Almost always, the muon decays via µ− → e−νµνe. Draw the tree level Feynman diagram

for this process.

2. The tau lepton is relatively heavy, and can decay purely leptonically and semi-leptoniclly

(that is, also to hadrons). List the possible tree level decay processes of the tau lepton in

terms of leptons and quarks. (Note that while the charm is lighter than the tau, what we

care about is the mass of the lightest charmed, the D meson, is heavier than the tau.)

127

3. In terms of CKM elements, estimate the branching ratio of each mode.

4. Within the SM, the tau could not decay by τ− → µ−γ. Explain why.

5. We now move to the neutrino sector. We add three right handed neutrinos to the SM

N iR(1, 1)0, i = e, µ, τ. (8.82)

Now the neutrinos can have Dirac masses just like the other fermions of the SM.1 Write down

the terms in L which give neutrino masses. Write them both before and after the electroweak

SSB.

6. How many physical parameters are needed now in order to describe the lepton sector? Sep-

arate them into masses, mixing angles and phases.

7. In this model, τ− → µ−γ is possible. Still, it is not allowed at tree level. Explain why.

8. We assume that there is no degeneracy in the neutrino sector. We denote the neutrino mass

eigenstates by νi with i = 1, 2, 3 where

m3 > m2 > m1. (8.83)

Draw the leading one loop diagram(s) for τ− → µ−γ.

9. We assume that m3 m2 m1 and we take m3 = 10−1 eV. In terms of the neutrino

masses, estimate the branching ratio of τ− → µ−γ. In case you need more information about

the neutrino parameters, just assume some values for them and explain your assumptions.

10. Consider a model with only two right-handed neutrino fields. How many physical parameters

are needed now in order to describe the lepton sector? Separate them into masses, mixing

angles and phases.

Question 8.3: W decays

In this question we study decays of the W boson. In all items you should neglect the effect of

the fermion masses when they are small compared to MW (namely, all the fermions but for the

top quark). The W → µ+νµ width at tree level is given by

Γµ ≡ Γ(W → µ+νµ) =g2mW

48π=GFm

3W

6√

2π≈ 227 MeV. (8.84)

1Here we ignore the possibility of lepton number violating mass, the so called Majorana mass. If you never heard

about Majorana mass, then just ignore this footnote. If you do know what it is, then assume you cannot write it.

Majorana masses can be forbidden by imposing lepton number as a global symmetry.

128

1. What are the predicted widths for Γ(W → e+νe) and Γ(W → τ+ντ )?

2. Write down all the hadronic decays of the W in terms of quarks, that is, W → qq′, and

estimate their widths in terms of Γµ, the CKM elements and the number of colors, Nc = 3.

3. What is roughly ΓW , the total width of the W? Write your answer in GeV, to a precision of

two decimal points.

4. Experimentally ΓW = 2.12(4) GeV . Explain the difference between your result and the

experimental measurement.

Question 8.4: LeptoQuarks

In this question we study some properties of scalar LeptoQuarks (LQs). A LQ is a hypotettical

field which couples to a quark and a lepton. For example, F , which is a doublet of SU(2)L, couples

as

λFQEij QiLE

jR F, (8.85)

where SU(3)C and SU(2)L indices are omitted. Here i and j are generation indices. While λFQEij

above is in the flavor basis, in the following you can assume that the rotation to the mass basis

is small and can be neglected. Recall that the representations of the SM fermions under the SM

gauge group are

QL(3, 2)1/6, UR(3, 1)2/3, DR(3, 1)−1/3, LL(1, 2)−1/2, ER(1, 1)−1. (8.86)

1. What is the representation of F under the SM group?

2. What is the electric charge of each of the two components in F?

3. We assume that F does not acquire a VEV, that is, 〈F 〉 = 0. Shortly describe the phe-

nomenological problems of 〈F 〉 6= 0.

4. What are the baryon and lepton numbers of F? (Recall that we define the baryon number

of the proton to be +1, so the baryon number of the quarks is 1/3. We define the lepton

number of the electron to be +1.)

5. F contributes to the quark level decay b → sµ+e−. Draw the tree-level Feynman diagram

for this decay.

6. Next we find a bound on mF . (Here we neglect the splitting between the two components

of F .) For this we would like to compare the rate of the LQ mediated b → sµ+e− decay to

that of the W mediated one, b→ ce−ν. Draw the tree-level diagram for b→ ce−ν.

129

7. Estimate the ratioΓ(b→ sµ+e−)

Γ(b→ ce−ν). (8.87)

Express the ratio in terms of λFQEij , mF , g, mW , and the CKM matrix elements. You should

assume that mF mb and that the hadronic matrix elements are similar in the two decays

and thus cancel in the ratio.

8. We now assume that λFQEij ∼ g for all i and j. Using the experimental data

BR(b→ ce−ν) ∼ 10−1, BR(b→ sµ+e−) ∼< 10−5, (8.88)

estimate a lower bound on mF .

9. F couples to another combination of a quark and a lepton. Write it down. You can omit

SU(3)C and SU(2)L indices.

10. Besides F there is one more scalar doublet LQ which we denote by G. Find the representation

of G and write down its couplings (the equivalent of (8.85)). You can omit SU(3) and SU(2)

indices.

11. Now we move to other types of scalar LQs. There are several of them, but here we consider

only an SU(2)L singlet, R, that couples to SM fermions as

λRDEDREcRR, (8.89)

where the flavor indices are omitted. Recall that ψcR is a left handed field with opposite

charges compared to ψR. In particular, EcR is a left handed field which transforms as (1, 1)1

and carries lepton number of −1.

12. What is the SM representation of R?

13. R also couples to pairs of quarks. For example, we can have

λRUU U cRURR. (8.90)

Show that this term is indeed a singlet under the SM gauge group. Recall that U cR is a left

handed field which transforms as (3, 1)−2/3. As for SU(3) algebra, recall that 3 × 3 = 6 + 3

and 3× 3 = 1 + 8.

14. R breaks both baryon and lepton numbers. That is, the lepton and baryon numbers as-

signment of R in (8.89) is inconsistent with that in (8.90). Show that this is indeed the

case.

130

15. R can mediate proton decay. Draw a Feynman diagram for the decay

P+ → π0µ+, (8.91)

and give a rough estimate of it in terms of the coupling and mR. Be sure to write the flavor

dependence of the couplings. (You can assume mR 1 GeV .)

16. Experimental data tell us that the proton lifetime is longer than about 1033 years. Taking

all the couplings of R to be of order one, estimate a lower bound on mR. Express your result

in units of GeV. Recall that h = 1 = 6.58× 10−22 MeV s.

Question 8.5: CPV in the SMCalculate the number of physical parametrs in a SM with 2 and 3 generations, and show that CPV

eneters for a 3 generation Sm but not for one with 2 generations.

131

Part II

Particle physics

132

Chapter 9

QCD at the IR

In Chapter 5 we presented the high energy effects of QCD, where the theory is perturbative. In

this chapter we discuss some low energy aspects of QCD.

9.1 The quark model

As we discussed in Chapter 5 QCD is strongly coupled at the IR, and consequently there is no

perturbative expansion in terms of the fundamental DoF, quarks and gluons. What is observed are

bound states which we call hadrons, which are SU(3)C-singlets. What we know from experiments

about hadrons are their masses, lifetimes, spins and electric charges.

Note, however, that the fact that there is no reliable perturbation theory in terms of quarks and

gluons does not imply that there is no way to solve QCD. For example Lattice QCD is a framework

that aimed at solving QCD from first principle, but it is not perturbative. Chiral perturbation

theory is a systematic expansion that works at low energy, but it used hadrons as its DoF, and does

not have quarks and gluons as its DoF. These two examples provide us with a lot of understanding

of QCD at low energy, and we very briefly discuss them later. In the following we discuss a less

rigorous approach, but still useful, known as the quark model.

The idea of the quark model is that all bound states are color singlet combinations of quarks

and anti-quarks. In particular, we assume that the quantum numbers of the hadrons are dictated

by the quantum numbers of the constituent quarks and antiquarks. This is clearly a model, as it

assumes the minimum quark content of the hadron, while for strongly coupled theories the wave

function of the hadron is unavoidably more complex. Yet, the model works surprisingly well, and

gives us a lot of insights into the IR properties of QCD. It is difficult, however, to estimate the

related errors. In this Section, we focus on the spectroscopy of the hadrons. Yet, there are also

properties of the dynamics that can be described in more detailed models.

In the quark model, the simplest hadrons belong to one of three types:

• Mesons, which have quark-antiquark constituents, M = qq

133

• Baryons, which have three quark constituents, B = qqq;

• Antibaryons, which have three antiquark constituents, B = qqq.

The lightest mesons are the pions:

π+ = ud, π0 =1√2

(uu− dd), π− = du. (9.1)

The lightest baryons are the proton and the neutron:

p = uud, n = udd. (9.2)

Mesons carry no baryon number, baryons carry baryon number that we normalize to B = +1, and

anti-baryons carry baryon number B = −1. Note that the letter B is used to denote a baryon

and baryon number (as well a certain type of mesons). Which of these options is meant should be

clear from the context.

The fact that QCD is confining results in an infinite spectrum of bound states. Their names

and Quantum Numbers (QNs) are given in Appendix 9.C. For mesons the notation is similar

to that of the hydrogen atom, as it is a two body system. We have the total spin, J , the spin

combination of the spin of the quark and anti-quark, S, that can be zero or one, and the orbital

angular momentum, L.

The quark model can have states that involve more than two or three quarks and anti-quarks,

that are referred to as exotics. These include tetraquarks (qqqq) and pentaquarks (qqqqq). While

not usually described as part of the quark model, there are also states involving gluons, in particular

the glueballs made of gluons only. Some of these states have recently been discovered. We do not

elaborate further on them.

9.1.1 Hadron masses

If we could solve QCD we would be able to calculate its IR spectrum. Since, however, QCD is a

strongly coupled theory we cannot do it. Yet, we can make several qualitative statements.

The masses of the hadrons come from two sources: the masses of the constituent quarks and

the binding energy coming from QCD interactions. Let us compare the situation to that of the

hydrogen atom. The mass of a hydrogen atom comes from the masses of its constituent proton

and electron plus a potential energy V = −13.6 eV. Indeed, it can be experimentally checked

that the mass of the hydrogen atom is 13.6 eV less than the sum of the masses of the proton and

electron. To make this statement it is crucial that we are able to physically pull a proton and

electron asymptotically apart and measure their masses independent of their mutual electric field.

For hadrons the situation is more complicated. Confinement implies that we cannot pull

individual quarks apart and measure their masses independently. (We discuss the issues of quark

134

masses in mode details in Appendix 9.A.) If we were able to extract the quark masses at high

scales, where QCD is weak, then we could run these masses down to the IR. The running of the

masses to low energies involves, however, the interactions, which is precisely what we wanted to

separate from our mass measurement. Let us also remark that, unlike atoms, hadrons can have

masses larger than the sum of the masses of their constituent quarks. In other words, the potential

energy can be positive. This, of course, is a signature of confinement.

As concerns quark masses, we split the six quarks into two groups: Light quarks — u, d, s

— which have masses (even though renormalization-scheme dependent) much smaller than ΛQCD,

and heavy quarks — c, b, t — with masses much larger than ΛQCD. The problem we just described

is particularly relevant for the light u, d and s quark masses, , since we probe them at energies

much larger than their pole masses (to the extent that a pole mass is at all well defined). The

bottom line is that these quark masses are regularization-scheme dependent. To understand what

is meant by a light quark mass, the regularization scheme that is used must be spelled out. Because

of these complications, it is at times useful to define “constituent quark masses.” The idea is that

we include the binding energy into the mass. In that way, we can define the mass of the u and the

d quark to be a third of the mass of the proton and the neutron. While this definition is somewhat

more intuitive, it does not relate to a fundamental parameter in our Lagrangian. It is almost a

restatement of the definition of ΛQCD.

The situation with the heavy c and b quark masses is different, as the mass of hadrons with

heavy quarks arises mainly from the heavy quark mass. The binding energy is a small correction

of relative order ΛQCD/mc,b.

As concerns the top quark, its mass is so heavy that its decay width from the weak interaction

is larger that the width of the QCD resonance: Γt ≈ 1.4 GeV ΛQCD. This situation is often

described by saying that the top “does not form hadrons.” There are subtleties, however, in this

statement. Because the top decays so quickly, we can never identify the hadron that it forms.

Arguably it is even an ill-posed question to identify the top hadron within the very short time

scale. In principle, though, we can theoretically “turn off” the weak interaction and calculate a

spectrum of top hadrons. This is a very strongly coupled problem that is intractable by known

techniques, but in principle one can calculate the spectrum of top hadrons when the weak force is

neglected.

9.1.2 Hadron lifetimes

With some abuse of language, we classify hadrons as either stable particles or unstable resonances.

In this context, stability is not meant in the absolute sense. (In fact, the only truly stable hadron

is the proton.) We can define what we mean by stability in two ways: experiment-related and

theory-related. Experimentally, we can determine the lifetime (1) by directly measuring its decay

width or (2) by measuring a displaced vertex, which is possible if the particle is sufficiently long-

135

lived and have high velocity. We call particles “stable” if their lifetimes is large enough to be

measured by the latter method (or even larger, so that it escapes the detector before decaying).

Theoretically, a more rigorous definition is that a stable particle is one that does not decay through

QCD interactions. These are the hadrons whose decays necessarily violate the accidental [U(1)]6

symmetry of QCD, see Eq. (5.17). Thus, we define a stable particle as one that either is really

stable or decays only through weak interactions. Resonances are the particles which are not stable.

In fact, some resonances have a width that is so broad (of the order of their mass) that it becomes

very hard to determine whether or not it is a bound state at all. Recall that when we do QFT

we work with states in the asymptotic past and future. When a decay width of a particle is of

the order of its mass, the notion of an asymptotic state becomes ill defined. In the PDG, particles

whose name contains their mass in parenthesis, e.g. ρ(770), are resonances while those that do not

are stable.

As an example of a resonance, consider the ρ meson. This is a bound state of up and down

quark and anti-quark. The dominant contribution to its mass comes from QCD. We thus expect

the mass to be of the order of ΛQCD, which is hundreds of MeV. Indeed, mρ ≈ 775 MeV. The ρ

decay via QCD, almost always to two pions and its width is very large, Γρ ≈ 150 MeV, just a

factor of five smaller than its mass. As an example of a stable particle, consider the charged π

meson, which is also a bound state of up and down quark and anti-quark. Its mass is 140 MeV

and it decays via the weak interaction to leptons and its width is of order 10−8 eV, clearly much

smaller than its mass. The ratio Γρ/Γπ ∼ 1016 demonstrates the difference between a resonance

and a stable particle, and explains why the weakly decaying hadrons are called stable.

The stable mesons are the lightest JP = 0− mesons that are charged under the [U(1)]6 sym-

metry, and they play an important role in the investigation of flavor physics. Their list is given in

Table 9.1. Note that the four neutral meson pairs with well defined U(1)s×U(1)c×U(1)b QNs are

not mass eigenstates. In fact, each pair mixes into a pair of mesons of well defined masses and de-

cay widths: (KS, KL), (DH , DL), (BH , BL) and (BsH , BsL). Here, (PH , PL) (for P = D,B,BS) are

distinguished by mass (Heavy and Light), while (KS, KL) are distinguished by lifetime (Short-lived

and Long-lived). The mass splitting within each pair is tiny (see Table ??). The width difference

within each of the D, B and Bs pairs is tiny, but for the K pair it is very large.

When discussing the different between stable hadrons and resonance, tritium – a hydrogen

isotope with two neutrons – provides a useful analogy. The excited 2P tritium state emits a

photon to decay to the 1S ground state at a time scale of the order of a nanosecond. The 1S state

decays into helium-3 with a lifetime of the order of 18 years. One can similarly imagine an excited

(heavy) B resonance decaying rapidly to the B0 by emitting a pion and then the B0 decaying via

the weak interaction on a much longer time scale. One should be comfortable thinking about the

plethora of hadrons in analogy to excited hydrogen atoms in basic quantum mechanics. In fact,

let us turn to one of the basic features of hydrogen in quantum mechanics: the quantum numbers

136

Table 9.1: List of the weakly decaying JP = 0− mesons with their quark decomposition, U(1)s ×U(1)c×U(1)b charge, mass and lifetime. ∗For the neutral K-meson mass eigenstates, KS and KL,

the respective lifetimes are τS = 9.0× 10−11 s and τL = 5.1× 10−8 s.

Meson qq′ (S,C,B) Mass [GeV] τ [s]

π± ud, ud (0, 0, 0) 0.140 2.6× 10−8

K± us, us (±1, 0, 0) 0.494 1.2× 10−8

K0, K0 ds, ds (±1, 0, 0) 0.498 ∗

D± cd, cd (0,±1, 0) 1.87 1.0× 10−15

D0, D0 cu, cu (0,±1, 0) 1.87 4.1× 10−16

D±s cs, cs (±1,±1, 0) 1.97 5.0× 10−15

B± ub, ub (0, 0,±1) 5.28 1.6× 10−12

B0, B0 db, db (0, 0,±1) 5.28 1.5× 10−12

Bs, Bs sb, sb (∓1, 0,±1) 5.37 1.5× 10−12

B±c cb, cb (0,±1,±1) 6.28 5.1× 10−13

that describe a state.

9.1.3 Hadron quantum numbers

Each hadron can be identified by its mass, and has a well defined width. In addition, the hadrons

are characterized by three kinds of hadronic quantum numbers:

• Exact QNs. These are the electromagnetic charge, Q and the spin, J .

• QNs that are respected by QCD and QED but broken by the weak interaction.

These are the charges under the global [U(1)]6 symmetry, often referred to as flavor QNs. In

addition for mesons we can define the discrete parity P , and for neutral mesons also charge

conjugation C. We define parity as

P = (−1)L+1 , (9.3)

where L is the relative orbital angular momentum between the quark and anti-quark. The

reason that the L = 0 state is parity odd has to do with the fact that the quarks are fermions,

and thus for L = 0 without they change sign under parity. As concerns charge conjugation,

charged states are not eigenstates of C, while neutral ones are.

• Approximate QNs. The approximate symmetries of the QCD interactions are isospin,

SU(3)-flavor, and heavy quark symmetry. They are discussed in Sections 9.3.1 and 9.3.2.

137

9.2 Combining QCD with the weak interaction

In our review of hadrons we already presented the main problem we face when dealing with QCD:

Our Lagrangian are written in terms of quarks and gluons, but our world is made up of hadrons.

Extracting quark interactions from these hadronic interactions is far from trivial. The situation is

even more problematic. As you recall, the way scattering in QFT is set up is such that we work

with asymptotic states, say electrons and positrons. Quarks and gluons, however, cannot be at

infinity.

The way to go on with this problem is to do what is call “parametrize our ignorance.” That is,

we use symmetries to isolate the parts of the amplitudes that is not perturbative. We then may

use approximate symmetries (see below) Lattice (see ???) or models to calculate them. The aim

of this section is to introduce the vocabulary that is used in such a process and to see how we

isolated the non-perturbative aspects.

The point to emphasis is that there is no equivalent of Feynman diagram to do it, and each

case we use some combination of the available tools.

9.2.1 Factorization

We start by defining factorization. The basic idea is that we can factorize any process to its

QCD and non-QCD parts. Then we treat the leptonic and electromagnetic part of a process

perturbatively.

Let us recall how we do calculations and start with a simple purely leptonic example. Consider

the process W → `ν. The amplitude, or the matrix element, is written as

A = 〈`ν|O|W 〉, (9.4)

where O is some operator, for example, in the SM it is One can write down the relevant tree-level

term for O,

O = ¯γµWµν (9.5)

. We can then write down the decay rate as

Γ ∼∫|A|2 × d(phase space) (9.6)

where in this example the amplitude is trivially A = g/√

2.

Moving on, we can see what happens when we look at hadronic decays. Consider π+ → `+ν in

the SM. The matrix element is

A = 〈`ν|O|π+〉, O ∼ ¯γµνuγµd, (9.7)

The leptonic part of this matrix element is simple, precisely because we have asymptotic external

states which are contracted with the creation and annihilation operators in O. The complication

138

is in the π+. In the quark model the charged pion is made out of a ud and thus we can think

about the uγµd combination in the operator that annihilate it. Yet, this is not the whole story as

the operator annihilate a free quark and anti-quark, while those inside the pion are bounded and

are not free.

What we are doing here is to use the factorization hypotheses. It is stated that we can treat

the leptonic and hadronic parts of the matrix element separately and write the result as a product.

This assumption is very solid. The leptons do not participate in the weak interaction and thus

then can be factored out.

This means that our matrix element should also factorize,

〈`ν|O|π+〉 = 〈`ν|O`|0〉 × 〈0|OH |π+〉 (9.8)

with

O` ∼ ¯γµν, OH ∼ uγµd. (9.9)

The object on the left-hand side is simple to calculate perturbatively. The 〈0|OH |π+〉 is the one

that we cannot calculate perturbatively. It is this factor that we will parametrize, and refer to in

general as “hadronic matrix element.”

We can move on and discuss more complicated hadronic decays. Consider K+ → π0`+ν. The

matrix element is

A = 〈π0`ν|O|K+〉, O ∼ ¯γµνuγµs, (9.10)

The leptonic part of this matrix element is simple, precisely because we again have asymptotic

external states which appear as creation and annihilation operators in O. What about the K+

and the π0? Once again we appeal to factorization and write

〈π0`ν|O|K+〉 = 〈`ν|O`|0〉〈π|OH |K+〉. (9.11)

We now move to discuss how we deal with such unknown hadronic matrix elements. Those with

one hadron are related to a decay constant, and those who relate two hadrons are parametrized

using form factors. Matrix elements with more than two hadrons do not have any special names.

We discuss decay constant and form factors below.

9.2.2 The decay constant

Consider

〈0|OH |π+〉, O ∼ uΓd. (9.12)

for some Dirac structure Γ. We can always simplify the Dirac structure to one of S, P, V,A, T

(scalar, pseudo-scalar, vector, axial vector, tensor).

139

Our main tool in going forward is to use symmetries to reduce the problem. In the above case

we will show that some of the operators vanishes due to the exact symmetries of QCD (that is,

parity and Lorentz).

We know that π is a pseudo-scalar and the vacuum is parity-even, thus O must be parity-odd,

that is we know that the matrix elements with Γ = S, V, T vanishes.

What can we say about the finite one. We only discuss here the axial vector one (as it is the

one relevant to the SM, the pseudo-scalar one is given as homework) 〈0|uγµγ5d|π〉. We do not

know how to calculate this object, but we can parametrize it. For that note that it must be a

Lorentz vector and thus must depend on all the vectors. In this case there is only one vector, the

pion momenta. We then conclude that 〈0|Aµ|π〉 ∝ pµ. The proportionality constant depends on

Lorenz scalar quantities. In our case there is only mπ, but this is not a dynamic variable. We can

then go on and further define the proportionality constant as the a decay constant, fπ and write

〈0|Aµ|π〉 ≡ −ipµfπ. (9.13)

The decay constant then can measured using one of the process where it involved. (See one of the

homework.) Once it is measure it can be used in all other processes.

fπ ∼ 131MeV. (9.14)

Note that at time one also used Fπ ≡ fπ√2≈ 93MeV. Similar decay constant can be defined for

other hadrons.

Before we move on we discuss the physics of the decay constant. As for now it is here just as

a parameter. In order to get the physics we use a perturbative system that is the positronium.

This is an e+e− bound state that decay into photons, and thus should have a decay constant,

just like the pion. Unlike the pion cases, however, here we can calculate it using QED. While

we do not go over the calculation here, lets us discuss the physics. The relevant question is how

does positronium decay. Semiclassically this occurs when the electron and positron touch, that is

when they are in the same place. Quantum mechanically is then must be proportional to the wave

function at r = 0 (recall that r is the distant between the electron and a positron). The decay

constant of positronium can thus be heuristically interpreted as something like |Ψ(r = 0)|2.

The above intuition is carried over to QCD. The physics of a decay constant has to do with the

wave function at the origin. While we do not know the wave function, at times we know general

scaling properties of it. We discuss some of them later.

9.2.3 Form factors

We now move to discuss matrix element between two mesons. We do it via an e example which

will be useful later on: the β decay of a neutron into a proton, n→ p+enu. Let us focus only on

140

the vector current. The relevant matrix element that we cannot calculate from first principle is

〈p+(p′)|dγµu|n(p)〉. (9.15)

To parametrize it we must write out the most general linear combination of kinematic variables

(and products of those variables). The matrix element is a Lorentz vector and thus the result must

depend on the Lorentz vectors we have in hand, that is, pµ and p′µ. Explicitly we then write

〈p+(p′)|dγµu|n(p)〉 ∼ apµ + bp′µ, (9.16)

where the coefficients a and b are our form factors. We know that these can only depend on

Lorentz scalars; there are three of these available: p2, p′2, and p · p′. The first two are just masses

and are not dynamical, so we only left with the third one. We perform a basis change and use

p± p′ as the basis and define

q ≡ p− p′, q2 = p2 + p′2 − 2p · p′. (9.17)

We the rewrite our form factors as f± which are functions of q2 and thus write our matrix element

as

〈p+(p′)|dγµu|n(p)〉 = f+(q2)(p+ p′)µ + f−(q2)(p− p′)µ. (9.18)

For now we will leave this definition of the f± form factors. Below we will apply it to the determi-

nation of the |Vud| CKM matrix element and see how, in specific cases, this structure can be made

to simplify even further.

The same procedure applied in more complicated case, that is, to identify all possible combi-

nation, and the relevant Lorentz scalar that the form factor could depend on. In the homework

there are more examples to work out.

For a decay constant we explain that it is related to the wave function at the origin. What is

the physical intuition of a form factor? The answer is that it corresponds to the wave function

overlap of the two hadrons. This is not a precise statement as the two hadrons have different

constituent, and thus different wave functions. What we refer to is the relevant wave function

after the transition. The intuition can be borrow from the sudden approximation in quantum

mechanics. When a state very fast change to another state, the probability to end up in a state

of the new system is proportional to the wave function overlap of the original state and the final

one. This is indeed what we see here, there is a sudden transition due to the weak interaction.

The form factors are just a more formal way to encode the wave function overlap.

9.2.4 Lattice QCD

Discuss Lattice, what it is, and how it is used

141

9.3 The approximate symmetries of QCD

In this section we discuss some approximate symmetries of QCD. They are useful to relate apriori

unrelated quantities to each other. For example, they were used in the past to predict masses of

undiscovered particles. As concerning the weak interaction and in general short distance physics,

these symmetries can be use, for example to related hadronic matrix elements to each other. Using

such relation is important when we try to disentangle the weak interaction effects. We will see

example of it later.

9.3.1 The approximate symmetries of QCD: light quarks

Consider the QCD Lagrangian for the up and the down quarks [see Eq. (5.11)]:

LQCD = −1

4Gµνa Gaµν − iqD/ q −mqqq, q = u, d. (9.19)

Since mu 6= md, this Lagrangian has a [U(1)]2 flavor symmetry. If, however, we had mu = md,

the flavor symmetry would be U(2) = SU(2)× U(1), such that the U(1) symmetry is just baryon

number where the up and down quarks transform in the same way. The SU(2) part is called

isospin symmetry. Under isospin the up and down quarks form a doublet:

Q =

(u

d

), Q =

(d

u

). (9.20)

The up and down quarks are much lighter than ΛQCD. In particular, md−mu ΛQCD. Given

that, the isospin symmetry is an approximate symmetry of the QCD interactions, broken by a

small parameter:

εI ≡md −mu

ΛQCD

∼ 10−2, (9.21)

One might wonder why the QCD scale enters here. In general, what we have to compare the mass

difference to is the relevant energy in the event. Due to confinement the energy is at least of order

the QCD scale, so the the symmetry breaking parameter may be smaller, but not larger than εI .

In addition to εI , the isospin symmetry is broken by QED due to the different charges of the

up and down quarks. The size of this breaking is of order α ∼ 0.01, which is of similar order as εI .

The weak interaction also break isospin. Yet, in most implications the weak interaction is small

enough that we can neglect it.

What are the implications of this approximate symmetry? The symmetry implies that the

hadron mass eigenstates can be assigned into well defined representations of isospin, where the

hadrons within an isospin multiplet are approximately degenerate. This statement is based on a

symmetry of the QCD Lagrangian and not on any quark model. We use the quark model to assign

the mass eigenstates into quark representations.

Consider, for example, the baryons. The two lightest baryons are the neutron and the proton

which are almost degenerate: mp = 938.272 MeV and mn = 939.565 MeV. It is natural to assume

142

that they form an isospin doublet. At somewhat higher scale, m∆ ≈ 1.23 GeV, there are the four

nearly degenerate ∆ states with charges Q = −1, 0,+1,+2, which fit nicely into an isospin quartet.

Within the quark model a baryon is made of three quarks. Thus, in term of isospin, the baryons

that are form from u ad d quarks are a product of three isospin doublets. This product gives

1

2× 1

2× 1

2=

1

2+

3

2. (9.22)

(The other isospin-half in the combination is antisymmetric and thus vanishes for identical dou-

blets.) For the nucleons, the quark assignment is given in Eq. (9.2), while that of ∆ is given

by

∆− = ddd, ∆0 = ddu, ∆+ = duu, ∆++ = uuu. (9.23)

The fact that each multiplet is almost degenerate is a result of the approximate symmetry. The

difference of O(ΛQCD) between the masses of the nucleon doublet and the ∆ quartet is due to the

fact the QCD interaction that gives the binding energy is very different for different representations

of the isospin group.

The situation with the mesons is similar. Among the JP = 0− mesons, we identify the three

pions, which are quasi-degenerate (mπ± = 139.57 MeV, mπ0 = 134.98 MeV), as an isospin-triplet,

while the heavier η meson (mη = 547.86 MeV) is an isospin singlet. Indeed, the combination of

two isospin doublets gives1

2× 1

2= 0 + 1. (9.24)

The quark model assignment of the pions is given in Eq. (9.1), while that of η is given by

η0 =uu+ dd√

2. (9.25)

Isospin can play an important role in weak decays. For example, it is used to predict some

form factor, for example, the proton to neutrino one in beta decay, is predicted to be unity in the

isospin limit. We use this result in Section (measuring Vud).

We close this subsection by remarking that we can also treat the s quark as light. In that

situation the approximate symmetry of the QCD Lagrangian is SU(3) that is usually called “flavor

QSU(3). We discuss some details of it in the appendix.

9.3.2 The approximate symmetries of QCD: heavy quarks

Heavy quark symmetry is different from the symmetries which we are already used to. For example,

in chiral symmetry we know that there is a parameter in the Lagrangian which we can set to zero

to yield a larger symmetry. In heavy quark symmetry we take the limit where the heavy quark

mass goes to infinity. This is not a symmetry which is manifest in the Lagrangian, as one cannot

take parameters to infinity. Yet, just like other symmetries it lead to very usefull results.

143

Most of the essential physics of heavy quark effective theory is contained in the undergraduate

quantum mechanics of the hydrogen atom. Let’s start with the following question: What is the

difference between hydrogen and deuterium? (Deuterium is a hydrogen atom with an additional

neutron). In Chemistr they are called isotopes and are treated the same as far as the chemistry

goes. We know that technically hydrogen and deuterium differ by their quantum numbers under the

Lorentz group: they have different mass and spin. Why, then, is it that these two distinguishable

particles are basically the same in chemistry?

The reason why hydrogen and deuterium are chemically basically the same is that the electrons

really don’t care about the nucleus. Or, to put it in more general terms the light degrees of freedom

are insensitive to the heavy degrees of freedom that source the potential, and it is the light DoF that

do the chemistry. To leading order in the inverse mass of the nucleus the potential is insensitive

to the mass and spin of the actual nuclear source of the electromagnetic field.

What do the corrections to that leading order statment look like? The relevant one are the

hyperfine spilting

En ∼ meα2

n2, ∆Ehf ∼ meα

4me

mp

. (9.26)

Note the extra me/mp suppression on the hyperfine splitting. Physically this is telling us that

in the limit of infinite mass one cannot rotate the proton to reverse its dipole moment. For the

deuterium, where the nucleus is spin zero there are no such splitting. So we see two effects of

adding a nuetron. It change the mass and the spin. Both of these affect the hyperfine splitting.

The other difference coming from the larger nuclear mass leads to a different reduced mass. which

again scale like 1/mp.

The two differences between hydrogen and deuterium lead to energy splitting, but both effects

decouple in the limit where the nucleus is taken to infinite mass since both effects scale like 1/mnucl.

The infinite nuclear mass limit is the precise analog to the heavy quark limit that we are using.

The heavy quark mesons which we consider B and D mesons are simialr to the hydrogen systems

where the binding comes not from electromagnetism but from QCD.

Yet there is a very big difference between the electromagnetic potential and the QCD potential:

while the former is perturbative, the latter is hopelessly non-perturbative. Since virtual partons

are O(1) effects, we should not even be able to say that the B meson is composed of a b and a

light quark, say u.

This is precisely the issue of the brown muck, the dirty nonperturbative physics that should

make the heavy mesons intractable. Our goal is to get around the brown muck without getting

ourselves too dirty; we want to find ways to calculate the things that we care about without having

to understand the intractable physics of the muck.

Now assume that mc,mb ΛQCD. Just as we could figure out to leading order, everything

about deuterium once we understood hydrogen, we can similarly determine the rest of the B

spectrum simply based on the D spectrum and the lowest B mass as a reference point. Further,

144

we are able to do this without knowing the physics of the brown muck just as we didn’t have to

re-derive the deuterium spectrum from scratch. The situation is in fact simpler since the B and

the D has rhe same spin. The only difference between the Ds and Bs are the mass of the heavy

quark and, as we said before, the light degrees of freedom just don’t care.

For example, in the mb →∞ limit, we have the relation

mB∗ −mB

mB∗ +mB

= 0. (9.27)

The B and B∗ are degenerate in the heavy quark limit. In the language of symmetry, we say that

these these two states form an SU(2) doublet with respect to the spin orientation of the heavy

quark. (Recall that these are just the singlet and triplet states of the hyperfine structure.)

Further, up to some normalization we know that

〈B|O|D〉 = 1. (9.28)

What we mean by this is that in this transition the light degrees of freedom just don’t care. The

b→ c transition hasn’t changed the QCD potential which the light quark feels. In fact, it doesn’t

matter what the operator O is that enacts the b → c decay. This is just a statement about the

wavefunction overlap.

We can now see that heavy quark symmetry is very different from the usual symmetries that

we work with in effective field theory. In normal effective field theories integrate out some heavy

degrees of freedom. In the chiral Lagrangian, for example, we can take things to zero and see

how we recover symmetries. Heavy quark symmetry is different in a very fundamental way. The

M → ∞ limit does not increase the symmetry of the Lagrangian. In fact, we will only integrate

out part of the degrees of freedom.

When we do a regular EFT, we expand about the vacuum and integrate out high frequency

modes. In heavy quark effective theory, we will expand about the background of a single heavy

quark, say a b. In such a background one assumes that the heavy quark is there classically; we

get it ‘for free’ without having to consider how it might pop out of the vacuum. If we want any

additional heavy quarks, however, we have to honestly pay the cost of introducing an additional

high frequency mode to the background.

Let’s now consider the spectrum of mesons containing a heavy quark. This turns out to be

very simple and can be done even before dipping into heavy quark effective theory. The main idea

is to consider an expansion in powers of ΛQCD/mQ, where mQ is the mass of the quark. Due to the

ambiguities in defining a quark mass, we should really say that it is some effective mass—a pole

mass, MS mass, whatever—which is identified with a physical mass by some prescription. The

mass of a hadron containing this heavy quark, mH , is

mH = mQ + Λ + a/mQ + · · · . (9.29)

145

This is a trivial parameterization where we’ve explicitly written out the zeroth, first, and second

order terms in our expansion. The leading term just reminds us that in the heavy quark limit

the mass of the hadron and the mass of the quark are basically the same. The sub-leading terms

all depend on the light degrees of freedom; the Λ is just the mass of the light degrees of freedom

and the a term parameterizes the interaction between the light degrees of freedom and the heavy

quark. Note that Λ doesn’t know about the mass or spin of the heavy quark, it is a parameter that

only deals with the light degrees of freedom independent of the heavy degree of freedom sourcing

the QCD potential. Only the interaction term takes these into account; this is just the hyperfine

splitting in hydrogen.

Consider the case when the hadron H is a B or B∗. One only observes a difference between

the B and B∗ mass at O(1/mQ), i.e. in the a term. On the other hand, the difference between the

B and the Bs masses occurs at O(m0Q), i.e. in the Λ term.

The information contained in a is reduced mass or kinetic energy of the heavy quark in the

meson rest frame and the spin. We parametrize a as

a = −λ1 + 2[J(J + 1)− 3

2

]λ2. (9.30)

Here the λ1 term is universal and the λ2 term is associated with the total spin of the meson, J .

By dimensional analysis λ1,2 have mass dimension two and we can guess that λ1,2 ≈ Λ2QCD and

Λ ≈ ΛQCD.

For example, consider the ratio of the Bs–Bd mass splitting to the Ds–Dd splitting. Heavy

quark symmetry predicts that these splittings are the same up to effects on the order of 1/mc.

Applying (9.29) gives

r1 ≡m(Bs)−m(Bd)

m(Ds)−m(Du)=

Λs − Λd +O( 1mb

)

Λs − Λd +O( 1mc

)= 1 +O(1/mc). (9.31)

As another example, one can show that

m2(B∗)−m2(B) = 4λ2. (9.32)

It is, of course, not surprising that this splitting only comes from the non-universal part of the a

term. Comparing to actual measurements, we find

λ2(mB) ≈ 0.12GeV2. (9.33)

We can do the same calculation for the charmed mesons and we would also get 0.12, note that the

bottom gives preferable results since the error goes like 1/mQ. In fact, looking at the differences

in the squared masses of the vector and pseudoscalar mesons for the B and the D,

m2(B∗)−m2(B) = 0.47GeV2 (9.34)

m2(D∗)−m2(D) = 0.55GeV2. (9.35)

146

What about the kaon? Of course, we don’t expect this heavy quark symmetry to hold, but it

turns out that m2(K∗)−m2(K) = 0.55GeV2. What do we get for the ρ and pion? We can neglect

the pion mass, and we recall that mρ = 770 MeV so that the difference of the squared masses is

0.57GeV2. There is no reason why these should hold, but it is a notable observation that they do.

We do not discuss the more formal part of how to developed the heavy quark effective theory

(HQET), and just remark that it was done where all the ideas we discuss above are put into formal

setting.

9.4 High energy QCD

In high energy QCD processes, both long and short distance physics are needed to explain the

observations. In this section we discuss a few aspects of it.

9.4.1 Quark hadron duality

The way we set up QCD, theoretical calculations are done in terms of quarks and gluons but what

we deal with experimentaly are hadrons. The question is then how can we connect the two aspects

of QCD, that is, how to use the experimental data to probe the theory.

The answer is given by the assumption of “quark-hadron duality.” The duality states that

inclusive hadronic observables at high energies, when integrated over large enough energt range,

can be described by the calculation in terms of quarks and gluons. (The generic name for the

quark or gluon in such a process is parton.) This duality is used in many cases with much success,

for example, in e+e− annihilation, deep inelastic scattering, τ decays, and semileptonic B decays.

Despite this success, the notion of the quark-hadron duality remained vague. In praticular,

there is no formal critiria of when it can be used. Moreover, it is not clear how we can estimate

deviations from the duality results. Yet, the intuition is clear: at the short distance QCD is

perturbative, and the stong interaction result in energy shifts of O(ΛQCD) between the different

partons. By integrating over large enugh interval, we avarage over these processes and then recover

the underlying results.

To demonstrate the idea, consider the ratio of scattering cross sections (for s m2Z)

R(E) ≡∫ E

0 σ(e+e− → hadrons)∫ E0 σ(e+e− → µ+µ−)

. (9.36)

Quark hadron duality implies that it is given to leading order by

R(E) ≡∫ E

0 σ(e+e− → qq)∫ E0 σ(e+e− → µ+µ−)

≈ 3∑q

Q2q, (9.37)

where 3 is the number of colors, Qq is the EM charge of the quark q, the sum goes over all the

quarks that are lighter than E, and the approximation sign is due to the fact that we neglect

147

the phase space factor, the weak interaction effects, and higher orders in QCD. The agreement is

rather impressive, as can be see in the figure from the PDG.

The situation is more complicated when the initial state involves hadrons, for example, ep or

pp collisions. Then it is not clear what is the initial state in terms of quarks and gluons. The way

to go is to define a “Parton Distribution Function” (PDF) that describes the distributions of the

quarks, antiquarks and gluons inside the initial hadrons as functions of the involved energy. The

PDF is a non-perturbative quantity that cannot be calculated from first principles perturbatively.

We determine it by measuring it in one process and then using it in another. Once we have it,

the calculation is done on the same principle as discussed before, that is the calculation is done

in terms of the partons (quarks and gluons) and compared with the integrated measurements over

hadrons.

9.4.2 Jets

When we smash particles together at a collider, the hard scattering process can be described by

the UV part of QCD. For example, we can think about e+e− → qq or e+e− → gg processes. Yet,

confinement tells us that we cannot see the two quarks going back to back as we see the muons in

the case of e+e− → µ+µ−. What we find is that qq pairs coming from the vacuum combine with

the original partons in order to form color-neutral final states.

The number of qq pairs that are created depends on the relative energy between the original

partons. The higher the energy is, the more qq pairs are likely to be created. Most of the hadrons

that are produced are traveling roughly in the same direction as the original partons. The collection

of hadrons related to a single original parton comes under the name of a jet.

While there is no first principle way to relate jets to the UV properties of QCD, the fact that

jets form gives us ways to probe QCD at short distance. To a good approximation, a process like

e+e− → qq is searched for as e+e− → 2j (here j represents a jet), while e+e− → qqg as e+e− → 3j.

The main problem that we encounter when we discuss jets is how to relate the jte to the

underling process. The reason is that a jet is a finite size object and we can define it in several

ways. In each of these ways there is some probablity that the jets ar ento directly correspond for

the undelying process. For example, if we have two paryons that are produced almost colinear,

such that their seperation is much smaller than a typical jet size, we wil see them as one jet. There

is no simpel solution to this problem. What it is done is to use simulation and detemined the

probablity that such things occurs and corect for them. That is, there is intrinsice theoretial error

in translating a result of, say, e+e− → 2j to the one we calculate, that is e+e− → qq.

To leading order what we care about is the fact that we have a jet. We can learn more about

the short distance physics by the properties of the jet. Some of the properties depend on the

parton that originates the jet. For example, one can, statistically speaking, tell a quark jet from

a gluon jet. Other properties are universal. For example, the most likely mesons to be produced

148

in the jet are pions since they are the lightest. There are also not many baryons in a jet. We can

understand this fact by the need to pop up two pairs of qq and the combinatorics is such that the

three qq pairs are more likely to end up in three mesons than in two baryons.

149

Appendix

9.A quark masses

9.B Flavor SU(3)

Here we discuss falvor SU(3).

9.C Names and QN for hadrons

The pseudoscalar (JPC = 0−+) mesons include the π, η, η′, K, D, and B mesons. The first three

are called ‘unflavored’ by the PDG, meaning that they contain no heavy flavor. The η and η′

contain some admixture of ss, but this has no net strangeness quantum number. The K, D, and

B are flavored mesons: The kaons (K) all have net strangeness, the D mesons have net charm,

and the B mesons have net beauty.

The vector mesons (JPC = 1−−) include the isospin-triplet ρ(770), and the ω(782) and φ(1020)

which are isospin singlets. The flavored vector mesons are indicated by stars relative to the

pseudoscalars: K∗, D∗, B∗.

The scalar (JPC = 0++) mesons are formed by quark configurations with orbital angular

momentum ` = 1 and spin s = 1 such that the net spin is zero: a0, a1, a2, . . .. The vector 1+−

mesons are formed by quark configurations with ` = 1 and s = 0: b0, b1, . . ..

The baryons with three light quarks (u, d combinations) include the isospin doublet of nucleons

N = p, n, and the isospin quartet of ∆ = ∆++,∆+,∆0,∆−.

The baryons with two u and/or d quarks include the isospin singlets Λ,Λc,Λb, and the isospin

triplets Σ,Σc,Σb. The isospin comes only from the light quarks. These particles are named

according to their heavy quark. No index implies an s, while a subscript c or b indicates charmed

or beautiful baryons.

Baryons with just one u or d quark are isospin doublets and are called Ξ,Ξc,Ξb (which is too

difficult for particle physicists to pronounce consistently so we tend to call them ‘cascades’). These

are also named according to their heavy quarks with s implied by no index. Thus we would name

a baryon with one light quark, a strange quark, and a charm quark the Ξc. Whether the light

150

quark is u or d is implied by the charge. For example, the Ξ+ must be a uus combination.

Finally, there baryons with no u and d quarks. The most famous of these is Ω, composed of

three strange quarks. This is the particle that Ne’eman and Gell-mann predicted on the basis of

the decouplet of SU(3). They wrote down its charge, quark content, and mass a few years before

its discovery.

151

Homework

Question 9.1: Using the PDG

Read the quark model review and the light meson summery table from the PDG.

1. What are the component quarks of the D+ meson? What is its mass?

2. What are the component quarks of the Λ baryon? What is its spin?

3. What is the lifetime (in sec) and width (in eV) of the B+ meson?

4. Explain what P , C, J , I and G are. For each of these Quantum Numbers (QNs) indicate if

they are (i) exact in Nature; (ii) exact in QCD; or (iii) approximately conserved in QCD.

(Exact in QCD means to consider QCD gauge interactions and quark masses, approximate

means consider only QCD gauge interactions.)

5. Find the masses and the above mentioned QNs of the π, η, ρ and ω.

6. Find the Branching Ratios (BRs) of the ρ, η and ω to two and three pions.

7. You can see that the η does not decay to two pions, the ω decay rate to two pions is very

small, while the ρ decay almost always to two pions. Based on the QNs we mention above,

explain these results.

152

Homework

Question 9.1: Pseudoscalar decay constantWe’ve now explored the meaning of the axial matrix element and have parameterized the QCD

brown muck into a single pion decay constant. Armed with this, what can we say about the

pseudoscalar matrix element? It turns out that we don’t need to do any more calculations or

introduce any more parameters: we can use a slick trick to get the pseudoscalar matrix element

from the axial matrix element. This trick falls under the umbrella of the current algebra, something

which has fallen out of favor in modern textbooks. Here we’ll just sketch what we need.

Let us consider the pseudoscalar kaon matrix element,

P = 〈K(p)|sγ5u|0〉. (9.38)

This is manifestly a Lorentz scalar quantity. The result that we would like to show is

P = ifKm2K

mu +ms

. (9.39)

The trick to derive this result is to take the divergence of the axial current,

i∂µ(sγµγ5u) = i(∂µs)γµγ5u+ sγµγ5(i∂µu) (9.40)

= (i/∂s)γ5u− sγ5(i/∂u) (9.41)

= −(ms +mu)sγ5u. (9.42)

Now take this and put it between the kaon and the vacuum. The derivative is just (∂µ = −ipµ) is

just a momentum, so we get

〈K|i∂µ(sγµγ5u)|0〉 = pµ(−ifKpµ) = −ifKm2K , (9.43)

from this we obtain (9.39) straightforwardly.

Question 9.2: Form Factors

Consider the semi-leptonic decay B → D∗`ν. Let us again focus on calculating the hadronic

vector matrix element.

153

1. Before diving into this particular decay, consider B → D+`ν. Convince yourself that the

relevant hadronic vector matrix element takes the form of (9.18).

2. Prove that

〈D∗+(pD, ε)|V µ|B(pB)〉 = ig(q2)εµναβε∗ν(pD + pB)αqβ, (9.44)

where ε is the totally antisymmetric tensor.

3. Using the fact that T acts as a complex conjugate, show that g(q2) is real.

4. Parameterize the axial matrix element, 〈D∗(pD, ε)|V µ|B(pB)〉.

154

Chapter 10

Mixing and CPV

10.1 Neutral meson mixing

Neutral meson mixing is an FCNC process. Within the SM, it provides indirect measurements of

CKM parameters. Beyond the SM, it probes very high energy scales. In this Section, we present

the formalism that is used to investigate these processes, and explain how the time evolution of

the neutral meson system depends on the tiny mass splitting between the two quasi-degenerate

mass eigenstates.

10.1.1 Toy model

Let us denote the time-evolved state of an initial state |P 〉 by |P (t)〉. For mass eigenstates, the

time evolution is simple, |PL,H(t)〉 = e−iEL,H t|PL,H〉. But the time evolution of |P 0(t)〉 and |P 0(t)〉is more complicated:

|P 0(t)〉 = cos(

∆E t

2

)|P 0〉+ i sin

(∆E t

2

)|P 0〉 ,

|P 0(t)〉 = cos(

∆E t

2

)|P 0〉+ i sin

(∆E t

2

)|P 0〉 . (10.1)

Since flavor is not conserved, the probability P to measure a specific flavor, that is P 0 or P 0,

oscillates in time:

P(P 0 → P 0)[t] =∣∣∣〈P 0(t)|P 0〉

∣∣∣2 =1 + cos(∆Et)

2,

P(P 0 → P 0)[t] =∣∣∣〈P 0(t)|P 0〉

∣∣∣2 =1− cos(∆Et)

2. (10.2)

Thus, neutral meson mixing, M12 6= 0, leads to flavor oscillations.

Since flavor is not conserved, the probability P to measure a specific flavor, that is P 0 or P 0,

oscillates in time:

P(P 0 → P 0)[t] =∣∣∣〈P 0(t)|P 0〉

∣∣∣2 =1 + cos(∆Et)

2,

155

P(P 0 → P 0)[t] =∣∣∣〈P 0(t)|P 0〉

∣∣∣2 =1− cos(∆Et)

2. (10.3)

Thus, neutral meson mixing, M12 6= 0, leads to flavor oscillations.

In the meson rest frame, ∆E = ∆m and t = τ , the proper time. Thus, ∆m sets the frequency

of the flavor oscillations. This is a very interesting result:

• On the theoretical side, ∆m is related to FCNC transitions: the quark transitions that

correspond to K0 −K0, D0 −D0, B0 −B0, and B0s −B0

s mixing are, respectively, sd→ sd,

uc → uc, bd → bd, and bs → bs. Thus, ∆m for each of the four systems gives an indirect

measurement of CKM parameters and can probe new physics.

• On the experimental side, we learn that by measuring the oscillation frequency we can

determine the mass splitting between the two mass eigenstates. One way this can be done

is by measuring the flavor of the meson both at production and decay. It is not trivial to

measure the flavor at both ends, and we do not explain here how it is done, but you are

encouraged to think and learn about it.

10.1.2 Flavor oscillations

There are four neutral meson-pairs where mixing can occur: K0 − K0, D0 − D0, B0 − B0, and

B0s −B0

s.1

Consider a neutral meson P (P = K,D,B or Bs). We consider the case where initially, at

t = 0, it is some specific combination of P 0 and P 0:

|ψP (0)〉 = a(0)|P 0〉+ b(0)|P 0〉 . (10.4)

It evolves in time, and acquires components that correspond to all possible decay final states

f1, f2, . . .:|ψP (t)〉 = a(t)|P 0〉+ b(t)|P 0〉+ c1(t)|f1〉+ c2(t)|f2〉+ · · · . (10.5)

Our interest lies in obtaining only a(t) and b(t). For this aim, one can use a simplified formalism,

where the full Hamiltonian is replaced with a 2× 2 effective Hamiltonian H that is not Hermitian.

The non-Hermiticity is related to the possibility of decays, which makes the P 0, P 0 system an

open one. The complex matrix H can be written in terms of Hermitian matrices M and Γ as

H = M − i

2Γ . (10.6)

The matrices M and Γ are associated with (P 0, P 0)↔ (P 0, P 0) transitions via off-shell (dispersive)

and on-shell (absorptive) intermediate states, respectively. Diagonal elements of M and Γ are

1You may be wondering why there are only four such systems. If you do not wonder and do not know the answer,

then you should wonder. You will answer this question in your homework.

156

associated with the flavor-conserving transitions P 0 → P 0 and P 0 → P 0. The CPT symmetry

implies that M11 = M22 and Γ11 = Γ22. The off-diagonal elements are associated with the flavor

changing transitions P 0 ↔ P 0.

Before we proceed, let us clarify a semantic issue. The effective Hamiltonian H and, similarly,

its Hermitian part M , is a combination of operators. What we need for our purposes is its matrix

element between specific meson states. With some abuse of language, we denote by Mij both

the operator and its matrix element. Model independently, the diagonal matrix elements fulfill

M11 = M22 = m and Γ11 = Γ22 = Γ. The off-diagonal elements are those of interest to us. When

we refer to a specific meson system, we will use MPP for the matrix element 〈P 0|M12|P 0〉.In all cases (P = K,D,B,Bs), H is not a diagonal matrix. Thus, the states that have well

defined masses and decay widths are not P 0 and P 0, but rather the eigenvectors of H. We denote

the light and heavy eigenstates by PL and PH with masses mH > mL. (Another possible choice,

which is standard for K mesons, is to define the mass eigenstates according to their lifetimes. We

denote the short-lived and long-lived eigenstates by KS and KL with decay widths ΓS > ΓL. The

KL meson is experimentally found to be the heavier state.) The eigenstates of H are given by

|PL,H〉 = p|P 0〉 ± q|P 0〉, (10.7)

where (q

p

)2

=M∗

12 − (i/2)Γ∗12

M12 − (i/2)Γ12

, (10.8)

and with the normalization |p|2 + |q|2 = 1. Since H is not Hermitian, the eigenstates need not be

orthogonal to each other.

The masses and decay-widths are given by the real and imaginary parts of the eigenvalues,

respectively. The average mass and the average width are given by

m ≡ mH +mL

2, Γ ≡ ΓH + ΓL

2. (10.9)

The mass difference ∆m and the width difference ∆Γ are defined as follows:

∆m ≡ mH −mL, ∆Γ ≡ ΓH − ΓL. (10.10)

Here ∆m is positive by definition, while the sign of ∆Γ is to be determined experimentally. (Al-

ternatively, one can use the states defined by their lifetimes to have ∆Γ ≡ ΓS − ΓL positive by

definition.) It is useful to define dimensionless ratios x and y:

x ≡ ∆m

Γ, y ≡ ∆Γ

2Γ. (10.11)

We also define

θ = arg(M12Γ∗12). (10.12)

157

Solving the eigenvalue equation gives

(∆m)2 − 1

4(∆Γ)2 = 4|M12|2 − |Γ12|2, ∆m∆Γ = 4Re(M12Γ∗12). (10.13)

We move on to study the time evolution of a neutral meson. For simplicity, we assume CP

conservation. In Section 10.2 we study CP violation, and there we relax this assumption. Many

important points can, however, be understood in the simplified case where CP is conserved. If CP

is a good symmetry of H then Γ12/M12 is real, leading to

|q/p| = 1 . (10.14)

It follows that the mass eigenstates are also CP eigenstates, and are orthogonal to each other,

〈PH |PL〉 = |p|2−|q|2 = 0. The phase of q/p is convention dependent, and not a physical observable.

As concerns the mass and decay widths, Eq. (10.13) simplifies to

∆m = 2|M12|, |∆Γ| = 2|Γ12|. (10.15)

Let us denote the time-evolved state of an initial state |P 〉 by |P (t)〉. For mass eigenstates, the

time evolution is simple,

|PL,H(t)〉 = e−imL,H t−12

ΓL,H t|PL,H〉. (10.16)

But the time evolution of |P 0(t)〉 and |P 0(t)〉 is more complicated:

|P 0(t)〉 = g+(t)|P 0〉 − (q/p)g−(t)|P 0〉 ,

|P 0(t)〉 = g+(t)|P 0〉 − (p/q)g−(t)|P 0〉 , (10.17)

where

g±(t) =1

2

(e−imH t−

12

ΓH t ± e−imLt−12

ΓLt). (10.18)

Let us further define decay amplitudes of P 0 and its CP conjugate P 0 into a final state f :

Af = 〈f |H|P 0〉, Af = 〈f |H|P 0〉. (10.19)

It is also useful to define the parameter λf :

λf ≡ (q/p)(Af/Af ). (10.20)

The time dependent decay rates of P 0 → f and P 0 → f are given by

dΓ[P 0(t)→ f ]/dt

e−ΓtNf |Af |2= (1 + |λf |2) cosh(yΓt) + (1− |λf |2) cos(xΓt)

+2Re(λf ) sinh(yΓt)− 2Im(λf ) sin(xΓt), (10.21)

dΓ[P 0(t)→ f ]/dt

e−ΓtNf |Af |2= (1 + |λf |−2) cosh(yΓt) + (1− |λf |−2) cos(xΓt)

+2Re(λ−1f ) sinh(yΓt)− 2Im(λ−1

f ) sin(xΓt), (10.22)

where Nf is a common, time-independent normalization factor.

158

10.1.3 Time scales

There are various time scales involved in meson mixing, and understanding the hierarchy (or lack

of hierarchy) between them leads to insights and simplifications.

The first important time scale is the oscillation period. As can be seen from Eq. (10.3), the

oscillation time scale is given by ∆m.2

To understand which other time scales are relevant, we need to introduce the notion of “flavor

tagging.” The flavor eigenstates P 0 and P 0 have a well defined flavor content. For example, B0

(B0) is a bd (bd) bound state. The term ‘flavor tagging’ is used, in the physicists jargon, to the

experimental determination of whether a neutral P meson is in a P 0 or P 0 state. Flavor tagging is

provided to us by Nature, when the meson decays into a flavor-specific final state, namely a state

that can come from either P 0 or P 0 state, but not from both.3 Semi-leptonic decays are very good

flavor tags. Take, for example, semileptonic b (anti)quark decays:

b→ cµ−ν, b→ cµ+ν. (10.23)

Thus, the charge of the lepton tells us the flavor: µ+ comes from a B0 (or B+) decay, while µ−

comes from a B0 (or B−) decay. Of course, before the meson decays it could be in a superposition

of B0 and a B0. The decay acts as a quantum measurement. In the case of semileptonic decay, it

acts as a measurement of flavor vs. anti-flavor.

Thus, a second relevant time scale is that of flavor tagging. Since the flavor is tagged when the

meson decays, the relevant time scale is determined by the decay width, Γ. We can then use the

dimensionless quantity x [defined in Eq. (10.11)] to understand the possible hierarchies between

these two time scales:

1. x 1 (“slow oscillations”): The meson decays before it has time to oscillate, and thus

flavor is conserved to good approximation. Putting cos(∆mt) ≈ 1 in Eq. (10.3), we obtain

P(P 0 → P 0) ≈ 1 and P(P 0 → P 0) → 0. A measurement of ∆m is challenging, but

experiments can provide a useful upper bound even before the required precision for an

actual measurement is achieved. This case is relevant for the D system.

2. x 1 (“fast oscillation”): The meson oscillates many times before decaying, and thus the

oscillating term practically averages out to zero. Putting cos(∆mt) ≈ 0 in Eq. (10.3),

we obtain P(P → P ) ≈ P(P → P ) ≈ 1/2. A measurement of ∆m is challenging, but

experiments can provide a useful lower bound even before the required precision for an

actual measurement is achieved. This case is relevant for the Bs system.

2The time scale is, of course, 1/∆m. Physicists know, however, how to match dimensions. We thus interchange

between time and energy freely, counting on the reader to understand what we mean.3Final states that are common to the decays of both P and P are also very useful in flavor physics and, in

particular, to the study of CP violation. They are discussed in Section 10.2.

159

3. x ∼ 1: The oscillation and decay times are roughly the same. The meson has time to oscillate

and the oscillations do not average out. This is the case where it is experimentally easiest to

measure ∆m. This case is relevant to both the K and the B systems. We emphasize that

the physics processes that determine Γ and ∆m are unrelated, so there is no reason to expect

x ∼ 1. Yet, amazingly, Nature has been kind enough to choose flavor parameters such that

x ∼ 1 in two out of the four neutral meson systems.

Thus, flavor oscillations give us sensitivity to mass difference of the order of the width, which

is much smaller than the mass itself. In fact, we have been able to measure mass differences that

are fourteen orders of magnitude smaller than the corresponding masses. It is due to the quantum

mechanical nature of the oscillation that such high precision can be achieved.

In some cases there is one more time scale: ∆Γ. In these cases, we have one more relevant

dimensionless parameter: y ≡ ∆Γ/(2Γ). Note that y is bounded, |y| ≤ 1. (This is in contrast to

x which has no upper bound.) Thus, we can talk about several cases depending on the values of

y and x.

1. |y| 1 and y x. In this case the width difference is irrelevant. This is the case for the

B0 system.

2. y ∼ x. In this case the width difference is as important as the oscillation. This is the case in

the D system where y 1 and for the K system with y ∼ 1.

3. |y| ∼ 1 and y x. In this case the oscillation averages out and the width difference can

be observed simply as a difference in the lifetimes of the two mass eigenstates. This case is

relevant to the Bs system, where y ∼ 0.1.

There are few other limits (like y x) that are not realized in the four meson systems. Yet, they

might be realized in some other systems yet to be discovered.

To conclude this subsection, we present in Table 10.1 the experimental data on meson mixing.

In all cases (including the K meson system) we define x and y as in Eqs. (10.10) and (10.11). For

the K0 system, the error on y is well below a permill. For the B0 system, there is only an upper

bound on y.

10.2 CP violation

To date, CP violation has been observed (at a level higher than 5σ) in about thirty different decay

modes. It has not been observed in baryon decays, nor in the leptonic sector, nor in flavor diagonal

processes, such as electric dipole moments. We thus present in this Section the formalism.

The experimental observation of CP violation is challenging for several reasons:

160

Table 10.1: The experimental values of the neutral meson mixing parameters

P m [GeV] Γ [GeV] x y

K0 0.498 3.68× 10−15 0.945± 0.001 −0.997

D0 1.86 1.60× 10−10 0.0048± 0.0017 +0.014± 0.002

B0 5.28 4.33× 10−13 0.775± 0.006 −0.0075± 0.0090

Bs 5.37 4.34× 10−13 26.82± 0.23 −0.061± 0.008

1. In order that there will be a CP asymmetry in a decay process, the presence of so-called

“strong phases”, which are CP conserving phases arising from intermediate on-shell particles,

is needed. These phases might be small (or vanish) and suppress the CP asymmetry (or make

it vanish).

2. CPT implies that the total width of a particle and its anti-particle are the same. Thus, any

CP violation in one channel must be compensated by CP violation with an opposite sign in

other channels. Consequently, CP violation is suppressed in inclusive measurements.

3. Within the SM, CP violation arises only when all three generations are involved. With the

smallness of the CKM mixing angles, this means that either the CP asymmetries are small,

or they appear in modes with small branching ratios.

CP violation in meson decays is an interference effect. In neutral meson decays the phe-

nomenology of CP violation is particularly rich thanks to the fact that meson mixing, as described

in Section 10.1, can contribute to the CP violating interference effects. One distinguishes three

types of CP violation in meson decays, depending on which amplitudes interfere:

1. In decay: The interference is between two decay amplitudes.

2. In mixing: The interference is between the absorptive and dispersive mixing amplitudes.

3. In interference of decays with and without mixing: The interference is between the direct

decay amplitude and a first-mix-then-decay amplitude.

Our starting point for discussing in more detail each of these three types of CP violation is Eq.

(10.21), which gives the time-dependent decay rates Γ(B0 → f)[t] and Γ(B0 → f)[t]. Before we

proceed to do so, we present some physics ingredients concerning the decay amplitudes, and some

further notations. We do so for the specific case of B-meson decays, but our discussion applies to

all meson decays.

Consider Af , the B → f decay amplitude, and Af , the amplitude of the CP conjugate process,

B → f . There are two types of phases that may appear in these decay amplitudes. First, complex

161

parameters in any Lagrangian term that contributes to Af appear in a complex conjugate form

in Af . In other words, CP violating phases change sign between Af and Af . In the SM, these

phases appear only in the couplings of the W±-bosons, hence the CP violating phases are called

“weak phases.” Second, phases can appear in decay amplitudes even when the Lagrangian is real.

They arise from contributions of intermediate on-shell states. These CP conserving phases appear

with the same sign in Af and Af . In meson decays, such rescattering is usually driven by strong

interactions, hence the CP conserving phases are called “strong phases.”

It is useful to factorize each contribution ai to Af into three parts: the magnitude |ai|, the

weak phase φi, and the strong phase δi. If there are two such contributions, Af = a1 +a2, we write

Af = |a1|ei(δ1+φ1) + |a2|ei(δ2+φ2),

Af = |a1|ei(δ1−φ1) + |a2|ei(δ2−φ2). (10.24)

It is further useful to define

φf ≡ φ2 − φ1, δf ≡ δ2 − δ1, rf ≡ |a2/a1|. (10.25)

For neutral meson mixing, it is useful to write

M12 = |M12|eiφM , Γ12 = |Γ12|eiφΓ . (10.26)

Each of the phases appearing in Eqs. (10.24) and (10.26) is convention dependent, but combinations

such as δ1 − δ2, φ1 − φ2, and φM − φΓ are physical.

10.2.1 CP violation in decay

CP violation in decay corresponds to

|Af/Af | 6= 1. (10.27)

In charged meson decays, this is the only possible contribution to the CP asymmetry:

Af± ≡Γ(B− → f−)− Γ(B+ → f+)

Γ(B− → f−) + Γ(B+ → f+)=|Af−/Af+|2 − 1

|Af−/Af+|2 + 1. (10.28)

Using Eq. (10.24), we obtain, for rf 1,

Af± = 2rf sinφf sin δf . (10.29)

This result shows explicitly that we need two decay amplitudes, that is, rf 6= 0, with different

weak phases, φf 6= 0, π and different strong phases δf 6= 0, π.


1. In order to have a large CP asymmetry, we need each of the three factors in (10.29) to be

large.

162

2. A similar expression holds for the contribution of CP violation in decay in neutral meson

decays. In this case there are, however, additional contributions.

3. Another complication with regard to neutral meson decays is that it is not always possible

to tell the flavor of the decaying meson, that is, if it is B0 or B0. This can be a problem or

a virtue.

4. In general the strong phase is not calculable since it is related to QCD. This is not a problem

if the aim is just to demonstrate CP violation, but it is if we want to extract the weak

parameter φf . In some cases, however, the phase can be independently measured, eliminating

this particular source of theoretical uncertainty.

10.2.2 CP violation in mixing

CP violation in mixing corresponds to

|q/p| 6= 1 . (10.30)

In decays into flavor specific final states (Af = 0 and, consequently, λf = 0), and, in particular,

semileptonic neutral meson decays, this is the only source of CP violation:4

ASL(t) ≡ Γ[B0(t)→ `+X]− Γ[B0(t)→ `−X]

Γ[B0(t)→ `+X] + Γ[B0(t)→ `−X]=

1− |q/p|4

1 + |q/p|4. (10.31)

Using Eq. (10.8), we obtain for |Γ12/M12| 1,

ASL = − |Γ12/M12| sin(φM − φΓ). (10.32)


1. Eq. (10.31) implies that this asymmetry of time-dependent decay rates is actually time

independent.

2. The calculation of |Γ12/M12| is difficult, since it depends on low-energy QCD effects. Hence,

it would be difficult in general to extract the value of the CP violating phase φM − φΓ from

a measurement of ASL.

CP violation in K0−K0 mixing is measured via a semileptonic asymmetry which is defined as

follows:

δL ≡Γ(KL → `+ν`π

−)− Γ(KL → `−ν`π+)

Γ(KL → `+ν`π−) + Γ(KL → `−ν`π+)=

1− |q/p|2

1 + |q/p|2. (10.33)

This asymmetry is somewhat different from the one defined in Eq. (10.31), in that the decaying

meson is the neutral mass eigenstate, rather than the flavor eigenstate. Hence also the different

dependence on |q/p|. The experimental value is δL = (3.32± 0.06)× 10−3.

4This statement holds within the SM where, to lowest order in GF , |A`+X | = |A`−X | and A`−X = A`+X = 0.

163

Here one can overcome the difficulty of calculating |ΓKK | by taking into account the experimen-

tal result that ∆ΓK/∆mK ≈ −2, and that, given that the CP violating effects are experimentally

determined to be small, ∆ΓK/∆mK ' |ΓKK/MKK |. Then one obtains, in the phase convention

where ΓKK is real,

Re(εK) =1

2δL '

Im(MKK)

2∆mK

, (10.34)

where we connect the commonly used CP violating parameter εK to our notations. Thus, to

calculate the theoretical prediction of Re(εK), we need to obtain MKK .

10.2.3 CP violation in interference of decays with and without mixing

CP violation in interference of decays with and without mixing corresponds to

Im(λf )

1 + |λf |26= 0. (10.35)

A particular simple case is the CP asymmetry in decays into final CP eigenstates. Moreover, a

situation that is relevant in many cases is when the effects of CP violation in decay are negligible,

|AfCP /AfCP | ' 1, and the effects of CP violation in mixing are small, |q/p| ' 1. In this case, λfCPis a pure phase, |λfCP | = 1. Further consider the case where y = 0. We obtain the very simple

result:

AfCP (t) ≡ Γ[B0(t)→ fCP ]− Γ[B0(t)→ fCP ]

Γ[B0(t)→ fCP ] + Γ[B0(t)→ fCP ]= Im(λfCP ) sin(∆mBt). (10.36)

Using Eq. (10.20), we obtain, for |Γ12/M12| 1,

Im(λfCP ) = Im(M∗

12

|M12|AfCPAfCP

)= − sin(φM + 2φ1). (10.37)

The phase φM is defined in Eq. (10.26), while the phase φ1 is defined in Eq. (10.24), and we

assume that a2 can be neglected.

10.2.4 Indirect CP violation

A different classification of CP violation in meson decays is the following. When one can choose

a phase convention where all CP violating effects can be described by a phase in M12, we have

indirect CP violation. When this is impossible, we have direct CP violation. Thus, when we

observe CP violation in decay, or when two different final states give two different values of CP

violation in the interference of decays with and without mixing, we have direct CP violation. When

we observe CP violation in mixing, or a situation where all asymmetries resulting from interference

of decays with and without mixing have the same value, we have indirect CP violation.

In the neutral D system, one can isolate the effects of indirect CP violation. Consider the decay

rates in Eqs. (10.21). The mixing processes modify the time dependence from a pure exponential.

164

However, given the small values of x and y, the time dependencies can be recast, to a good

approximation, into purely exponential form, but with modified decay-rate parameters. Take, for

example, the K+K− final state:

ΓD0→K+K− = Γ× [1 + |q/p|(y cosφD − x sinφD)],

ΓD0→K+K− = Γ× [1 + |p/q|(y cosφD + x sinφD)], (10.38)

where we defined φD via λK+K− = −|q/p| exp(iφD). Then, we can define a CP violating parameter,

AΓ ≡ΓD0→K+K− − ΓD0→K+K−

2Γ

=y

2

(∣∣∣∣∣qp∣∣∣∣∣−

∣∣∣∣∣pq∣∣∣∣∣)

cosφD −x

2

(∣∣∣∣∣qp∣∣∣∣∣+

∣∣∣∣∣pq∣∣∣∣∣)

sinφD. (10.39)

The y-dependent term represents CP violation in mixing, while the x-dependent term represents

CP violation in the interference of decays with and without mixing. Both effects and, consequently,

AΓ, represent indirect CP violation.

Present measurements of AΓ are consistent with zero,

AΓ = (−0.06± 0.04)× 10−2. (10.40)

10.3 SM calculations

10.3.1 M12

We now explain how the theoretical calculation of the mixing parameters is done. Our focus is on

∆m. We present the SM calculation, but the tools that we develop can be used in a large class of

models.

For the sake of concreteness, we discuss in this section the neutral B meson system. The oper-

ator M12 is given, within the SM, by CSM(dLγµbL)(dLγµbL), where CSM is the Wilson coefficient.

The matrix element is given by

MBB =CSM

2mB

〈B0|(dLγµbL)(dLγµbL)|B0〉. (10.41)

The mass splitting is given by

∆mB = 2|MBB|, (10.42)

so that, within the SM, we have

∆mB = −1

3mBBBf

2BCSM, (10.43)

where we parameterized the hadronic matrix element as 〈B0|(dLγµbL)(dLγµbL)|B0〉 = −1

3m2BBBf

2B.

(Lattice calculations give√BBfB ≈ 0.22 GeV.)

165

Our task is then to calculate CSM. Since the operator in Eq. (10.41) is an FCNC operator,

within the SM it cannot be generated at tree level. The one loop diagrams that generate it are

called “box diagrams”. They are displayed in Fig. 10.1. The calculation of the box diagrams gives,

to a good approximation,

MBB =G2F

12π2mBm

2W (BBf

2B)S0(xt)(VtbV

∗td)

2, (10.44)

where xt = m2t/m

2W . A few comments are in order:

1. The box diagrams have two W -boson propagators, which yield the G2F factor.

2. The box diagrams have two up-type quark (i and j) propagators, yielding six different com-

binations: ij = uu, cc, tt, uc, ut, ct. Each such diagram depends on a different combination

of CKM elements and quark masses, (VibV∗id)(VjbV

∗jd)F (m2

i /m2W ,m

2j/m

2W ).

3. The unitarity of the CKM matrix implies that any (mi,mj)-independent terms vanish.

4. The three CKM combinations V ∗idVib are comparable in size. (They are all cubic in the

Wolfenstein parameter λ.)

5. The six kinematic functions F (m2i /m

2W ,m

2j/m

2W ) are very different in size. In particular,

S0(xt) = F (xt, xt) is the largest.

6. The conclusion of the last two statements is that the dominant contribution comes from the

box diagram with two top-quark propagators. In the physicists’ jargon we say that MBB is

dominated by the top-quark.

The function S0(xt) is quadratically sensitive to mt. Similar to the EWPM, this non-decoupling

effect is related to the fact that the larger the top mass, the stronger its Yukawa coupling. When

∆mB was first measured, the top quark has not yet been discovered, and one could use Eq.

(10.44) to predict (correctly!) the top mass. At present, when the top mass is known (yielding

S0(xt) ≈ 2.36), Eq. (10.44) serves to constrain the CKM combination |VtbV ∗td|.Let us comment on the calculation of ∆mP in the other meson systems:

1. ∆mK : Due to the CKM structure, it is dominated by the charm quark in the loop. Conse-

quently, ∆mK is GIM-suppressed by a factor of m2c/m

2W (see the discussion in Section 12).

The lightness of the charm quark implies also considerably larger theoretical uncertainties in

the calculation compared to ∆mB.

2. ∆mD: Due to the CKM structure, the contributions involving the bottom quark are sup-

pressed. The calculation of the box diagrams with intermediate down and strange quarks is

not a good approximation to ∆mD.

166

b d

d b

ui

uj

Figure 10.1: A box diagram that generate an operators that can lead to B ↔ B transition.

3. ∆mBs : The calculation goes along very similar lines to that of ∆mB. In the ratio ∆mB/∆mBs ,

much of the uncertainty in the calculation of the hadronic matrix elements cancels out, pro-

viding an excellent measurement of |Vtd/Vts|.

Finally, let us mention the calculation of ΓPP . An estimate of it can be made by calculating

the on-shell part of the box diagram. Yet, since the intermediate quarks are light and on-shell,

QCD effects are important, and the theoretical uncertainties in the calculation of ΓPP are large.

10.3.2 CP violation in decay: D → K+K−

We give here an example of the SM contribution to CP violation in decay in the D → K+K− mode.

This decay proceeds via the quark transition c → ssu. Within the SM, there are contributions

from both tree (t) and penguin (pq, where q = d, s, b is the quark in the loop) diagrams. Factoring

out the CKM dependence, we have

AK+K− = (V ∗csVus)tKK +∑

q=d,s,b

(V ∗cqVuq)pqKK . (10.45)

Using CKM unitarity, AK+K− can be written in terms of just two CKM combinations:

AK+K− = (V ∗csVus)TKK + (V ∗cbVub)PbKK , (10.46)

where TKK = tKK + psKK − pdKK and P bKK = pbKK − pdKK . CP violating phases appear only in the

CKM elements, so thatAK+K−

AK+K−=

(V ∗csVus)TKK + (V ∗cbVub)PbKK

(VcsV ∗us)TKK + (VcbV ∗ub)PbKK

. (10.47)

Due to CKM suppression and loop suppression, we expect the P bKK-related contribution to be

much smaller than the TKK-related contribution, and thus the contribution from CP violation in

decay to the CP asymmetry is given by

AdK+K− ≈ −2Im(P bKK

TKK

)|V ∗cbVub||V ∗csVus|

sin γ, (10.48)

where γ is defined in Eq. (8.77). The super-index d on AdK+K− denotes that we include here only

the contribution from CP violation in decay.

167

The CKM parameters are known, and generate a suppression factor of O(10−3). The factor of

Im(P bKK/TKK) depends on the relative size of the penguin and tree contributions, as well as the

relative strong phase. Both ingredients arise from QCD dynamics at the scale of mD. At present,

there is no rigorous way to calculate this factor. Thus, one cannot use a measurement of AK+K−

to extract, for example, the value of the CP violating phase γ.

10.3.3 CP violation in mixing: K → `νπ

We give here an example of the SM contribution to CP violation in K0 −K0 mixing, represented

by Re(εK). As explained above, we need to obtain the SM contribution to MKK . Similarly to the

neutral B system, this contribution comes from box diagrams with intermediate up-type quarks,

leading to

MKK =G2Fm

2W

12π2mK(BKf

2K)[S0(xc)(VcsV

∗cd)

2 + S0(xt)(VtsV∗td)

2 + S0(xc, xt)(VcsV∗cdVtsV

∗td)].

(10.49)

where xc = m2c/m

2W . In contrast to the case of MBB (10.44), in the neutral K system, MKK is

dominated by the charm quark. The reason is that, of the three relevant CKM combinations,

the top-related one is highly suppressed: |V ∗tdVts| ∼ λ5 compared to |V ∗cdVcs| ' λ. Thus, ∆mK is

dominated by the charm quark. We used this fact when writing down Eq. (??). The pure charm

contribution to Im(M12) is, however, highly suppressed, and the top quark is dominant in Re(εK).

10.3.4 CP violation in interference of decays with and without mixing:

B → ψKS

We give here an example of the SM contribution to CP violation in the interference of decays with

and without mixing in the B → ψKS mode. This is often called “the golden mode” with regard

to CP violation as its theoretical calculation is uniquely clean of hadronic uncertainties. In fact,

the CP asymmetry can be translated into a value of sin 2β [β is defined in Eq. (8.77)] with a

theoretical uncertainty smaller than one percent.

For the neutral B meson system, |ΓBB/MBB| 1 holds. From Eq. (10.44) we obtain

M∗BB

|MBB|=V ∗tbVtdVtbV ∗td

. (10.50)

The B → ψK decay proceeds via a b→ ccs transition:

AψK = (V ∗cbVcs)TψK + (V ∗ubVus)PuψK . (10.51)

The second term is CKM and loop suppressed, and can be safely neglected. Since B0 decays into

ψK0 while B0 decays into ψK0, an additional phase from K0−K0 mixing, (V ∗cdVcs)/(VcdV∗cs), enters

168

the calculation of AψKS/AψKS :AψKSAψKS

= −VcbV∗cd

V ∗cbVcd. (10.52)

Combining Eq. (10.50) and Eq. (10.52), we obtain

λψKS = −e−2iβ =⇒ Im(λψKS) = sin 2β. (10.53)

This demonstrate the power of CP asymmetries in measuring CKM parameters. The experimental

measurement of Im(λψKS) translates directly into the value of a CKM parameter, β, without any

hadronic parameters. A crucial role is played by the CP symmetry of the strong interactions.

The size and the phase of the amplitude TψK cannot be calculated, but it is the same in the CP

conjugate amplitudes AψKS and AψKS and therefore cancels out when their ratio is taken.

169

Part III

Testing the SM

170

Chapter 11

Electroweak Precision Measurements

11.1 The SM beyond tree level

The SM is not a full theory of Nature. It is only a low energy effective theory, valid below some

scale Λ mZ . Then, the SM Lagrangian should be extended to include all non-renormalizable

terms, suppressed by powers of Λ:

L = LSM +1

ΛOd=5 +

1

Λ2Od=6 + · · · , (11.1)

where Od=n represents operators that are products of SM fields, transforming as singlets under the

SM gauge group, of overall dimension n in the fields. For physics at an energy scale E well below

Λ, the effects of operators of dimension n > 4 are suppressed by (E/Λ)n−4. Thus, in general, the

higher the dimension of an operator, the smaller its effect at low energies.

In previous sections, we studied the gauge sector of the SM at tree level and with only renor-

malizable terms. We can classify the effects of including loop corrections and nonrenormalizable

terms into three broad categories:

1. Forbidden processes: Various processes are forbidden by the accidental symmetries of the

Standard Model. Nonrenormalizable terms (but not loop corrections!) can break these

accidental symmetries and allow the forbidden processes to occur. Examples include neutrino

masses and proton decay.

2. Rare processes: Various processes are not allowed at tree level. These effects can often be

related to accidental symmetries that hold within a particular sector of, but not in the entire,

SM. Here both loop corrections and nonrenormalizable terms can contribute. Examples

include flavor changing neutral current (FCNC) processes.

3. Tree level processes: Often tree level processes in a particular sector depend on a small subset

of the SM parameters. This situation leads to relations among different processes within this

171

sector. These relations are violated by both loop effects and nonrenormalizable terms. Here,

precision measurements and precision theory calculations are needed to observe these small

effects. Examples include electroweak precision measurements (EWPM).

As concerns the last two types of effects, where loop corrections and nonrenormalizable terms

may both contribute, their use in phenomenology can be divided to two eras. Before all the SM

particles have been directly discovered and all the SM parameters measured, one could assume

the validity of the renormalizable SM and indirectly measure the properties of the yet unobserved

SM particles. Indeed, the charm quark, the top quark and the Higgs boson masses were predicted

in this way. Once all the SM particles have been observed and the parameters measured directly,

the loop corrections can be quantitatively determined, and effects of nonrenormalizable terms can

be unambiguously probed. Thus, at present, all three classes of processes serve to search for new

physics.

11.2 Electroweak Precision Measurements (EWPM)

Consider a situation where a class of processes is described at tree level by only one sector of a

theory, and where this sector depends on only a small number of parameters. If the number of

observables is larger than the number of parameters, relations among the observables are predicted.

At the quantum level, however, the tree level relations are violated, and the processes depend on

all parameters of the theory. In some cases, the tree level predictions follow from a symmetry

which is respected by the relevant sector but not by other sectors of the theory. Violations of such

symmetry-based relations are particularly sensitive to loop effects. These features of quantum

theories are taken advantage of in the program of EWPM.

At tree level, all (flavor diagonal) electroweak processes depend on only three parameters of the

renormalizable SM Lagrangian. In the language of the SM Lagrangian, the three parameters are

g, g′ and v. It is convenient for our purposes to work with the combinations of these parameters

that are best measured: α, mZ and GF . The number of relevant observables is much larger than

three. Thus, at tree level, a large number of relations among these observables are predicted.

These predictions are, however, violated by SM loop effects, and possibly by nonrenormalizable

operators that are generated by beyond SM (BSM) physics.

The full SM has eighteen parameters. The EWPM allow us to probe some of the additional

fifteen parameters through their modification of the tree level relations. Eleven of the fifteen

parameters (eight of the Yukawa couplings and the three CKM mixing angles) are small, and

consequently have negligible effects on deviations from the tree level relations. The four large

parameters are the Kobayashi-Maskawa (KM) phase, the strong coupling constant, the Higgs self-

coupling and the top Yukawa coupling. The KM phase has negligible effects on flavor diagonal

processes. As concerns the strong coupling constant, its universality and the fact that the elec-

172

troweak vector bosons do not couple directly to gluons combine to make its effect on the relevant

parameters very small. Thus, in practice, there are only two SM parameters that have significant

effects on the EWPM: mt/v and mh/v. In the past, when these masses had not yet been directly

measured, the EWPM were used to determine their values. Now, that the top quark and the Higgs

boson have been discovered and their masses are known from direct measurements, the EWPM

are used to probe nonrenormalizable operators, that is, BSM physics.

As concerns the experimental aspects of the EWPM program, the relevant processes can be

divided to two classes: low-energy and high-energy. The “low-energy” observables involve processes

with a characteristic energy scale well below mW and mZ , so that the intermediate W -boson or

Z-boson are far off-shell. The “high-energy” observables are measured in processes where the W -

boson or the Z-boson are on-shell. The low-energy EWPM include measurements of GF and α,

as well as data from neutrino scattering, deep inelastic scattering (DIS), atomic parity violation

(APV) and low energy e+e− scattering. The high-energy EWPM include measurements of the

masses, the total widths and partial decay widths of the W and Z bosons. A summary of the

current data on the relevant observables is given in Fig. ???. : Need to include the fig In the

appendix we discuss some examples of these observables in detail and derive their dependence on

the SM parameters.

11.3 The weak mixing angle, θW

As an example of the way EWPM can be used, we consider three definitions of the weak angle. Each

definition involves a different set of observables. At tree level, all three definitions are equivalent,

and correspond to

tan θtree ≡g′

g. (11.2)

At the one loop level, however, they differ.

(i) Definition in terms of α, GF and mZ :

sin2 2θ0 ≡4πα(mZ)√

2GFm2Z

. (11.3)

Quantitatively, θ0 is defined in terms of the best measured observables and thus has the

smallest experimental uncertainties.

(ii) Definition in terms of mW and mZ :

sin2 θW ≡ 1− m2W

m2Z

. (11.4)

This definition is based on the tree level relation ρ = 1 discussed earlier.

173

(iii) Definition in terms of gV and gA:

sin2 θi∗ ≡giA − giV

2Qi

, (11.5)

where i is a flavor index that is not summed, and where giV,A = giL ± giR refer to the Z

couplings to fermions in Eq. (7.60),

LZψψ ∝ ψiγµ(giV − giAγ5

)ψiZµ. (11.6)

In principle we have here nine different definitions of θ∗, one for each charged fermion type.

(The Zνν coupling does not involve θW .)

For any given model, one can compute all relevant corrections as functions of the model parame-

ters. For pedagogical and practical purposes we would like, however, to work in a simple framework

that represents a large class of models. To do so, we make the following working assumption:

1. The only significant effects are in the electroweak gauge boson propagators. These effects

are called oblique corrections.

We note that this assumption implies that all θi∗ are equal, that is, the relevant one loop effects

are flavor universal. In some models the assumption that all corrections are oblique does not hold,

yet the corrections are flavor universal.

We denote the oblique corrections to the electroweak gauge boson propagators by ΠAB(q2).

The propagators PAB are defined as follows:

PAB(q2) =−i

q2 −m2A

[δAB +

−iΠAB(q2)

q2 −m2B

]. (11.7)

Taking charge conservation into account, we learn that there are four ΠAB’s that do not vanish.

We can choose the set ΠWW , ΠZZ , Πγγ and ΠγZ(= ΠZγ), corresponding to the mass eigenstates.

Alternatively, we can use Π+−, Π33, Π00 and Π30, corresponding to the interaction eigenstates.

The transition between the two sets is straightforward.

At q2 = m2A, the relevant Π can be identified as correction to the mass of the corresponding

gauge boson. Thus, gauge invariance (specifically, m2γ = 0) guarantees that

Πγγ(q2 = 0) = ΠγZ(q2 = 0) = 0. (11.8)

To proceed further, we expand the ΠAB’s that are relevant for low energy observables, ΠAB(q2 m2W ), in q2, and make one more working assumption:

2. Terms of order (q2)2 and higher can be neglected in low energy observables:

ΠAB(q2 m2W ) = ΠAB(0) + q2Π′AB(0), Π′(q2) ≡ dΠ(q2)

dq2. (11.9)

174

How do the various corrections to the propagators affect the three differently-defined θ’s? We

define ∆ sin2 θA ≡ sin2 θA − sin2 θtree and obtain [Peskin]:

(i) Corrections to θ0 [See (21.131) of Peskin]:

∆ sin2 θ0 =sin2 2θ

4 cos 2θ

[Π′γγ(0) +

ΠWW (0)

m2W

− ΠZZ(m2Z)

m2Z

]. (11.10)

The three terms correspond to corrections to α, GF and mZ , respectively.

(ii) Corrections to θW [See (21.128) of Peskin]:

∆ sin2 θW = − 1

m2Z

[ΠWW (m2

W )− m2W

m2Z

ΠZZ(m2Z)

]. (11.11)

The two terms correspond to corrections to the tree level heavy gauge boson masses,

∆m2W = ΠWW (m2

W ), ∆m2Z = ΠZZ(m2

Z). (11.12)

(iii) Corrections to θ∗ [See (21.127) of Peskin]:

∆ sin2 θ∗ = − sin θ cos θΠγZ(m2

Z)

m2Z

. (11.13)

The corrections arise from the mixing between the off-shell photon and the Z-boson.

We learn that the loop effects are different for the three definitions. Once we extract the values of

sin2 θ from each of the three sets of observables, we can probe these effects.

11.3.1 θ within the SM

Our analysis so far applies to all models with only oblique corrections. In this subsection we discuss

the specific case of the SM. We make the following approximation:

3. We consider only one loop diagrams involving the top and bottom quarks.

Within the SM, these diagrams provide the leading oblique corrections. We thus need to calculate

diagrams of the general form

Plot of the diagram (11.14)

The relevant diagrams will be added to this text, but for now you can just find them in Fig. 21.12

of Peskin’s book. We do not reproduce the details of the calculation here, but we make one

comment regarding the finiteness of the results. Naively, each diagram is quadratically divergent.

Ward identities prevent, however, the quadratic divergences, leaving only logarithmic ones. The

logarithmic divergences appear in each of the ∆ sin2 θ, but they cancel in the differences between

175

any two observable quantities. The final results (see Eq. (21.157) of Peskin) are the following (we

approximate mb = 0):

sin2 θ0 − sin2 θ∗ =3α

16π cos2 2θ

m2t

m2Z

,

sin2 θW − sin2 θ∗ =−3α

16π sin2 θ

m2t

m2Z

. (11.15)

The factor of α/(16π) is typical of electroweak one-loop effects. The factor of 3 is the color factor

of the quarks in the loop. The θ dependence is different in the two cases, reflecting the specific

combination of observables involved in each case. The factor of m2t/m

2Z deserves a more detailed

discussion, which we now turn to.

Naively, quadratic dependence on the top mass is puzzling since it seems to violate the so-called

decoupling theorem. This theorem states that the effect of heavy states on low energy observables

must go to zero as their mass goes to infinity. The intuition behind this theorem is straightforward.

The heavier a state is, the smaller its effects (when off-shell) become. This can be understood based

on the uncertainty principle, or on second order perturbation theory, or simply by considering the

form of propagators in QFT. Why doesn’t this theorem apply to the top contribution to EWPM?

The solution to the puzzle lies in the fact that the SM quarks acquire their masses from the Higgs

mechanism. Consequently, their Yukawa couplings are proportional to their masses. The heavier

the top, the stronger its Yukawa coupling becomes. Indeed, the top-related loop corrections to

EWPM depend on the top coupling to the longitudinal W and Z, which is its Yukawa coupling.

The quadratic dependence on the top mass reflects the proportionality of the loop corrections to

the top Yukawa coupling, and not to its mass. (In fact, the mt →∞ limit cannot be taken, because

perturbation theory does not hold anymore.)

Eqs. (11.15) demonstrate the main point of this section: One can use the observables of the

EWPM to determine parameters outside the pure electroweak sector. Using the measured values of

the observables on the left-hand side of these equations, one can, first, determine mt and, second,

test the SM by examining whether the ranges allowed for mt from the two sets of observables

overlap.

Additional oblique corrections of interest come from loop diagrams with internal Higgs boson.

At one loop level, the mh-dependence is logarithmic (and thus the EWPM are less sensitive to

mh than they are to mt). This result is known as the screening theorem. We do not discuss it in

detail here. Two loop contributions are proportional to m2h, but are small because of the extra loop

factor. We note, however, that the precision of the EWPM was good enough to have sensitivity

to the Higgs-related oblique corrections and provided an allowed range for the Higgs mass well

before it was actually measured. The present allowed range from EWPM (removing all direct

measurements of the Higgs mass, production and decay rates) is [pdg] 60 GeV ≤ mh ≤ 127 GeV

at the 90% CL.

176

11.4 Custodial symmetry

The SM Higgs potential has an accidental symmetry. This so-called “custodial symmetry” predicts

tree level relations among various observables. The fact that the custodial symmetry is not a

symmetry of the full SM makes EWPM sensitive to the symmetry breaking SM parameters. This

is the topic of this subsection. The fact that the symmetry is accidental makes EWPM sensitive

to nonrenormalizable terms that violate the symmetry. This is one of the topics of the next

subsection.

Consider the SM Higgs potential:

V = µ2|φ|2 + λ|φ|4. (11.16)

Since φ is a complex, SU(2)L-doublet scalar field, it has four degrees of freedom:

φ =

(φ3 + iφ4

φ1 + iφ2

). (11.17)

The scalar potential, when written in terms of these four components, depends only on the com-

bination φ21 + φ2

2 + φ23 + φ2

4, and thus has manifestly an SO(4) symmetry. At the algebra level,

SO(4) ∼ SU(2) × SU(2). Out of the six generators, four are also generators of the gauge group

SU(2)L × U(1)Y . The two extra generators are then related to an accidental symmetry of the

scalar sector of the SM.

The VEV of the Higgs field breaks three of the generators, leaving (within the pure Higgs

sector) an unbroken SU(2) symmetry. This symmetry is called the custodial symmetry. Under

this symmetry, the (W1,W2,W3) DoF transform as a triplet. Consequently, the mass terms induced

by the spontaneous symmetry breaking are equal for these three DOFs.

The most general mass matrix in the (W1,W2,W3, B) basis, that is consistent with U(1)EM

gauge invariance, is given by m2W

m2W

m2Zc

2W m2

ZcW sW

m2ZcW sW m2

Zs2W

. (11.18)

The custodial symmetry requires that the top three diagonal terms are equal and thus thatm2Zc

2W =

m2W , namely the ρ = 1 relation.

The custodial symmetry holds at tree level for models with any number of scalar doublets and

singlets. It is however not a symmetry of the full SM. In particular, it is broken by the Yukawa

coupling since YD 6= YU . The strongest breaking parameter is then m2t − m2

b . This breaking is

communicated to the Higgs sector by loop effects, resulting in violation of the predictions that

follow from the custodial symmetry. This is the reason that the leading correction to the ρ = 1

relation is proportional to m2t −m2

b , see Eq. (11.15).

177

The custodial symmetry is just an accidental symmetry of the Higgs sector. It is broken

by nonrenormalizable operators of dimension six (and higher). We present these dimension six

operators in the next subsection.

11.5 Probing new physics

Within the SM, the EWPM program is sensitive at tree level to three input parameters and at the

loop level to a few more. Since we have more observables than relevant SM parameters, EWPM

can be used to test the SM. So far, no significant deviation from the SM was found. Furthermore,

as we already mentioned, all the relevant SM parameters are now directly measured, and therefore

the data can be use to constrain BSM physics.

As concerns the probing of new physics with EWPM, one can go in either of two ways. First,

one can consider a specific model, calculate the new contributions to the observables, and constrain

the BSM parameters by comparing to the experimental results (just as we did for the SM itself).

We demonstrate this by a brief discussion of the four generation extension of the Standard Model

in subsection 11.5.2. Second, one can consider the effects of nonrenormalizable terms without

committing to a specific model. This is what we do in some detail in subsection 11.5.1. Under

reasonable assumptions, to be specified below, there is only a small number of dimension six terms

that affect the EWPM. Here we discuss these operators and how they are constrained by EWPM.

11.5.1 NR operators and the q2 expansion

In general there are many operators that affect the EWPM. The number of operators with poten-

tially significant effects is however much smaller in a large class of models that fulfill the following

three conditions:

1. The scale of the new physics is much higher than the electroweak breaking scale, Λ mW .

(This condition holds, by definition, for new physics whose effects can be represented by

nonrenormalizable terms.)

2. The new physics generates only oblique corrections to the relevant observables.

3. There are no contributions from new heavy gauge bosons.

(The analysis below applies to a broader class of models than implied by the third condition

[barbieri], but we assume this stronger condition for the sake of simplicity.)

The contributions to oblique corrections from nonrenormalizable terms can come from tree-level

diagrams or from loop diagrams. We consider here only tree level contributions. Consequently, our

analysis concerns only operators that involve exactly two electroweak gauge fields and no fermions.

In addition to the two electroweak gauge fields, the operators can have derivatives and Higgs fields.

178

Tree level contributions come from replacing the Higgs fields in these operators with their VEV.

The leading contributions (for Λ v) come from dimension-six operators.

It turns out that there are only four dimension-six terms that contribute to the oblique correc-

tions [5, 3, 6, 4]:

Lo.c. =1

Λ2(cWBOWB + cHHOHH + cBBOBB + cWWOWW ) , (11.19)

where the cXY ’s are dimensionless coefficients, and the operators are defined as follows (note that

in the literature, a variety of normalizations are used):

OWB = (H†τaH)W aµνBµν →

1

2v2W 3

µνBµν ,

OHH = |H†DµH|2 →1

16v4(gW 3

µ − g′Bµ)2,

OBB = (∂ρBµν)2,

OWW = (DρWaµν)

2. (11.20)

The symmetry properties of these operators are important. In particular, the way they can be

probed by the EWPM depends on whether their contributions to the oblique corrections violate

both the electroweak gauge symmetry and the custodial symmetry, or just the electroweak symme-

try, or neither. (All operators that contribute to oblique corrections at tree level belong to one of

these three classes. In particular, no such operators break the custodial symmetry without break-

ing the electroweak gauge symmetry.) Among the four operators, OHH breaks both symmetries,

while OWB breaks the gauge symmetry but respects the custodial symmetry. The two remaining

operators, OBB and OWW , respect both symmetries.

Intuitively, symmetry breaking effects are easiest to probe. This intuition goes well with the

actual situation: If the new physics breaks the custodial symmetry with O(1) parameters, then

the lower bound on the scale Λ coming from OHH would be the strongest. (It corrects the ρ = 1

relation.) Conversely, if we assume that the new physics respects the custodial symmetry, so that

cHH = 0, then the bound coming from OWB is the strongest.

In addition to the expansion in inverse powers of Λ, we can expand in powers of q2. As we

will see below, this expansion leads to considerable simplification of the analysis. Since we deal

with oblique corrections, we expand the vacuum polarization amplitudes ΠAB(q2) with AB =

W+W−,W3W3, BB,W3B:

ΠAB(q2) = ΠAB(0) + q2Π′AB(0) +(q2)2

2Π′′AB(0) + · · · . (11.21)

What is the relation between the q2 and the 1/Λ expansions? First we note that the number of

derivatives in any operator is related to the order in the q2 expansion. Since Π is dimension two,

and the only dimensionful parameter in the SM is v, we can write

Π(0) = v2(1 + v2/Λ2 + · · ·),

179

Π′(0) = 1 + v2/Λ2 + · · · ,

Π′′(0) =

1

Λ2(1 + v2/Λ2 + · · ·),

Π(n)(0) =1

Λ2n−2(1 + v2/Λ2 + ...). (11.22)

Since we neglect terms of dimension higher than six, we can truncate the q2 expansion at order

(q2)2: Terms of order (q2)3 and higher are suppressed by at least 1/Λ4. One more observation from

the above is that the dimension-six operators affect Π(0), Π′(0) and Π′′(0). Specifically, examining

Eq. (11.20), we straightforwardly learn that OWB contributes to Π′W3B(0), OHH contributes to

ΠW3W3(0), ΠBB(0) and ΠW3B(0), OBB contributes to Π′′BB(0), and OWW contributes to Π′′W3W3(0)

and Π′′W+W−(0).

To be clear, the q2 expansion of Eq. (11.9) applies to any model but only to low energy

observables (q2 m2W ). The q2 expansion of Eq. (11.21) applies to all observables but only to

new physics models that are characterized by a scale that is much higher than the electroweak

scale, (Λ v).

S, T and U

To study the contributions of dimension-six terms to oblique corrections, we need to keep terms

up to Π′′

in the q2 expansion. Thus, the four functions, ΠAB(q2), are replaced by twelve numbers,

ΠAB(0), Π′AB(0), and Π′′AB(0). Out of these twelve numbers, two combinations vanish due to the

U(1)EM gauge invariance, Πγγ(0) = ΠγZ(0) = 0. Three other combinations are fixed by the three

tree-level SM parameters. The remaining seven parameters are then related to observables, and

can be fitted. The seven parameters are usually called S, T, U, V,W,X, Y . From the theoretical

side, one can obtain the contributions of the four dimension-six terms to these parameters and

finally constrain their coefficients in the Lagrangian.

In order to provide a simple demonstration of the relation to observables, we provisionally

truncate the expansion at order q2. This leaves only three parameters that neither vanish nor are

fixed by the tree level parameters. These are the S, T and U parameters defined as follows:

αT =ΠWW (0)

m2W

− ΠZZ(0)

m2Z

, (11.23)

αS

4 sin2 2θ= Π′ZZ(0)− 2 cos2 2θ

sin2 2θΠ′Zγ(0)− Π′γγ(0), (11.24)

αU

4 sin2 θ= Π′WW (0)− cos2 θΠ′ZZ(0)− sin 2θΠ′γZ(0)− sin2 θΠ′γγ(0). (11.25)


1. These S, T and U parameters receive one loop contributions in the SM and possibly also

BSM contributions. One can subtract the SM values and redefine them such that within the

SM they vanish. With this new definition, a non-zero value would be a sign of new physics.

The values presented below correspond to this definition.

180

S

0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5

T

0.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0.5

68%, 95%, 99% CL fit contours, U free

=173 GeV)t

=126 GeV, mH: Mref

(SM

SM Prediction 0.4 GeV± = 125.7 HM

0.94 GeV± = 173.18 tm

SM Prediction

[100,1000] GeV∈ H

with M

HM

G fitter SMB

Se

p 1

2

Figure 11.1: The current bounds on S and T taken from [8]

2. S, T and U are pure numbers. Their scaling by α−1 relative to ΠAB (or Π′AB) means that

they roughly reflect the allowed size of new physics contributions compared to the SM one

loop contributions.

3. The U parameter rarely provides a significant constraint. The reason is that U arises from

a dimension-eight operator, while S and T arise from dimension-six operators.

4. Any new physics that respects the custodial symmetry gives T = U = 0, while S may

be different from zero. More generally, T and U are proportional to custodial symmetry

breaking parameters.

The EWPM determine the allowed ranges for these parameters. The strongest bounds arise

from S and T but there are experimental correlations between them that have to be taken into

account, see Fig. 11.1. The current experimental bounds read [?]

S = +0.05± 0.11,

T = +0.09± 0.13,

U = +0.01± 0.11, (11.26)

and, in particular, S ≤ 0.14 and T ≤ 0.20 at 95% CL.

The contributions of the dimension-six operators to S and T are given by [4]

S =2 sin 2θ

α

v2

Λ2cWB,

181

T = − 1

2α

v2

Λ2cHH . (11.27)

We note that T is proportional to cHH , which is the only custodial symmetry breaking parameter

among these nonrenormalizable terms. Using the experimental upper bounds of Eq. (11.26) as

reference points gives

Λ2

cWB

> 9.7(

0.14

S

)TeV2,

Λ2

cHH> 4.4

(0.20

T

)TeV2. (11.28)

11.5.2 The four generation SM

The four generation SM is excluded by measurements of the Higgs production and decay rates.

Yet, it provides a simple example of constraining a specific new physics model by the EWPM. We

assume that the fourth generation fermions are heavier than the top quark. We further assume no

flavor mixing between the fourth generation and the three known ones. To leading order in the

masses of the new fermions one finds [7]

S =1

2π

[1− 1

3log

(m2u4

m2d4

)]+

1

6π

[1 + log

(m2ν4

m2e4

)],

T =1

2π sin2(2θW )

(m2u4−m2

d4

m2Z

)+

1

6π sin2(2θW )

(m2ν4−m2

e4

m2Z

), (11.29)

where mf4 are the masses of the fourth generation fermions. We learn that T is related to the mass

splitting between t4 and b4 and between ν4 and e4, which are the custodial symmetry breaking

parameters in this model. In the case of a degenerate quark doublet and a degenerate lepton

doublet, we have T = 0 and S = 2/(3π). In a way, the S parameter “counts” the number of extra

generations. The current measurement of S by itself excludes an extra degenerate generation (so

that T = U = 0) at the 7σ level. Yet, when the correlation with T is taken into account, the

bounds on the extra generations become much less severe.

182

Chapter 12

Flavor physics

12.1 Introduction

The effects of non-renormalizable terms might be observed in rare processes, where the contribution

from the renormalizable SM is highly suppressed. The prime example of such processes are flavor

changing neutral current (FCNC) processes. In this section, we explain what these processes are,

describe the phenomenological constraints on deviations from the SM predictions, and extract

lower bounds on the scale that suppresses dimension-six terms that contribute to these processes.

The term “flavors” is used to describe several copies of the same gauge representation. Within

the Standard Model, each of the four different types of fermionic particles (namely different

SU(3)C × U(1)EM representations) comes in three flavors:

• Up-type quarks in the (3)+2/3 representation: u, c, t;

• Down-type quarks in the (3)−1/3 representation: d, s, b;

• Charged leptons in the (1)−1 representation: e, µ, τ ;

• Neutrinos in the (1)0 representation: ν1, ν2, ν3.

In this section we discuss only quark flavor physics. (We discuss the lepton flavor physics in

Chapter 13.)

The term “flavor physics” refers to interactions that distinguish between flavors. Within the

SM, these are the W -mediated weak interactions and the Yukawa interactions. The term “flavor

parameters” refers to parameters that carry flavor indices. Within the quark sector of the SM,

there are ten flavor parameters: the six quark masses and the four CKM parameters. The term

“flavor universal” refers to interactions with couplings (or to parameters) that are proportional to

a unit matrix in flavor space. Within the SM, the strong, electromagnetic, and Z-mediated weak

interactions are flavor-universal. The term “flavor diagonal” refers to interactions with couplings

183

(or to parameters) that are diagonal, but not necessarily universal, in flavor space. Within the

SM, the Yukawa interactions are flavor-diagonal in the mass basis.

In this chapter we discuss three types of measurements that are used to probe the flavor sector

of the SM: tree-level decays, loop processes and CP violating processes. Then we show how the

combination of all of them is used to test the SM and to put bounds on possible extensions of it.

12.2 Tree level: The CKM parameters

Among the SM interactions, the W -mediated interactions are the only ones that are not diagonal.

Consequently, all flavor changing processes depend on the CKM parameters. The fact that there are

only four independent CKM parameters, while the number of measured flavor changing processes

is much larger, allows for stringent tests of the CKM mechanism for flavor changing processes.

We emphasize, however, that there is an inherent difficulty in determining the CKM parameters.

While the SM Lagrangian has the quarks as its degrees of freedom, in Nature they appear only

within hadrons. As discussed in Chapter 9, there are various tools to overcome this difficulty.

For example, isospin symmetry is used to extract |Vud| from various beta decays (see Appendix

12.A for details), and heavy quark symmetry is used to extract |Vcb| from rates of various b→ c`ν

decays (see Appendix 12.B for details). The most useful processes are those related to q → q′`ν

transitions, and here we only give a short summary of the results:

• Processes related to u→ d`+ν transitions give |Vud| = 0.97425± 0.00022.

• Processes related to s→ u`−ν transitions give |Vus| = 0.2253± 0.0008.

• Processes related to c→ d`+ν or to νµ + d→ c+ µ− transitions give |Vcd| = 0.225± 0.008.

• Processes related to c→ s`+ν or to cs→ `+ν transitions give |Vcs| = 0.986± 0.016.

• Processes related to b→ c`−ν transitions give |Vcb| = 0.0411± 0.0013.

• Processes related to b→ u`−ν transitions give |Vub| = 0.0041± 0.0005.

There are two additional classes of tree level processes that depend on the CKM parameters:

• Processes related to single top production in hadron colliders give |Vtb| = 1.02± 0.03.

• Processes related to b→ scu and b→ suc transitions give γ ≡ arg(−VudV

∗ub

VcdV∗cb

)= (68.0+8.0

−8.5)o.

These eight distinct classes of processes depend on only four CKM parameters. The system is thus

over-constrained and tests the SM. The SM passes this test successfully. The following ranges of

the four parameters are consistent with all the measurements:

λ = 0.2254± 0.0006, A = 0.81± 0.02, ρ = 0.12± 0.02, η = 0.354± 0.015. (12.1)

184

12.3 FCNC processes

A central role in testing the CKM sector of the SM is played by flavor changing processes. The

term “flavor-changing” refers to processes where the initial and final flavor-numbers (that is, the

number of particles of a certain flavor minus the number of anti-particles of the same flavor) are

different. In “flavor changing charged current” (FCCC) processes, both up-type and down-type

flavors are involved. Examples are K− → µ−νµ which corresponds, at the quark level, to su →µ−νµ transition, and B− → ψK− (b→ ccs transition). Within the Standard Model, these processes

are mediated by theW -bosons and occur at tree level. In “flavor changing neutral current” (FCNC)

processes, either up-type or down-type flavors but not both are involved. Examples of FCNC decays

include K0 → µ+µ− (sd → µ+µ− transition), and B− → φK− (b → sss transition). Within the

Standard Model, these processes do not occur at tree level, and are strongly suppressed.

Historically, FCNC processes have played an important role in predicting the existence of SM

particles before they were directly discovered, and in predicting their masses:

• The smallness of Γ(KL → µ+µ−) led to predicting a fourth (the charm) quark;

• The size of ∆mK led to a successful prediction of the charm mass;

• The measurement of εK led to predicting the third generation;

• The size of ∆mB led to a successful prediction of the top mass.

We discuss these measurements below.

FCNC processes are highly suppressed within the SM. Our aim in this section is to explain the

suppression factors that affect FCNCs within the SM. There are three suppression factors: loop,

CKM and GIM. The first one is due to the basic structure of the SM, while the last two are there

due to the specific numerical values of the SM flavor parameters.

We first discuss the conditions that ensure that FCNCs cannot be mediated at tree-level. The

W -boson cannot mediate FCNC processes at tree level, since it couples to up-down pairs, or to

neutrino-charged lepton pairs. Only neutral bosons could mediate FCNC at tree level. The SM

has four neutral bosons: the gluon, the photon, the Z-boson and the Higgs-boson. As we explain

below, within the SM all of them couple diagonally in the mass basis, and therefore cannot mediate

FCNC at tree level.

12.3.1 Photon and gluon mediated FCNCs

As concerns the massless gauge bosons, the gluon and the photon, their couplings are flavor-

universal and, in particular, flavor-diagonal. This is guaranteed by gauge invariance. The uni-

versality of the kinetic terms in the canonical basis requires universality of the gauge couplings

related to the unbroken symmetries. Hence neither the gluon nor the photon can mediate flavor

185

changing processes at tree level. Since we require that extensions of the SM respect the local

SU(3)C × U(1)EM symmetry, this result holds in all such extensions.

As we explain below, the situation concerning the Z-boson and the Higgs-boson is more com-

plicated. In fact, the diagonality of their tree-level couplings is a consequence of special features

of the SM, and can be violated with new physics.

12.3.2 Z-mediated FCNCs

The Z-boson, similarly to the W -boson, does not correspond to an unbroken gauge symmetry (as

manifest in the fact that it is massive). Hence, there is no fundamental symmetry principle that

forbids flavor changing couplings. Yet, as mentioned in Section 8.4.1, in the SM the couplings are

universal and diagonal.

The key point is the following. The Z couplings are proportional to T3 − Q sin2 θW . A sector

of mass eigenstates is characterized by spin, SU(3)C representation and U(1)EM charge. While Q

must be the same for all the flavors in a given sector, there are two possibilities regarding T3:

1. All mass eigenstates in this sector originate from interaction eigenstates in the same SU(2)L×U(1)Y representation, and thus have the same T3 and Y .

2. The mass eigenstates in this sector mix interaction eigenstates of different SU(2)L × U(1)Y

representations and thus have different T3 and Y (but, of course, with the same Q = T3 +Y ).

Let us examine the Z couplings in the interaction and mass bases for all quarks (spin 1/2, color-

triplet) in either the up (Q = +2/3) or the down (Q = −1/3) sector:

1. In the first class, the Z couplings in the interaction basis are universal, namely they are

proportional to the unit matrix (times T3−Q sin2 θW of the relevant interaction eigenstates).

The rotation to the mass basis maintains the universality:

VfM × 1× V †fM = 1, (f = u, d;M = L,R). (12.2)

2. In the second class, the Z couplings in the interaction basis are diagonal but not universal.

Each diagonal entry is proportional to the relevant T3−Q sin2 θW . In this case, the rotation

to the mass basis does not maintain the diagonality:

VfM × Gdiagonal × V †fM = Gnon−diagonal, (f = u, d;M = L,R). (12.3)

You will work out an explicit example of a model of this type in your homework.

The special feature of the SM fermions is that they belong to the first class: All fermion mass

eigenstates in a given SU(3)C × U(1)EM representation come from the same SU(3)C × SU(2)L ×U(1)Y representation. For example, all the left-handed up quark mass eigenstates, which are in

186

the (3)+2/3 representation, come from interaction eigenstates in the (3, 2)+1/6 representation. This

is the reason that the SM predicts universal Z couplings to fermions. If, for example, Nature had

also left-handed quarks in the (3, 1)+2/3 representation, then the Z couplings in the left-handed

up sector would be non-universal and the Z could mediate FCNC, such as t → cZ decay, at tree

level.

12.3.3 Higgs-mediated FCNCs

The Yukawa couplings of the Higgs boson are not universal. In fact, in the interaction basis,

they are given by completely general 3 × 3 matrices. Yet, as explained in Section 8.4.3, in the

fermion mass basis they are diagonal. The reason is that the fermion mass matrix is proportional

to the corresponding Yukawa matrix. Consequently, the mass matrix and the Yukawa matrix are

simultaneously diagonalized. The general condition for the absence of Higgs-mediated FCNCs at

tree level is that the only source of masses for any fermion type is a single Higgs field.

The special features of the SM in this regard are the following:

1. All the SM fermions are chiral, and therefore there are no bare mass terms.

2. The scalar sector has a single Higgs doublet.

In contrast, either of the following possible extensions would lead to flavor changing Higgs cou-

plings:

1. There are quarks or leptons in vector-like representations, and thus there are bare mass

terms.

2. There is more than one SU(2)L-doublet scalar that couple to a specific type of fermions.

In your homework you will work an example of such extensions and see explicitly how the Higgs

couplings are not diagonal in the mass basis.

It is interesting to note, however, that not all multi Higgs doublet models lead to flavor changing

Higgs couplings. If all the fermions of a given sector couple to one and the same doublet, then the

Higgs couplings in that sector would still be diagonal. For example, in a model with two Higgs

doublets, φ1 and φ2, and Yukawa terms of the form

LYuk = Y uijQLiURj φ2 + Y d

ijQLiDRj φ1 + Y eijLLiERj φ1 + h.c., (12.4)

the Higgs couplings are flavor diagonal. In the physics jargon, we say that such models have natural

flavor conservation (NFC).

We conclude that within the SM, all FCNC processes are loop suppressed. However, in exten-

sions of the SM, FCNC can appear at the tree level, mediated by the Z boson or by the Higgs

boson or by new massive bosons.

187

d

Z

sW

Figure 12.1: One of the sdZ amplitudes

12.3.4 The CKM and mq dependence of FCNC

While FCNCs cannot be mediated at tree level in the SM, they can be mediated at one loop. The

reason is that there is no symmetry that protects against FCNC in the quark sector, and thus

they should occur. Concretely, the W interactions are flavor changing and two insertions of the W

interaction can mediate a FCNC process. Several explicit examples are presented in Section 12.3.5.

Here we use one simple example to qualitatively explain the resulting dependence of FCNC on the

flavor parameters, namely the quark masses and the CKM parameters.

We consider, for example, the sdZ vertex. It is induced at the one loop level by several

diagrams, one of which is plotted in Fig. 12.1. Since the FCNC in this example involves external

down-type quarks, the internal quarks are from the up sector and, in fact, one must sum over all

three up-type quarks, i = u, c, t. The most general form of the effective Zsd coupling is thus

gZsd =Cg2

16π2

∑i=u,c,t

f(mi

mW

)VidV

∗is, (12.5)

where C is a dimensionless coefficient and f is some (calculable) function of the dimensionless

ratio between the mass of the internal quark and the W mass.

As concerns the CKM dependence, we note that for each of the three up-type quarks, there is

a product of an element from the d-column and an element from the s-column, of which at least

one is off-diagonal. (Similarly, in FCNC involving external up-type quarks, the internal quarks are

from the down sector, and the contribution of each of the latter involves two terms from the same

column but different rows, of which at least one is off-diagonal.) The important feature is that

within the SM, FCNC are proportional to off-diagonal CKM elements. A quick look at the absolute

values of the off-diagonal entries of the CKM matrix, Eq. (12.13), reveals that they are small. A

rough estimate of the CKM suppression can be acquired by counting powers of λ in the Wolfenstein

parametrization, Eq. (8.67): |Vus| and |Vcd| are suppressed by λ, |Vcb| and |Vts| by λ2, |Vub| and

|Vtd| by λ3. Thus, in the example of gZsd coupling, the top contribution is CKM-suppressed by λ5,

while the charm and up contributions by λ2.

As concerns the dependence on quark masses, it is important to bear in mind that CKM

188

unitarity implies

VudV∗us + VcdV

∗cs + VtdV

∗ts = 0. (12.6)

We learn that the contribution of mi-independent terms in f vanishes when summed over all

internal quarks. Even more significantly, it implies that within the SM, the loop-induced FCNC

processes in the down (up) sector would vanish if the up (down) quarks were all degenerate. This,

in turn, implies that FCNC amplitudes must depend on the mass-splittings among the quarks.

Explicitly, using Eq. (12.6), we can write

∑i=u,c,t

f(mi

mW

)VidV

∗is =

∑i=c,t

[f(mi

mW

)− f

(mu

mW

)]VidV

∗is, (12.7)

This situation, where FCNC are proportional to non-degeneracy among the quarks, is known as

the Glashow-Iliopoulos-Maiani (GIM) mechanism.

The exact form of the dependence on the mass splitting depends on the process. For example,

the leading dependence on the masses for the flavor changing Z-boson effective coupling discussed

here is proportional tom2i−m2

j . The analogous flavor changing photon and gluon effective couplings

are proportional to log(mi)− log(mj) = log(mi/mj). These two cases go under the names of hard-

GIM (for power dependence) and soft-GIM (for log dependence), respectively.

We reach the following conclusions concerning FCNC within the SM:

1. FCNC process are loop-suppressed.

2. FCNC are proportional to off-diagonal CKM elements and are thus CKM-suppressed.

3. FCNC would vanish if quarks were degenerate and thus they are proportional to mass dif-

ferences among quarks. When the quarks masses that are involved are small (that is, all the

quarks but the top) they are GIM-suppressed.

The quantitative consequences of these features are discussed below for several specific examples.

12.3.5 FCNC examples

Here we give examples of two important FCNC processes: neutral meson mixing and FCNC

decays. These processes are also characterized as ∆F = 2 and ∆F = 1 processes, where F is a

flavor number.

Neutral meson mixing corresponds to a transition between a flavored mason and its anti-

particle. Nature provides us with four relevant pairs of CP -conjugate neutral mesons: K0 −K0, B0 − B0, B0

s − B0s, and D0 − D0. The result of this mixing is that the degenerate pair of

QCD-interaction eigenstates splits into two mass eigenstates of different masses and widths. In

Section 10.1 we discuss the measurements and the SM calculations of the mass and width difference.

Here we show how these results incorporate the suppression factors we discussed above.

189

Consider, for example, the ∆s = 2 process, K0 −K0 mixing. The result of the SM calculation

of the mass splitting is presented in Eq. (10.49). As explained there, the top contribution to ∆mK

can be neglected. We further put mu = 0 and keep the leading term in mc/MW . We obtain:

∆MK

MK

∝ g4

16π2

(m2c

m2W

)(VcsV

∗cd)

2. (12.8)

Indeed, all three suppression factors discussed above are manifest:

1. The loop-suppression factor, g2/(16π2) ∼ 10−2.

2. The GIM-suppression factor, m2c/m

2W ∼ 10−4.

3. The CKM-suppression factor, (VcsV∗cd)

2 ∼ 5× 10−2.

As concerns the other meson systems, we see an important hierarchy of GIM- and CKM-

suppression factors:

• B0 −B0 mixing: m2t/m

2W (VtbVtd)

2 ∼ 3× 10−4;

• B0s −B0

s mixing: m2t/m

2W (VtbVts)

2 ∼ 10−2;

• K0 −K0 mixing: m2c/m

2W (VcsVcd)

2 ∼ 10−5;

• D0 −D0 mixing: m2s/m

2W (VcsVus)

2 ∼ 5× 10−8.

We learn an important prediction of hierarchy among the FCNC mixing processes:

∆mD

mD

∆mK

mK

∆mB

mB

∆mBs

mBs

. (12.9)

We can compare it to the experimental results

∆mK

mK

= 7.0× 10−15,∆mD

mD

= 8.7× 10−15,∆mB

mB

= 6.3× 10−14, ∆mBs/mBs = 2.1× 10−12.

(12.10)

For the down sector mesons (K, B and Bs), this pattern is indeed observed in Nature.

On the other hand, ∆mD/mD is experimentally much larger than this naive estimate. Indeed,

there are strong arguments that the contribution to ∆mD represented by quark diagrams (the

so-called “short-distance contribution”) is negligible compared to the contribution from interme-

diate hadrons (the “long-distance contribution”) which is, however, not subject to a perturbative

calculation.

As a second example consider the inclusive ∆B = 1 process, B → Xsνν. Normalizing to the

tree-level W -mediated semileptonic decay, we have

BR(B → Xsνν)

BR(B → Xc`ν)= C

(g2

16π2

)2 |VtbV ∗ts|2

|Vcb|2X2(m2

t/m2W ), (12.11)

where the dimensionless coefficient C includes phase space and QCD correction factors. A few

comments are in order:

190

• The CKM factor suppresses the rate Γ(B → Xsνν) but not the ratio of rates, since a

numerically similar factor suppresses also the tree level decay rate.

• There is no GIM suppression in this case. The reason is that the dominant contribution

to the decay amplitude involves an internal top quark which is neither close in mass to the

charm and up quarks nor light.

• Thus, the ratio of rates is effectively suppressed by the loop factor, (g2/16π2)2 ∼ 10−4.

At present, there are only upper bounds on the B → Xsνν rates.

12.4 CP violation

Among the SM interactions, the W -mediated interactions are the only ones that violate CP.

Consequently, all CP violating processes depend on the CKM parameters. The fact that there is

only a single independent CP violating parameter (η in the Wolfenstein parametrization), while

the number of measured CP violating processes that have been measured is much larger, allows

for stringent tests of the KM mechanism for CP violation.

Another virtue of probing flavor physics via CP violating processes is that CP is a good sym-

metry of the strong and electromagnetic interactions. In several processes, this fact leads to

cancelations of QCD uncertainties and allows a very precise determination of CKM parameters

via the measurements of CP asymmetries.

CP violation is a very subtle phenomenon in the three generation SM. As explained in Sec-

tion 8.6, and written explicitly in Eq. (8.72), there is a large number of necessary conditions on

the SM flavor parameters (quark masses, CKM mixing angles and the CKM phase) for CP to be

violated. In terms of Yukawa couplings and CKM parameters, these conditions read

XCPV ≡ (y2t − y2

c )(y2t − y2

u)(y2c − y2

u)(y2b − y2

s)(y2b − y2

d)(y2s − y2

d)JCKM 6= 0. (12.12)

Indeed, XCPV 6= 0 is Nature: there are no degeneracies in the quark sector, all mixing angles are

different from 0 and π/2, and the phase is different from 0 and π. Somewhat surprisingly, CP

asymmetries are not necessarily small (some are order one) in spite of the fact that XCPV ∼ 10−18.

This situation is related to two features:

1. CP asymmetries are often defined as a ratio between the difference and the sum of two

CP-conjugate processes. In this ratio, small CKM parameters often cancel out.

2. The experiments measure exclusive asymmetries, distinguishing some quark mass eigenstates

from other. In such a case, the asymmetry is independent of the factors with the correspond-

ing Yukawa couplings.

191

There are four main classes of measurements pf CP violation that are particularly useful in

using flavor physics to test the CKM mechanism, determine the CKM parameters, and constrain

new physics:

• B → DK: CP violation in decay.

• B → ψKS: CP violation in the interference of decays with and without mixing.

• B → ππ, ρρ, ρπ: CP violation in the interference of decays with and without mixing.

• K → ππ and K → π`ν: CP violation in mixing [Re(εK)] and in the interference of decays

with and without mixing [Im(εK)].

12.4.1 B → DK

One measures decays that proceed via b→ cus transitions and via b→ ucs transitions. The former

produce a D0-meson, and the latter a D0 meson, and the experiments measure final states that

are common to D0 and D0. Such processes can determine the relative phase between, for example,

VcbV∗cs and VubV

∗us which is the angle γ = arctan(η/ρ).

By observing the B, D and K mesons, the experiments distinguish the b, s, c and u quark mass

eigenstates from the others, and thus the theoretical expression for the observed CP violation is

independent of all the Yukawa factors of Eq. (12.12).

12.4.2 B → ψKS

One measures the time-dependent CP asymmetry in this decay mode. As explained in Sec-

tion 10.2.3, this asymmetry depends on the relative phase between VtbV∗td and VcbV

∗cd, which is

β. Explicitly, SψK = sin 2β = 2(1− ρ)η/[(1− ρ)2 + η2].

By observing the B, ψ and KS mesons, the experiments distinguish the b, s, d and c quark mass

eigenstates from the other. We expect the (y2t − y2

u) factor to appear in the theoretical expression

for the asymmetry. Indeed, the measured asymmetry is SψK sin(∆mBt), and the m2t -dependence

of ∆mB represents this factor.

12.4.3 B → ππ

Here one combines the experimental measurements of the CP asymmetry in B → π+π− and the

rates for B± → π±π0 and B → π0π0 decays with an isospin analysis in order to measure the

relative phase between MBB and the tree-level (or, more precisely, ∆I = 3/2) contribution to

Aπ+π−/Aπ+π− . Within the SM, this is the relative phase between VtbV∗td and VubV

∗ud, which is α.

By observing N and charged π mesons, the experiments distinguish the b, d and u quark mass

eigenstates from the other. We expect the (y2t − y2

c ) factor to appear in the theoretical expression

192

for the asymmetry. Indeed, the measured asymmetry is SψK sin(∆mBt), and the m2t -dependence

of ∆mB represents this factor.

12.4.4 KL → π`ν

One measures the relative phase between MKK and ΓKK . As explained in Section 10.2.2, the

measurement determines a combination of the phases of (VtsV∗td)

2, (VtsVtdVcsV∗cd)

2 and (VcsV∗cd)

2

relative to (VusV∗ud)

2.

By observing the KL and charged π mesons, the experiments distinguish the s, d and u quark

mass eigenstates from the others. Hence, the theoretical expression depends on the non-degeneracy

between the top and the charm quarks.

12.5 Testing the flavor sector

We can now combine measurements of CP conserving and CP violating, tree-level and FCNC pro-

cesses into a global test of the SM. The primary question is whether the long list of measurements

can be fitted by the four CKM parameters.

12.5.1 Test of the SM flavor sector

The present status of our knowledge of the absolute values of the various entries in the CKM

matrix can be summarized as follows:

|V | =

0.97427± 0.00014 0.22536± 0.00061 (3.55± 0.15)× 10−3

0.22522± 0.00061 0.97343± 0.00015 0.0414± 0.0012

(8.86± 0.33)× 10−3 0.0405± 0.0012 0.99914± 0.00005

. (12.13)

These ranges take into account all the relevant tree-level and loop decays. Yet, as explained above,

the test of the SM is stronger when we reduce the above to the four CKM parameters. Indeed,

the following ranges for the four Wolfenstein parameters are consistent with all measurements:

λ = 0.2254± 0.0007, A = 0.811+0.022−0.012, ρ = +0.131+0.026

−0.013, η = +0.345± 0.014. (12.14)

For the sake of presentation, it is useful to project the constraints onto the (ρ, η) plane:

• The rates of inclusive and exclusive charmless semileptonic B decays depend on |Vub|2 ∝ρ2 + η2;

• The CP asymmetry in B → ψKS, SψKS = sin 2β = 2η(1−ρ)(1−ρ)2+η2 ;

• The rates of various B → DK decays depend on the phase γ, where eiγ = ρ+iη√ρ2+η2

;

• The rates of various B → ππ, ρπ, ρρ decays depend on the phase α = π − β − γ;

193

γ

γ

α

α

dm∆

Kε

Kε

sm∆ & dm∆

ubV

βsin 2

(excl. at CL > 0.95)

< 0βsol. w/ cos 2

exc

luded a

t CL >

0.9

5

α

βγ

ρ

1.0 0.5 0.0 0.5 1.0 1.5 2.0

η

1.5

1.0

0.5

0.0

0.5

1.0

1.5

excluded area has CL > 0.95

FPCP 13

CKMf i t t e r

Figure 12.1: Allowed region in the ρ, η plane. Superimposed are the individual constraints from

charmless semileptonic B decays (|Vub|), mass differences in the B0 (∆md) and Bs (∆ms) neutral

meson systems, and CP violation in K → ππ (εK), B → ψK (sin 2β), B → ππ, ρπ, ρρ (α), and

B → DK (γ).

• The ratio between the mass splittings in the neutral B and Bs systems is sensitive to

|Vtd/Vts|2 = λ2[(1− ρ)2 + η2];

• The CP violation in K → ππ decays, εK , depends in a complicated way on ρ and η.

The resulting constraints are shown in Fig. 12.1.

The consistency of the various constraints is impressive. Given the consistency of the measure-

ments with the renormalizable SM, and the fact that all the SM parameters are known, one can

use the upper bounds on possible deviations from the SM predictions to set upper bounds on the

size of non-renormalizable terms.

12.5.2 Non-renormalizable terms

We now go beyond testing the self-consistency of the CKM picture of flavor physics and CP

violation. The aim is to quantify how much room is left for new physics in the flavor sector and

194

Table 12.1: Lower bounds on the scale of new physics Λ, in units of TeV, for |zij| = 1 and for

Imzij ∼ 1, and upper bounds on zij, assuming Λ = 1 TeV.

Operator Λ [TeV] CPC Λ [TeV] CPV |zij| Im(zij) Observables

(sLγµdL)2 9.8× 102 1.6× 104 9.0× 10−7 3.4× 10−9 ∆mK ; εK

(sRdL)(sLdR) 1.8× 104 3.2× 105 6.9× 10−9 2.6× 10−11 ∆mK ; εK

(cLγµuL)2 1.2× 103 2.9× 103 5.6× 10−7 1.0× 10−7 ∆mD; AΓ

(cRuL)(cLuR) 6.2× 103 1.5× 104 5.7× 10−8 1.1× 10−8 ∆mD; AΓ

(bLγµdL)2 6.6× 102 9.3× 102 2.3× 10−6 1.1× 10−6 ∆mB; SψK

(bRdL)(bLdR) 2.5× 103 3.6× 103 3.9× 10−7 1.9× 10−7 ∆mB; SψK

(bLγµsL)2 1.4× 102 2.5× 102 5.0× 10−5 1.7× 10−5 ∆mBs ; Sψφ

(bRsL)(bLsR) 4.8× 102 8.3× 102 8.8× 10−6 2.9× 10−6 ∆mBs ; Sψφ

to translate these constraints into lower bounds on the scale of higher-dimension flavor-violating

operators. Consider, for example, the following set of dimension-six operators:

L∆F=2NP =

∑i 6=j

zijΛ2

(QLiγµQLj)2, (12.15)

where the zij are dimensionless couplings. The consistency of the experimental results with the

SM predictions for neutral meson mixing, allows us to impose the condition |MNPPP | < |M

SMPP | for

P = K,B,Bs, which implies that

Λ >4.4 TeV

|V ∗tiVtj|/|zij|1/2∼

1.3× 104 TeV × |zsd|1/2

5.1× 102 TeV × |zbd|1/2

1.1× 101 TeV × |zbs|1/2(12.16)

A more detailed list of the bounds derived from the ∆F = 2 observables in Table ?? is given in

Table 12.1. The bounds refer to two representative sets of dimension-six operators: (i) left-left

operators, that are also present in the SM, and (ii) operators with different chirality, where the

bounds are strongest because of larger hadronic matrix elements.

The first lesson that we draw from these bounds on Λ is that new physics can contribute to

FCNC at a level comparable to the SM contributions even if it takes place at a scale that is six

orders of magnitude above the electroweak scale. A second lesson is that if the new physics has a

generic flavor structure, that is zij = O(1), then its scale must be above 104 − 105 TeV

A different lesson can be drawn from the bounds on zij. It could be that the scale of new physics

is of order TeV, but its flavor structure is far from generic. Specifically, if new particles at the TeV

scale couple to the SM fermions, then there are two ways in which their contributions to FCNC

processes, such as neutral meson mixing, can be suppressed: degeneracy and alignment. Either of

these principles, or a combination of both, signifies non-generic structure.

195

It is instructive to compare the FCNC bounds to those from electroweak precision measurements

(EWPM). The bounds from EWPM are of order 10 TeV, see Eq. (11.28), while FCNC bounds,

as can be seen from Table 12.1, are as high as 105 TeV. In both cases the SM explains the data

well, and the strength of the bound depends on the theoretical and experimental errors, which are

roughly at the same level. The reason for the FCNC bounds being much stronger than the EWPM

bounds stems from the difference in the suppression factors. In both cases, the relevant SM effects

are at the one-loop level. Yet, for the FCNC, there are two extra suppression factor, CKM and

GIM. These suppression factors result in a much smaller SM effects and thus stronger bounds.

196

Appendix

12.A Extracting Vud

The key process to measure d→ u transitions is β decay, and here we briefly discuss three different

such decays

• Nuclear β decay, for example tritium 3H to 3He or 14C to 14N.

• Feee Neutron β decay n→ peν.

• Pion β decay, π+ → π0eν.

All of them probe the d→ u transition, yet the QCD involce in each of them is different and thus

require different treatment. The basic approach for determining Vud in each of these processes is

the same. We take the amplitude and compare it to muon decay. The different at teh qurk level

diagram is the present of the factor of Vud.

12.A.1 Nuclear β decay

Nuclei are complicated objects and this seems hard to calculate from first principles. Fortunately,

symmetries make our lives much easier. In general we would like to determine the matrix element

of an operator O sandwichd between two nucleon states N and N ′,

〈N |O|N ′〉. (12.17)

In the Standard Model O can be vector or axial. We know that we can appeal to cases where

either the A or V operator vanishes. For example, A = 0 when the initial and final states are spin

zero since a spin-less particle vanishes when acted upon by a spin-changing operator. In nuclear

physics N → N ′ transitions between J = 0 states are called a superallowed β decays. In such a

transition one of the neutrons within the nucleus N converts into a proton to produce nucleus N ′.

Nothing else has changed, certainly not spin.

Now we invoke an approximate symmetry: in the limit of exact isospin symmetry, nothing

changes when we make this n→ p transition in the nucleus. The n and p are just states of isospin

+1/2 and −1/2 respectively. The QCD transition matrix element should thus be unity. We thus

197

expect isospin breaking to give one or two percent corrections. In practice one can use a model to

estimate the size of these corrections.

The most general decay amplitude A→ B takes the form:

Mµ = 〈N ′(p′)|Aµ + V µ|N(p)〉. (12.18)

We shall choose spin-less initial and final states JN = JN ′ = 0 so that 〈N ′|Aµ|N〉 = 0. Thus this

reduces to the form factors that we introduced in (??),

Mµ = 〈N ′(p′)|V µ|N(p)〉 = f+(q2)(p+ p′)µ + f−(q2)qµ. (12.19)

Now we invoke the isospin limit where mN = mN ′ . Note that while in this limit the decay cannot

occur kinematically, this is not a problem since we interested in the calculation of the hadronic

matrix element. Since the real world is very close to the symmetry limit, the matrix element at

q2 = 0 should be very close to its true value. In other words, q2 is very small relative to Λ2QCD so

that

f±(q2) ≈ f±(0). (12.20)

(We can test this assumption by looking at the spectrum of this decay with respect to q2. If f±

is constant then the spectrum is a straight line. This is precisely the dominant feature in the

so-called Kurie plot of tritium β decay.)

In the isospin limit we have yet another very helpful simplification: f−(q2 Λ2QCD) = 0. To

see this we invoke the Ward identity,

0 = qµMµ = f+(q2)(p2 − p′2) + f−(q2)q2 = f+(q2)(m2A −m2

B) + f−(q2)q2. (12.21)

Isospin tells us that mN = mN ′ so that the f+ term vanishes. This means that f−(q2)q2 = 0.

Next we further simply by using the non-relativistic limit. The energy emitted by the electron

in nuclear β decay is much smaller than the rest energy of the nucleus. The vector operator giving

us the non-relativistic contribution to this matrix element changes I3 by one unit; this is precisely

the difference between N and N ′ as far as isospin and QCD is concerned. In terms of quarks, this

is because V µ = uγµd and each of the quarks changes isospin by 1/2. Since this is a ∆I3 = 1

operator, we know that is is proportional to the usual SU(2) raising operator acting on isospin

space,

V |j,m〉 ∝√

(j −m)(j +m+ 1)|j,m+ 1〉. (12.22)

The overall coefficient can be determined, but the point for us is that we do not care since we’re

just want to take the ratio with µ→ eνν anyway. The coefficient above is just a Clebsch-Gordan

coefficient and tells us that, finally,

〈N ′(p′)|V µ|N(p)〉 ∝√

(j −m)(j +m+ 1)(p+ p′)µ, (12.23)

198

where we stress that the j and m are isospin numbers.

The bottom line is impressive. We started with some hadronic matrix element that we knew

nothing about. Using Lorentz invariance, Isospin and the non-relativistic limit we were able to

get rid of all QCD unknowns and end up with expressing everything in terms of a single number

that comes from group theory. After taking into account corrections to the approximations we’ve

made, one can extract

|Vud| = 0.97425(22). (12.24)

Look at how many significant digits we have. This is certainly an experimental triumph that we

have such accuracy despite QCD.

12.A.2 Neutron β decay

Let us briefly comment on the other measurements of Vud. Consider the free neutron decay process

n → peν. While naıvely this appears to be very similar to the nucleon calculation, the external

states are different as the hadronic states are spin half so we cannot get away with using the

JN = JN ′ = 0 trick to get rid of the axial contribution. This ends up giving six matrix elements

to calculate which nontrivially reduce to

GV = 〈n|V µ|p〉(p+ p′)µ, GA = 〈n|Aµ|p〉(p+ p′)µ. (12.25)

The relevant quantity turns out to be the ratio of these matrix elements, gA ≡ GV /GA. This is

a ratio between two hadronic matrix elements with no symmetries forcing any obvious hierarchies

between the two terms so that we expect gA ∼ O(1). Indeed, it turns out that gA = 1.27.

Vud depends on the neutron lifetime and gA as

|Vud|2 ∼ τn × (1 + 3g2A)(1 + radiative corrections). (12.26)

We can calculate the right-hand side to determine Vud; it turns out to have roughly the same value

and precision as the nuclear matrix element method. This should be very reassuring since this

method carries different hadronic assumptions. For two recent reviews, see [?] and [?].

12.A.3 Pion β decay

Finally, for pion β decay, π+ → π0eν, the hadronic matrix element is between two spin-0 states.

〈π0|O|π±〉. (12.27)

The pions live in an I = 1 isotriplet and are much simpler objects than nuclei. It even turns out

that the calculation of the corrections to the isospin limit are much easier for pions. In fact, this

is the cleanest way to measure Vud. Yet, it is suppressed by phase space. Experimentally we are

limited by the total number of these decays that we measure.

199

12.B Extracting Vcb

Here we discuss how to get Vcb. The main theoretical tool is e heavy quark symmetry. We discuss

the exlusive and inclusive approach for it.

12.B.1 B → D decays

We consider the following decays

B → D`ν (12.28)

B → D∗`ν. (12.29)

Under heavy quark symmetry the rates for (??) and (12.29) should be the same since the B and

B∗ (as well as the D and D∗) are doublets under spin symmetry.

In general B → D`ν contains two form factors, while B → D∗`ν contains four. In addition,

we can also use the baryon decay,

Λ1 → Λ2`ν, (12.30)

which is essentially the same as neutron decay. This also has six form factors, giving a grand total

of twelve form factors between the three decays.

In the heavy quark limit all of these form factors are either zero or the same non-zero value,

which we can normalize to unity. This allows us to write everything in terms of a single form factor.

Even better, QCD differs from the heavy quark limit in terms of an object called the Isgur-Wise

function which is universal and depends only on vb · vc.Let us work in the specific limit

mb

mc

= const(∼ 3) (12.31)

while simultaneously taking mb →∞. Imagine a B meson to be composed of a b quark surrounded

by brown muck. Consider the velocity of the b to be the same as the velocity of the entire meson,

vb = v and suppose the b decays into a c and let us write the velocity of the c as vc = v′.

Now we would like to ask what would happen if we replaced the B meson with a B∗ or if we

changed the D to a D∗? Would the brown muck also be excited? No, nothing would happen. We

have a theory of QCD where we treat electroweak currents as external sources. If Harry Potter

waved his magic wand and turned the b into a c quark, the brown muck would more or less stay the

same. This is just like our atomic physics analogy where adding another neutron to an atom won’t

lead to a big change in its chemical properties; nothing happens. And when nothing happens, the

intuitive value of the form factor is unity (suitably normalized, of course).

In the limit of v → v′, we don’t care about b versus c, D versus D∗, or any of these ‘magic

wand’ modifications that don’t significantly affect the brown muck. In fact, we don’t even care

how the b→ c transition occurs, as long as it is short distance. This transition could just as well

200

have come from a vector current, an axial current, some V ± A combination, or Harry Potter’s

magic wand.

Now consider v 6= v′. In the decay B → D this corresponds to the c quark picking up a non-zero

velocity in the meson rest frame. Note that we are not allowed to assume the non-relativistic limit

for the velocity of the c quark, v′. The only thing we know is that the b is non-relativistic in the

meson rest frame. (It is a fallacy to say that ‘everything’ is non-relativistic in the heavy-quark

limit.) The c can be relativistic in the rest frame of the B. In this case, the b decays to c with

some generically-not-small velocity and suddenly brown muck sees the color flow moving. Then

one of two things happen,

1. The brown muck can follow the color flow. This corresponds to B → D decay.

2. The brown muck can pop things out of the vacuum to produce other things.

This brings us back to the meaning of the form factor: the probability for B → D versus other

decay processes when things are not at rest. The analogous statement in the hydrogen/deuterium

atom is that not only do we take away a neutron, but we give the proton a kick. The form factor

can be interpreted as the overlap between the electron wavefunction before and after the kick and

represents the probability of finding an electron in a given shel after the proton kick. The case with

the brown muck is completely analogous except that we do not know the explicit wavefunction.

There’s one simple option for generating a Lorentz invariant out of v and v′: w ≡ v · v′, where

w is just what we usually call γ, the boost. The form factor is given by some universal function

of only w called the Isgur-Wise function, ξ(w), such that ξ(1) = 1, i.e. when v = v′. In general

w ≥ 1. We don’t know much more about ξ(w), but we can see how far we can go with these

properties.

Let us choose a somewhat different normalization of so that we right the form factors,

1

mB

〈B(v)|V µ|B(v′)〉 = ξ(w) · (v · v′)µ. (12.32)

We can compare this to the ‘usual’ way of parameterizing the form factors,

〈B|V µ|B〉 = F (q2)(p+ p′)µ. (12.33)

By the way, why is there no (p− p′) term? It is zero in the isospin limit.

Now we would like to calculate the form factors based on the Isgur-Wise function. Let’s look

at the decay of B → D so that we would like

1√mBmD

〈B(v)|V µ|D(v′)〉 = ξ(w) · (v · v′)µ. (12.34)

In the ‘usual’ notation,

〈B|V µ|D〉 = F+(q2)(p+ p′)µ + F−(q2)qµ. (12.35)

201

Our hope is to relate the form factors F± to the Isgur-Wise function ξ(w). First consider the case

v = v′ (w = 1). We find

F±(q2)∣∣∣w=1

=mB ±mD

2√mBmD

. (12.36)

HELP: please specify the ξ(w) dependence. In principle there are two form factors, but in

the heavy quark limit they are related so that we only have to determine one of them before we

can start making predictions.

Three remarks are in order:

1. Does this work for B → D∗? Yes, it’s more complicated because the form factors include

more terms, e.g. contractions with ε tensors, but at the end of the day everything is indeed

expressed in terms of ξ(w).

2. What kind of corrections are there in the heavy quark limit, ξ(1) = 1? There’s a correction

from the finiteness of the heavy quark mass, 1/m2Q as well as perturbative αS corrections

from soft gluons.

3. Luke’s theorem. If you expand about an extremum, the correction is always second order.

This is why the leading order correction to ξ(w) = ξ(1) + · · · starts at O(1/m2W ). The

point ξ(1) is an extremum because we’re looking at a situation with ‘maximum wavefunction

overlap.’

Now what do we want to do with all of this? We would like to measure B → D and use the

spectrum to obtain |Vcb|. Ideally we would like to look at the case where w = 1, i.e. where the

charm is at rest, but the phase space—and hence statistics—for this is very small. We must thus

look at all events, plotting the total number events as a function of w. We can then use the data to

extrapolate the curve to the w = 0 case. [Insert plot.] Since we know ξ(1) = 1 we can expand

about this point,

ξ(w) = 1− ρ2(w − 1), (12.37)

and fit for the parameter ρ. There is a small unremovable theoretical uncertainty associated with

this. The number that we get is something like 0.44. (HELP: check this)

12.B.2 Inclusive decays

There is another way to get information about |Vcb|. Consider the decay B → Xc`ν, where Xc

means any state containing a c quark. This is called an inclusive decay. There is a principle

called quark-hadron duality1 which says that information about inclusive hadronic decays tell

1This is a duality from the pre-string theory days, when we [phenomenologists] had more dualities than they

did.

202

us about the parton-level processes. For B → Xc, for example, this duality tells us that

B → Xc`ν ≈ b→ c`ν. (12.38)

The important symbol here is the ‘≈.’ The B decays into something charmed and potentially a

lot of other stuff. When we sum over all of this ‘other stuff,’ we say that this amplitude should be

approximately the same as that of the quark-level b→ c amplitude. Said in another way, one can

predict the rates for inclusive hadronic processes by calculating the quark process.

It is important to stress that despite it’s seemingly innocuous appearance, this quark-hadron

duality is far from trivial. Consider the measurement of the famous ‘R ratio’,

R =σ(e+e− → hadron)

σ(e+e− → µ+µ−). (12.39)

Recall that the plot of R over q2 is one of the famous checks for the existence of quarks. Quark-

hadron duality tells us that this plot should look like a series of step functions with a step at each

quark’s mass threshold, with some smoothing due to phase space. [include plot]

If we compare this to actual experimental plots of the R ratio, we can immediately see the

problem. [include plot] There are lots of peaks associated with hadronic resonances which

clearly do not appear in the quark-level analysis. For example, at the ρ resonance, e+e− → ρ→ ππ

has a huge cross section—much larger than one would predict from the naive quark-level diagrams.

Clearly there’s a subtlety in the quark-hadron duality principle. The subtlety is that we must

smear out the data. For example, in Mathematica we can smooth out the data. What is the scale

of the smoothing? ΛQCD, of course! When we smear out features on this order and smaller—that

is when we integrate over these features—we begin to follow the quark-level b→ c`ν plot.

To measure |Vcb| we just have to plot the spectrum of B → Xc`ν, integrate over some region

of q2, and pretend that we’re looking at a plot of b→ c`ν. In the heavy quark limit it’s clear that

the B decay really is the same as the b decay, so that to leading order we’re done. Of course, after

developing HQET we can go further and discuss the corrections coming from λ1 and λ2. [See

long homework problem on this.]

Operator dependence. Manohar and Wise originally calculated the 1/m2 corrections to

B → Xceν from the vector operator. Suppose you wanted to calculate the scalar operator, as

Yuval and Zoltan Ligeti did when they were graduate students. How do you expect the λ1 and λ2

corrections to relate to to those calculated by Manohar and Wise? The λ1 prefactor is the same

while λ2 is different. This is because λ1 is the correction coming from the kinetic energy of the

heavy quark, which has nothing to do with the operator.

203

Homework

Question 12.1: Exotic light quarks

We consider a model with the gauge symmetry SU(3)C×SU(2)L×U(1)Y spontaneously broken

by a single Higgs doublet into SU(3)C × U(1)EM . The lepton sector is as in the SM. The quark

sector, however, differs from the SM one as it consists of three quark flavors, that is, we do not

have the c, b and t quarks. The quark representations are non-standard. Of the left handed quarks,

QL = (uL, dL) form a doublet of SU(2)L while sL is a singlet. All the right handed quarks are

singlets. All color representations and electric charges are the same as in the standard model.

1. Write down (a) the gauge interactions of the quarks with the charged W bosons (before

SSB); (b) the Yukawa interactions (before SSB); (c) the bare mass terms (before SSB); (d)

the mass terms after SSB.

2. Show that there are four physical flavor parameters in this model. How many are real and

how many imaginary? Is there CP violation in this model? Separate the parameters into

masses, mixing angles and phases.

3. Are there photon and gluons FCNC’s? Support your answer by an argument based on

symmetries.

4. Write down the gauge interactions of the quarks with the Z boson in both the interaction

basis and the mass basis. (You do not have to rewrite terms that do not change when you

rotate to the mass basis. Write only the terms that are modified by the rotation to the mass

basis.) Are there generally tree level Z exchange FCNC’s?

5. We assume that the masses of the particles and the value of the Cabibbo angle are as found

in Nature and that the leptons are described by the SM. Then, in this model we can have

process like KL → µ+µ−. Estimate its rate (normalized it to K+ → µ+ν).

6. Explain why this result practically ruled out the model.

204

Question 12.2: Two Higgs doublet model

Consider the two Higgs doublet model (2HDM) extension of the SM. In this model we add a

Higgs doublet to the SM fields. Namely, instead of the one Higgs field of the SM we now have two,

denoted by φ1 and φ2. For simplicity you can work with two generations when the third generation

is not explicitly needed.

1. Write down (in a matrix notation) the most general Yukawa potential of the quarks.

2. Carry out the diagonalization procedure for such a model. Show that the Z couplings are

still flavor diagonal.

3. In general, however, there are FCNCs in this model mediated by the Higgs bosons. To show

that, write the Higgs fields as Re(φi) = vi + hi where i = 1, 2 and vi 6= 0 is the vev of φi,

and define tan β = v2/v1. Then, write down the Higgs–fermion interaction terms in the mass

basis. Assuming that there is no mixing between the Higgs fields, you should find a non

diagonal Higgs fermion interaction terms.

4. Since there are FCNCs in this model processes like b → s`+`− can proceed at tree level.

Assume that tan β ∼ 1 and mHi ∼ mW and give a very rough estimate of the ratio

Γ(b→ sµ+µ−)

Γ(b→ cµ−ν)(12.40)

(For the numerical values use mb = 4.3 GeV and Vcb = 0.04.)

5. The current upper bound on this ratio is 5 × 10−6. Can we already probe this model using

this ratio?

6. Can you find a symmetry that will forbid the Higgs exchange FCNCs? In particular, try to

find a symmetry that will couple φ1 only to the up type quarks, and φ2 to the down type

quarks.

Question 12.3: The four mesonsIt is now time to come back to the question of why there are only four meson pairs that are relevant

to flavor oscillations. Explain why the following systems are irrelevant to flavor oscillations:

1. B+ −B−.

2. s− d. Conisder a case where, say, a charm decay into a superpostion of d and s quark.

3. T − T (a T is a meson made out of a t and a u quarks.)

205

4. K∗ −K∗ oscillation.

Hint: The last three cases all have to do with time scales. In principle there are oscillations in

these systems, but they are irrelevant.

Question 12.4: Kaons

Here we study some properties of the kaon system. We did not talk about it at all. You have

to go back and recall (or learn) how kaons decay, and combine that with what we discussed in the

lecture.

1. Because τL τS we have yK ≈ 1. What is the reason for that?

2. In a hypothetical world where we could change the mass of the kaon without changing any

other masses, how would the value of yK change if we made mK smaller or larger.

Question 12.5: Mixing beyond the SM

Consider a model without a top quark, in which the first two generations are as in the SM,

while the left–handed bottom (bL) and the right–handed bottom (bR) are SU(2) singlets.

1. Draw a tree-level diagram that contributes to B − B mixing in this model.

2. Is there a tree-level diagram that contributes to K − K mixing?

3. Is there a tree-level diagram that contributes to D − D mixing?

Question 12.6: Condition for CP violation

Using Eq. (10.24), show that in order to observe CP violation, Γ(B → f) 6= Γ(B → f), we

need two amplitudes with different weak and strong phases.

Question 12.7: Mixing formalism

In this question, you are asked to develop the general formalism of meson mixing.

1. Show that the mass and width differences are given by

4(∆m)2 − (∆Γ)2 = 4(4|M12|2 − |Γ12|2), ∆m∆Γ = 4Re(M12Γ∗12), (12.41)

and that ∣∣∣∣∣qp∣∣∣∣∣ =

∣∣∣∣∣∆m− i∆Γ/2

2M12 − iΓ12

∣∣∣∣∣ . (12.42)

206

2. When CP is a good symmetry all mass eigenstates must also be CP eigenstates. Show that

CP invariance requires ∣∣∣∣∣qp∣∣∣∣∣ = 1. (12.43)

3. In the limit Γ12 M12 show that

∆m = 2|M12|, ∆Γ = 2|Γ12| cos θ,

∣∣∣∣∣qp∣∣∣∣∣ = 1. (12.44)

4. Derive Eqs. (10.21).

5. Derive Eq. (10.36).

6. Show that when ∆Γ = 0 and |q/p| = 1

Γ(B → X`−ν)[t] = e−Γt sin2(∆mt/2), Γ(B → X`+ν)[t] = e−Γt cos2(∆mt/2). (12.45)

Question 12.8: B → π+π− and CP violation

One of the interesting decays to consider is B → ππ. Here we only briefly discuss it.

1. First assume that there is only tree level decay amplitude (that is, neglect penguin ampli-

tudes). Draw the Feynman diagram of the amplitude, paying special attention to its CKM

dependence.

2. In that case, which angle of the unitarity triangle is the time dependent CP asymmetry,

Eq. (10.36), sensitive to?

3. Can you estimate the error introduced by neglecting the penguin amplitude? (Note that one

can use isospin to reduce this error. Again, you are encouraged to read about it in one of

the reviews.)

Question 12.9: B decays and CP violation

Consider the decays B0 → ψKS and B0 → φKS. Unless explicitly noted, we always work

within the framework of the standard model.

1. B0 → ψKS is a tree-level process. Write down the underlying quark decay. Draw the

tree level diagram. What is the CKM dependence of this diagram? In the Wolfenstein

parametrization, what is the weak phase of this diagram?

207

2. Write down the underlying quark decay for B0 → φKS. Explain why there is no tree level

diagram for B0 → φKS.

3. The leading one loop diagram for B0 → φKS is a gluonic penguin diagram. As we have

discussed, there are several diagrams and only their sum is finite. Draw a representative

diagram with an internal top quark. What is the CKM dependence of the diagram? In the

Wolfenstein parametrization, what is the weak phase of the diagram?

4. Next we consider the time dependent CP asymmetries. We define as usual

λf ≡AfAf

q

p, Af ≡ A(B0 → f), Af ≡ A(B0 → f). (12.46)

In our case we neglect subleading diagrams and then we have |λ| = 1 and thus

af ≡Γ(B0(t)→ f)− Γ(B0(t)→ f)

Γ(B0(t)→ f) + Γ(B0(t)→ f)= −Imλf sin(∆mB t) (12.47)

Both aψKS and aφKS measure the same angle of the unitarity triangle. That is, in both cases,

Imλf = sin 2x where x is one of the angles of the unitarity triangle. What is x? Explain.

5. Experimentally,

ImλψKS = 0.68(2), ImλφKS = 0.59(14). (12.48)

Comment about these two results. In particular, do you think these two results are in

disagreement?

6. Assume that in the future we will find

ImλψKS = 0.68(1), ImλφKS = 0.55(3). (12.49)

That is, that the two results are not the same. Below are three possible “solutions”. For

each solution explain if you think it could work or not. If you think it can work, show how.

If you think it cannot, explain why.

(a) There are standard model corrections that we neglected.

(b) There is a new contribution to B0− B0 mixing with a weak phase that is different from

the SM one.

(c) There is a new contribution to the gluonic penguin with a weak phase that is different

from the SM one.

Question 12.10: FCNCs, GIM and finitness

208

The GIM mechanism is also important in understanding the finiteness of loop amplitudes. Any

one loop amplitude corresponding to decay where the tree level amplitude is zero must be finite.

Technically, this can be seen by noticing that if it were divergence, a counter term at tree-level

would be needed, but that cannot be the case if the tree-level amplitude vanishes. The amplitude

for b → sγ it is naively log divergent. (Make sure you do the counting and see it for yourself.)

Yet, it is only the mi independent term that diverges. The GIM mechanism is here to save us as

it guarantees that this term is zero. The mi dependent term is finite, as it should be.

209

Chapter 13

Neutrinos

13.1 Introduction

In the SM, the neutrinos are exactly massless. Experiments, however, established that neutrinos

have masses. While the individual neutrino mass eigenvalues are not known, two mass-squared

differences are inferred from experiments:

∆m221 ≡ m2

2 −m21 = (7.5± 0.2)× 10−5 eV2,

∆m232 ≡ m2

3 −m22 = ±(2.3± 0.1)× 10−3 eV2. (13.1)

We discuss the way these ranges were obtained in section 13.4. This is a clear experimental

indication of physics beyond the SM.

The SM prediction that the neutrinos are massless is related to the lepton number symmetry.

The SM prediction that the neutrinos do not mix is related to the lepton flavor symmetry. Similar

to other predictions that depend on accidental symmetries of the SM, these predictions are violated

in generic extensions of the SM. In Chapter 11 we saw that d = 6 terms violate the approximate

custodial symmetry of the scalar sector of the SM, and consequently are constrained by EWP

measurements. In Chapter ?? we saw that d = 6 terms violate the approximate flavor symmetries

of the SM, and consequently are constrained by measurements of FCNCs processes. In this chapter

we will see that d = 5 terms violate the accidental lepton number and lepton flavor symmetries of

the SM, and consequently are probed by measurements of neutrino masses and mixing.

13.2 The νSM: The SM with d = 5 terms

In this section we study a model that we call the νSM. It is the SM extended to include the most

general d = 5 terms.

There is a single class of dimension five terms that depend on SM fields and obey the SM

symmetries. These terms involve two SU(2)-doublet lepton fields and two SU(2)-doublet scalar

210

fields:

LνSM = LSM +Zνij

ΛφφLiLj, (13.2)

where Zν is a symmetric and complex 3 × 3 matrix of dimensionless couplings, and Λ is a high

mass scale, Λ v.

13.2.1 The neutrino spectrum

With φ0 acquiring a VEV, 〈φ0〉 = v/√

2, LνSM in Eq. (13.2) has a piece that corresponds to a

Majorana mass matrix for the neutrinos:

LνSM,mass =1

2(mν)ijνiνj, (mν)ij =

v2

ΛZνij. (13.3)

The matrix mν can be diagonalized by a unitary transformation:

VνLmνVTνL = mν = diag(m1,m2,m3). (13.4)

Majorana mass matrices are always symmetric. While the diagonalization of a general mass matrix

M involves a general bi-unitary transformation, Mdiag = VLMV †R, for a symmetric mass matrix

the diagonalization is by a unitary matrix and its transpose, as in Eq. (13.4).

We denote the corresponding neutrino mass eigenstates by ν1, ν2, ν3. If we choose a basis

where the mixing angles are all not larger than π/4, there are six possible mass ordering. Yet,

based on the experimental results (as we explain below) there are only two possibilities that are

consistent with the data. The convention here is that the states ν1 and ν2 are the ones separated by

the smaller mass-squared difference, with m2 > m1. The state ν3 is the one whose mass-squared

difference from the other two is the largest. It is not yet known experimentally whether it is

heavier (‘normal hierarchy’) or lighter (‘inverted hierarchy’) than the other two. This convention

is in one-to-one correspondence with the way that the experimental results are presented in Eq.

(13.1): |∆m232| > ∆m2

21 > 0.

13.2.2 The scale of generation of neutrino masses

In this section we explain the implications of the measured neutrino masses for the scale Λ where

these masses are generated. As long as experiments probe only the low energy effective theory,

what is measured is the combination Zν/Λ. Thus, there is an ambiguity in the definition of Λ and

Zν . The separation of the coefficient of a d = 5 term to a dimensionless coupling and a scale is

meaningful when we discuss a full high energy theory which generates the effective term. What we

refer to as the scale of a non-renormalizable term is Λ/Zν (or, in case that Zν is a matrix, as in Eq.

(13.2), Λ/Zνmax, where Zν

max is the largest eigenvalue of Zν). Note, however, that the combination

of a measurement of Λ/Zν and the assumption that Zν is generated by perturbative physics and

therefore Zν∼< 1 translates into an upper bound on Λ.

211

The measurements of the neutrino mass-squared differences, Eq. (13.1), do not tell us the

individual masses of the neutrinos, though they provide a lower bound on two mass eigenvalues:

There is at least one neutrino mass heavier than√|∆m2

32|,

mheaviest ≥√|∆m2

32| ' 0.05 eV, (13.5)

and there is at least one additional mass heavier than√

∆m221 ∼ 0.009 eV. There is, however,

additional information from experiments and cosmology which provides an upper bound on the

absolute mass scale of the neutrinos of order 1 eV. We discuss the experimental upper bounds in

Appendix 13.A.

The effective low energy Lagrangian of Eq. (13.2) where, by definition, Λ v, predicts that

the neutrino masses are much lighter than the weak scale:

m1,2,3 ∼ v2/Λ v. (13.6)

The fact that experiments find that the neutrinos are indeed much lighter than the W mass, makes

the notion that neutrino masses are generated by d = 5 terms very plausible.

In fact, all fermions of the SM except for the top quark are light relative to mW . The lightness of

charged fermions is related to the smallness of the corresponding Yukawa couplings. The question

of why Yukawa couplings are small may find an answer in a more fundamental theory, beyond the

SM. The neutrinos, however, are not only much lighter than mW , but also lighter by at least six

orders of magnitude than all charged fermions. This extreme lightness of the neutrinos is explained

if their masses are generated by d = 5 terms.

Clearly, the SM cannot be a valid theory above the Planck scale, Λ ∼< MPl. We thus expect

that mi ∼> v2/MPl ∼ 10−5 eV. A more relevant scale might be the scale of Grand Unified Theories

(GUTs). In GUTs, the GSM = SU(3)C ×SU(2)L×U(1)Y gauge group of the SM is assumed to be

a subgroup of a unifying group, such as SU(5), which is spontaneously broken to GSM at a scale

ΛGUT = O(1016 GeV). If the d = 5 terms are generated at ΛGUT, then we expect mν ∼ 10−2 eV.

Conversely, an experimental lower bound on neutrino masses provides an upper bound on the

scale of relevant new physics. Using the lower bound of Eq. (13.5) and the relation of Eq. (13.3),

we conclude that the SM cannot be a valid theory above the scale

Λ ∼<v2

mν

∼ 1015 GeV. (13.7)

This proves that the SM cannot be valid up to the Planck scale. Furthermore, this upper bound

is intriguingly close to the GUT scale.

13.2.3 The neutrino interactions

The addition of the dimension-five terms leads to significant changes in the phenomenology of the

lepton sector. The modifications can be understood by re-writing the neutrino-related terms in

212

Table 13.1: The neutrino interactions

interaction force carrier coupling

NC weak Z0 e/(2sW cW )

CC weak W± gU/√

2

Yukawa h 2m/v

the mass basis. The renormalizable SM gives

LSM,ν = iνα∂/να −g

2cWναZ/να −

g√2

(`LαW/

−να + h.c.), (13.8)

where α = e, µ, τ . (The Lagrangian (13.8) describes massless neutrinos, and consequently the basis

(νe, νµ, ντ ) serves as both an interaction basis and a mass basis.) The Lagrangian of Eq. (13.2)

gives

LνSM,ν = iνi∂/νi −g

2cWνiZ/νi −

g√2

(`LαW/

−Uαiνi + h.c.)

+miνiνi +2mi

vhνiνi +

mi

v2hhνiνi. (13.9)

Here α = e, µ, τ denotes only the charged lepton mass eigenstates, while i = 1, 2, 3 denotes the

neutrino mass eigenstates. The neutrino mass parameters m1,2,3 are real, and the mixing matrix

U is unitary. Starting from an arbitrary interaction basis, the matrix U is given by

U = VeLV†νL. (13.10)

While each of VeL and VνL is basis-dependent, the combination VeLV†νL is not. Explicitly we write

it as

U =

Ue1 Ue2 Ue3

Uµ1 Uµ2 Uµ3

Uτ1 Uτ2 Uτ3

. (13.11)

The most significant changes from (13.8) to (13.9) concerning neutrino interactions are the

following:

• The leptonic charged current interactions are neither universal nor diagonal. Instead, they

involve the mixing matrix U .

• The Higgs boson has Yukawa couplings to neutrinos. These couplings break lepton num-

ber. The size of the Yukawa couplings is, however, tiny, of order mi/v ∼ 10−13, leading to

unobservably small branching ratio for h→ νν.

The νSM-neutrinos thus have three types of interactions, mediated by massive bosons. These

interactions are summarized in Table 13.1.

213

13.2.4 Accidental symmetries and the lepton mixing parameters

The dimension-five terms in Eq. (13.2) break the U(1)e × U(1)µ × U(1)τ accidental symmetry of

the SM. With the addition of only d = 5 terms, all that remains of the GglobalSM symmetry of the

SM [see Eq. (8.58)] is baryon number symmetry:

GglobalνSM = U(1)B. (13.12)

This symmetry is, however, anomalous and broken by non-perturbative effects. In addition, it is

broken by dimension-six terms.

The counting of flavor parameters in the quark sector remains unchanged: six quark masses

and four mixing parameters, of which one is imaginary. How many physical flavor parameters

are involved in the lepton sector? The Lagrangian of Eq. (13.2) involves the 3 × 3 matrix Y e (9

real and 9 imaginary parameters), and the symmetric 3 × 3 matrix Zν (6 real and 6 imaginary

parameters). The kinetic and gauge terns have a U(3)L×U(3)E accidental global symmetry, that is

completely broken by the Y e and Zν terms. Thus, the number of physical lepton flavor parameters

is (15R + 15I)− 2× (3R + 6I) = 9R + 3I . Six of the real parameters are the three charged lepton

masses me,µ,τ and the three neutrino masses m1,2,3. We conclude that the 3× 3 unitary matrix U

depends on three real mixing angles and three phases.

Why does the lepton mixing matrix U depend on three phases, while the quark mixing matrix V

depends on only a single phase? The reason for this difference lies in the fact that the Lagrangian of

Eq. (13.2) leads to Majorana masses for neutrinos. Consequently, there is no freedom in changing

the mass basis by redefining the neutrino phases, as such redefinition will introduce phases into

the neutrino mass terms. While redefinitions of the six quark fields allowed us to remove five non-

physical phases from V , redefinitions of the three charged lepton fields allows us to remove only

three non-physical phases from U . The two additional physical phases in U are called “Majorana

phases,” since they appear as a result of the (assumed) Majorana nature of neutrinos. They affect

lepton number violating processes.

A convenient parametrization of U is the following:

U =

c12c13 s12c13 s13e

−iδ

−s12c23 − c12s23s13eiδ c12c23 − s12s23s13e

iδ s23c13

s12s23 − c12c23s13eiδ −c12s23 − s12c23s13e

iδ c23c13

× diag(1, eiα1 , eiα2), (13.13)

where α1,2 are the Majorana phases, sij ≡ sin θij and cij ≡ cos θij. We describe the experimental

determination of the lepton mixing parameters in Section 13.4.

The present status of our knowledge of the absolute values of the various entries in the lepton

mixing matrix can be summarized as follows (we quote here the 3σ ranges):

|U | =

0.80− 0.85 0.51− 0.58 0.14− 0.16

0.22− 0.52 0.44− 0.70 0.61− 0.79

0.25− 0.53 0.46− 0.71 0.59− 0.78

. (13.14)

214

When working in the mass basis, the formalisms of quark and lepton flavor mixing are very

similar. The difference between these two phenomena arises due to the way neutrino experiments

are done. While quarks and charged leptons are identified as mass eigenstates, neutrinos are

identified as interaction eigenstates. Explicitly, they are identified as νe or νµ or ντ according to

whether they produce in the detector an e or µ or τ lepton, respectively.

There are several ways to think about fermion mixing. The basic point is that there are three

relevant matrices in each of the quark and lepton sectors – two mass matrices and one matrix

for the W -couplings – and at most two can be simultaneously diagonal. For quarks, mixing is

best understood in the mass basis. The W couplings to quark mass eigenstates are not diagonal,

namely mu and md are diagonal but V is not. For leptons, it is often more convenient to work in

an interaction basis which is also the charged lepton mass basis. Here me and the W couplings

are diagonal but mν is not.

Note also that the indices of the two matrices are reversed. In V the first index corresponds to

the T3 = +1/2 component of the doublet and the second one to the T3 = −1/2 component. In U

it is the other way around.

13.3 The NSM: The SM with singlet fermions

In Section 13.2 we introduce the νSM and show that the addition of non-renormalizable, dimension-

five terms to the SM Lagrangian gives neutrinos masses and, moreover, explains why they are much

lighter than the charged fermions. Non-renormalizable terms must arise from a more fundamental

theory. One uses the term “UV-completion” for a full high energy theory that leads to the effective

theory. For the full high energy theory we again write only renormalizable terms.

In this section we provide an example of a full high energy theory that generates at low energy

the dimension-five terms. (In your homework you will work out one more example.) This extension

amounts to adding heavy gauge-singlet fermions to the SM. The way these singlets generate masses

to the neutrinos is called the seesaw mechanism. The reason for this name, as will become clear

when we analyze the model, is that the heavier the singlet fermions, the lighter the neutrinos.

The Lagrangian Ld=5 of Eq. (13.30) can come not only from the NSM, but also from other

high-energy theories. In particular, there are three types of seesaw models, called type I, II, and

III, which differ by the type of heavy fields that one adds to the SM:

• Type I: (1, 1)0 fermion fields. (This is the NSM we discuss below.)

• Type II: (1, 3)−1 scalar fields. (An example is provided by the LRS model in your homework.)

• Type III: (1, 3)−1 fermion fields. (You will work out this model in your homework.)

All three types of seesaw models predict that the light neutrinos have Majorana masses, and that

their mass scale is inversely proportional to the mass scale of the new heavy particles (hence the

215

name “seesaw models”) and, in particular, much lower than the electroweak breaking scale. These

kind of models arises in various extensions of the SM, such as SO(10) grand unified theories (GUT),

and left-right symmetric (LRS) models.

13.3.1 Defining the NSM

The NSM is defined as follows:


SU(3)C × SU(2)L × U(1)Y . (13.15)

(ii) There are three fermion generations (i = 1, 2, 3), each consisting of six different representa-

tions:

QLi(3, 2)+1/6, URi(3, 1)+2/3, DRi(3, 1)−2/3, LLi(1, 2)−1/2, ERi(1, 1)−1, NRi(1, 1)0.

(13.16)


φ(1, 2)+1/2. (13.17)

(iii) The pattern of spontaneous symmetry breaking is as follows:

SU(3)C × SU(2)L × U(1)Y → SU(3)C × U(1)EM (QEM = T3 + Y ). (13.18)

13.3.2 The NSM Lagrangian

The NSM has the same gauge group, the same scalar content, and the same pattern of spontaneous

symmetry breaking as the SM. In the fermion sector, all the SM representations are included. The

only difference is the addition of the fermionic NRi fields. Since the imposed symmetry is the same,

all the terms that appear in the SM Lagrangian appear also in the NSM Lagrangian. The NSM

Lagrangian has, however, several additional terms. These are all the terms that involve the NRi

fields. We can write:

LNSM = LSM + LN . (13.19)

Our task now is to find the specific form of LN . We note the following points in this regard:

1. Given that the NR fields are singlets of the gauge group, we have DµNR = ∂µNR.

2. Since the NR fields carry no conserved charge, they can have Majorana mass terms.

3. The combination LLNR transforms as (1, 2)+1/2 under the gauge group, and can thus have a

Yukawa coupling to the scalar doublet.

216

We thus obtain the most general form for the renormalizable terms in LN :

LN = iNRi∂/NRi −(

1

2MN

ij NRiNRj + Y νijLLiφNRj + h.c.

). (13.20)

Here MN is a symmetric 3 × 3 complex matrix, with entries of mass dimension 1, and Y ν is a

general 3× 3 complex matrix of dimensionless Yukawa couplings.

13.3.3 The NSM spectrum

As concerns the spectrum of this theory, clearly the bosonic spectrum remains unchanged from

the SM. As concerns the fermions, we note that, since the NR fields are singlets of the full gauge

group, they are also singlets of the unbroken subgroup, namely they transform as (1)0 under

SU(3)C×U(1)EM. This means that also the spectrum of the charged fermions (quarks and charged

leptons) remains unchanged from the SM.

As concerns the neutrinos (the νL components of the SU(2)-doublet leptons and the NR fields),

taking into account the spontaneous symmetry breaking, we find the following mass terms in LN :

LN ,mass = −1

2MN

ij NRiNRj −Y νijv√2νLiNRj + h.c.. (13.21)

This gives a 6× 6 neutrino mass matrix, that can be decomposed into four 3× 3 blocks as follows

[see Eq. (??)]: :

Mν =

(0 mD

mTD MN

), mD =

vY ν

√2. (13.22)

To obtain the six neutrino mass eigenstates, we need to diagonalize Mν .

Note that, unlike the case for the charged leptons and the quarks, where there are three mass

eigenstates, there are six neutrino mass eigenstates. The reason is that charged fermions have Dirac

masses, and each mass eigenstate has four DoF. The neutrinos have Majorana masses, where each

mass eigenstate has only two DoF. The total number of DoF is thus same in each sector.

Before we analyze the neutrino spectrum of the three generation case, we gain some intuition

by analyzing the simpler, one generation model, where we have only one copy of the L and N

fields. Without loss of generality, we choose a basis where Mν of Eq. (13.22) is a symmetric real

2 × 2 matrix. We further make an important assumption, inspired by both phenomenology and

theoretical model building: We assume that the MN is much larger than the electroweak breaking

scale, MN v. To leading order in mD/MN , we obtain the following mass eigenvalues and mixing

angle:

m1 =m2D

MN, m2 = MN , sin θ =

mD

MN. (13.23)

These results demonstrate several important points about the see-saw mechanism:

1. The mixing angle between the two states is very small. The light state is almost a pure

SU(2)-doublet ν, while the heavy one is almost a pure SU(2)-singlet N .

217

2. The light mass is inversely proportional to heavy one. This is the reason why the mechanism

that generates masses for the light neutrinos via their Yukawa couplings to heavy neutrinos

is called “the see-saw mechanism.”

3. The NSM constitutes a possible UV-completion of the νSM. Within the NSM, the unspecified

scale of the νSM, Λ, is interpreted as the mass scale of the heavy neutrino, MN .

We now return to the three generation case. We can always use a unitary transformation to

bring MN of Eq. (13.22) to a diagonal and real form:

MN → UTNM

NUN = MN = diag(M1,M2,M3). (13.24)

Unlike the SM, which has a single dimensionful parameter, v, the NSM has four dimensionful

parameters: v,M1,M2,M3. We assume that the Majorana masses of the heavy states are much

higher than the weak scale:

M1,2,3 v. (13.25)

We define mN to be the mass scale of the heavy neutrinos. Then, we can perform the diagonal-

ization of Mν to leading order in v/mN . First, we use the unitary matrix K,

K =

(1 mDM

−1N

−mDM−1N 1

), (13.26)

where we omitted terms of order v2/m2N , to block-diagonalize Mν :

KMνKT =

(−mT

DM−1N mD 0

0 MN

), (13.27)

The lower-right block is already diagonalized. The upper-left block,

mν = mTDM

−1N mD, (13.28)

can be diagonalized by a further unitary transformation:

V TνLmνVνL = mν = diag(m1,m2,m3). (13.29)

We thus learn the following points:

1. There are three heavy Majorana neutrinos of masses M1,M2,M3. We call these states

N1, N2, N3. These mass eigenstates are approximately SU(2)-singlet states, but have a small,

O(v/mN), SU(2)-doublet component. The masses are, by assumption, much larger than the

electroweak scale.

2. There are three light neutrinos of masses m1,m2,m3 of order v2/mN . We call these states

ν1, ν2, ν3. These mass eigenstates are approximately SU(2)-doublet states, but have a small,

O(v/mN), SU(2)-singlet component. The masses are much smaller than the electroweak

scale.

218

Table 13.1: The NSM particles

particle spin color Q mass

W± 1 (1) ±1 12gv

Z0 1 (1) 0 12

√g2 + g′2v

A0 1 (1) 0 0

g 1 (8) 0 0

h 0 (1) 0√

2λv

e, µ, τ 1/2 (1) −1 ye,µ,τv/√

2

ν1, ν2, ν2 1/2 (1) 0 m1,2,3

N1, N2, N2 1/2 (1) 0 M1,2,3

u, c, t 1/2 (3) +2/3 yu,c,tv/√

2

d, s, b 1/2 (3) −1/3 yd,s,bv/√

2

The details of the spectrum of the NSM are summarized in Table 13.1. There are three different

mass scales:

• The masses of all bosons and of the charged fermions are of order v.

• The masses of the (approximately) singlet neutrinos are heavy, of order mN .

• The masses of the (approximately) doublet neutrinos are light, of order v2/mN .

Furthermore, the heavier the gauge-singlet neutrinos, the lighter the SU(2)L-doublet neutrinos.

If the singlet neutrinos are very heavy, then they cannot be produced directly in experiments.

(Given that they are gauge-singlets, it would be difficult to produce them even if it were kinemat-

ically possible to do so.) They can thus be integrated out from the theory. This would leave the

SM as the effective low energy theory, with non-renormalizable terms suppressed by mN , the mass

scale of the heavy neutrinos. The dimension-five terms are

Ld=5 =Zνij

ΛφφLiLj, (13.30)

where

Zνij/Λ =

[Y ν(MN)−1Y νT

]ij. (13.31)

Thus, the leading terms in the low energy effective theory of the NSM are those of Eq. (13.2). We

learn that the NSM is indeed a possible UV completion of the νSM.

We end this subsection with an explanation why we choose to have three NR fields in our NSM.

• With a single NR field, the matrix Zν has two zero eigenvalues. Thus the model predicts

that two of the νi’s are massless and is therefore excluded.

219

• With two NR fields, the matrix Zν has a single zero eigenvalue. Thus the model predicts a

single massless νi and is phenomenologically viable. If experiments prove that the neutrino

spectrum is quasi-degenerate, the model will be excluded.

• With three NR fields, the matrix Zν is a general symmetric 3 × 3 matrix of complex, di-

mensionless couplings. Thus the model can accommodate any light neutrino spectrum and

mixing.

• The low energy effective theory of NSM models with more than three NR fields is the same

as that of the model with three NR fields.

Thus, three is the minimal number required to generate at low energy the most general LνSM. This

is the reason that we define the NSM in this way.

13.3.4 The Ni interactions

The interaction eigenstates NR are gauge-singlets. Consequently, they do not have any gauge

interactions. The only type of interaction that they do have is Yukawa interaction, as described

by the Lagrangian LN of Eq. (13.20). Unlike the Yukawa interactions of the SM, in the NSM the

Higgs boson has off-diagonal couplings. In particular, it couples the heavy Ni states to the light νj

states. As explained in Section 12.3, the special SM features which lead to diagonality are, first,

that for a given fermion sector all fermions are chiral and therefore there are no bare mass terms

and, second, that the scalar sector has a single Higgs doublet. In the NSM, the first condition is

violated, and the Higgs boson has neutrino flavor changing couplings.

The heavy mass eigenstates Ni have (in addition to the dominant NR component) a small

component, of order v/mN , of νL, the SU(2)L doublets. This means that the Ni fields have also

weak interactions. The leptonic charged current interaction depends on a 3 × 6 lepton mixing

matrix, where the three rows refer to the three charged lepton mass eigenstates (e, µ, τ), and

the six columns refer to the six neutrino mass eigenstates (ν1, ν2, ν3, N1, N2, N3). The neutral

current interaction of neutrinos is neither universal nor even diagonal. As explained in Section

12.3, universality of the Z couplings holds only if all fermions of given chirality and given color

and charge come from the same SU(2) × U(1) representation. In the NSM, neutrinos come from

two different types of SU(2) × U(1) representations, (2)−1/2 and (1)0, and therefore there is no

universality in the couplings of the Z-boson to neutrinos. The NSM demonstrates then that the

absence of tree-level FCNCs is a rather special feature of the SM, that is violated in general by

new physics.

The Yukawa and weak interactions of the Ni states imply that they are unstable, and decay

to a light lepton (charged or neutral) and a boson (h, Z, or W ). By assumption, however, the Ni

particles are heavy, and thus cannot be produced in experiments, and their interactions cannot be

220

directly probed. We thus do not study their interactions any further here. We note, however, that

the Yukawa interactions of the Ni particles might be the source of the baryon asymmetry of the

Universe, a scenario that is known by the name of leptogenesis. See Section 14.1.4 for a discussion.

13.3.5 The case of mN v: Sterile neutrinos

So far we worked under the assumption of Eq. (13.25), that is, mN v. Here we comment on the

consequences of an opposite hierarchy, mN v. Note that in this case the extreme lightness of

the neutrinos is accounted for by tiny Yukawa couplings, yν ∼< 10−12, rather than by a high scale

of new physics.

Let us first discuss the special case of MN = 0. In order to satisfy the Naturalness principle

discussed in Section 1, setting MN to zero requires that we postulate a global symmetry – lepton

number – in addition to the gauge symmetry of the SM. Lepton number is an anomalous symmetry,

but U(1)B−L is non-anomalous and achieves the same renormalizable Lagrangian, namely LNSM

with MN = 0.

With MN = 0, the neutrinos are Dirac particles, similarly to the charged fermions. This should

be the case, because in this model the neutrinos carry a conserved charged (L, or B−L). One may

think that this is the simplest way for neutrinos to acquire their masses. However, extending SM

not only by adding matter fields but also by imposing a global symmetry is a much more dramatic

modification than just adding matter fields and considering the most general Lagrangian, as is the

case for the NSM. Furthermore, in the NSM, the lightness of the doublet neutrinos is explained by

a new high scale of physics, which is well motivated. In contrast, the lightness of Dirac neutrinos

requires that their dimensionless Yukawa couplings are set to be tiny by hand.

For finite mN but with mN v, we have to consider two cases. First, if mN ∼ yνv, then

there are six very light Majorana mass eigenstates. Light states that are dominantly electroweak

singlets are called sterile neutrinos, while those that are dominantly electroweak doublets are called

active neutrinos. Sterile neutrinos can significantly change the neutrino phenomenology, but so far

there is no conclusive evidence for their existence. Second, if mN yνv, the six Majorana mass

eigenstates divide into three pairs of pseudo-Dirac neutrinos. Each pair is quasi-degenerate, with

average mass of order yνv and splitting of order mN .

The question of whether the neutrinos are Majorana or Dirac fermions is not yet experimentally

decided. If neutrinoless double beta decay is observed (see discussion in Section 13.A.2), it will

prove their Majorana nature.

221

13.4 Probing neutrino masses

13.4.1 Neutrino oscillations in vacuum

In experiments, neutrinos are produced and detected by charged current weak interactions. Thus,

the states that are relevant to production and detection are the SU(2)L-doublet partners of the

charged lepton mass eigenstates, namely νe, νµ, ντ . On the other hand, the eigenstates of free

propagation in spacetime are the mass eigenstates, ν1, ν2, ν3. In general, the interaction eigenstates

are different from the mass eigenstates:

|να〉 = U∗αi|νi〉 (α = e, µ, τ ; i = 1, 2, 3). (13.32)

Consequently, flavor is not conserved during propagation in spacetime and, in general, we may

produce να but detect νβ 6= να.

The probability Pαβ of producing neutrinos of flavor α and detecting neutrinos of flavor β is

calculable in terms of

• The neutrino energy E;

• The distance between source and detector L;

• The mass squared difference ∆m2ij ≡ m2

i −m2j ;

• The parameters – mixing angles and the Dirac phase – of the mixing matrix U ;

Starting from Eq. (13.32), we can write the expression for the time evolved |να(t)〉 (where |να(0)〉 ≡|να〉):

|να(t)〉 = U∗αi|νi(t)〉, (13.33)

where

|νi(t)〉 = e−iEit|νi(0)〉. (13.34)

In all cases of interest, the neutrinos are relativistic, and we then approximate

Ei = pi +m2i /(2Ei). (13.35)

Thus, the probability of a state that is produced as να to be detected as νβ is given by

Pαβ = |〈νβ|να(t)〉|2 . (13.36)

Explicit calculation (see your homework) gives

Pαβ = δαβ − 4∑j>i

Re(UαiU

∗βiU

∗αjUβj

)sin2

(∆m2

ijL

4E

)+ 2

∑j>i

Im(UαiU

∗βiU

∗αjUβj

)sin

(∆m2

ijL

2E

).

(13.37)

222

If we apply this calculation to the two generation case, where there is a single mixing angle

(and no relevant phase) and a single mass-squared difference,

U =

(cos θ sin θ

− sin θ cos θ

), ∆m2 = m2

2 −m21, (13.38)

we obtain, for α 6= β,

Pαβ = sin2 2θ sin2 x, x ≡ ∆m2L

4E. (13.39)

We learn that the time evolution of neutrinos that are produced in a flavor eigenstate exhibits

oscillations (as a function of time or, equivalently, distance) between the different flavor eigenstates.

This phenomenon is known as “neutrino oscillations.”

The expression (13.39) depends on two parameters that are related to the experimental design,

E and L, and two that are parameters of the Lagrangian, ∆m2 and θ. To be sensitive to the

Lagrangian parameters, one has to design the experiment appropriately:

∆m2L/E 1 Pαβ → 0,

∆m2L/E ∼ 1 Pαβ sensitive to ∆m2 and θ,

∆m2L/E 1 Pαβ → 12

sin2 2θ. (13.40)

We learn that to allow observation of neutrino oscillations, Nature needs to provide sin2 2θ that is

not too small. To get an intuition it is useful to write

x ≈ 1.27

(∆m2

eV2

)(L

km

)(GeV

E

). (13.41)

To probe small ∆m2, we need experiments with large L/E. Indeed, given natural neutrino sources

as well as neutrinos produced in reactors and accelerators, we can probe a rather large range of

∆m2; see the list in Table 13.1.

It is interesting to understand the differences between neutrino oscillations and neutral meson

oscillations (discussed in Section 10.1):

1. Mesons decay while the neutrinos are stable (at least on the time scale of experiments).

This brings the meson decay width into the analysis of the time evolution of mesons, see

Eq. (10.21). This is also the reason that in meson mixing the two Hamiltonian eigenstates

may be non-orthogonal.

2. Neutrino oscillations depend on the mixing angle. In neutral meson mixing we consider

particle-antiparticle oscillation. In this case, CPT requires that the two diagonal elements

in the mass matrix are equal and thus the mixing angle is maximal at π/4.

3. Consider the argument of the time dependent oscillation. Comparing Eq. (10.3) to Eq.

(13.39) we see that it is ∆mt for mesons and ∆m2t/(2E) for neutrinos. This apparent

223

Table 13.1: Neutrino oscillation experiments. The column ∆m2 gives the range to which the

corresponding class of experiments is sensitive. The physics of solar neutrinos, and the separation

to vacuum (VO) and matter (MSW) effects are explained in Section 13.4.2.

Source E[MeV] L[km] ∆m2[eV2]

Solar (VO) 1 108 =⇒ 10−11 − 10−9

SB Reactor 1 0.1− 1 =⇒ 10−3 − 10−2

LB Reactor 1 102 =⇒ 10−5 − 10−3

Atmospheric 103 101−4 =⇒ 10−5 − 1

SB accelerator 103−4 0.1 =⇒ ∼> 0.1

LB accelerator 104 102−3 =⇒ 10−3 − 10−2

Source n0[cm−3] r0[cm] ∆m2[eV2]

Solar (MSW) 6× 1025 7× 109 =⇒ 10−9 − 10−5

difference, however, is nothing but the effect of relativistic time dilation. Using, for the

quasi-degenerate mesons ∆m = ∆m2/(2m), and for the ultra-relativistic neutrinos t = τ/γ

(with τ the proper time) and γ = E/m, the dependencies on time and mass become the

same.

13.4.2 The MSW effect

To describe the propagation of neutrinos through matter, modifications to the oscillation formalism

are necessary. The smallness of the cross sections makes most effects of neutrino scattering off

medium negligible. This is, however, not the case for forward scattering, where there is no energy or

momentum exchange between the neutrinos and the medium. The effect of forward scattering is to

induce effective masses for the neutrinos (similar to the effect of medium on photon propagation).

The resulting modification to Pαβ of Eq. (13.39) can be very dramatic. This is known as the

Mikheyev-Smirnov-Wolfenstein (MSW) effect.

For the current measurements of neutrino flavor transitions, the relevant effects come from

neutrino propagation in the Sun or in Earth. In both cases, matter consists of electrons, protons

and neutrons. In particular, there are neither muons, nor tau-leptons nor anti-leptons in the

medium.

All neutrinos have the same (universal) neutral current interactions. In contrast, in matter

that has electrons but neither muons nor tau-leptons, only νe has charged current interactions

with matter. The effective potential induced by the charged current interactions of νe is given by

VC =√

2GFne ≈ 7.6ne

np + nn

(ρ

1014g/cm3

)eV , (13.42)

224

where ni (i = e, p, n) stands for the number density, and ρ is the mass density. For example,

at the solar core, ρ ∼ 100 g/cm3, which gives rise to VC ∼ 10−12 eV, while at the Earth core,

ρ ∼ 10 g/cm3, which gives rise to VC ∼ 10−13 eV.

Current data indicates that mν ∼> 10−3 eV, and thus mν VC . One may then naively think

that matter effects are irrelevant. Matter effects, however, arise from vector interactions while

masses are scalar operators. Consequently, the right comparison to make is between m2ν and EVC ,

where E is the neutrino energy. Since E mν , matter effects can be important. To see how this

enhancement of matter effects arises, consider a uniform, unpolarized medium at rest and a one

generation model. In this case, the four-vector effective interaction is given by Vµ = (VC , 0, 0, 0).

Due to VC the vacuum dispersion relation of the neutrino, pµpµ = m2, is modified as follows:

(pµ − Vµ)(pµ − V µ) = m2 ⇒ E ≈ p+ VC +m2

2p, (13.43)

where the approximation holds for ultra-relativistic neutrinos, E ≈ p m. Writing the dispersion

relation as E2 = p2 +m2m, we learn that the effective mass-squared in matter, m2

m, is given by

m2m = m2 + Ae, Ae ≡ 2EVC . (13.44)

It is the vector nature of the weak interaction that makes the matter effects practically relevant.

We analyze the matter effects in a two neutrino framework. In vacuum, in the mass basis

(ν1, ν2), the Hamiltonian can be written as

H = p+

( m21

2Em2

2

2E

). (13.45)

In the interaction basis (νe, νa), where νa is a combination of νµ and ντ , we have

H = p+m2

1 +m22

4E+

(−∆m2

4Ecos 2θ ∆m2

4Esin 2θ

∆m2

4Esin 2θ ∆m2

4Ecos 2θ

). (13.46)

In matter that contains only electrons, protons and neutrons, the Hamiltonian in the interaction

basis is modified from its vacuum form of Eq. (13.46):

H = p+ Va +m2

1 +m22

4E+

1

4E

(2Ae −∆m2 cos 2θ ∆m2 sin 2θ

∆m2 sin 2θ ∆m2 cos 2θ

). (13.47)

Omitting the part in the Hamiltonian that is proportional to the unit matrix in flavor space (which

plays no role in the oscillations), we see that the effective mass-squared difference and mixing angle

in matter are given by

∆m2m =

√(∆m2 cos 2θ − Ae)2 + (∆m2 sin 2θ)2, (13.48)

tan 2θm =∆m2 sin 2θ

∆m2 cos 2θ − Ae,

225

where the subindex m stands for matter.

The oscillation probability in matter with constant ne is simply obtained from Eq. (13.39) by

replacing ∆m2 and θ with ∆m2m and θm:

Pαβ = sin2 2θm sin2 xm , xm =∆m2

mL

4E. (13.49)

The following points are worth mentioning regarding Eq. (13.49):

1. The vacuum result is reproduced for A = 0, as it should.

2. Vacuum mixing is needed in order to get mixing in matter.

3. For ∆m2 cos 2θ |A|, the matter effect is a small perturbation to the vacuum result.

4. For ∆m2 cos 2θ |A|, the neutrino mass is a small perturbation to the matter effect. In

that case the oscillations are highly suppressed since the effective mixing angle is very small.

5. For ∆m2 cos 2θ = A, the mixing is maximal, namely it is on resonance.

13.4.3 Non-uniform density

When the matter density is not constant there are further modifications to the oscillation formal-

ism. Density variation results in changing the effective neutrino masses and their mixing angles.

Then, the flavor composition of the neutrinos along their path is a function of the medium density

profile.

At any point r on the neutrino path we care about the change of the potential. Given that the

potential is linear in the density, we define the derivative of the density

n′(r) =dn(r)

dr. (13.50)

For constant density, n′ = 0, the flavor conversion probability is controlled by the effective masses

and mixing angles. For varying density, n′ 6= 0, there are extra parameters that affect the flavor

conversion probability. Most important is the adiabatic parameter

Q(r) =∆m2 sin2 2θ

E cos 2θ

n(r)

n′(r). (13.51)

In the adiabatic limit, Q 1, the density variation is slow. In this case the constant density

formalism can be applied locally. Of particular interest is the case of large L where we can average

the oscillation and just think of propagation of mass eigenstates. In the adiabatic limit transition

between effective mass eigenstates is highly suppressed, and thus flavor transition is governed only

by the way the mass eigenstate evolves.

226

In the non-adiabatic limit, Q < 1, the density variation is fast. Then, transition between

effective mass eigenstates is possible, and the constant density formalism cannot be used. Both

limits can be of interest in reality, but here we concentrate on the adiabatic case.

To gain insight into the importance of the variation we consider the mixing angle that is function

of location, θm = θm(ne(x)):

tan 2θm(x) =∆m2 sin 2θ

∆m2 cos 2θ − 2√

2GFne(x)E. (13.52)

In particular, as ne(x) decreases, so does θm(x). Defining

nRe =∆m2 cos 2θ

2√

2GFE, (13.53)

we have

ne nRe =⇒ θm ≈ π/2,

ne = nRe =⇒ θm = π/4,

ne = 0 =⇒ θm = θ. (13.54)

We conclude that, for a small θ, νm2 propagating along a decreasing ne is mostly νe above nRe and

mostly νa for ne below nRe .

We now describe the characteristics of νe production and propagation in the Sun. The electron

density in the Sun can be parameterized as ne(x) ≈ 2n0 exp(−x/r0), where the relevant parameters

are given in Table 13.1. Consider the case where nprode nRe . Then, according to Eq. (13.54),

we have at the production point ν = νm2 (θm = π/2). Further assume that the propagation is

adiabatic at ne ∼ nRe . Then, at the resonance point we still have ν = νm2 (θm = π/4). Finally,

as the neutrino arrives to the surface of the Sun, it is still νm2 , but now, according to Eq. (13.54),

we have θm = θ, and the neutrino is simply the heavy mass eigenstate. Being a mass eigenstate,

it does not oscillate along its propagation to Earth. We conclude that for solar electron neutrinos

with energy in the range

E ∆m2

GFnprode

, (13.55)

and where the adiabatic condition is satisfied, the probability of being detected as νe is given by

PMSWee = |〈νe|ν2〉|2 = |Ue2|2 = sin2 θ. (13.56)

It is highly sensitive to θ and provides a way to probe small mixing angles.

On the other hand, for solar neutrinos with energy in the range

E ∆m2 cos 2θ

GFnprode

, (13.57)

227

namely nprode nRe , the produced state is ν = sin θ νm2 + cos θ νm1 . Approaching the surface of

the Sun, ν = sin θ ν2 + cos θ ν1 and Pee(R) = 1. Along the propagation to Earth, the neutrino is

subject to vacuum oscillations, with the final result [see Eq. (13.40)]

PVOee = 1− 1

2sin2 2θ. (13.58)

The comparison between Eqs. (13.56) and (13.58) demonstrates the importance of matter effects.

A few final remarks are in order:

1. Note that PMSWee < 1

2is possible, while PVO

ee > 12. For solar neutrinos, the transition between

those subject to the MSW effect, Eq. (13.56), and those subject to vacuum oscillations, Eq.

(13.58), occurs at E ∼ MeV.

2. Matter effects are sensitive to the sign of ∆m2, as manifest in Eq. (13.48). This is why we

know the sign of ∆m221 but not of ∆m2

32.

3. Examining Table 13.1, we conclude that, if sin θ ∼ 1, neutrino masses in the entire theoreti-

cally interesting range, 10−11 eV2∼< ∆m2

∼< eV2, could be discovered. For 10−2∼< sin θ 1,

neutrino masses could still be discovered via the adiabatic MSW effect for ∆m2 ∼ 10−5 eV2.

13.4.4 Experimental results

Neutrino flavor transitions have been observed for solar, atmospheric, reactor and accelerator

neutrinos. Five flavor parameters – two mass-squared differences and the three mixing angles –

have been measured:

∆m221 = (7.5± 0.2)× 10−5 eV2,

|∆m232| = (2.46± 0.05)× 10−3 eV2,

sin2 θ12 = 0.30± 0.01,

sin2 θ23 = 0.45± 0.03,

sin2 θ13 = 0.022± 0.001. (13.59)

All the results in the neutrino sector so far are consistent with the νSM. The following param-

eters are still not experimentally determined:

• The absolute mass scale of the neutrinos is still unknown. On one extreme, they could

be quasi-degenerate and as heavy as parts of eV. On the other extreme, they could be

hierarchical, with the lightest possibly massless.

• It is not known whether the spectrum has normal or inverted hierarchy.

• None of the three phases has been measured.

228

While the results can be accommodated in the νSM, there are other ways to explain the data.

The following questions are of interest as further tests of the idea that the νSM is the correct low

energy description of the neutrino sector:

• Are the neutrinos Dirac or Majorana fermions?

• Are there sterile neutrinos, that is, other light states that mix with the active neutrinos?

• Are there dimension six operators that significantly affect the neutrino interactions?

229

Appendix

13.A Probing neutrino masses

The positive evidences for neutrino masses only measure ∆m2. Here we discuss attempts to

measure the mass itself. As of now, these measurements give only upper bounds. Specifically, we

discuss kinematic tests for neutrino masses and neutrinoless double beta decay. We do not discuss

here astrophysical and cosmological probes of neutrino masses; We mention these in section 14.1.

13.A.1 Kinematic tests

In decays that produce neutrinos, the decay spectra are sensitive to neutrino masses. For example,

in the π → µν decay, the muon momentum is fixed (up to tiny width effects) by the masses of the

pion, the muon and the neutrino. To first order in m2ν/m

2π, the muon momentum in the pion rest

frame is given by

|~p | = 1

2mπ

(m2π −m2

µ −m2π +m2

µ

m2π −m2

µ

m2ν

). (13.60)

Since the correction to the massless neutrino limit is proportional to m2ν , the kinematic tests are

not very sensitive to small neutrino masses. The current best bounds obtained using kinematic

tests are the following [10]:

mν < 18.2 MeV from τ → 5π + ν ,

mν < 190 KeV from π → µν ,

mν < 2 eV from 3H→3He + e+ ν . (13.61)

The combination of oscillation experiments, which are sensitive to the neutrino mass-squared

differences, and kinematic tests, which are sensitive to the neutrino masses themselves, implies

that all three neutrino masses are lighter than 2 eV.

13.A.2 Neutrinoless double-beta (0ν2β) decay

Neutrino Majorana masses violate lepton number by two units. Therefore, if neutrinos have

Majorana masses we expect that there are also ∆L = 2 processes. The smallness of the neutrino

230

masses indicates that such processes have very small rates. Therefore, the only practical way to

look for ∆L = 2 processes is in places where the lepton number conserving ones are forbidden or

highly suppressed. Neutrinoless double-beta (0ν2β) decay, where the single beta decay is forbidden,

is such a process. An example for such processes is

3276Ge→ 34

76 Se + 2e− . (13.62)

The only physical background to 0ν2β decay is from double-beta decay with two neutrinos.

The 0ν2β decays are sensitive to the following combination of neutrino parameters:

mββ =3∑i=1

miU2ei. (13.63)

The best bound derived from 0ν2β decay is mββ < 0.34 eV [10]. We emphasize the following

points:

1. If the neutrinos are Dirac particles, lepton number is conserved, and their masses do not

contribute to 0ν2β decays.

2. The 0ν2β decay rate depends not only on m2ββ but also on some nuclear matrix elements.

Those matrix elements introduce theoretical uncertainties in extracting mββ from the signal,

or in deriving an upper bound on mββ if no signal is observed.

3. The 0ν2β decay is sensitive to other ∆L = 2 operators, and not only to the neutrino Majorana

masses. Thus, the relation between the 0ν2β decay rate and the neutrino mass is model

dependent.

231

Homework

Question 13.1: The νSM

Here you are asked to fill up some of the detailed omitted in the main text.

1. Restore the SU(2) indices that are implicit in Eq. (13.2).

2. Show that Eq. (13.3) arises from Eq. (13.2)Eq. (13.3) when the Higgs field is replaced by its

a VEV.

Question 13.2: 2× 2 matrices and the see-saw mechanism

Consider a 2× 2 Hermitian matrix:

M =

(a c

c∗ b

), (13.64)

with a and b real.

1. Show that the Eigenvalues are

λ1,2 =TrM ±

√(TrM)2 − 4 detM

2. (13.65)

2. Prove that

(TrM)2 ≥ 4 detM. (13.66)

This is needed in order to ensure that the Eigenvalues are real.

3. Next we move to study real symmetric matrices, that is, we assume that c is real. In that

case the matrix M can be diagonalized by an orthogonal matrix O:

O =

(cos θ sin θ

− sin θ cos θ

). (13.67)

We call θ the mixing angle. Show that

θ =1

2tan−1

(2c

b− a

). (13.68)

232

4. The seesaw mechanism is a special case of the above where a = 0 and c b. Show that, to

leading order in c/b, the eigenvalues and mixing angles are given by

λ1 =c2

b, λ2 = b, θ =

c

b, (13.69)

that is, confirming Eq. (13.23).

Question 13.3: Type III see-saw

Consider a one generation SM where we add one fermion field, N(1, 3)0. This model is called

the type III see-saw model and here you are asked to show how it generates the light neutrino

mass.

1. Write the new terms in L that involve N . Denote them by LNKin, LNψ , and LNYuk. Denote the

Majorana mass term by M . Write the covariant derivative explicitly. In the Yukawa term,

make sure you use correctly φ or φ.

2. The Higgs field acquires a VEV: 〈φ0〉 = v/√

2. Assume that M v. Show that the neutrino

acquires a mass, mν = cv2/M . Find out what c is.

3. Both the neutral and the charged components of N mix with the SM fields. The mixing

generates a mass splitting between the charged and neutral components of N . Estimate the

size of the splitting.

4. The mass splitting you found above is rather small and, in fact, negligible compared to those

generated by one loop effects. Discuss where these one loop effects come from, and estimate

their size.

Question 13.4: Neutrino oscillations

Here you are asked to derive some of the basic formulas of neutrino oscillations.

1. Derive Eq. (13.37). It is useful to recall that

Pαβ = |〈νβ|να(t)〉|2 =∑i

|〈νβ|νi〉〈νi|να(t)〉|2 . (13.70)

2. Derive expressions for the difference between the T -conjugate processes,

∆T ≡ P (νe → νµ)− P (νµ → νe), (13.71)

233

and for the difference between the CP -conjugate processes,

∆CP ≡ P (νe → νµ)− P (νe → νµ). (13.72)

Show that in the CP limit, which you can take as the case where U is real, ∆T = ∆CP = 0.

3. Due to CPT we know that ∆T −∆CP = 0. Check that this is indeed the case.

4. Consider a case where the production state is νe and the detection is done via elastic scat-

tering. In that case the detected neutrino is some superposition of flavor eigenstates which

we define as νd. Find P (νe → νd) as a function of distance. For simplicity assume only two

generations.

Question 13.5: Matter effects

1. Derive Eqs. (13.48) and (13.49).

2. Consider the case that xm is small. Show that to leading order in xm, matter effects cancel

in the oscillation probability, that is, you recover the vacuum result.

Question 13.6: Solar neutrinos

While by now ruled out, in the past one possible solution for the solar neutrino problem was

the so-called “vacuum oscillation.” The idea is that the mass difference is very small, and then

matter effects can be neglected, and the only relevant effects is the oscillation of the neutrinos

while traveling from the Sun to Earth.

1. Estimate ∆m2 which is needed for such a solution to work.

2. One prediction of this idea is seasonal variation. Explain how this can be observed and

estimate the magnitude of the effect.

234

Part IV

Connection to astronomy and cosmology

235

Chapter 14

Connection to cosmology

14.1 Baryogenesis

14.1.1 The baryon asymmetry

Observations indicate that the number of baryons in the Universe is unequal to the number of

antibaryons. To the best of our understanding, all the structures that we see in the Universe –

stars, galaxies, and clusters – consist of matter (baryons and electrons) and there is no antimatter

(antibaryons and positrons) in appreciable quantities. Since various considerations suggest that

the Universe has started from a state with equal numbers of baryons and antibaryons, the observed

baryon asymmetry must have been generated dynamically, a scenario that is known by the name

of baryogenesis.

The baryon asymmetry of the Universe is expressed in the literature in various ways:

ηB ≡nB − nB

nγ

∣∣∣∣∣0

= (6.21± 0.16)× 10−10,

YB ≡nB − nB

s

∣∣∣∣0

= (8.75± 0.23)× 10−11, (14.1)

where nB, nB, nγ and s are the number densities of, respectively, baryons, antibaryons, photons

and entropy, a subscript 0 implies “at present time.” One can also present the asymmetry in terms

of the baryonic fraction of the critical energy density ΩB ≡ (ρB/ρcrit):

ΩBh2 = 0.0219± 0.0007, (14.2)

where h ≡ H0/(100 km s−1 Mpc−1) = 0.70± 0.01 is the present Hubble parameter.

The value of the baryon asymmetry of the Universe is inferred in two independent ways. The

first way is via big bang nucleosynthesis. This chapter in cosmology predicts the abundances of

the light elements, D, 3He, 4He, and 7Li. These predictions depend on a single parameter, which is

ηB. The abundances of D and 3He are very sensitive to ηB. The second way is from measurements

236

of the cosmic microwave background radiation. A larger ηB would enhance the odd peaks in the

spectrum. The fact that the two determinations agree gives much confidence in the value of the

baryon asymmetry. A consistent theory of baryogenesis should thus explain nB ≈ 10−9nγ and

nB = 0.

14.1.2 Sakharov conditions

Three conditions that are required to dynamically generate a baryon asymmetry were formulated

by Sakharov:

• Baryon number violation: This condition is required in order to evolve from an initial state

with ηB = 0 to a state with ηB 6= 0.

• C and CP violation: If either C or CP were conserved, then processes involving baryons

would proceed at precisely the same rate as the C- or CP-conjugate processes involving

antibaryons, with the overall effect that no baryon asymmetry is generated.

• Out of equilibrium dynamics: In chemical equilibrium, there are no asymmetries in quantum

numbers that are not conserved (such as B, by the first condition).

These necessary ingredients are all present in the Standard Model. Quantitatively, however,

the SM fails to explain the observed asymmetry:

• Baryon number is violated in the SM, and the resulting baryon number violating processes are

fast in the early Universe. The violation is due to the triangle anomaly, and leads to processes

that involve nine left-handed quarks (three of each generation) and three left-handed leptons

(one from each generation), see Fig. 14.1.1 At zero temperature, the amplitude of the

baryon number violating processes is proportional to e−8π2/g2, which is too small to have

any observable effect. At high temperatures (T TEWPT, where TEWPT is the temperature

where the electroweak phase transition starts to take place, of order 100 GeV), however,

these transitions become unsuppressed, Γ ∝ 250α5wT , and are thus faster than the expansion

rate of the Universe for TEWPT ∼< T ∼< 1012 GeV.

• The weak interactions of the SM violate C maximally and violate CP via the Kobayashi-

Maskawa mechanism. As argued in Section 14.1.3, the KM mechanism introduces a suppres-

sion factor of order 10−20 into the SM contribution to the baryon asymmetry. Since there

are practically no kinematic enhancement factors in the thermal bath, it is impossible to

generate ηB ∼ 10−9 with such a small amount of CP violation. Consequently, baryogenesis

implies that there must exist new sources of CP violation, beyond the KM phase of the SM.

1A selection rule is obeyed, ∆B = ∆L = ±3n, preserving proton stability.

237

Le

Lµ

Lτ

Q3

Q2

Q1

Figure 14.1: A schematic presentation of a B + L violating process in the SM.

• Within the Standard Model, departure from thermal equilibrium occurs at the electroweak

phase transition (EWPT). Here, the non-equilibrium condition is provided by the interac-

tions of particles with the bubble wall, as it sweeps through the plasma. The experimental

measurement of mh ∼ 126 GeV implies, however, that this transition is not strongly first

order, as required for successful baryogenesis (see Fig. 14.1.2). Thus, a different kind of

departure from thermal equilibrium is required from new physics or a modification to the

electroweak phase transition.

V

φ

Tc

T>Tc

T<Tc

φc

V

φ

T>Tc

T<Tc

Tc

Figure 14.2: The evolution of the Standard Model scalar potential with decreasing temperature

for (left) mh < 70 GeV and (right) mh > 70 GeV.

238

We learn that baryogenesis requires new physics that extends the SM in at least two ways. It

must introduce new sources of CP violation, and it must either provide a departure from thermal

equilibrium in addition to the EWPT or modify the EWPT.

Among the proposed scenarios for possibly successful baryogenesis we should mention GUT

baryogenesis, leptogenesis, electroweak baryogenesis, and the Affleck-Dine mechanism. We describe

leptogenesis in more detail in Section 14.1.4.

14.1.3 The suppression of KM baryogenesis

As explained in Section 12.4, the three generation SM violates CP if XCP 6= 0. The baryon

asymmetry of the Universe is a CP violating observable. As such, it is proportional to XCP . More

precisely, it is proportional to XCP/T12c , where Tc ∼ 100 GeV is the critical temperature of the

electroweak phase transition. When one puts the measured values of the quark masses and CKM

parameters, one obtains that XCP ∼ 10−20, and thus the KM mechanism cannot account for a

baryon asymmetry as large as O(10−10).

One may wonder why the suppression by XCP does not apply to all CP asymmetries measured

in experiments. After all, there are CP asymmetries such as Sππ that are experimentally of order

one and theoretically known to be suppressed by the KM phase (sin 2α) but by none of the mixing

angles or small quark mass-squared differences of XCP . The answer provides some insights as to

how the KM mechanism operates. As concerns the mixing angles, they often cancel in the CP

asymmetries which are ratios of CP violating to CP conserving rates. The physics behind the mass

factors in Eq. (8.73) is that, in order to exhibit CP violation, a process has to “go through” all three

flavors of each quark type, and “sense” that their masses are different from each other. Sometimes,

the experiment does that for us. For example, when experimenters measure the CP asymmetry in

B → ππ, they already distinguish the bottom, up, and down masses from the others (by identifying

the B and π mass eigenstates) and thus ‘get rid’ of the corresponding mass factors. What remains

is the (m2t − m2

c) factor, This factor does appear in ∆mB and, indeed, if this factor were zero,

the CP asymmetry, which is really Sππ sin(∆mBt), would vanish. In contrast, baryogenesis is a

flavor-blind process (it sums over all flavors), and is suppressed by all six mass-squared factors of

Eq. (??).

The important conclusion of the failure of the KM mechanism to account for the baryon

asymmetry is the following: There must exist sources of CP violation beyond the KM phase of

the SM.

14.1.4 Leptogenesis

The addition of the NRi fields, with the Yukawa (Y ν) and mass (MN) terms of Eq. (13.20), is

motivated by the seesaw mechanism for light neutrino masses. The addition of these terms implies,

239

however, an additional intriguing consequence: The physics of the singlet fermions is likely to play

a role in dynamically generating a lepton asymmetry in the Universe. The reason that leptogenesis

is qualitatively almost unavoidable once the seesaw mechanism is invoked is that the Sakharov

conditions, described in Appendix 14.1.2, are (likely to be) fulfilled:

• Lepton number violation: The Lagrangian terms (13.20) violate L because lepton number

cannot be consistently assigned to the NRi fields in the presence of Y ν and MN . If L(NR) = 1,

then Y ν respects L but MN violates it by two units. If L(NR) = 0, then MN respects L but

Y ν violates it by one unit. (Remember that the fact that the SM interactions violate B + L

implies that the requirement for baryogenesis from new physics is B − L violation, and not

necessarily B violation.)

• CP violation: Since there are irremovable phase in Y ν (once the `−L,R andN fields are rephased

to make Y e and MN real), the Lagrangian terms (13.20) provide new sources of CP violation.

• Departure from thermal equilibrium: The interactions of the Ni are only of the Yukawa type.

If the Y ν couplings are small enough, these interactions can be slower than the expansion

rate of the Universe, in which case the singlet fermions will decay out of equilibrium.

Thus, in the presence of the seesaw terms, leptogenesis is qualitatively almost unavoidable, and the

question of whether it can successfully explain the observed baryon asymmetry is a quantitative

one.

We consider leptogenesis via the decays of N1, the lightest of the singlet fermions Ni. When

the decay is into a single flavor α, N1 → Lαφ or Lαφ†, the baryon asymmetry can be written as

follows:

YB =

(135ζ(3)

4π4g∗

)× Csphal × ηeff × εCP. (14.3)

The first factor is the equilibrium N1 number density divided by the entropy density at temperature

T M1. It is of O(4× 10−3) when the number of relativistic degrees of freedom g∗ is taken as in

the SM, gSM∗ = 106.75. The other three factors on the right hand side of Eq. (14.3) represent the

following physics aspects:

1. εCP is the CP asymmetry in N1 decays. For every 1/εCP N1 decays, there is one more L than

there are L’s.

2. ηeff is the efficiency factor. Inverse decay, other “washout” processes, and inefficiency in N1

production, reduce the asymmetry by 0 ≤ ηeff ≤ 1. In particular, ηeff = 0 is the limit of N1

in perfect equilibrium, so no asymmetry is generated.

3. Csphal describes further dilution of the asymmetry due to fast processes which redistribute

the asymmetry that was produced in lepton doublets among other particle species. These

include gauge, Yukawa, and B + L violating non-perturbative effects.

240

Lα

LβLβ

LαLα

φ φφ

φφ

N1

N1

N1

N2 N

2

Figure 14.3: The diagrams contributing to the CP asymmetry ε.

These three factors can be calculated, with εCP and ηeff depending on the Lagrangian parame-

ters. The final result can be written (with some simplifying assumptions) as

YB ∼ 10−3 10−3 eV

mεCP, (14.4)

where the diagrams of Fig. 14.3 give (we define xj ≡M2j /M

21 )

εCP =1

8π

1

(Y ν†Y ν)11

∑j

Im[

(Y ν†Y ν)1j

]2√xj

[1

1− xj+ 1− (1 + xj) ln

(1 + xjxj

)], (14.5)

and

m =(Y ν†Y ν)11v

2

M1

. (14.6)

The plausible range for m is the one suggested by the range of hierarchical light neutrino

masses, 10−3 − 10−1 eV, so we expect a rather mild washout effect, ηeff ∼> 0.01. Then, to account

for YB ∼ 10−10, we need |εCP| ∼> 10−5 − 10−6. Using Eq. (14.5), we learn that this condition

roughly implies, for the seesaw parameters,

M1

M2

Im[(Y ν†Y ν)212]

(Y ν†Y ν)11∼> 10−4 − 10−5, (14.7)

which is quite natural.

We learn that leptogenesis is attractive not only because all the required features are qualita-

tively present, but also because the quantitative requirements are plausibly satisfied. In particular,

m ∼ 0.01 eV, as suggested by the light neutrino masses, is optimal for thermal leptogenesis as

it leads to effective production of N1’s in the early Universe and only mild washout effects. Fur-

thermore, the required CP asymmetry can be achieved in large parts of the seesaw parameter

space.

14.2 Dark Matter

Roughly 20% of the energy density of the Universe consists of neutral weakly interacting non-

baryonic matter, ‘dark matter’. The picture of structure formation by growth of fluctuations in

weakly interacting matter explains the elements of structure in the Universe from the fluctuations

241

in the cosmic microwave background down almost to the scale of galaxies. However, from the point

of view of particle physics, we have no idea what dark matter is made of. We know, however, that

the following conditions must be satisfied for the DM:

• Stable on cosmological time scales;

• Very weakly interacting;

• Have the right relic density.

The possibilities range in mass from axions (mass 10−5 eV) to primordial black holes (mass

10−5 M). One of the most attractive classes of dark matter candidates are WIMPs (weakly

interacting massive particles). They are heavy, neutral, weakly interacting particles with interac-

tion cross sections nevertheless large enough that they were in thermal equilibrium for some period

in the early universe.

14.2.1 Observational Evidence

Assuming that galaxies are in virial equilibrium, one can relate the mass at a given distance r from

the center of a galaxy to its rotational velocity:

v2 ∝ GNM(r)

r. (14.8)

The rotational velocity v is measured by observing 21 cm emission lines in HI regions (neutral

hydrogen) beyond the point where most of the light in the galaxy ceases. If the bulk of the mass is

associated with light, then beyond the point where most of the light stops, M should be constant

and v2 ∝ 1/r. This is not the case. The rotation curves are approximately flat, i.e. v ∼ constant

outside the core of the galaxy. In our own galaxy, v ' 240 km/s at the location of our solar system,

with little change out to the largest observable radius. This implies the existence of a dark halo,

with mass density ρ(r) ∝ 1/r2, i.e. M(r) ∝ r. At some point, ρ has to fall of faster, in order to

keep the total mass of the galaxy finite, but we do not know at what radius this happens. This

leads to a lower bound on the DM mass density, ΩDM ∼> 0.1.

The observation of clusters of galaxies tends to give somewhat larger values, ΩDM ' 0.2. These

observations include (i) measurements of the peculiar velocities of galaxies in the cluster, which

are a measure of their potential energy if the cluster is virialized; (ii) measurements of the X-ray

temperature of hot gas in the cluster, which again correlates with the gravitational potential felt by

the gas; and, most directly, (iii) studies of the (weak) gravitational lensing of background galaxies

on the cluster.

A particularly compelling example involves the bullet cluster which recently (on cosmological

time scales) passed through another cluster. As a result, the hot gas forming most of the baryonic

mass of the cluster was shocked and decelerated, whereas the galaxies in the clusters proceeded

242

on ballistic trajectories. Gravitational lensing shows that most of the total mass also moved

ballistically, indicating that DM self-interactions are weak.

The currently most accurate, if somewhat indirect, determination of ΩDM comes from global

fits of cosmological parameters to a variety of observations. For example, using measurements

of the anisotropy of the cosmic microwave background (CMB) and of the spatial distribution of

galaxies, the density of cold, non-baryonic matter is found to be [PDG]

Ωnbmh2 = 0.119± 0.002, (14.9)

where h is the Hubble constant in units of 100 km/(s Mpc). Some part of the baryonic matter

density [PDG],

Ωbh2 = 0.0223± 0.0002, (14.10)

may well contribute to (baryonic) DM, e.g. MACHOs or cold molecular gas clouds.

The DM density in the neighborhood of our solar system is of considerable interest. The most

recent estimates, based on a detailed model of our galaxy, and constrained by a host of observables

including the galactic rotation curve, finds [PDG]

ρlocalDM ∼ 0.4

GeV

cm3. (14.11)

14.2.2 Why not the neutrinos?

The SM does have dark matter particles among its list of elementary particles. These are the

neutrinos, which carry neither color charge, nor electromagnetic charge. The lightest of these (if

not all three) is stable. Can it constitute the dark matter of the Universe? The answer is negative.

Due to the weak interactions, neutrinos have been in thermal equilibrium at high enough

temperature. They decouple at a temperature TD, when the expansion rate of the Universe becomes

larger than their interaction rate. The cross section is given by σ ' G2FT

2 and the neutrino number

density is n ' T 3. Thus Γint = nσv ' G2FT

5 and

Γint

H' G2

FT5

T 2/mPl

'(

T

1 MeV

)3

. (14.12)

Below 1 MeV, the neutrino temperature Tν scales like a−1. Shortly after neutrino decoupling, the

temperature drops below the mass of the electron, and the entropy in e+e− pairs is transferred

to the photons, but not to the decoupled neutrinos. For T ∼> me, the particle species in thermal

equilibrium with photons include the photon (gγ = 2) and e+e− pairs (ge = 4), which give g =

2 + 78× 4 = 11/2. For T me, only the photons are in equilibrium, g = gγ = 2. For particles in

thermal equilibrium with the photons, g(aT )3 remains constant. Therefore, the value of aT after

the e± annihilation must be larger than before e± annihilation by a factor of the third root of the

243

ratio of g before (= 11/2) and after (= 2). Thus, aTγ increases by (11/4)1/3 while aTν remains

constant. Consequently, the present ratio of temperatures is

T

Tν=(

11

4

)1/3

= 1.40 =⇒ Tν = 1.95 K. (14.13)

The present number density of each flavor of neutrinos is

[n(νi)]0 =3

11[nγ]0 ≈ 110/cm3. (14.14)

Their contribution to the mass density of the universe is then

Ων =

∑imi[n(νi)]0

ρc∼ 10−3, (14.15)

where we took∑imi ∼ 0.05 eV.

Another major problem with the idea that neutrinos could constitute a major component in

the dark matter is that analyses of structure formation in the Universe indicate that most DM

should be “cold”, i.e., should have been non-relativistic at the onset of galaxy formation (when

there was a galactic mass inside the causal horizon).

14.2.3 The relic density: back of the envelope estimate

Let us study what happens to WIMP dark matter particle χ of mass m when the temperature T

cools below m. Annihilations with cross section σ(χχ → SM particles) try to maintain thermal

equilibrium, nχ ∝ exp(−m/T ). But they fail at T ∼< m, when nχ is so small that the collision rate

Γ experienced by χ becomes smaller than the expansion rate h:

Γ ∼ nχσ ∼< H ∼ T 2/mPl. (14.16)

Annihilations become ineffective, leaving the following out-of-equilibrium relic abundance of dark

matter particles:nχnγ∼ m2/(mPlσ)

m3∼ 1

mPlσm, (14.17)

i.e.,ρχ(T )

ργ(T )∼ m

T

nχnγ∼ 1

mPlσT. (14.18)

Inserting the observed DM density, ρχ ∼ ργ at T ∼ T0, and a typical cross section σ ∼ g4/m2,

gives

m ∼√T0mPl ∼ TeV, (14.19)

for a WIMP with g ∼ 1. More generally, to account for Ωih2 ∼ 0.1, we need

〈σv〉 ∼ 0.9 pb ∼ 1

108 GeV2 , (14.20)

244

which is a typical weak interaction rate. It is remarkable that the estimate points to a mass scale

where we expect for other reasons to find new physics beyond the standard model. The fact that

the cross section required to obtain the correct relic abundance is of the order of weak interaction

cross section has been termed “the WIMP miracle.”

14.2.4 Detecting WIMPs

The WIMP miracle not only provides a model-independent motivation for dark matter at the weak

scale, but it also has strong implications for how dark matter might be detected. For WIMPs χ to

have the observed relic density, they must annihilate to other particles. Assuming that these other

particles are SM particles, the necessity of χχ → SM SM interactions suggests three promising

strategies for dark matter detection:

• Indirect detection: if dark matter annihilated in the early Universe, it must also annihilate

now through χχ→ SM SM, and the annihilation products may be detected.

• Direct detection: dark matter can scatter off normal matter through χ SM→ χ SM interac-

tions, depositing energy that may be observed in sensitive, low background detectors.

• Particle colliders: dark matter may be produced at particle colliders through SM SM → χχ.

Such events are undetectable, but are typically accompanied by related production mecha-

nisms, such as SM SM → χχ + SM, where “SM” denotes one or more standard model

particles. These events are observable and provide signatures of dark matter at colliders.

It is important to note that the WIMP miracle not only implies that such dark matter interac-

tions must exist, it also implies that the dark matter-SM interactions must be efficient. Although

WIMPs may not be a significant amount of dark matter, they certainly cannot have an energy

density more than ΩDM. Cosmology therefore provides lower bounds on interaction rates. This

fact provides highly motivated targets for a diverse array of experimental searches that may be

able to detect WIMPs and constrain their properties.

14.2.5 DM@LHC?

Our current understanding of the weak interaction is that it arises from a gauge theory of the

group SU(2) × U(1) that is spontaneously broken at the hundred-GeV scale. An astronomer

might note this as a remarkable coincidence. A particle theorist might go further. There are

many possible, and competing, models of weak interaction symmetry breaking. In any of these

models, it is possible to add a discrete symmetry that makes the lightest newly introduced particle

stable. Generically, this particle is heavy and neutral and meets the definition of a WIMP that

we have given above. In many cases, the discrete symmetry in question is actually required for

245

the phenomenological viability of the model or arises naturally from its geometry. For example, in

models of supersymmetry, imposing a discrete symmetry called R-parity is the most straightforward

way to eliminate dangerous baryon-number violating interactions. Thus, as particle theorists, we

are almost justified in saying that the problem of electroweak symmetry breaking predicts the

existence of WIMP dark matter.

There is a further assumption that, when added to the properties of a WIMP just listed,

has dramatic consequences for experiments. Models of electroweak symmetry breaking typically

contain new heavy particles with QCD color. These appear as partners of the quarks to provide

new physics associated with the generation of the large top mass. In supersymmetry, and in many

other models, electroweak symmetry breaking appears as a result of radiative corrections due to

these particles, enhanced by the large coupling of the Higgs boson to the top quark. Thus, we

would like to add to the structure of WIMP models the assumption that there exists a new particle

that carries the conserved discrete symmetry and couples to QCD. This particle should have a mass

of the same order of magnitude as the WIMP, below TeV.

Any particle with these properties will be pair-produced at the CERN LHC with a cross section

of tens of pb. The particle will decay to quark or gluon jets and an imbalance of measured momen-

tum. These ‘missing energy’ events are well known to be a signature of models of supersymmetry.

In fact, they should be seen in any model (subject to the assumption just given) that contains a

WIMP dark matter candidate.

14.2.6 Supersymmetry: neutralino dark matter

In models of supersymmetry, the role of the WIMP χ is taken by the lightest ‘neutralino’ - a

mixture of the superpartners of γ and Z (‘gauginos’) and the superpartners of the neutral Higgs

bosons (‘higgsinos’). Depending on the spectrum and couplings of the superpatrtners, several

different reactions can dominate the process of neutralino pair annihilation.

The simplest possibility is that neutralinos annihilate to standard model fermions by exchanging

their scalar superpartners. Sleptons are typically lighter than squarks, so the dominant reactions

are χχ→ `+`−. It turns out, however, that this reaction is less important than one might expect

over most of the supersymmetry parameter space. Because neutralinos are Majorana particles,

they annihilate in the S-wave only in a configuration of total spin 0. However, light fermions are

naturally produced in a spin-1 configuration, and the spin-0 state is helicity suppressed by a factor

(m`/mχ)2. The dominant annihilation is then in the P-wave. Since the relic density is determined

as a temperature at which the neutralinos are non-relativistic, the annihilation cross-section is

suppressed and the prediction for the relic density is typically too large. To obtain values of the

relic density that agree with the observed value, we need light sleptons, with masses below 200

GeV.

Neutralinos can also annihilate to Standard Model vector bosons. A pure U(1) gaugino (‘bino’)

246

cannot annihilate to W+W− or Z0Z0. However, these annihilation channels open up if the gaugino

contains an admixture of SU(2) gaugino (‘wino’) or higgsino content. The annihilation cross

sections to vector bosons are large, so only a relatively small mixing is needed.

The annihilation to third generation fermions can be enhanced by a resonance close to threshold.

In particular, if the mass of the CP odd Higgs boson A0 is close to 2mχ, the resonance produced

by this particle can enhance the S-wave amplitude for neutralino annihilation to bb and τ+τ−.

If other superparticles are close in mass to the neutralino, these particles can have significant

densities when the neutralions decouple, and their annihilation cross sections can also contribute

to the determination of the relic density through a co-annihilation process. If the sleptons are only

slightly heavier than the neutralino, thereaction ˜χ → γ` and ˜ → `` can proceed in the S-wave

and dominate the annihilation Co-annihilation with W+ partners (‘charginos’) and with the top

squarks can also be important in some regions of the MSSM parameter space.

A common feature of all four mechanisms is that the annihilation cross section depends strongly

both on the masses of the lightest supersymmetric particles and on the mixing angles that relate

the original gaugino and higgsino states to the neutralino mass eigenstates. Both sets of parameters

must be fixed in order to obtain a precise prediction for the relic density.

247

Appendix

248

Appendix A

Lie Groups

A crucial role in model building is played by symmetries. You are already familiar with symmetries

and with some of their consequences. For example, Nature seems to have the symmetry of the

Lorentz group which implies conservation of energy, momentum and angular momentum. In order

to understand the interplay between symmetries and interactions, we need a mathematical tool

called Lie groups. These are the groups that describe all continuous symmetries.

In the following we only give definitions and quote statements without proving them. There

are many texts about Lie group where the statements we make below are proven. Three that are

very useful for particle physics purposes are the book by Howard Georgi (“Lie Algebras in particle

physics”), the book by Robert Cahn (“Semi-simple Lie algebras and their representations”) and

the physics report by Richard Slansky (“Group Theory for Unified Model Building”, Phys. Rept.

79 (1981) 1).

A.1 Groups

We start by presenting a series of definitions.

Definition: A group G is a set xi (finite or infinite), with a multiplication law ·, subject to the

following four requirements:

• Closure:

xi · xj ∈ G ∀ xi, xj. (A.1)

• Associativity:

xi · (xj · xk) = (xi · xj) · xk ∀ xi, xj, xk. (A.2)

• There is an Identity element I (or e) such that

I · xi = xi · I = xi ∀ xi. (A.3)

249

• Each element has an inverse element x−1i :

xi · x−1i = x−1

i · xi = I ∀ xi. (A.4)

A group is specified by its multiplication table.

Definition: A group is Abelian if all its elements commute:

xi · xj = xj · xi ∀ xi, xj. (A.5)

A non-Abelian group is a group that is not Abelian, that is, at least one pair of elements does not

commute.

Let us give a few examples:

• Z2, also known as parity, is a group with two elements, I and P , such that I is the identity

and P−1 = P . This completely specifies the multiplication table. This group is finite and

Abelian.

• ZN , with N=integer, is a generalization of Z2. It contains N elements labeled from zero

until N − 1. The multiplication law is the same as addition modulo N : xi · xj = x(i+j)mod N .

The identity element is x0, and the inverse element is given by x−1i = xN−i. This group is

also finite and Abelian.

• Multiplication of positive numbers. It is an infinite Abelian group. The identity is the

number one and the multiplication law is just a standard multiplication.

• S3, the group that describes permutation of 3 elements. It contains 6 elements. This group

is non-Abelian. In your homework you will find for yourself the 6 elements and their multi-

plication table.

A.2 Representations

One of the most important aspects of group theory that is relevant to physics is related to repre-

sentation theory, and that is what we discuss next.

Definition: A representation is a realization of the multiplication law among matrices.

Definition: Two representations are equivalent if they are related by a similarity transformation.

Definition: A representation is reducible if it is equivalent to a representation that is block

diagonal.

Definition: An irreducible representation (irrep) is a representation that is not reducible.

Definition: An irrep that contains matrices of size n× n is said to be of dimension n.

Statement: Any reducible representation can be written as a direct sum of irreps, e.g. D =

D1 +D2.

250

Statement: The dimension of all irreps of an Abelian group is one. For non-Abelian groups

there is at least one irrep that has dimension larger than one.

Statement: Any finite group has a finite number of irreps Ri. If N is the number of elements

in the group, the irreps satisfy ∑Ri

[dim(Ri)]2 = N. (A.6)

Infinite groups have infinite number of irreps.

Statement: For any group there exists a trivial representation such that all the matrices are

just the number 1. This representation is also called the singlet representation. As we see later, it

is of particular importance for us.

Let us give some examples for the statements that we made here.

• Z2: Its trivial irrep is I = 1, P = 1. The other irrep is I = 1, P = −1. Clearly these two

irreps satisfy Eq. (A.6).

• ZN : An example of a non-trivial irrep is xk = exp(i2πk/N).

• S3: In your homework you will work out its properties.

The groups that we are interested in are transformation groups of physical systems. Such

transformations are associated with unitary operators in the Hilbert space. We often describe the

elements of the group by the way that they transform physical states. When we refer to represen-

tations of the group, we mean either the appropriate set of unitary operators, or, equivalently, by

the matrices that operate on the vector states of the Hilbert space.

A.3 Lie groups and Lie Algebras

While finite groups are very important, the ones that are most relevant to particle physics and,

in particular, to the Standard Model, are infinite groups, in particular continuous groups, that is

of cardinality ℵ1. These groups are called Lie groups. They give us formal ways to talk about

rotations is any real or abstract space. The different groups corresponds to rotations in different

spaces.

Definition: A Lie group is an infinite group whose elements are labeled by a finite set of N

continuous real parameters α`, and whose multiplication law depends smoothly on the α`’s. The

number N is called the dimension of the group.

Different groups have different N . Yet, the dimension of the group does not uniquely defined

it. We discuss below the classifications of groups.

Statement: An Abelian Lie group has N = 1. A non-Abelian Lie group has N > 1.

The first example is a group we denote by U(1). It represents addition of real numbers modulo

2π, that is, rotation on a circle. Such a group has an infinite number of elements that are labeled

251

by a single continuous parameter α. We can write the group elements as M = exp(iα). We can

also represent it by M = exp(2iα) or, more generally, as M = exp(iXα) with X real. Each X

generates an irrep of the group.

We are mainly interested in compact Lie groups. We do not define this term formally here,

but we can use the U(1) example to give an intuitive explanation of what it means. A group of

adding with a modulo is compact, while just adding (without the modulo) would be non-compact.

In the first, if you repeat the same addition a number of times, you may return to your starting

point, while in the latter this would never happen. In other words, in a compact Lie group, the

parameters have a finite range, while in a non-compact group, their range is infinite. (Do not

confuse that with the number of elements, which is infinite in either case.) Another example is

rotations and boosts: Rotations represent a compact group while boosts do not.

Statement: The elements of any compact Lie group can be written as

M = exp(iα`X`) (A.7)

such that X` are specific Hermitian matrices and α`, as mentioned before are real numbers. (We

use the standard summation convention, that is α`X` ≡∑` α`X`.)

Definition: The X` are called the generators of the group.

Let us perform some algebra before we turn to our next definition. Consider two elements of a

group, A and B, such that in A only αa 6= 0, and in B only αb 6= 0 and, furthermore, αa = αb = λ:

A ≡ exp(iλXa), B ≡ exp(iλXb). (A.8)

Since A and B are in the group, each of them has an inverse. Thus also

C = BAB−1A−1 ≡ exp(iβcXc) (A.9)

is in the group. Let us take λ to be a small parameter and expand around the identity. Clearly, if

λ is small, also all the βc are small. Keeping the leading order terms, we get

C = exp(iβcXc) ≈ I + iβcXc, C = BAB−1A−1 ≈ I + λ2[Xa, Xb]. (A.10)

In the λ→ 0 limit, we have

[Xa, Xb] = iβcλ2Xc. (A.11)

The combinations

fabc ≡ λ−2βc (A.12)

is independent of λ. Furthermore, while λ and βc are infinitesimal, the fabc-constants do not

diverge. This brings us to a new set of definitions.

Definition: fabc are called the structure constants of the group.

252

Definition: The commutation relations [see Eq. (A.11)]

[Xa, Xb] = ifabcXc, (A.13)

constitute the algebra of the Lie group.

Note the following points regarding the Lie Algebra:

• The algebra defines the local properties of the group but not its global properties. Usually,

this is all we care about.

• The Algebra is closed under the commutation operator.

• Similar to our discussion of groups, one can define representations of the algebra, that is,

matrix representations of X`. In particular, each representation has its own dimension. (Do

not confuse the dimension of the representation with the dimension of the group.)

• The generators satisfy the Jacoby identity

[Xa, [Xb, Xc]] + [Xb, [Xc, Xa]] + [Xc, [Xa, Xb]] = 0. (A.14)

• For each algebra there is the trivial (singlet) representation which is X` = 0 for all `. The

trivial representation of the algebra generates the trivial representation of the group.

• Since an Abelian Lie group has only one generator, its algebra is always trivial. Thus, the

algebra of U(1) is the only Abelian Lie algebra.

• Non-Abelian Lie groups have non-trivial algebras.

• The generators of the Non-Abelian Lie groups are traceless.

The example of SU(2) algebra is well-known from QM courses:

[Xa, Xb] = iεabcXc. (A.15)

Here εabc are the structure constants of the SU(2) group. Usually, in QM, X is called L, S, or J .

The three matrices Sx, Sy and Sz for a given spin S corresponds to a a given irrep of SU(2). The

SU(2) group represents non-trivial rotations in a two-dimensional complex space. Its algebra is

the same as the algebra of the SO(3) group, which represents rotations in the three-dimensional

real space.

We should explain what we mean when we say that “the group represents rotations in a space.”

The QM example makes it clear. Consider a finite Hilbert space of, say, a particle with spin S. The

matrices that rotate the direction of the spin are written in terms of exponent of the Si operators.

For a spin-half particle, the Si operators are written in terms of the Pauli matrices. For particles

253

with spin different from 1/2, the Si operators will be written in terms of different matrices. We learn

that the group represents rotations in some space, while the various representations correspond to

different objects that can “live” in that space.

There are three important irreps that have special names. The first one is the trivial – or

singlet – representation that we already mentioned. Its importance stems from the fact that it

corresponds to something that is symmetric under rotations. While that might sound confusing it

is really trivial. Rotation of a singlet does not change its representation. Rotation of a spin half

does change its representation.

The second important irrep is the fundamental representation. This is the smallest non-trivial

irrep. For SU(2), this is the spinor, or spin half, representation. An important property of the

fundamental representation is that it can be used to get all other representations. We return to

this point later. Here we just remind you that this statement is well familiar from QM. One can

get spin-1 by combining two spin-1/2, and you can get spin-3/2 by combining three spin-1/2. Any

non-Abelian Lie group has a fundamental irrep.

The third important irrep is the Adjoint representation. It is made out of the structure constants

themselves. Think of a matrix representation of the generators. Each entry, T cij is labelled by three

indices. One is the c index of the generator itself, that runs from 1 to N , such that N depends

on the group. The other two indices, i and j, are the matrix indices that run from 1 to the

dimension of the representation. One can show that each Lie group has one representation where

the dimension of the representation is the same as the dimension of the group. This representation

is obtained by defining

(Xc)ab ≡ −ifabc. (A.16)

In other words, the structure constants themselves satisfy the algebra of their own group. In

SU(2), the Adjoint representation is that of spin-1. In your homework you will check for yourself

that the εijk are just the set of the three 3× 3 representations of spin 1.

Before closing this section we remarks about subalgebras and simple groups.

Definition: A subalgebra M is a set of generators that are closed under commutation.

Definition: Consider an algebra L with two subalgebra L1 and L2 such that for any X ∈ L1

and Y ∈ L2, [X, Y ] = 0. The algebra L is not simple and it can be written as a direct product:

L = L1 × L2.

Definition: A simple Lie algebra is an algebra that cannot be written as a direct product.

Since any algebra can be written as a direct product of simple Lie algebras, we can think

about each of the simple algebras separately. A useful example is that of the U(2) group. A U(2)

transformation corresponds to a rotation in two-dimensional complex space. This group is not

simple:

U(2) = SU(2)× U(1). (A.17)

Think, for example, about the rotation of a spinor. It can be separated into two: The trivial

254

rotation is just a U(1) transformation, that is, a phase multiplication of the spinor. The non-

trivial rotation is the SU(2) transformation, that is, an internal rotation between the two spin

components.

A.4 Roots and Weights

Here we move to discuss properties of the algebra and the representations. From this point on

we only consider irreps, and thus we do not distinguish anymore between a representation and an

irrep.

Definition: The Cartan subalgebra is the largest subset of generators whose matrix represen-

tations can all be diagonalized at once. Obviously, these generators all commute with each other

and thus they constitute a subalgebra.

Definition: The number of generators in the Cartan subalgebra is called the rank of the algebra.

Let us consider a few examples. Since the U(1) algebra has only a single generator, it is of

rank one. SU(2) is also rank one. You can make one of its three generators, say Sz, diagonal, but

not two of them simultaneously. SU(3) is rank two. We later elaborate on SU(3) in much more

detail. (We have to, because the Standard Model has an SU(3) symmetry.)

Our next step is to introduce the terms roots and weights. We do that via an example. Consider

the SU(2) algebra. It has three generators. We usually choose S3 to be in the Cartan subalgebra,

and we can combine the two other generators, S1 and S2, to a raising and a lowering operator,

S± = S1 ± iS2. Any representation can be defined by the eigenvalues under the operation of the

generators in the Cartan subalgebra, in this case S3. For example, for the spin-1/2 representation,

the eigenvalues are −1/2 and +1/2; For the spin-1 representation, the eiganvalues are −1, 0, and

+1. Under the operation of the raising (S+) and lowering (S−) generators, we “move” from one

eigenstate of S3 to another. For example, for a spin-1 representation, we have S−|1〉 ∝ |0〉.Let us now consider a general Lie group of rank n. Any representation is characterized by the

possible eigenvalues of its eigenstates under the operation of the Cartan subalgebra: |e1, e2..., en〉.Statement: We can assemble all the operators that are not in the Cartan subalgebra into

“lowering” and “raising” operators. That is, when they act on an eigenstate they either move it

to another eigenstate or annihilate it.

Definition: The weight vectors (or simply weights) of a representation are the possible eigen-

values of the generators in the Cartan subalgebra.

Definition: The roots of the algebra are the various ways in which the generators move a state

between the possible weights.

Statement: The weights completely describe the representation.

Statement: The roots completely describe the algebra.

Statement: The weights of the adjoint representations are the roots of the Lie algebra.

255

Note that both roots and weights live in an n-dimensional vector space, where n is the rank

of the group. The number of roots is the dimension of the group. The number of weights is the

dimension of the representation.

Let us return to our SU(2) example. The vector space of roots and weights is one-dimensional.

The three roots are −1, 0,+1. The trivial representation has only one weight, zero; The funda-

mental has two, ±1/2; The adjoint has three, 0,±1; and so on. You can also see that the weights

of the adjoint irrep are the roots of the algebra.

A.5 SU(3)

In this section we discuss the SU(3) group. It is more complicated than SU(2) and it allows us to

demonstrate few aspects of Lie groups that cannot be demonstrated with SU(2). Of course, it is

also important since it is relevant to physics.

SU(3) is a generalization of SU(2). It may be useful to think about it as rotations in three-

dimensional complex space. Similar to SU(2), the full symmetry of this rotations is called U(3),

and it can be written as a direct product of simple groups, U(3) = SU(3) × U(1). The SU(3)

algebra has eight generators. (You can see it by recalling that rotation in a complex space is

done by unitary matrices, and any unitary matrix can be written with Hermitian matrix in the

exponent. There are nine independent Hermitian 3× 3 matrices. They can be separated to a unit

matrix, which corresponds to the U(1) part, and eight traceless matrices, which correspond to the

SU(3) part.) We go on and study SU(3) without proving that it is related to the above intuitive

picture of rotation in three dimensional complex space.

Similar to the use of the Pauli matrices for the fundamental representation of SU(2), the

fundamental representation of SU(3) is usually written in terms of the Gell-Mann matrices,

Xa = λa/2, (A.18)

with

λ1 =

0 1 0

1 0 0

0 0 0

, λ2 =

0 −i 0

i 0 0

0 0 0

,

λ3 =

1 0 0

0 −1 0

0 0 0

, λ4 =

0 0 1

0 0 0

1 0 0

,

λ5 =

0 0 −i0 0 0

i 0 0

, λ6 =

0 0 0

0 0 1

0 1 0

,

256

λ7 =

0 0 0

0 0 −i0 i 0

, λ8 =1√3

1 0 0

0 1 0

0 0 −2

. (A.19)

We would like to emphasize the following points:

1. The Gell-Mann matrices are traceless, as they should.

2. There are three SU(2) subalgebras. One of them is manifest and it is given by λ1, λ2 and

λ3. Can you find the other two?

3. It is manifest that SU(3) is of rank two: λ3 and λ8 are in the Cartan subalgebra.

Having explicit expressions of fundamental representation in our disposal, we can draw the

weight diagram. In order to do so, let us recall how we do it for the fundamental (spinor) rep-

resentation of SU(2). We have two basis vectors (spin-up and spin-down); we apply Sz on them

and obtain the two weights, +1/2 and −1/2. Here we follow the same steps. We take the three

vectors,

(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T , (A.20)

and apply to them the two generators in the Cartan subalgebra, X3 and X8. We find the three

weights (+

1

2,+

1

2√

3

),

(−1

2,+

1

2√

3

),

(0,− 1√

3

). (A.21)

We can plot this in a weight diagram in the X3 −X8 plane. Please do it.

Once we have the weights we can get the roots. They are just the combination of generators

that move us between the weights. Clearly, the two roots that are in the Cartan are at the origin.

The other six are those that move us between the three weights. We find that they are(±1

2,±√

3

2

), (±1, 0) . (A.22)

Again, it is a good idea to plot it. This root diagram is also the weight diagram of the Adjoint

representation. In terms of the Gell-Mann matrices, we can see that the raising and lowering

generators are proportional to

I± =1

2(λ1 ± iλ2) V± =

1

2(λ4 ± iλ5) U± =

1

2(λ6 ± iλ7). (A.23)

The names I, U , and V are, at this point, just names. Later on we will see that they are related

to some specific SU(3) symmetry.

257

A.6 Classification and Dynkin diagrams

The SU(3) example allows us to obtain more formal results. In the case of SU(2), it is clear what

are the raising and lowering operators. The generalization to groups with higher rank is as follows.

Definition: A positive (negative) root is a root whose first non-zero component is positive

(negative). A raising (lowering) operator correspond to a positive (negative) root.

Definition: A simple root is a positive root that is not the sum of other positive roots.

Statement: Every rank-k algebra has k simple roots. Which ones they are is a matter of

convention, but their relative lengths and angles are fixed.

In fact, it can be shown that the simple roots fully describe the algebra. It can be further

shown that there are only four possible angles and corresponding relative lengths between simple

roots:angle 90 120 135 150

relative length N/A 1 : 1 1 :√

2 1 :√

3.(A.24)

The above rules can be visualized using Dynkin diagrams. Each simple root is described by a

circle. The angle between two roots is described by the number of lines connecting the circles:

i i90 i i120 i y135 i y150

(A.25)

where the solid circle in a link represent the largest root.

There are seven classes of Lie groups. Four classes are infinite and three classes, called the

exceptional groups, have each only a finite number of Lie groups. Below you can find all the sets.

The number of circles is the rank of the group. Note that different names for the infinite groups

are used in the physics and mathematics communities. Below we give both names, but we use only

the physics names from now on.

i i . . . i iiSO(2k) [Dk]

y y . . . y iSO(2k + 1) [Ck]

i i . . . i ySp(2k) [Bk]

i i . . . i iSU(k + 1) [Ak]

(A.26)

i i i i iiE6

258

i i i i i iiE7

i i i i i i iiE8

i i y yF4

i yG2 (A.27)

Consider, for example, SU(3). The two simple roots are equal in length and have an angle of

120 between them. Thus, the Dynkin diagram is just h h.

Dynkin diagrams provide a very good tool to tell us also about what are the subalgebras of a

given algebra. We do not describe the procedure in detail here, and you are encouraged to read

it for yourself in one of the books. One simple point to make is that removing a simple root

always corresponds to a subalgebra. For example, removing simple roots you can see the following

breaking pattern:

E6 → SO(10)→ SU(5)→ SU(3)× SU(2). (A.28)

You may find such a breaking pattern in the context of Grand Unified Theories (GUTs).

Finally, we would like to mention that the algebras of some small groups are identical

SU(2) ' SO(3) ' Sp(2), SU(4) ' SO(6), SO(4) ' SU(2)× SU(2), SO(5) ' Sp(4).

(A.29)

A.7 Naming representations

We are now back to discuss representations. How do we name an irrep? In the context of SU(2),

which is rank one, there are three different ways to do so.

(i) We denote an irrep by its highest weight. For example, spin-0 denotes the singlet repre-

sentation, spin-1/2 refers to the fundamental representation, where the highest weight is 1/2, and

spin-1 refers to the adjoint representation, where the highest weight is 1.

(ii) We can define the irrep according to the dimension of the representation-matrices, which

is also the number of weights. Then the singlet representation is denoted by 1, the fundamental

by 2, and the adjoint by 3.

(iii) We can name the representation by the number of times we can apply S− to the highest

weight without annihilating it. In this notation, the singlet is denoted as (0), the fundamental as

(1), and the adjoint as (2).

259

Before we proceed, let us explain in more detail what we mean by “annihilating the state”. Let

us examine the weight diagram. In SU(2), which is rank-one, this is a one dimensional diagram.

For example, for the fundamental representation, it has two entries, at +1/2 and −1/2. We now

take the highest weight (in our example, +1/2), and move away from it by applying the root that

corresponds to the lowering operator, −1. When we apply it once, we move to the lowest weight,

−1/2. When we apply it once more, we move out of the weight diagram, and thus “annihilate the

state”. Thus, for the spin-1/2 representation, we can apply the root corresponding to S− once to

the highest weight before moving out of the weight diagram, and — in the naming scheme (iii) —

we call the representation (1).

We are now ready to generalize this to general Lie algebras. Either of the methods (ii) and

(iii) are used. Method (ii) is straightforward, but somewhat problematic as there could be several

different representations with the same dimension. We give an example of such a situation later.

In scheme (iii), the notation used for a specific state is unambiguous. To use it, we must order

the simple roots in a well-defined (even if arbitrary) order. Then we have a unique highest weight.

We denote a representation of a rank-k algebra as a k-tuple, such that the first entry is the maximal

number of times that we can apply the first simple root on the highest weight before the state is

annihilated, the second entry refers to the maximal number of times that we can apply the second

simple root on the highest weight before annihilation, and so on. Take again SU(3) as an example.

We order the Cartan subalgebra as X3, X8 and the two simple roots as

S1 =

(+

1

2,+

√3

2

), S2 =

(+

1

2,−√

3

2

). (A.30)

Consider the fundamental representation where we chose the highest weight to be(1/2, 1/(2

√3)).

Subtracting S1 twice or subtracting S2 once from the highest weight would annihilate it. Thus

the fundamental representation is denoted by (1, 0). You can work out the case of the adjoint

representation and find that it should be denoted as (1, 1). In fact, it can be shown that any pair

of non-negative integers forms a different irrep. (For SU(2) with the naming scheme (iii), any

non-negative integer defines a different irrep.)

From now on we limit our discussion mostly to SU(N).

Statement: For any group the singlet irrep is (0, 0, ..., 0).

Statement: For any SU(N) algebra, the fundamental representation is (1, 0, 0, ..., 0).

Statement: For any SU(N ≥ 3) algebra, the adjoint representation is (1, 0, 0, ..., 1).

Definition: For any SU(N), the conjugate representation is the one where the order of the

k-tuple is reversed.

For example, (0, 1) is the conjugate of the fundamental representation, which is usually called

the anti-fundamental representation. An irrep and its conjugate have the same dimension. In

the naming scheme (ii), they are called m and m. Note that some representations are self–

conjugate, e.g., the adjoint representation, such representation are also called real representations.

260

Representations that are not self conjugate are also called complex representations. Also note that

all irreps of SU(2) are real.

We now return to the notion that the groups that we are dealing with are transformation

groups of physical states. These physical states are often just particles. For example, when we

talk about the SU(2) group that is related to the spin transformations, the physical system that is

being transformed is often that of a single particle with well-defined spin. In this context we often

abuse the language by saying that the particle is, for example, in the spin-1/2 representation of

SU(2). What we mean is that, as a state in the Hilbert space, it transforms by the spin operator

in the 1/2 representation of SU(2). Similarly, when we say that the proton and the neutron form

a doublet of isospin-SU(2), we mean that we represent p by the vector-state (1, 0)T and n by

the vector-state (0, 1)T , so that the appropriate representation of the isospin generators is by the

2× 2 Pauli matrices. In other words, we loosely speak on “particles in a representation” when we

mean “the representation of the group generators acting on the vector states that describe these

particles.”

How many particles there are in a given irrep? Here, again, we consider only SU(N) and state

the results.

• Consider an (α) representation of SU(2). It has

N = α + 1, (A.31)

particles. The singlet (0), fundamental (1) and adjoint (2) representations have, respectively,

1, 2, and 3 particles.

• Consider an (α, β) representation of SU(3). It has

N = (α + 1)(β + 1)α + β + 2

2(A.32)

particles. The singlet (0, 0), fundamental (1, 0) and adjoint (1, 1) representations have, re-

spectively, 1, 3, and 8 particles.

• Consider an (α, β, γ) representation of SU(4). It has

N = (α + 1)(β + 1)(γ + 1)α + β + 2

2

β + γ + 2

2

α + β + γ + 3

3(A.33)

particles. The singlet (0, 0, 0), fundamental (1, 0, 0) and adjoint (1, 0, 1) representations have,

respectively, 1, 4, and 15 particles. Note that there is no α+γ+2 factor. Only a consecutive

sequence of the label integers appears in any factor.

• The generalization to any SU(N) is straightforward. It is easy to see that the fundamental

of SU(N) has N particles and the adjoint has N2 − 1 particles.

261

In SU(2), the number of particles in a representation is unique. In a general Lie group, however,

the case may be different. Yet, it is often used to identify irreps. For example, in SU(3) we usually

call the fundamental 3, and the adjoint 8. For the anti-fundamental we use 3. In cases where there

are several irreps with the same number of particles we often use a prime to distinguish them. For

example, both (4, 0) and (2, 1) contain 15 particles. We denote them by 15 and 15′ respectively.

Last, we remark on how the count goes in terms of subgroup irreps. For example, any SU(3)

irreps can be wirtten in terms of its SU(2) subgroup ireps. Two usefull decompositions are 3 = 2+1

and 8 = 3 + 2 + 2 + 1, where the irreps on the left are that of SU(3) and on the right of SU(2).

You can decude them from the weight diagrams of the SU(3) irreps simply by moving only in one

of the SU(2) subgorup direction on the diagrams, for exmaple, only moving in the X3 direction.

While we do not elaborate further here, we only mention that there is a way to decompose any

irrep of a bigger group as a sum irreps of its subgroups.

A.8 Combining representations

When we study spin, we learn how to combine SU(2) representations. The canonical example is

to combine two spin-1/2 to generate a singlet (spin-0) and a triplet (spin-1). We often write it

as 1/2 × 1/2 = 0 + 1. There is a similar method to combine representations for any Lie group.

The basic idea is, just like in SU(2), that we need to find all the possible ways to combine the

indices and then assign it to the various irreps. That way we know what irreps are in the product

representation and the corresponding Clebsch-Gordan (GB) coefficients.

Here, however, we do not explain how to construct the product representation. The reason

is that often all we want to know is what irreps appear in the product representation, without

the need to get all the CG coefficients. (In particular, many times all we care about is how to

generate the singlet.) There is a simple way to do just this for a general SU(N). This method

is called Young Tableaux, or Young Diagrams. The details of the method are well explained in

several places, for example, in the PDG. In the homework you are asked to learn how to use it.

Here we give few examples for combining irreps in SU(3) that we will use when we discuss the

SM. Using naming scheme (ii) we have

3× 3 = 1 + 8, 3× 3 = 3 + 6, 3× 6 = 10 + 8. (A.34)

From this we can conclude that

3× 3× 3 = 10 + 8 + 8 + 1. (A.35)

Note that the number of particles on both sides are equal, as they should. Of particular interest

to us is that 3× 3 and 3× 3× 3 contain the singlet irrep.

We we combine identical irreps the result have well defined symmetry properties under exchange

of the two irreps, that is, they are even or odd under an exchange. For example, when combining

262

two spin halves, the singlet is odd while the triplet is even under the exchange. For SU(2) the rule

is simple, the highest irrep is symmetric and then going down it alternate. For other groups it is

more complicated and we do not discuss it here. When it is important we may add a subscript

to denote the symmetry property. For example, in S(2) we may write 2 × 2 = 1a + 3s and

3× 3 = 5s + 3a + 1s. In the example above for SU(3) we may write

3× 3 = 3a + 6s, 3× 3× 3 = 10s + 8m + 8m + 1a, (A.36)

where 8m refer to a mixed symmetry.

The symmetry properties are important when the two irreps that we are combining are identical.

For example, the antisymmetry combination of two vectors in real three dimensional space is the

cross product. Being antisymmetry we know it identically vanishes for two identical vectors,

~a×~a = 0. This result is a general one, and applies to all fully antisymmetric combinations in any

group.

Combining representations plays a very important role in physics. In particular, we would like

to find a combination of representations that combined into the singlets one, as such a combination

is invariant under rotation in the relevant space.

263

Homework

Question A.1: S3

In this question we study the group S3. It is the smallest finite non-Abelian group. You can

think about it as all possible permutation of three elements. The group has 6 elements. Thinking

about the permutations we see that we get the following representation of the group:

() =

1 0 0

0 1 0

0 0 1

(12) =

0 1 0

1 0 0

0 0 1

(13) =

0 0 1

0 1 0

1 0 0

(23) =

1 0 0

0 0 1

0 1 0

(123) =

0 1 0

0 0 1

1 0 0

(321) =

0 0 1

1 0 0

0 1 0

(A.37)

The names are instructive. For example, (12) represents exchanging the first and second elements.

(123) and (321) are cyclic permutation to the right or left.

1. Write explicitly the 6× 6 multiplication table for the group.

2. Show that the group is non-Abelian. Hint, it is enough to find one example.

3. Z3 is a sub group of S3. Find three generators that correspond to a Z3.

4. In class we mentioned the following theorem for finite groups

∑Ri

[dim(Ri)]2 = N, (A.38)

where N is the number of elements in the group and Ri are all the irreps. Based on this,

proof that the representation in Eq. (A.37) is reducible. Then, write it explicitly in a (1 + 2)

block diagonal representation. (Hint: find a vector which is an eigenvector of all the above

matrices.)

264

5. In the last item you found a two dimensional and a one dimensional representations of S3.

Based on (A.38) you know that there is only one more representation and that it is one

dimensional. Find it.

Question A.2: Lie algebras

Consider two general elements of a Lie groups,

A ≡ exp(iλXa), B ≡ exp(iλXb). (A.39)

where Xi is a generator. We think about λ as a small parameter. Then, consider a third element

C = BAB−1A−1 ≡ exp(iβcXc). (A.40)

Expand C in powers of λ and show that at lowest order you get the Lie algebra

[Xa, Xb] = ifabcXc, fabc ≡βcλ2. (A.41)

Question A.3: SU(3)

1. The three Gell–Mann matrices, aλ1, aλ2 and aλ3 satisfy an SU(2) algebra, where a is a

constant. What is a?

2. Does this fact mean that SU(3) is not a simple Lie group?

3. Draw the root diagram of SU(3).

4. There are two other independent combinations of Gell–Mann matrices that satisfy SU(2)

algebras. What are they? Hint: Look at the root diagram.

Question A.4: Dynkin diagrams

1. Draw the Dynkin diagram of SO(10).

2. What is the rank of SO(10)?

3. How many generators there are for SO(10)? (We did not proof a general formula for the

number of generators for SO(N). It should be simple for you to find such a formula using

your understanding of rotations in real N -dimensional spaces.)

4. Based on the Dynkin diagram show that SO(10) has the following subalgebras

SO(8), SU(5), SU(4)× SU(2), SU(3)× SU(2)× SU(2). (A.42)

In each case show which simple root you can remove from the SO(10) Dynkin diagram.

265

Question A.5: Representations

Here we practice finding the number of degrees of freedom in a given irrep.

1. In SU(5), how many particles there are in the (1, 1, 0, 0) irrep?

2. In SU(3) how many particles there are in the following irreps

(3, 0), (2, 2). (A.43)

3. Consider the (3, 0) irrep of SU(3). Draw its weight diagram and from it decompose it into

its SU(2) irreps.

Question A.6: Combining irreps

Here we are going to practice the use of Young Tableaux. The details of the method can be

found in the PDG (there is a link in the website of the course). Study the algorithm and do the

following calculations. Make sure you check that the number of particles on both sides is the same.

Write your answer both in the k-tuple notation and the number notation. For example, in SU(3)

you should write

(1, 0)× (0, 1) = (0, 0) + (1, 1), 3× 3 = 1 + 8. (A.44)

1. In SU(3) calculate

3× 3, 3× 8, 10× 8. (A.45)

2. Given that the quarks are SU(3)C triplets, 3, the anti-quarks are 3 and the gluons are color

octets, 8, which of the following could be an observable bound state?

qq, qq, qg, gg, qqg, qqq. (A.46)

Note that an observable bound state must be a color singlet.

3. Find what is 5 and 10 in SU(5) in a k-tuple notation.

4. Calculate 10× 10 in SU(5).

266

Bibliography

[1] C. P. Burgess and G. D. Moore, “The standard model: A primer,” Cambridge Univ. Pr.

(2007).

[2] A. Pich, arXiv:1201.0537 [hep-ph].

[3] Z. Han and W. Skiba, Phys. Rev. D 71, 075009 (2005) [hep-ph/0412166].

[4] W. Skiba, arXiv:1006.2142 [hep-ph].

[5] R. Barbieri, A. Pomarol, R. Rattazzi and A. Strumia, Nucl. Phys. B 703, 127 (2004) [hep-

ph/0405040].

[6] G. Cacciapaglia, C. Csaki, G. Marandella and A. Strumia, Phys. Rev. D 74, 033011 (2006)

[hep-ph/0604111].

[7] M. E. Peskin and T. Takeuchi, Phys. Rev. D 46, 381 (1992).

[8] The GFitter home page at http://gfitter.desy.de/

[9] Flip Tenado, Quantum Diaries blog, http://www.quantumdiaries.org/

[10] The Review of Particle Physics, K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38,

090001 (2014).

[11] M.E. Peskin and D.V. Schroeder, QFT book.

[12] O. J. P. Eboli and M. C. Gonzalez-Garcia, Phys. Rev. D 70, 074011 (2004)

doi:10.1103/PhysRevD.70.074011 [hep-ph/0405269].

[13] Y. Liang and A. Czarnecki, Can. J. Phys. 90, 11 (2012) doi:10.1139/p11-144 [arXiv:1111.6126

[hep-ph]].

[14] G. Buchalla and A. J. Buras, Nucl. Phys. B 548, 309 (1999) doi:10.1016/S0550-3213(99)00149-

2 [hep-ph/9901288].

267

[15] G. Buchalla and A. J. Buras, Nucl. Phys. B 412, 106 (1994) doi:10.1016/0550-3213(94)90496-0

[hep-ph/9308272].

268

THE STANDARD MODEL Y. Grossman and Y. Niryuvalg/p4444/GNB-master.pdf11.3 The weak mixing angle, W. ....

Documents

Transcript of THE STANDARD MODEL Y. Grossman and Y. Niryuvalg/p4444/GNB-master.pdf11.3 The weak mixing angle, W. ....