Post on 28-May-2020
MULTIVARIATE METHODS FOR HADRONIC FINAL STATES IN
ELECTRON-POSITRON COLLISIONS AT√
S = 500 GEV
Saurav Pathak
A DISSERTATION
in
Physics and Astronomy
Presented to the Faculties of the University of Pennsylvania
in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
2005
Robert Hollebeek, Supervisor of Dissertation
Randy Kamien, Graduate Group Chair
c© Copyright 2005
by
Saurav Pathak
To my parents
iii
Acknowledgements
The use of data mining techniques in high-energy physics is a novelty, and the solu-
tions that they offer to the challenges posed by the next generation of accelerators are
now being appreciated. I was very fortunate to participate in the National Scalable
Cluster Project (NSCP) that explored this area under Robert Hollebeek, an early
proponent of this approach to high-energy physics data. This thesis is clearly in-
spired by his vision and knowledge. Prof. Hollebeek open up this exciting world and
made it possible for me to explore the limits of newer computational techniques—their
amazing abilities as well as their limitations.
I wish to thank Kevin Sterner for maintaining a strong interest in my work and
setting high standards. His guidance on the design of FastCal was crucial for its
success. I shall ever be grateful for the revisions and editorial scrutiny that he has
subjected this thesis to. Though I have benefited much from this, the errors if any
are my own.
Pavlos Protopapas has been a friend and mentor for the entire period of the gradu-
ate life at the University of Pennsylvania. His interest and expertise in computational
work is a major influence in this thesis.
I wish to thank Gary Bower at SLAC for his encouragement especially for the
development of the CJNN neural network package. I have benefited much from my
interaction with him.
iv
I would like to offer special thanks to Turgut Durduran, Marc Llaguno, Regine
Choe and Eylem Ozkaramanli for enriching my graduate life. Turgut has been very
helpful with his advice, especially on matters computers, during emergencies and
otherwise.
This thesis was possible because of strong unconditional emotional support that
I received from my parents, my soon to be in-laws and my vast extended family in
distant Assam, India. Their interest and their faith in me was an engine that drove
me toward completion of this thesis.
I have no words to convey my thanks for the love and support of Maina, who has
shared with me much of the joys and sorrows of graduate life. Thank you Maina, for
this wonderful journey.
v
ABSTRACT
MULTIVARIATE METHODS FOR HADRONIC FINAL STATES IN
ELECTRON-POSITRON COLLISIONS AT√
S = 500 GEV
Saurav Pathak
Supervisor: Robert Hollebeek
We approach the hadronic final state events in a future linear collider at√
s =
500 GeV from the knowledge discovery (data mining) point of view. We present
FastCal, a fast configurable calorimeter Monte Carlo simulator for linear collider
detector simulations that produces data at a rate that is 3000 times that of full simu-
lation. Neural networks based on earlystopping are designed for the jet-combinatorial
problem. CJNN, a neural network package is presented for use in the linear collider
analysis environment. Neural network performances are optimized by implementing
an ensemble of neural networks. A binary tree is used to obtain novel automatic cuts
on physics variables. Data visualization is introduced as a crucial component of data
analysis, and principal component analysis is used to understand data distributions
and structures in multiple dimensions. Finally, cluster analyses with fuzzy c-means
and demographic clustering are used to partition data automatically in an unsuper-
vised regime, and we show that for fruitful use of these algorithms, understanding
the data structures is crucial.
vi
Contents
Acknowledgements iv
Abstract vi
Contents vii
List of Tables xiii
List of Figures xv
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 New Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Accelerators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Multijet events in LC . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Multivariate Analysis and Kinematics . . . . . . . . . . . . . . . . . . 5
1.3 Knowledge Discovery as an approach . . . . . . . . . . . . . . . . . . 6
1.4 Mining for the Z boson – CDF data . . . . . . . . . . . . . . . . . . . 6
1.4.1 Demographic Clustering . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 Mining for Z . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
vii
1.5 Plan of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Knowledge Discovery and Physics 17
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 A short history of Data Mining . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Digital databases . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Classical Statistics . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.4 Meeting together . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Knowledge Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 An example - Binary Tree . . . . . . . . . . . . . . . . . . . . 21
2.3.2 Description of KDD . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Knowledge discovery and scientific discovery . . . . . . . . . . . . . . 24
2.5 Knowledge discovery and high-energy physics . . . . . . . . . . . . . 25
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Neural Networks 27
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 The problem – Separation of Z from background . . . . . . . . . . . . 28
3.3 Rationale for a neural network . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 The F -measure . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.2 Comparison with some other methods . . . . . . . . . . . . . 30
3.4 A discussion on neural networks . . . . . . . . . . . . . . . . . . . . . 31
3.4.1 The neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4.2 The network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Neural Network Training . . . . . . . . . . . . . . . . . . . . . . . . . 35
viii
3.5.1 Data representation . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5.2 Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5.3 Bias-variance dilemma and earlystopping . . . . . . . . . . . . 36
3.6 Neural Network Architecture . . . . . . . . . . . . . . . . . . . . . . . 38
3.6.1 The number of layers . . . . . . . . . . . . . . . . . . . . . . . 38
3.6.2 The number of units in each layer . . . . . . . . . . . . . . . . 39
3.6.3 Fixing ε . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6.4 Root mean square weights . . . . . . . . . . . . . . . . . . . . 40
3.7 What do neural networks do . . . . . . . . . . . . . . . . . . . . . . . 42
3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 FastCal: A fast Monte Carlo simulator for the LCD calorimeter 45
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Simulated data and real data . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Rationale for FastCal . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.1 Description of existing FastMC . . . . . . . . . . . . . . . . . 48
4.4 Description of FastCal . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4.1 Design philosophy and approach . . . . . . . . . . . . . . . . . 49
4.4.2 Geometry of Calorimeter . . . . . . . . . . . . . . . . . . . . . 50
4.5 FastCal simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 Particle Propagation . . . . . . . . . . . . . . . . . . . . . . . 52
4.5.2 High-energy physics processes . . . . . . . . . . . . . . . . . . 56
4.6 Single Particle Comparison . . . . . . . . . . . . . . . . . . . . . . . . 60
4.6.1 Low-energy physics processes . . . . . . . . . . . . . . . . . . 63
4.6.2 A synopsis of the hadronic particle simulations . . . . . . . . . 69
4.7 FastCal gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
ix
4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 Neural Networks – comparison between GISMO and FastCal 73
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Calorimeter deposition . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.1 Cluster level comparison . . . . . . . . . . . . . . . . . . . . . 75
5.2.2 Jet level comparison . . . . . . . . . . . . . . . . . . . . . . . 77
5.2.3 Jet-Quark Association . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Neural network training . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.1 FastCal-GISMO comparison . . . . . . . . . . . . . . . . . . . 81
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 Neural Network – results 84
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2 Jet-Boson Association . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.3 Ensemble of neural networks . . . . . . . . . . . . . . . . . . . . . . . 87
6.3.1 Ensemble Results . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 Classifying Z and W jet-pairs . . . . . . . . . . . . . . . . . . . . . . 90
6.4.1 Distinguishing W and Z . . . . . . . . . . . . . . . . . . . . . 93
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7 Unsupervised Methods 97
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.2.1 The combinatorial problem and Principal Component Analysis 100
7.2.2 PCA to 2 dimensions . . . . . . . . . . . . . . . . . . . . . . . 101
7.2.3 The density plot . . . . . . . . . . . . . . . . . . . . . . . . . 106
x
7.2.4 PCA at the quark level . . . . . . . . . . . . . . . . . . . . . . 109
7.3 Clustering methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.3.1 Fuzzy clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3.2 Demographic clustering on the combinatorial problem . . . . . 117
7.3.3 Comparing clustering result . . . . . . . . . . . . . . . . . . . 119
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8 Conclusion and Future work 121
8.1 The goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.2 The Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.2.1 FastCal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.2.2 Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.2.3 Exploratory Data Mining . . . . . . . . . . . . . . . . . . . . 124
8.2.4 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.3 Data mining and the future . . . . . . . . . . . . . . . . . . . . . . . 126
8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A Using FastCal in JAS 128
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
A.2 Use of the FastCal package . . . . . . . . . . . . . . . . . . . . . . . . 129
B CJNN – Neural Network GUI package for JAS 131
B.1 A brief description of CJNN . . . . . . . . . . . . . . . . . . . . . . . 131
B.2 Use of CJNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
B.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
B.4 Training interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
B.5 Error Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
xi
B.6 Application of CJNN in code . . . . . . . . . . . . . . . . . . . . . . 136
C Demographic Clustering – an example 138
C.1 Condorcet’s Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
C.2 Demographic Clustering example . . . . . . . . . . . . . . . . . . . . 139
D Binary Trees 141
D.1 SPRINT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
D.1.1 Split Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
E Fuzzy Clustering 143
E.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
F Principal Component Analysis and Multidimensional Scaling 145
F.1 Classical Multidimensional Scaling . . . . . . . . . . . . . . . . . . . 145
F.2 MDS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Glossary 147
Glossary 147
Bibliography 151
xii
List of Tables
2.1 Alternative cuts for Z reconstruction. CUTS II are obtained from a
decision tree modeled on the jet pair invariant mass mjj. . . . . . . . 23
3.1 F -measure (Equation 3.3) for different classifiers. . . . . . . . . . . . 31
3.2 The binary tree rule for one of the branches on trained on the neural
network result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1 Detector Parameters ldmar01, a design specification for the LD design,
used in this study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Table of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Table of lengths. Tyvek is not included in the calculation. Since it
forms such a thin layer in comparison to the others, it is expected to
produce minor corrections. . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 A comparison between FastCal and GISMO. Four jet events are those
in which both the Z’s decay hadronically, and the events have at least
four jets. Good pairs are those pairs of the four jets that can be
associated with the quarks from the same Z. . . . . . . . . . . . . . 71
5.1 The F -measures (Equation 3.3) for the three data sets. . . . . . . . 82
xiii
6.1 Comparison of the Boson Content and the Angular Proximity methods
in Z pair identification. 1 denotes a good jet pair and 0 denotes a bad
jet pair. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.1 The transformation matrix. Note that the transformation is between
the scaled and standardized (zero mean, unit variance) values of vari-
ables in the left column and the principal components . . . . . . . . . 102
7.2 Fuzzy clustering for 2 clusters (c = 2). The cluster 1 is interpreted as
the incorrect jet-pair cluster. The F -measure is calculated for cluster
0 and interpreted as the correct jet-pair cluster. . . . . . . . . . . . . 114
7.3 Fuzzy clustering for 3 clusters (c = 3). The F -measure for cluster 0
does not improve with the increase of c from 2 to 3. . . . . . . . . . 115
7.4 The centroids of the three clusters. Note that the variables are stan-
dardized to zero mean and unit variance. . . . . . . . . . . . . . . . . 115
7.5 Comparison between fuzzy c-means and demographic clustering. . . 119
xiv
List of Figures
1.1 Invariant mass distribution m of jet pairs, for events with a good e+e−
pair (electrons with energies greater than 20 GeV). . . . . . . . . . . 9
1.2 Clustering of jet pairs with a pair of e+e−. Clustering in performed on
the following variables: jet-jet invariant mass (m ij), jet-jet opening
angle (dphi), the transverse momenta of the first and the second jets
(pt i and pt j respectively) and the rapidities of the first and second
jets (y i and y j respectively). The two jets in the jet pairs are ordered
so that the first jet has higher energy than the second. The gray filled
histograms denote the distributions of the entire dataset, whereas the
histograms with thick lines denote distributions for the specific cluster. 10
1.3 Invariant mass distribution m of jet pairs with lowered e+e− threshold
energy (5 GeV). Note that the Z bump clearly visible in Figure 1.1 is
washed out. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Clustering of jet pairs with lowered e+e− threshold energy, representing
Z. The variables are those given in Figure 1.2. . . . . . . . . . . . . 12
1.5 Clustering of jet pairs with lowered e+e− threshold energy, representing
W±. The variables are explained in Figure 1.2 . . . . . . . . . . . . . 13
xv
1.6 The Z cluster in the 2 maximum energy jet variables. The variables
are the same as in Figure 1.2, but the subscripts (1,2) now represent
the two massive jets. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7 The Z cluster in the 2 maximum energy jet variables. The variables are
the same as in Figure 1.2, but the subscripts (1,2) now represent the
two massive jets. The threshold jet energy is lowered to 5 GeV. The
distance unit is set for invariant mass m 12 (15 GeV), the transverse
momenta pt 1 (5 GeV) and pt 2 (5 GeV). . . . . . . . . . . . . . . . 15
2.1 The graph on the left is one obtained from cuts. The one on the right
is from cuts obtained from the binary tree method. . . . . . . . . . . 22
3.1 Performance of different classifiers. For a quantitative comparison, see
Table 3.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 The distribution of the two output neural units trained on output data
that is either (1,0) or (0,1). The straight line is not a fitted line, but
the output distribution of the two neural units, showing the sum of the
outputs is unity, as is required for a probabilistic interpretation. There
are a few outliers present, though, which shows that some regions of
the variable space were not adequately represented and these regions
were under-trained. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 (a) With ε = 0.01, the weights decay too fast for the neural network
to learn. This is denoted by the flat error curve in the figure. The
difference in the error values on the training and the validation datasets
is due to the difference in size. (b) With ε = 0.00001, the error curves
exhibit the same behavior as for ε = 0. . . . . . . . . . . . . . . . . . 41
xvi
3.4 The RMS weights of neural units in the first and second hidden layers.
In (a), the low-valued RMS weights is clearly isolated, and are all less
than 1. In (b), the low-valued RMS weights are not isolated. So we
follow (a) and delete the neural units that have RMS weights less than
1.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5 With a√
W 2 cutoff of 1.0 a 6-21-22-2 network is obtained, which after
the initial spike in the learning curve, continues in the expected fashion.
Also note that the error on the validation dataset is lower compared
to the error from the validation set on the older network. This indi-
cates that the generalization capability has improved with pruning. By
contrast in the more drastic pruning (b), we see that after the spike,
the error curves do not follow the behavior of the unpruned network,
thus exhibiting poor learning. This also reinforces our assertion that
we require at least two hidden layers to solve the problem. . . . . . . 43
4.1 Plot for Hadronic Barrel Calorimeter for neutral hadrons. Each circle
represents an intersection point of a neutral particle with the inner
surface of a barrel calorimeter. . . . . . . . . . . . . . . . . . . . . . 54
4.2 Plot for Hadronic Barrel Calorimeter for neutral hadrons. . . . . . . 54
4.3 Plot for Hadronic Barrel Calorimeter for charged hadrons. . . . . . . 55
4.4 Plot for Hadronic Barrel Calorimeter for charged hadrons. . . . . . . 56
4.5 Event with charged final state particle trajectories in FastCal and
GISMO for comparison. Figure (a) is obtained from FastCal and Fig-
ure (b) is the same event simulated by GISMO and viewed using LCD-
Wired. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
xvii
4.6 Calorimeter response to single e− events at 50 GeV. The left column of
histograms show the FastCal response, the right column show GISMO
response. The top figures show the total energy deposit (EHCAL +
EECAL). The middle figures show the energy in the ECAL (EECAL),
and the bottom figure the HCAL response (EHCAL). Note that there
is no energy leakage from the ECAL in FastCal, represented by the flat
distribution of HCAL energy in the bottom left figure. . . . . . . . . 58
4.7 Calorimeter response to single π− particle at 50 GeV without any fluc-
tuation. The placement of the histograms is the same as that described
in Figure 4.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.8 Calorimeter response to single π− particle at 50 GeV with fluctuations
in the shower origin. The placement of the histograms is the same as
that described in Figure 4.6. . . . . . . . . . . . . . . . . . . . . . . 62
4.9 Calorimeter response to single π− particle at 50 GeV with fluctuations
in the shower origin. The placement of the histograms is the same as
that described in Figure 4.6. . . . . . . . . . . . . . . . . . . . . . . 64
4.10 Calorimeter response to single π− particle at 50 GeV with additional
fluctuations in the shower length. The placement of the histograms is
the same as that described in Figure 4.6. . . . . . . . . . . . . . . . 65
4.11 The distribution of w. . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.12 Calorimeter response to single π− particle at 50 GeV with additional
fluctuations in the w. The placement of the histograms is the same as
that described in Figure 4.6. . . . . . . . . . . . . . . . . . . . . . . 67
4.13 The quadratic fit for dE/dx. . . . . . . . . . . . . . . . . . . . . . . 68
xviii
4.14 Comparison of low energy depositions in the ECAL for 50 GeV negative
π with minimum ionization and δ-ray simulations. Note that the lowest
energy deposit matches well due to the fitting on the minimum energy
given by Equation 4.15. The spread to the right is due to the δ-ray
simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.15 The algorithm for FastCal energy deposition in ECAL due to hadronic
showers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1 The ECAL energy deposition obtained in (a) FastCal and (b) GISMO.
The two distributions are in qualitative agreement. . . . . . . . . . . 74
5.2 The HCAL energy deposition obtained in (a) FastCal and (b) GISMO.
The two distributions are in qualitative agreement. . . . . . . . . . . 75
5.3 The total calorimeter energy deposition obtained in (a) FastCal and
(b) GISMO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4 The total energy per event in final state e+, e−, γ and hadrons (gen-
erator level). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5 (a) The comparison of cluster energy distribution in FastCal and GISMO
(FullSim). There is a general agreement between FastCal and GISMO,
except for low energies. (b) The first bin in (a) blown up to show
GISMO has a high number of low energy clusters (<1 GeV). The very
low energy “garbage” clusters do not impact the jet finding algorithms. 77
5.6 Jet energy distribution in FastCal and GISMO (FullSim). The number
of jets match in comparison, except for low energy jets. . . . . . . . 78
xix
5.7 (a) The comparison of cluster energy distribution in FastCal and GISMO.
There is a general agreement between FastCal and GISMO, except for
low energies. (b) The first bin in (a) blown up to show GISMO has a
high number of low energy clusters (<1 GeV). . . . . . . . . . . . . . 78
5.8 The jet-quark energy plot for GISMO events with one Z decaying
hadronically, while the other decays into muons and neutrinos. The the
most energetic jets are then associated with the quarks, and a match
is established if the bigger of the two jet-quark opening angles is less
than 0.3 radians. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.9 The jet-quark energy plot for FastCal events with one Z decaying
hadronically, while the other decays into muons and neutrinos. The the
most energetic jets are then associated with the quarks, and a match
is established if the bigger of the two jet-quark opening angles is less
than 0.3 radians. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.10 The neural network training graph. The lowest validation error is
reached at the 780-th training cycle. The network at this stage is used
in the testing phase. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.1 The energy fraction of each jet for the first boson, given by Equation 6.1 86
6.2 The ε-p graph illustrating the ensemble result. See Section 6.3.1. . . 89
6.3 The ε-p curves display the performance of a neural network trained on
e+e− → W+W− data and tested (1) on e+e− → ZZ data (Z NN on
W data) and (2) on e+e− → W+W− data (W NN on W data). Note
that network, trained on W data performs equally well on Z data. . 91
xx
6.4 The ε-p curves display the performance of a neural network trained on
e+e− → ZZ data and tested (1) on e+e− → ZZ data (Z NN on Z data)
and (2) on e+e− → W+W− data (W NN on Z data). As in Figure 6.3,
this shows that the networks trained on right and wrong jet pairs are
picking up the features of a general heavy boson. . . . . . . . . . . . 92
6.5 The ε-p curve for the Z jets pairs against a background of W jet pairs
is given by the dark line. For comparison, the ε-p curve for the Z
jets pairs against the background of wrong jet pairs is given. The dark
line is straighter, denoting a poor classification performance in the mid
probabilities (See Figure 6.6). . . . . . . . . . . . . . . . . . . . . . . 94
6.6 The neural network output of the first unit (probability of Z jet pair).
Since there are two classes, the probability for W is complementary.
The classification is not optimal between the output values of 0.15
and 0.7, and it is particularly suboptimal between 0.15 and 0.5. The
peaks most likely denote subclasses of the jet pairs that have particular
kinematic variable distributions. . . . . . . . . . . . . . . . . . . . . 95
7.1 The surface matrix plot for the four variables mentioned in Section 7.2.1.
102
7.2 The scree plot, that displays the eigenvalues of the principal compo-
nent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3 The two orthogonal components with the highest standard deviations
after principal component analysis. In this plot, the Z jet-pairs are
given in black and the wrong combinations are given in gray. . . . . 104
xxi
7.4 The surface plot for the density distribution, calculated using the
Gaussian-like kernel described above. Notice the peaks. The higher
of the two peaks represent the signal. . . . . . . . . . . . . . . . . . 107
7.5 The contour plot of the density distribution. . . . . . . . . . . . . . 107
7.6 The efficiency-purity curve. The two curves compare the result of (a)
the classical multidimensional scaling procedure and (b) the neural
network. In (a), a density cutoff is used to identify the Z. The density
is obtained in a 2-dimensional reduced subspace obtained from a clas-
sical scaling on the following variables: E1E2, m12, θ12 and δm212. The
neural network is trained on E1, E2, cos θ1, cos θ2, m12 and θ12. . . . 108
7.7 The PCA plot using the same transformation as the jets but on quark
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.8 The right and and wrong combinations at the quark level . . . . . . . 111
7.9 Four perspectives of three-dimensional PCA at the quark level. The
dark (black) points are Z and the light (green) points are wrong com-
binations. The two classes form different distributions. Section 7.2.4
for description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.10 The plot in the first two principal components on the fuzzy cluster data.
Note the structures in the data distribution. The dark datapoints are
the correct jet-jet pairs, and the lighter datapoints are the incorrect
jet-jet pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.11 Demographic Clustering: The purity as a function of the distance pa-
rameter on the θ12 variable. . . . . . . . . . . . . . . . . . . . . . . . 118
7.12 Demographic Clustering: The efficiency-purity graph obtained by vary-
ing the distance parameter on the θ12 variable. Comparison is made
with fuzzy clustering, θ12-cut and an ensemble of neural networks. . 120
xxii
B.1 The network setup panel. The working directory sets up the folder
in which the data files exist, and all CJNN created files are copied.
Network configuration configures the basic network architecture. The
neural network can alternatively be read from an already existing con-
figuration file. Further, the configuration of a network can be written
to a file for later use. . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.2 The second panel in the GUI sets up the training parameters. eta is
the learning coefficient and alpha is the momentum coefficient term.
Rescaling the data transforms variables such that they have zero mean
and unit variance along columns. The training parameters can be
written to and read from a file. . . . . . . . . . . . . . . . . . . . . . 134
B.3 The third panel, under Weights Initialization, sets the weights in the
network. This can be done either from a file, or the weights can be
randomized. Once the weights are set, the iterative training can begin. 135
xxiii
Chapter 1
Introduction
1.1 Motivation
The experimental program in high-energy physics has seen tremendous achievements
and successes in recent times. In particular, a unified electroweak model, the Standard
Model (SM) [36, 87, 77] was put in place by the discovery of the W [6] and the
Z [9] bosons. The number of precision tests at the linear e+e− colliders have further
vindicated this model and it is now generally accepted that it does indeed describe
the physics at the energy scale accessible in the present particle colliders.
Though the Standard Model has seen such success in the precision tests, all its
assertions have not been verified.
The Standard Model, which has a SU(2)×U(1) symmetry, has a spectrum of mass-
less particles. The masses are generated by what is known as the Higgs mechanism
which requires a boson multiplet to provide the mass term via vacuum expectation
values, thus breaking the symmetry. Though there have been suggestions of evidence
for the Higgs boson, its existence is not proven yet. And further, it is still not proven
that the Higgs boson indeed provides the masses to all the fermions and gauge bosons
1
in the SM particle spectrum. A lighter Higgs is presently favored, and with the in-
dication for a Higgs at about 115 GeV from Large Electron-Positron Collider (LEP)
runs, and a lower bound of about 113 GeV at 95% CL [10], it is now highly proba-
ble that the Higgs will either be discovered at the Large Hadron Collider (LHC), or
sooner at the Tevatron. To prove that the Higgs is responsible for the gauge boson
masses as well as the fermion mass, it is essential to obtain the absolute couplings of
these particles with that of the possible extended Higgs sector. This will be possible
only in the linear collider [3].
The SM Higgs prescription, of a single complex scalar doublet breaking the gauge
symmetry and then mediating the finite masses to fermions through Yukawa cou-
plings, can be further extended. Supersymmetry (SUSY), in particular, demands at
least two complex doublets.
Therefore, one of the major concerns in high-energy physics is the discovery of
the Higgs boson and the mechanism of the mass generation. But the discovery of
a Standard Model Higgs boson is itself not sufficient. A single fundamental Higgs
bosons has theoretical problems, which is also called the hierarchy problem [83].
1.1.1 New Physics
The hierarchy problem is characterized by the need to fine-tune the renormalized
coupling constants and masses to 26 decimal places, if the Standard Model is to be
considered as an effective theory of a more fundamental unified theory. The objection
against fine-tuning comes not from formal mathematics, but from the naturalness of
the gauge theory.
To solve the hierarchy and other theoretical problems that are inherent in the
Standard Model, many alternative extensions have been suggested. A particularly
popular solution is provided by supersymmetry, which relates fermions with bosons.
2
Under this symmetry, all particles in nature have a supersymmetric partner that has
identical properties except spin and R-parity. Supersymmetry solves the hierarchy
problem elegantly by producing spontaneous symmetry breaking in the electroweak
sector as a by-product of the supersymmetry breaking in the soft terms. Supersym-
metric particles have not been observed because the supersymmetry breaking itself
gives the super-partners masses in the range that has not been reached in the parti-
cles colliders built until today. At higher energies, it is possible super-partners will
be produced, and their indirect signals will be observed.
Supersymmetry does not provide a single prescription to the problems of the Stan-
dard Model Higgs sector and instead offers a number of individual supersymmetric
models. There are other possible non-supersymmetric models that too have been
suggested—models based on new gauge interactions or extra spatial dimensions and
quantum gravity.
At the linear collider, it is expected that the Higgses will be produced by various
production processes, depending upon the correct model. But for a light Higgs, the
Higgsstrahlung process is the most favored, which produces a Z boson and a Higgs.
The expected backgrounds are the SM ZZ and WW processes. A lighter Higgs is
expected to decay mostly to bb, with WW and ZZ decays being important. For a
heavier Higgs, the WW and ZZ decays become dominant. In the decoupling limit of
the minimal supersymmetric model (MSSM), the light CP-even (charge-parity) Higgs
has properties close to the SM Higgs. For lighter mA0 , all the physical Higgs would
be produced in the linear collider [3].
1.1.2 Accelerators
Some of these issues that confront high-energy physics will be addressed in the new
accelerators that are being built. The next high energy frontier machine is the Large
3
Hadron Collider which will operate at 14 TeV and collide protons. Although there
have been indications of a Standard Model Higgs at LEP, it is expected to be conclu-
sively found at either the Tevatron or the LHC. The LHC, with a much higher energy
reach, will also begin to explore the new physics.
Although the LHC is expected to discover the Higgs and signals of the new physics,
it will have to be complemented by a program of precision high-energy physics ex-
periments at a e+e−linear collider (LC). It is expected that a 500 GeV linear collider
with luminosity above 1034 cm−2 sec−1, and energy extensible to the TeV range, will
be required for the precision experiments that will be performed [3].
1.1.3 Multijet events in LC
A linear collider running at high energy will mostly produce heavy particles: Z, W±,
and heavy quarks. The heavy quarks produce multiple jets. The massive gauge
bosons decay hadronically with a branching ratio of about 70%. Therefore, physics
processes in the linear collider at 500 GeV produce copious multi-jet events. With
the decay particles carrying high energy, the jets are well collimated and well defined.
The physics experiments that will be carried out in the linear collider will deal largely
with jets.
For example, for the Higgsstrahlung process, e+e− → Zh, which is the main
Higgs boson discovery process, the background is the double Z production process:
e+e− → ZZ [2]. In the hadronic decay of both the Z and the h in the two processes,
it is not clear which jets arise from which particle.
The jet multiplicity exists for many of the processes that will be studied at the
linear collider. In the tt process there are at least 6 jets; in the tth process there are
at least 8 jets and in the AH process there are at least 12 jets.
A linear collider running at 500 GeV or higher will thus be characterized by
4
multi-jet events. A primary issue in such events is associating jets with decaying
particles. In a collider that is expected to see many novel processes never before seen,
methods are required that identify jets and associate them with decaying particles
with minimum theoretical bias.
In the scenarios held up by different possible models, the Higgs decay to the
heavy gauge bosons, and quarks are significant. Since W and Z themselves decay
predominantly into hadrons, the study of hadronic jets becomes of paramount impor-
tance. Typically, events will have multiple hadronic decays, and the problem will be
to classify the event and associate the jets with the parent particle.
1.2 Multivariate Analysis and Kinematics
The problem mentioned above is essentially a multivariate problem. As we try to
tackle the issues mentioned in the previous section, we are forced to handle many
different measured quantities simultaneously. The traditional approach in high-energy
physics analysis is to perform cuts one variable at a time, which is a monovariate
approach. This can be done in a more efficient way by recognizing the multivariate
nature of the problem, and using multivariate methods.
One popular method presently used in the high-energy physics (HEP) community
is the feedforward neural network. We propose to use neural networks as the bench-
mark and develop multivariate algorithms to efficiently classify events, associate jets
and understand the algorithms themselves, in a multivariate way.
In this thesis, we propose to use kinematics alone. Although for specific studies
additional signals like muon distribution, strange particles and jet charges will provide
significant inputs, in general kinematics will be sensitive to new physics. Therefore,
we will confine the use of multivariate methods to the use of kinematics.
5
We attempt here to develop an approach to multijet events that is based on
knowledge discovery, which is introduced in the following section.
1.3 Knowledge Discovery as an approach
The knowledge discovery methods and approach are those that have developed within
the discipline of machine learning outside traditional statistics. In general, these data-
centric methods require an iterative learning process by which algorithms produce a
set of rules. These rules, which are the results of the data mining algorithm, constitute
the discovered knowledge; and the entire process, including the algorithmic learning
process, constitutes the discovery phase. It should be noted here that even though
the discovery phase possibly uses some prior domain knowledge (in this case physics
knowledge), the algorithmic learning process does not use any domain knowledge.
The knowledge discovery approach is further discussed in Chapter 2. Often the
phrase “data mining” is used interchangeably with “knowledge discovery in databases”
(KDD). Furthermore, we use the phrase “knowledge discovery approach” to mean the
use of the data-centric approach of data mining.
To illustrate the knowledge discovery approach, we use the a Demographic Clus-
tering method in the following section to search for Z bosons. The Demogaphic
Clustering algorithm is explained in Appendix C.
1.4 Mining for the Z boson – CDF data
In a typical particle collision, particles interact at high energies giving off many other
particles. As collider energies increase, the number of particles produced in every
event and the number of processes involved increase, resulting in very big datasets.
6
The problem of finding the relevant events with the desired process and particles is,
therefore, a formidable one. As the data produced by the colliders increase, there
exists a need for efficient algorithms that can sieve out the events a physicist might
be interested in. One such approach to retrieving relevant data from large datasets is
Data Mining. Data Mining techniques are being increasingly used to look for patterns
in large datasets. In addition to the data that one might have been looking for, these
algorithms are also capable of finding patterns in the data that are hitherto unknown.
As a first illustration, we will examine the efficacy of such a technique in detecting
the Z boson in Collider Detector at Fermilab (CDF) events. The pp collisions pro-
duce Z bosons which decay via the hadronic and leptonic modes. The hadronic decay
modes are predominant, from branching ratios. In leptonic decays, neutrinos escape
detection and muons leave a minimal signal in the calorimeters. The dominant tau
particle decay modes include neutrinos, which make them difficult to use for recon-
struction. Therefore, the electron-positron decay channel is the only leptonic decay
mode that could be used for Z boson reconstruction from calorimeter information.
Here we use Data Mining to look for significant events that have Z boson decays.
As an initial step, jet-pair data for events with good e+e− pairs are used for mining.
Then the threshold energies of the e+e− are lowered to include more events. Finally
we introduce even more noise by considering only the two most massive jets with
energies above certain threshold energies.
1.4.1 Demographic Clustering
The Data Mining algorithm used is demographic clustering [37]. In Appendix C an
example is given to illustrate the algorithm. We use this method to partition the data
into different clusters in the n-dimensional phase space of the chosen jet parameters.
Clusters are groups of data points that have the following two properties [41] (1)
7
Homogeneity within clusters, i.e., data that belong to the same cluster should be
as similar as possible and (2) Heterogeneity between clusters, i.e., data that belong
to different clusters should be as different as possible. We examine here whether
the clusters that demographic clustering will provide us have physically significant
characteristics. Also we shall be trying to examine which choice of parameters is best
suited for this kind of analysis.
1.4.2 Mining for Z
In the leptonic decay mode, the Z boson decays into an energetic back-to-back lepton
and anti-lepton pair. The invariant mass of the Z is calculated from the four-momenta
of the electron/positron pairs. Since the mass of the Z boson is high, we expect the
transverse momenta to be high, and the rapidities to have a narrow distribution.
Therefore, as a first choice, we choose the following jet parameters for clustering: the
invariant mass (mij), transverse momenta (pTi,j) of the two jets, their rapidities (yi,j)
and the angle (δφ) between the jet pairs. The subscript i (j) represents the jet with
higher (lower) energy. Note that the choice of these variables is the use of domain
(physics) knowledge.
Events with good electrons
As a preliminary first step, the jet+2 events are clustered, with 2 “good” electrons
having energies above 20 GeV. Because of the strong QCD background, the bump
in the jet invariant masses is feeble compared to the electron invariant masses [Fig-
ure 1.1]. The bump at 91 GeV is clearly visible against the QCD background. This
is the mass distribution of the jet pairs for events with exactly an electron and a
positron.
Figure 1.2 shows the distribution of the six different variables for one cluster
8
0 50 100 150 200 250 300
0
50
100
150
200
250
300
m
N
Figure 1.1: Invariant mass distribution m of jet pairs, for events with a good e+e−
pair (electrons with energies greater than 20 GeV).
among many. The distributions indicate that this cluster is a good candidate for
the Z boson. Both the transverse momenta are high and the rapidities are low.
The invariant mass distribution has a peak at around 90 GeV and the two decaying
particles are back-to-back. The cluster is not purely a Z because it contains a low-
mass particle and small opening angles. The clustering algorithm was thus able
to isolate a physically significant class of phenomenon, though it was contaminated
with another phenomenon. A strong candidate for the contaminating phenomenon is
photon to electron-positron conversion.
Events with lowered threshold energy
As a second step, more noise is introduced into the data by lowering the electron
energy threshold (include electrons with energies above 5 GeV). Figure 1.3 shows the
9
ge00 Cluster 3 10.3% of population
m_ij
0 40 80 120 160 2000
10
20
30
40
%
dphi0 1 2 3 4 65
0
10
20
30
%
pt_i
0 20 40 600
10
20
30
40
%
pt_j0 10 20 30 40 50
0
10
20
30
%
y_i-4 -2 0 2 40
10
20
30
40
%
y_j-4 -2 0 2 40
10
20
30
40
50
%
Figure 1.2: Clustering of jet pairs with a pair of e+e−. Clustering in performed onthe following variables: jet-jet invariant mass (m ij), jet-jet opening angle (dphi),the transverse momenta of the first and the second jets (pt i and pt j respectively)and the rapidities of the first and second jets (y i and y j respectively). The two jetsin the jet pairs are ordered so that the first jet has higher energy than the second.The gray filled histograms denote the distributions of the entire dataset, whereas thehistograms with thick lines denote distributions for the specific cluster.
resulting jet invariant mass distribution, within which the 91 GeV bump is signifi-
cantly washed out.
Figure 1.4 shows the 91 GeV clustering result on the loose cut sample. Signifi-
cantly the low mass and small angle jets separate out and form a different cluster.
This cluster can be identified with the Z-decay. The result is shown in the table
below.
Mean Std Devmij 95.89 GeV 21.23δφ 3.09 rad 0.51
10
0 50 100 150 200 250 300
0
100
200
300
400
500
600
700
800
900
1000
1100
m
N
Figure 1.3: Invariant mass distribution m of jet pairs with lowered e+e− thresholdenergy (5 GeV). Note that the Z bump clearly visible in Figure 1.1 is washed out.
Figure 1.5 shows another cluster with mij=89.14 and√
σ=36.57. It shows one of
the jets (the j-jet) has high transverse momentum and low rapidity. This indicates
that this cluster is a good candidate for the W± boson, which decays into an electron
and a neutrino. The standard deviations are rather large, which could be because
the distance unit has not been properly set, and the clustering is not tight enough to
sieve out some spurious data.
Clustering of the two massive jets
Since the Z boson decays into a pair of energetic e+e−, there exists a high probability
that they are picked up as the two most massive jets in the event. But when only the
two most massive jets are used to reconstruct Zs, a lot of QCD background is also
11
ge33 Cluster 6 3.8% of population
m_ij
0 40 80 120 160 2000
10
20
30
40
%
dphi
0 1 2 3 4 5 60
10
20
30
40
50
%
pt_i0 10 20 30 40 50 60 70
0
5
10
15
20
25
%
pt_j0 10 20 30 40 50 60
0
10
20
30
%
y_i-4 -3 -2 -1 0 1 2 3 40
10
20
30
%
y_j-4 -3 -2 -1 0 1 2 3 40
10
20
30
40
%
Figure 1.4: Clustering of jet pairs with lowered e+e− threshold energy, representingZ. The variables are those given in Figure 1.2.
picked up, and any information regarding the invariant mass bumps are completely
washed out.
With a change of the distance unit (30) and the m12 (subscript 1 represents the
jet with higher energy), a cluster is found with the same characteristics as that for the
Z boson in the other cases: high transverse momenta, low rapidity and back-to-back
angle [Figure 1.7]. This cluster is a strong candidate for Z decay.
Mean Std Devmij 115.11 50.84δφ 3.06 0.66
Even though the mass of the Z falls within the standard deviation, the error is
very high. To increase the confidence level that the Z decay events are solely included
12
ge33 Cluster 8 3.3% of population
m_ij0 40 80 120 160 200
0
10
20
30
%
dphi0 1 2 3 4 5 6
0
5
10
15
20
%
pt_i0 10 20 30 40 50 60 70
0
10
20
30
40
%
pt_j0 10 20 30 40 50 60
0
10
20
30
%
y_i-4 -2 0 2 4
0
1020
30
40
50
%
y_j-4 -2 0 2 40
10
20
30
40
50
%
Figure 1.5: Clustering of jet pairs with lowered e+e− threshold energy, representingW±. The variables are explained in Figure 1.2
in the cluster, further clustering is required to pick out the significant events.
For clustering with more noise, the distance parameter becomes very important.
In Figure 1.7, the clustering result is shown, with the threshold of the jet energies
lowered to 5 GeV. The distance parameters are set for invariant mass m12 (15 GeV),
and the transverse momenta, pt1 (5 GeV) and pt
2 (5 GeV). The cluster number 16 can
be identified with Z boson. The cluster is 0.5% of the whole data set, which is parti-
tioned into more than 60 clusters. The average invariant mass (m12) is 119.97 GeV,
with a standard deviation of 38.43 GeV. Though this data set had more noise, the
clustering resulted in a lower standard deviation, which underlines the importance of
the distance parameter.
13
gj12x Cluster 0 8.6% of population
m_12
0 100 200 300 400 5000
20
40
60
%
dphi_120 1 2 3 4 5 6
0
20
40
60
%
pt_20 40 80 1200
10
20
30
40
%
y_2-4 -3 -2 -1 0 1 2 3 4
0
10
20
30
%
y_1-4 -3 -2 -1 0 1 2 3 40
5
10
15
20
25
%
pt_10 20 40 60 80 100 120
0
10
20
30
40
%
Figure 1.6: The Z cluster in the 2 maximum energy jet variables. The variables arethe same as in Figure 1.2, but the subscripts (1,2) now represent the two massive jets.
1.4.3 Discussion
The clustering method used above illustrates a naive application of the knowledge
discovery method. Although the domain knowledge input in the example above was
made in the choice of the physically significant variables, the Demographic Clustering
method is unaware of this. It performs the partition of the input data into clusters
solely on the basis of the distribution in the six-dimensional variable space. The in-
variant mass of the jet pair is explicitly used in the dataset since it is the strongest
discriminant of parent particles. The Demographic Clustering method orders it con-
sistently as the most significant discriminator in all the clusters seen above. It lists
the opening angle δφ as the second most important discriminant. The significance of
this result is that the clustering method could find out automatically a physically sig-
14
gj33x (15,5,5) Cluster 16 0.5% of population
m_120
1020
30
40
50
%
pt_15 15 25 35 45 550
5
10
15
20
%
pt_25 15 25 35 45 55
0
10
20%
dphi_120 1 2 3 4 5 6
0
20
40
60
%
y_1-4 -2 0 2 40
10
20
30
%
y_2-4 -2 0 2 40
10
20
30
40
%
Figure 1.7: The Z cluster in the 2 maximum energy jet variables. The variables arethe same as in Figure 1.2, but the subscripts (1,2) now represent the two massive jets.The threshold jet energy is lowered to 5 GeV. The distance unit is set for invariantmass m 12 (15 GeV), the transverse momenta pt 1 (5 GeV) and pt 2 (5 GeV).
nificant cluster, and further it could identify the most important discriminant among
the six variables.
1.5 Plan of thesis
This thesis is structured in the following way. Chapter 2 discusses the knowledge
discovery approach and how it has been applied in this work. Chapter 3 discusses
the neural network approach to the problem. Chapter 4 discusses a fast Monte Carlo
Simulation of the Linear Collider Detector Calorimeter, as designed by the Ameri-
can Linear Collider Working Group. Chapter 5 compares the use of two simulation
15
schemes FastCal and GISMO (a package for particle transport and detector simula-
tion) and validates the use of FastCal. Chapter 6 provides results on neural networks.
Chapter 7 discusses the unsupervised approach to the problem. In Chapter 8, we
summarize and conclude the work.
1.6 Conclusion
In this chapter we have laid out the context in which the next generation of particle
accelerators will operate and the physics interests that they hold. We have asserted
that these particle accelerators are expected to produce copious well-collimated and
well-defined jets on which important physics studies will be done. We have further
asserted that this situation poses a multivariate problem, that could be gainfully
addressed in the data mining realm. We have tried to motivate the use of data
mining tools applying the demographic clustering method in the search for the Z
boson and explained the promising result.
16
Chapter 2
Knowledge Discovery and Physics
2.1 Introduction
Our approach to high-energy physics data is that of knowledge discovery. Knowledge
discovery, as defined more precisely below, is an approach to data analysis that has
developed at the confluence of many different fields, namely artificial intelligence,
statistics, database management and others. This approach has been successfully
applied in business, economics, and in a more modest way in the sciences. In this
thesis, we adopt the approach and the tools of knowledge discovery to analyze linear
collider data. This chapter elaborates on this approach and its application to our
problem, and compares and contrasts the approach to the traditional high-energy
physics approach.
2.2 A short history of Data Mining
Data Mining, and knowledge discovery as an approach has been possible because
of the confluence of largely three independent traditions: large digital databases,
17
classical statistics and artificial intelligence via machine learning. In this section we
give a brief description of these fields.
2.2.1 Digital databases
According to Moore’s Law [63, 64] , the density of integrated circuits in electronic
chips increases exponentially over time. This prediction proved correct, as the density
doubled every 12 months in the 1970’s and every 18 months since then until today.
Though it is likely that present approach to miniaturization will meet a physical
barrier, this does not guarantee the end of the law. Moore’s Law has been observed
over many different substrates, from electro-mechanical devices to vacuum tubes to
transistors and finally to integrated circuits [49].
Moore’s Law has been extended to the data storage devices. Just as the micro-
processor is being miniaturized, digital storage devices too have seen an exponential
increase in capacity. Furthermore, this has been accompanied by an even faster in-
crease in the storage capacity per unit price. As a result, the availability of digital
data has seen an exponential growth.
The nature of this accumulating data is as varied as the sources from which they
come. The data could be textual, pictorial, categorical or numerical, and frequently,
these data are noisy and contaminated. To eke out meaningful information from this
varied nature of digital databases the traditional approach to data, classical statistics,
falls short.
2.2.2 Classical Statistics
The traditional approach to understanding data has been classical statistics [30].
Classical statistics is closely associated with the notion of an experiment in which
18
each event produces an outcome according to an underlying probability distribution.
Each outcome is considered independent and the total set of outcomes is considered as
the sample space. This classical statistics is very closely defined, and most scientific
experiments are designed on the principles of classical statistics [29].
In a typical scientific situation, an experiment is designed with a definite hy-
pothesis that has to be tested, or a parameter that has to be measured. In this
hypothetico-deductive paradigm, the exercise begins with a lack of data. An ade-
quate amount of data is then collected either to prove or disprove a hypothesis or to
provide sufficient data to measure a quantity. This is the standard approach of most
laboratory based scientific endeavors where controlled experiments are conducted.
In an alternative paradigm, the data are already available not from a measurement
but in the form of a record of an event. This data could be a business transaction,
the image recorded during an astronomical observation of the sky or the record of
an event in a particle detector. These records do not fall neatly in the design of
single-phenomenon experiments that are so well described by classical statistics [39].
A typical dataset can be visualized as an n×p matrix, with n rows or records, and
p columns or simply variables. The output for each row could be a further r variables,
called the response. The large datasets that are available today are typically large in
both n and p. This poses special problems in classical statistics. For very large n,
methods that scale higher than O(n) are better replaced by adaptive or sequential
methods [39]. For large p, the explanatory variable space becomes sparse, rendering
structures impossible to identify, a problem called the “curse of dimensionality” [12]
or COD.
Further, the data that are available in digital form are often categorical and not
numerical. Digital data also include images, audio text and geographical data that are
not easily handled by classical statistics [39]. Furthermore, the data are contaminated
19
or noisy. These data characteristics pose special challenges to classical statistics.
2.2.3 Machine Learning
An alternative to the hypothetico-deductive paradigm of classical statistics is the
inductive approach that includes the techniques that have been developed under the
classification of machine learning [62]. Machine learning techniques owe their origins
to initial research on artificial intelligence. These methods are typically based on
a learning mechanism. That is, they are iterative techniques that improve in their
predictive abilities as the number of iterations increases.
Some of the well-known techniques in use under machine learning include decision
trees, neural networks, genetic algorithms, etc. These techniques are usually based
on a variety of learning procedures. Some of the different learning procedures are
supervised learning, unsupervised learning and reinforced learning. In this thesis, the
kind of learning procedure that is employed is discussed in the context.
Among the many features associated with these methods, machine learning meth-
ods are robust in the presence of noise. Methods based on classical statistics perform
poorly in the presence of noise, whereas neural networks actually improve in the
presence of noise [4].
2.2.4 Meeting together
The three major trends: the growth of large digital recorded datasets, the inadequacy
of classical statistics and the growth of machine learning based algorithms, have re-
sulted in the development of the the core data mining techniques and approach. These
techniques consist of such methods as neural networks, clustering, etc, resulting in a
corpus of tools that are able to tackle modern and large datasets. Machine learning,
20
which began in the wombs of Artificial Intelligence, is now being recast in the lan-
guage of statistics, and statistics is now being extended to include heuristic processes
(Statistical Learning Theory [85]). This has further resulted in the development of
such methods as Support Vector Machines [23] which has formally brought neural
networks into the realm of statistics [70].
Together, these techniques are now grouped under the heading of data mining,
in recognition of the original intention of mining already available data. These tech-
niques have given rise to a unique approach to data, which is now called Knowledge
Discovery in Databases, or KDD for short.
2.3 Knowledge Discovery
There are various definitions suggested for knowledge discovery. A general one is [31]:
Knowledge discovery is the nontrivial extraction of implicit, previously
unknown and potentially useful information from data. Specifically, given
a set of facts (data) F , a language L, and some certainty C, we define a
pattern as a statement S in L that describes relationships among a subset
Fs of F with certainty c, such that S is simpler than the enumeration of
all facts in Fs.
Below, we illustrate this definition with the use of the binary tree algorithm in
the search for Z bosons.
2.3.1 An example - Binary Tree
Consider an example of the Z boson reconstruction in the linear collider context. In
e+e− → ZZ simulated events with√
s = 500 GeV that have two hard muons or large
21
(a) Traditional Cuts (b) Binary Tree Cuts
Figure 2.1: The graph on the left is one obtained from cuts. The one on the right isfrom cuts obtained from the binary tree method.
missing energy (two neutrinos), all the energy deposited in the electromagnetic and
hadronic calorimeters is due to the decay of a single Z boson. Using the calorimeter
depositions, jets are constructed using the JADE algorithm, which yields in general
not two but a variable number of jets. Using the two most energetic jets to recon-
struct the Z boson will not yield a clean Z peak. Using a traditional fiducial cut of
cos θjj < 0.8 and opening angle cut of θjj < 0.5, we obtain a much cleaner sample of
Z (Figure 2.1(a)).
Alternatively, we may use a binary tree to partition the data to give us automatic
cuts. We use SLIQ/SPRINT [58, 79] to partition the data to obtain a cleaner sample
(see Appendix D). The data consists of the following discriminant variables: cos θj1,2 ,
θjj and δjj with the invariant jet-jet mass, mjj as the classification label. The algo-
rithm partitions the data at each node of the binary tree using the gini coefficient
(see Appendix D) to choose the partition variable and the split point. On examining
22
the different leaves that the binary tree yields, the Z boson can be easily picked up
(Figure 2.1(b)).
This is an example of the knowledge discovery method. We began with a given
dataset, (F ), from which we obtained a subset of the data Fs, described by the
statement 0.75 < θjj ≤ 1.04 (S).
mZ σCUTS I cos θ1, cos θ2 < 0.8; θ12 > 0.5 90.83± 0.71 18.13± 0.68CUTS II 0.74 < θ12 ≤ 1.04 90.27± 0.64 13.78± 0.52
Table 2.1: Alternative cuts for Z reconstruction. CUTS II are obtained from a decisiontree modeled on the jet pair invariant mass mjj.
Note that the statement S in this case is a high-level and physically significant
statement, that is very similar in language to the one that an expert (physicist)
uses while applying cuts to data. The statement S obtained from most data mining
techniques are not stated as clearly. For example, training neural networks results in
fixing a model to the training dataset, and the statement that is analogous to the one
from binary tree in the example above is an implicit complex decision boundary.
2.3.2 Description of KDD
The knowledge discovery process thus begins with data. In the example above, though
we have used some knowledge of physics (domain knowledge), we did not make any
assumption on the relationship between the different discriminant variables.
Typically, the knowledge discovery process begins at the data exploration stage
in which data is prepared. This involves using domain knowledge to identify rele-
vant discriminatory variables, like cos θj, θjj, mjj, etc, or it could involve statistical
methods that would reduce dimensionality like principal component analysis, or other
feature extraction methods.
23
The next stage involves using an algorithm that would perform the model building.
The algorithms used at this stage could be a binary tree, as above, or neural networks,
or a clustering method, which have been developed as machine learning techniques.
The model is validated using a separate dataset. Since the performance of the model
is dependent on the dataset, it is possible that the first stage is revisited and the
second stage is repeated with a new dataset.
The third stage is the application of the model on new data.
The three stages along with the use of the iterative, and non-linear algorithms
constitute the knowledge discovery approach to data.
2.4 Knowledge discovery and scientific discovery
Though the basic process of scientific discovery is similar to knowledge discovery,
they are very different in the contents. Chiefly, in scientific discovery, the data are
taken under very controlled conditions to eliminate unwanted parameters and isolate
desired ones. The data are, therefore, neither general nor copious. In the analysis
of the data, the domain knowledge and the methods used for the analysis are rarely
automated, and frequently not based on computers.
Though the aim and methods of scientific and knowledge discovery differ in gen-
eral, certain areas of science have successfully used the knowledge discovery process.
Sky Image Cataloging and Analysis Tool (SKICAT) [28] is an automated system
for cataloging the objects in the sky. It involves a first stage of feature extraction
from digital images, and then the use of a binary classifier to catalog the objects as
either stars or galaxies. In addition to the broad classification, clustering was able
to provide two sub-classes of galaxies (with and without cores). Uses of knowledge
discovery have been extended to geology and biogenetics [27].
24
2.5 Knowledge discovery and high-energy physics
The data produced by the high-energy physics colliders are indeed copious and gen-
eral. This data sets are large, and have a high dimensionality, the two chief criteria
that invite knowledge discovery methods. The large amount of data requires an effi-
cient data management system. The large dimensionality of high-energy physics data
makes it appropriate for the use of knowledge discovery techniques.
With newer particle colliders, the combined database of real and simulated data
is in hundreds of terabytes (1 terabyte = 240 bytes), and possibly in petabytes
(1 petabyte = 250 bytes). The DØ experiment is accumulating 100 terabytes of
data every year, and the forthcoming Large Hadron Collider is expected to collect
petabytes of data every year during its operational time [13]. Though collider events
are controlled in certain ways and devised to isolate relevant parameters, the data
obtained are general enough to make it possible to use it for different kinds of analy-
sis. A typical analysis would consist of a preselection of events and calculating some
quantities based on these preselection. This is precisely the exercise that is possible
in the knowledge discovery process.
Some of the well-known tools used in knowledge discovery are already in use in
high-energy physics. Hardware based neural networks are often used in trigger circuits
at the data collection stage. In the data analysis stage, neural networks have been
used for jet identification [55] and jet feature extraction [53].
In this thesis, we will examine the use of these techniques on the problem of
boson classification using kinematic variables. In particular, we will examine the use
of the neural network (Chapter 3), and means to improve its performance. We will
also examine different multivariate methods to improve the discriminant ability of
the variables using principal component analysis and multidimensional scaling along
25
with clustering methods (Chapter 7).
2.6 Conclusion
In this chapter we have introduced the concept of knowledge discovery. The origin of
this approach was identified as the confluence of three disparate disciplines: database
management, statistics and machine learning, which in turn owes its origin to artificial
intelligence. The salient features of this approach were described and discussed, and
the relevance to high-energy physics data emphasized.
26
Chapter 3
Neural Networks
3.1 Introduction
In high-energy physics, neural networks have been used for on-line as well as off-line
applications. The on-line applications have been in hardware implementations in trig-
gers in the CDF, H1 and LHC experiments. For off-line applications, neural networks
have been very widely used for many different uses. In both types of applications,
neural networks have been almost exclusively used for classification purposes.
In this chapter, we discuss the design of a neural network used for the purpose of
identifying the correct combination of jet-pairs from kinematic variables in e+e− →
ZZ events in which both the Z’s decay hadronically.
The neural network package written for this work is called CJNN, which has
been developed in both C++ and Java languages. CJNN is a general-purpose neural
network package easily configurable and reusable. The Java version has been imple-
mented for use in the Java Analysis Studio (JAS) environment and designed so that
trained neural networks could be included as part of a bigger analysis package. The
CJNN graphical interface for training is described in Appendix B. This chapter is in
27
part a description of this neural network package.
3.2 The problem – Separation of Z from back-
ground
In e+e− → ZZ events we face a combinatorial problem as the decay products of
the two Z bosons can easily be interchanged. This problem is particularly acute
when both the Z bosons decay hadronically, which happens more than half the time.
Hadronic decays result in jets, and because of gluon radiation, jet misidentification
and contamination, the number of jets that are observed in such events could be
something other than four. In the case of exactly four decay products, reconstructing
the second Z becomes trivial, once the first is reconstructed. Reconstructing the first
Z then becomes a combinatorial problem of one in three. In the case of hadronic
decays into an arbitrarily large number of jets, the combinatorial problem becomes
worse. Moreover, the successful reconstruction of one Z does not trivially lead to the
reconstruction of the second Z.
The approach we adopt here is that the four highest-energy jets carry the max-
imum amount of information about the decay products, and so we look for the Z
bosons in these four jets. Thus, we get six combinations of jets, out of which at most
two are the right combinations. We would try to reconstruct the Z from the kinematic
information of each pair of jets, and expect to find at most two right combinations
per event.
Each jet, constructed from the calorimeter energy and position of cluster four-
vectors with masses assumed to be zero, has three independent kinematic quantities.
Therefore each jet pair should have six independent kinematic quantities. In our
problem, therefore, we will have at most six independent kinematic quantities.
28
In this chapter, we design and optimize neural networks to identify the correct jet
pairs. For this purpose we use a sample of e+e− → ZZ events in which both the Z
bosons decay hadronically (produce jets). The training data are pre-classified using a
jet-quark association rule described later in Section 5.2.3. The neural network takes
six variables as input: the two jet energies, Ej; the two cosines of polar angles, cos θj;
the jet-jet opening angle θjj and the jet-jet invariant mass, mjj.
3.3 Rationale for a neural network
An important characteristic of neural networks is the ability to perform robust non-
linear modeling. In contrast to traditional statistical non-linear modeling, a user of
neural networks need not specify, a priori, the exact nature of non-linear dependence.
Therefore, they are particularly appropriate for modeling unknown dependencies.
Further, as the number of independent variables increases, statistical multivariate
methods become unstable while neural networks do not.
Neural networks further provide a robust method for classification problems. That
is, neural networks do not degrade when noise is introduced. On the contrary, noise
adds to the generalization capabilities of the network, which a user can exploit to get
better results.
Neural networks typically perform better than other classifiers. Therefore, in this
thesis, we use them as a benchmark while we discuss and compare other classification
methods. In the rest of the section, we define a measure of comparison and then
demonstrate the assertion that neural networks do indeed perform better than certain
others.
29
3.3.1 The F -measure
The two performance criteria generally used for comparing classification algorithms
are the efficiency (ε) and purity (p). They are defined as follows:
ε =TP
TP + FN(3.1)
p =TP
TP + FP. (3.2)
Here TP, FN and FP are the numbers of true positives, false negatives and false
positives, respectively. Therefore, the efficiency is the percentage of the signal that
the classifier could identify and the purity is the percentage of the classifier in the
identified sample. These two numbers are displayed in a two-dimensional plot.
On their own, neither ε nor p are good measures of performance. Therefore, we use
the harmonic mean of the efficiency and purity to reduce the two-number comparison
to a one-number comparison. The F -measure [52], is defined as:
F = 2εp
ε + p. (3.3)
The higher the F -measure, the better is the network performance, with unity rep-
resenting a perfect classifier for when both efficiency and purity are unity. The lowest
F -measure possible is zero, when either efficiency or purity is zero. The measure is
bounded between these two values.
3.3.2 Comparison with some other methods
A neural network is compared with two other classifiers in Figure 3.1. The two
classifiers are part of the Intelligent Miner (IM) suite of data mining tools. They
are (1) a decision tree classification (IM-DT) [79, 58] and (2) a feedforward neural
30
0.4 0.5 0.6 0.7 0.8Purity
0.3
0.4
0.5
0.6
0.7
0.8
Effi
cien
cy
Efficiency−PurityComparison
Ensemble NetworkEnhanced signalDecision TreeIntelligent MinerCut on theta
Figure 3.1: Performance of different classifiers. For a quantitative comparison, seeTable 3.1.
network (IM-NN), with learning via backpropagation. They are trained and tested
on the same datasets. We also compare the performance with the cuts obtained from
Section 2.3.1 (θjj-cuts).
F -measureEnsemble Network (NN) 0.5773Enhanced Signal (NN) 0.5719Decision Tree (Intelligent Miner) 0.538Neural Network (Intelligent Miner) 0.447θjj-cut 0.4309
Table 3.1: F -measure (Equation 3.3) for different classifiers.
3.4 A discussion on neural networks
The neurological brain inspired the neural network [44]. An early challenge was to
develop a mathematical neuron that would mimic the biological one.
Having been inspired by the brain, the newly emerging neural network began
moving away from its strictly neurological groundings. With the first simulation
31
of the neural network on a digital computer [71], and their universal computational
abilities [65], a branch of the neural network was on the verge of moving away from the
moorings of neurobiology and branching off into a role of problem solving. With the
first simulation of the neural network on a digital computer, and the belief that it had
universal computational abilities, researchers, usually physicists and engineers, began
using them for solutions in the non-biological fields. Theoretically, neural networks
are capable of universal computations [80].
3.4.1 The neuron
One of the early successes was the binary McCulloch-Pitts neuron [57], sometimes
called the MP neuron. This neuron calculates a weighted sum of its input and has two
states: active and inactive. The neuron is active when the sum of the inputs times
their weights crosses a threshold value. This is formally performed by an activation
function, the Heaviside function, with the threshold named the bias of the neuron.
This mimics the activity of a real neuron, with the active state representing the neural
firing, and the inactive state representing the quiescent state.
This design of a neuron, that of sum of inputs and the activation function was
a major breakthrough. Though the neuron itself is simple, in a network they are
capable of complex behavior [40].
The MP neuron is a binary unit on account of the Heaviside function, and the
non-linearity of the network is attributed to the use of this function. The MP neuron
can be generalized with the use of a non-linear, continuous function instead of the dis-
continuous Heaviside. In general sigmoid (S-shaped) functions are used. The popular
functions are the logistic function and the hyperbolic-tangent functions. Generally a
sigmoid function is used which is a monotonically increasing function.
32
The neuron in a network performs the following basic computational function:
g(a) =1
1 + exp (−∑W · x)
, (3.4)
where W is the vector of weights, including the threshold value, and x is the vector
of inputs to the neuron, including the constant-valued (-1) input for the threshold.
g(a) is the output of the neuron. Equation 3.4 is the definition of a logistic activation
function.
An important consequence of a network based on the logistic activation function
(Equation 3.4) and trained by the backpropagation of a squared error function is that
the output can be interpreted as the a posteriori Bayesian class probability [88, 75,
86, 69]. This makes it possible to construct combinations of networks and attribute
errors to neural network classifications. Figure 3.2 demonstrates this probability
interpretation.
3.4.2 The network
There are many kinds of neural networks, each suited for solving different problems.
The broad classes of neural networks are: multilayered perceptrons, recurrent net-
works and self-organizing maps. Multilayered perceptrons (MLP) consist of neural
nodes [73] that are arranged in more than one layer [61], with connections restricted
to the adjoining layers. In these kinds of networks, the input is provided to one of the
end layers, and the output is obtained at the other end. These networks are suited
for function fitting, optimization and classification problems. The recurrent network
is similar to the MLP, but it is characterized by connections between the outputs of
subsequent layers to neural nodes in the upstream layers [25]. This kind of network
is suited for problems with sequential or time-series data. The last broad category
33
0 0.2 0.4 0.6 0.8 1output unit 1 (Z)
0
0.2
0.4
0.6
0.8
1
outp
ut u
nit 2
(W
)
NN output
Figure 3.2: The distribution of the two output neural units trained on output datathat is either (1,0) or (0,1). The straight line is not a fitted line, but the outputdistribution of the two neural units, showing the sum of the outputs is unity, as isrequired for a probabilistic interpretation. There are a few outliers present, though,which shows that some regions of the variable space were not adequately representedand these regions were under-trained.
of neural network is the self-organizing map, or Kohonen map [45, 46]. This kind of
neural network is suited for feature extraction or dimension reduction, and follows an
unsupervised learning algorithm.
In this work, we wish to use neural networks to classify pairs of jets as the decay
product of either a heavy boson or a wrong-combination pair. We will be using the
jet kinematic variables to provide the training data. Thus, the network of choice for
our problem is the MLP.
This MLP consists of layers of neural units that are fully connected with the neural
units of adjacent layers. CJNN, the implementation of the MLP, allows us to adjust
34
the number of layers and neural units and optimize the network.
3.5 Neural Network Training
3.5.1 Data representation
The input data are continuous in all the variables and there are no missing values.
Nevertheless, variables have different variances. For instance, values for mjj, the
jet-jet invariant mass, range from 0 to a few hundred GeV, whereas values for cos θj
range from -1 to 1. Though the raw data could be used for network training, the high
input values are likely to introduce pathologies in the values of weights. Therefore,
we normalize the input data as follows:
xni →
xni − xi
σi
, (3.5)
where xi and σi are the mean and standard deviation of the distribution of variable
xi on the training data set. In this form, all variables in the training data have unit
variance and zero mean. Since the transformation is linear, we have not introduced
any artifact into the data.
The output class label data are encoded using the 1-of-c encoding scheme. For
example, in the combinatorial problem (Section 3.2) we have two classes: (1) jet pairs
are Zs and (2) jet pairs are wrong combinations. Therefore, we require that there are
two output neural nodes. The Z jet pair is encoded (1,0) and the wrong combination
jet pair is encoded (0,1).
3.5.2 Learning Algorithm
CJNN uses the classical backpropagation method for training neural networks [76].
35
Backpropagation methods, coupled with earlystopping (see Section 3.5.3) perform
better for bigger networks and do not overfit, as opposed to faster learning algorithms
like conjugate gradient [20].
To prevent the training from languishing in a local minimum, simulated annealing
is implemented. A random noise added to the weights at every update shakes up the
network, and throws it out of any local minimum it might be trapped in [72]. Thus,
the change in the weight is given by
∆w = ∆w0 + σ (3.6)
∆w0 = −η∇E (3.7)
σ =1
1 + exp α(Nc − β)×∆w0 × ran (3.8)
Above, α and β are numbers, η the learning rate, Nc the learning cycle (or epoch num-
ber) and ran a random number with a Gaussian distribution. E is the generalization
error.
3.5.3 Bias-variance dilemma and earlystopping
A central issue of effective neural network training is the bias-variance trade-off [35].
Bias is defined as the average difference between the neural network result and the
function the network is trying to model. The variance on the other hand is defined
as the sensitivity of the result to the dataset in use [16]. These two terms occur
additively in the error term but behave differently. The goal of network training is to
reduce both the terms simultaneously.
Theoretically it is possible to progressively train a sequence of networks with
increasing size and training dataset that will reduce both the bias and variance to-
gether [88]. But computationally, this is an expensive procedure, on account of large
36
networks and large datasets taking a very long time to train. From this point of view,
the goal of neural training is to provide a network that is optimized on network size
and data size. Therefore, the goal of network training in this chapter is not only to
decrease the generalization error but also to provide a choice for an optimized network
size as well as training data size.
Validation dataset
The concept of the bias-variance dilemma can be used to obtain optimized neural
network training. The calculation of the error on the training dataset indicates only
the bias of the network and not the variance. This is rectified with the use of a
validation dataset—a dataset that is independent of the training data sample.
The validation dataset performs the function of a general dataset. As the neural
network is being trained on the training dataset, the error is calculated alongside on
the validation dataset. A network that is closely fitted on the training dataset is
likely to pick up the idiosyncratic nature of the dataset thus giving rise to overfitting,
a condition with low bias but high variance. The validation dataset is used monitor
the training and an increase in the variance will result in an increase in the error on
the validation set. The network, for which the validation error is the lowest, has the
largest generalization capability and the best bias-variance trade-off. In practice, the
network is trained beyond the minimum point to make sure that that indeed is the
minimum. The use of this saved network state is called earlystopping [78].
For optimal performance of earlystopping, it is important to make sure that the
variance (overfitting), does not increase in the early phase of the training, and the
decrease in the error is solely due to a decrease in the bias. Since overfitting is due to
fitting the training dataset too closely, the network is characterized by large values of
assorted weights. Therefore, the networks are initialized by a random set of weights
37
that are very close to zero. In this work, we use random numbers between -0.01 and
0.01 to initialize weights and thresholds of the neurons.
3.6 Neural Network Architecture
The first task in setting up a neural network is deciding on an architecture—fixing the
number of layers and the number of neural units in each layer. There are some guide-
lines available [51], but in most applications, the architecture is invariably problem-
specific and empirical. The chief trade-off in fixing an architecture is between accuracy
and overfitting. A neural network with an insufficient number of neural units or in-
sufficient number of layers will be unable to learn a problem with sufficient accuracy.
But on the other hand, a neural network with too many neural units and layers would
easily overfit a problem, leading to inaccuracies again, in addition to taking longer to
train.
3.6.1 The number of layers
It has been shown theoretically that a single hidden layer with a sufficient number
of neural units is a universal approximator [42, 33]. It has been heuristically shown
that for data with continuous numeric data, a neural network with a minimum of two
hidden layers of sigmoids is sufficient to model any <n → <m mapping [50], provided
that each layer has a sufficient number of neural units. Though MLPs with a single
hidden layer are capable of universal approximations, additional hidden layers (and
the simultaneous removal of neural units) add to the computational efficiency of the
network by decreasing the number of weights in the network [54]. Thus a network
with two hidden layers is sufficient for training on data for which the nature of the
decision boundary is not known.
38
Even when a second hidden layer is not warranted by the decision boundary,
adding an extra layer, and decreasing the number of units in the first hidden layer
might lead to a decrease in the net number of weights, which would result in a faster
network [21].
3.6.2 The number of units in each layer
Though there exist theoretical results and heuristic proofs on the required number
of hidden layers, no such result exists for the number of hidden units [16]. Getting
an optimal number of neural units is important because it has a bearing on the
bias-variance trade-off [35], discussed above. Computationally, a network with an
insufficient number of neural units will be unable to model the problem adequately,
and a larger network will be computationally slower.
Therefore, we require a method to fix an optimal size of the network.
Pruning networks
There are two approaches one may take in deciding the optimum number of units in
each layer. The first is to begin with a large network and prune the number of units.
The other is to start with a smaller network and progressively add more units.
One way of pruning a big network is to let the weights decay. We let the connection
strengths of neural units decay with each iteration. The decay is done using the
following prescription:
wnewij = (1− γ)wold
ij , (3.9)
γ =ε′
(1 +∑
w2ij)
2. (3.10)
Here, ε′ is the product of the learning rate (η = 0.1) and a parameter ε. The summa-
39
tion in the denominator is over all weights of the input connections for a particular
neural unit.
Since weights with lower absolute values connect neural units weakly, they play a
diminished role in the output of the network. Therefore the prescription above tends
to decay the neural units faster if the sum of square values of the weights per unit is
low. This lowering of the value of weights plays against the iterative updates resulting
unreinforced weights to decrease faster.
3.6.3 Fixing ε
To fix the value of the parameter ε, we note that too large a value would decrease the
weights faster than they can be reinforced during network training (Figure 3.3(a)).
Therefore, the choice of ε should be large enough so that redundant neural weights
decrease during regular training, but the overall error during the training process
shows similar behavior as for ε = 0. Using this empirical requirement, we find that
ε = 0.00001 is a good value (Figure 3.3(b)).
3.6.4 Root mean square weights
Next the weights of each neural unit are examined to ascertain important ones. We
use the root-mean-squared (RMS) value of all the weights for any particular neural
unit to represent the overall value of the neural unit. Neural units with smaller RMS
weights are considered as less important than those with larger values, since they
have a much smaller effect on the next layer of neurons on average.
The distributions of RMS of weights for the two hidden layers are shown in Fig-
ures 3.4(a) and 3.4(b). In the first layer, some neural units show low RMS of weights.
The set of neural units with low RMS of weights is consistent over training cycles.
40
0 500 1000 1500 2000Cycles
0.04
0.06
0.08
0.1
0.12
Err
or
Error
Train, ε=0.01Valid, ε=0.01Train, ε=0Valid, ε=0
(a)
0 500 1000 1500Cycles
0.058
0.063
0.068
0.073
Err
or
Error
Train, ε=0.00001Valid, ε=0.00001Train, ε=0Valid, ε=0
(b)
Figure 3.3: (a) With ε = 0.01, the weights decay too fast for the neural network tolearn. This is denoted by the flat error curve in the figure. The difference in the errorvalues on the training and the validation datasets is due to the difference in size. (b)With ε = 0.00001, the error curves exhibit the same behavior as for ε = 0.
This suggests that a good cutoff for the RMS weights is 1.
Using this cutoff, we obtain a new network: 6–21–22–2. This network has 677
parameters (weights and biases). This is approximately half of that in the original 6–
34–24–2 network. The errors in the training and validation sets are given in Figure 3.5.
We observe an initial spike when the network is pruned, but the training follows closely
that of the unpruned network. The errors in the validation set show that the pruning
probably has an effect on the generalization capability of the network, as is expected.
Since computation time goes as W , the number of weights and biases, the reduc-
tion of W by about a half is a significant gain.
41
0 2 4 6 8 10 12 14rms(W)
0
5
10
15
Neu
ral U
nits
RMS weightsFirst hidden layer
(a) First hidden layer
0 2 4 6 8 10 12 14rms(W)
0
5
10
15
Neu
ral U
nits
RMS weightsSecond hidden layer
(b) Second hidden layer
Figure 3.4: The RMS weights of neural units in the first and second hidden layers.In (a), the low-valued RMS weights is clearly isolated, and are all less than 1. In (b),the low-valued RMS weights are not isolated. So we follow (a) and delete the neuralunits that have RMS weights less than 1.0.
3.7 What do neural networks do
What does a neural network do? As will be seen, neural networks do not offer rules
that are easily understood. The information it provides is low-level and does not
constitute “knowledge”, as defined in knowledge discovery. The decision boundary
which it constructs after it has been sufficiently trained cannot be constructed in terms
of another object, other than the neural network itself since the decision boundary is
in general complicated.
Here we employ a binary tree on data that has been classified by a neural net-
work. Binary trees offer rectangular decision boundaries in the form of cuts and
limits on variables, which are closest in form to those employed in traditional physics
classifications. Therefore, a binary tree approximation of the neural network decision
boundary could be instructive.
42
0 500 1000 1500 2000Cycles
0.06
0.065
0.07
0.075
Err
or
Deleting Neural Units(I: 6−34−24−2); (II: 6−21−22−2)
train Ivalid Itrain IIvalid II
(a)√
W 2 < 1.0 yields 6-21-22-2
0 500 1000 1500Cycles
0.06
0.065
0.07
0.075
0.08
Err
or
Weight Decayrms(Wij) < 2.3
train2.3valid2.3train0valid0
(b)√
W 2 < 2.3 yields 6-21-2
Figure 3.5: With a√
W 2 cutoff of 1.0 a 6-21-22-2 network is obtained, which afterthe initial spike in the learning curve, continues in the expected fashion. Also notethat the error on the validation dataset is lower compared to the error from thevalidation set on the older network. This indicates that the generalization capabilityhas improved with pruning. By contrast in the more drastic pruning (b), we see thatafter the spike, the error curves do not follow the behavior of the unpruned network,thus exhibiting poor learning. This also reinforces our assertion that we require atleast two hidden layers to solve the problem.
43
In the combinatorial problem, one of the branches offers cuts (Table 3.2) for the
correct Z boson. It offers a cut that is similar to the one obtained by a binary tree
alone (see Section), in addition to two cuts on the cosine of the polar angles which
are very close to the fiducial cuts in traditional physics, as well as a cut on the jet-jet
invariant mass.
0.6 ≤ φij < 1.38cos θi < 0.76cos θj < 0.78
mij ≥ 81.28 GeV
Table 3.2: The binary tree rule for one of the branches on trained on the neuralnetwork result.
3.8 Conclusion
In this chapter, we have motivated the use of the neural network for solving the
combinatorial problem. We have discussed the different types of neural networks that
are possible and picked the multilayered perceptron as the most appropriate network
for our problem. We solved the first hurdle in network design by fixing the number
of layers from empirical arguments and the number of neural layers by a weight
decay and pruning algorithm. Finally, we gave a brief solution for approximating the
complicated and unknown neural network decision with the help of a binary tree.
44
Chapter 4
FastCal: A fast Monte Carlo
simulator for the LCD calorimeter
4.1 Introduction
This chapter describes FastCal, a fast linear collider detector (LCD) simulation pack-
age developed to provide data for this thesis. The central feature of this package is the
hadronic shower parameterization to provide a fast implementation of the detector
simulator. The main goal of this package is provide a replacement for a full simulator
(for example, GISMO and GEANT) that will offer data at the cluster level and not
at the level of calorimeter cells, and is suited for statistics-limited studies that do not
depend on finer details of shower development.
Section 4.2 compares simulated and real data, and emphasizes the need for simu-
lated data. Section 4.3 provides the rationale for the development of a fast simulator.
Section 4.4 provides a general description and design philosophy of the of FastCal.
Section 4.5 describes the process that is being simulated. Section 4.6 compares hadron
showers from FastCal with those from GISMO. Section 4.8 concludes this chapter with
45
a summary of FastCal features and results.
4.2 Simulated data and real data
Simulated data are limited by extant knowledge of physics on which the simulations
are based. In spite of this limitation, they play a very important function in the design
and development accelerator equipment and data analysis algorithms. When real data
are not available, simulated data provide the necessary input for development efforts.
Further, for the development of data analysis routines, simulated data provide internal
information of physics processes that are not available in real data. In this thesis, we
make use of the secondary heavy boson and quark information to create training and
testing data sets. Training and testing data created from real data will be biased by
the methods used to classify the training sample. To circumvent this, simulated data
are indispensable.
4.3 Rationale for FastCal
Monte Carlo simulations of detectors have become integral to every part of experi-
mental high-energy physics, from detector design to data analysis. They have become
the primary tools by which the theories of particle physics are turned into testable,
quantitative predictions. The first hurdle in a detailed simulation is generally the
cost in terms of CPU time, though this is becoming less of an issue with the availabil-
ity of ultra-fast modern processors and multi-processor systems. Two such detailed
simulation packages (generically called full simulation) in use today in the LCD com-
munity are GISMO [19, 7] and GEANT4 [81]. In these full simulation schemes, a high
fraction of the CPU time is expended in the detailed simulation of the calorimeter.
46
The most time consuming parts of these full simulations are the simulation of
shower development. Energetic particles passing through matter lose energy by pro-
ducing more particles. These particles in turn produce other particles, thus forming
cascades of particles which are called particle showers. There are two major classifi-
cation of showers:
Electromagnetic showers At energies higher than 1 GeV, electrons and positrons
lose energy mainly by the process of Bremsstrahlung producing additional high-
energy photons. Photons lose energy predominantly by the process of electron-
positron pair production. Therefore, incident electrons, positrons and photons
produce a cascade of particles containing mainly these three kinds of particles.
Hadronic showers Hadrons undergo a variety of processes which include hadron
production, nuclear de-excitation, pion and muon decays etc, resulting in a
cascade of hadronic particles that constitute the hadronic showers. In these
processes, a significant fraction of the secondary particles produced are π0’s,
which do not take part in hadronic interactions. Since the predominant π0
decay mode is to a photon pair, hadronic showers have electromagnetic shower
components.
Full simulation methods simulate all the processes that contribute to the elec-
tromagnetic and hadronic showers, all the way down to individual particles. That
is, new particles are created, their paths are calculated and their interactions with
the calorimeter components simulated. A significant fraction of the computation is
attributable to the simulation of particle showers.
In general, particles moving through calorimeters deposit energy in regions which
are spread over many calorimeter cells. Particles are reconstructed from the energy
deposits in these cells by clustering them. For studies that do not require individual
47
particle identification, clusters could be used directly to form jets.
In applications such as these, which require large statistics, a faster simulation
method is needed. FastCal provides us with such a simulation package. It replaces the
most compute-intensive part of the calorimeter simulation by parameterized functions
to be evaluated quickly. As a result, an enormous statistical improvement is obtained.
The LCD collaboration has devised such a software package, a fast Monte Carlo
(FastMC), to supplement its full Monte Carlo system. It is used for roughly establish-
ing the physics reach of the several LCD prototype designs for a wide array of physics
channels. One of the major drawbacks of FastMC is that it lacks a facility for simu-
lating calorimeter responses. Another fast Monte Carlo package, also named FastMC
and part of the LCD suite of tools for the ROOT environment (LCDROOT) [43]
provides smearing of particle energy and merging of clusters.
4.3.1 Description of existing FastMC
The current FastMC package can be executed either from the JAS or the LCDROOT
environments. In the JAS environment, FastMC uses generator-level particle lists,
along with a detector geometry file, to produce information that can be handled by
the JAS analysis package.
This information must be fully reconstructed before being passed to the analysis
stage, meaning that FastMC is responsible for track, cluster and vertex reconstruction.
Generator-level particle lists are provided to FastMC in the form of StdHep [34]
files. They can be produced by any number of standard generator packages. For
the purposes of FastCal development, we used files generated by the Pandora-Pythia
package, which is based upon Pythia. Because these generator files are also used
for the GISMO-based LCD full simulation, this enabled us to do direct comparisons
between the full simulation and FastCal.
48
4.4 Description of FastCal
FastCal is a fast Monte Carlo simulation of the electromagnetic (ECAL) and hadronic
(HCAL) calorimeter responses to linear collider events in the√
s = 500 GeV range. It
aims to simulate the physical response of calorimeters in as short a time as possible by
forgoing as much detail as necessary, while still yielding sufficiently accurate results
so that kinematic properties of jets from FastCal would match closely with those
from GISMO. This it does by offering energy depositions at the level of clusters, with
each cluster forming at most one cluster each in the ECAL and HCAL, as opposed
to GISMO, which yields information at the calorimeter cell level.
The simulation is written in Java, for use in the JAS environment. For input, it
can accept data from event generators (e.g. Pandora-Pythia) [82, 67], in the StdHep
format [34].
4.4.1 Design philosophy and approach
The design goal for FastCal was that the running time should be comparable to that
of the existing LCD FastMC. In order to achieve this, we had to adhere to certain
policies.
The first policy was to avoid “swimming” particles through the detector (i.e.,
moving them along their trajectories in a series of fixed steps). At each step, rather,
we calculate the next “significant event” in each particle’s journey (e.g., initiating
a shower or crossing a detector boundary) and analytically solve for the particle’s
position and energy at that point. This approach speeds up the process by eliminating
the loop over many small step sizes.
The second policy was to avoid iterative numerical integration whenever possible.
The third policy was not to create new particles. All decays, like the V decays,
49
are handled by the generator (Pandora-Pythia).
The fourth policy is to parameterize physical processes.
The fifth policy is not to simulate the calorimeter in its granularity (different layers
of materials and segments), but to use average properties of calorimeters.
Not only does this latter policy save compute time directly, but it also allows
us to avoid the necessity of performing cluster reconstruction and recognition on
detector hits. Each shower directly becomes a cluster that can be used in the JAS
analysis. Though this introduces the serious error of removing the artifacts of cluster
recognition and reconstruction, this effect is acceptable in this since the use is made
of jets constructed from clusters, and the individual clusters are not important in
these examples.
4.4.2 Geometry of Calorimeter
There are a number of detector designs that are under development for the LC ex-
periments [1]. These detector designs are described by a generic format using the
eXtensible Markup Language (XML). This enables a unified and rapid simulation
and testing of these detectors designs. In keeping with this philosophy, FastCal
calorimeter geometrical information is not hardcoded, but is read from the relevant
detector files.
Three reference designs are currently used for simulation studies in the Next Lin-
ear Collider (NLC) context: a large tracking volume detector to optimize tracking
precision for the high-energy interaction region with a magnetic field strength of
3 Tesla (LD); a silicon tracking detector to optimize Particle Flow calorimetry in the
high-energy interaction region with a field strength of 5 Tesla (SD); and a precision
detector for Giga-Z in a low-energy interaction region (P) [3].
Both the LD and the SD detectors have similar geometries. At the inner core
50
reside the vertex detector and tracking chambers. The calorimeters consist of an
inner ECAL, and an outer HCAL. The magnetic coil lies between the ECAL and the
HCAL in the SD design, whereas it lies outside the HCAL in the LD design.
ECAL HCALBarrel Inner Radius 196.0 cm 233.4 cmBarrel Outer Radius 220.0 cm 365.4 cmEndCap Z Inner 297.5 cm 334.0 cmEndCap Z Outer 321.5 cm 466.0 cmEndCap Inner Radius 29.0 cm 31.0 cmNumber of layers 10 3Active material Polystyrene PolystyreneInactive material Lead LeadField Strength 3.0 T 3.0 T
Table 4.1: Detector Parameters ldmar01, a design specification for the LD design,used in this study.
The calorimeters in both the detector designs consist of a cylindrical barrel and
annular end cap calorimeters. The calorimeters are sampling calorimeters that consist
of stacked layers of active and inactive materials.
4.5 FastCal simulation
For FastCal we adopt the following picture for simulation. Incident e+, e− and γ
trigger electromagnetic showers in the ECAL very close to the inner surface. As such,
ECAL completely contains these showers. For our scheme in which single particles
produce single clusters, there is no need to simulate the electromagnetic shower ex-
plicitly. Instead, all the energy of the particle is fluctuated according to the energy
resolution of the ECAL and deposited in the calorimeter. Since electromagnetic show-
ers are initiated close to the inner surface of the ECAL, the energy is deposited at
the intersection of the particle trajectory and the inner surface itself.
51
For other leptons, the neutrinos escape detection in the calorimeter. Muons escape
with a minimum ionization deposit. The τ -leptons decay into other particles before
entering the calorimeter. The incident hadrons trigger hadronic showers, not all of
which are contained in the ECAL. The hadronic shower parameterization, described
below, is used to calculate the energy deposited in the ECAL. The rest of the energy
is deposited in the HCAL. We assume that ECAL and HCAL contain the entire
hadronic shower.
4.5.1 Particle Propagation
Particles in the Monte Carlo tables (obtained from Pandora-Pythia event generation)
that are flagged “final state” are made to propagate from their initial positions to the
calorimeter surfaces. Charged particles follow a helical trajectory in the magnetic field
and neutral particles follow straight trajectories. The trajectories depend upon the
initial momenta and positions of the particles. We assume that the particles passing
through the tracking chambers do not undergo any process that results in any loss of
energy, or creation of new particles. The significant events for each particle trajectory
are the points of intersection with the different calorimeter detector surfaces: the inner
and outer walls of the barrel calorimeter for both ECAL and HCAL; the inner and
outer walls of the two annular detectors at the positive and negative ends for both
ECAL and HCAL and the beam pipe circular holes.
52
Neutral particles
For neutral particles, the z when it hits the barrel is the solution of a quadratic
equation given by:
z =(z0 −
B
A
)+ hz
√√√√(B
A− z0
)2
−[z0
(z0 −
2B
A
)+
x20 + y2
0 − r2
A
](4.1)
where r is the radius of the Barrel Calorimeter surface, hz is the sign of pz and
A =p2
x + p2y
p2z
(4.2)
B =pxx0 + pyy0
pz
(4.3)
Figures 4.1 and 4.2 display the intersection of neural particles on one of the calorime-
ter surfaces and demonstrate that the analytic solution of the intersection, given in
Equation 4.1, is correct.
Charged particles
For charged particles, the x and y positions along the helical trajectory are given as
a function of z as follows:
x = x0 + RH
[cos
(Φ0 + hq
z − z0
RH tan λ
)− cos Φ0
](4.4)
y = y0 + RH
[sin
(Φ0 + hq
z − z0
RH tan λ
)− sin Φ0
](4.5)
where
tan λ =pz
p⊥(4.6)
53
-300 -200 -100 0 100 200 300x (cm)
-300
-200
-100
0
100
200
300
y (c
m)
HAD Barrel CalorimeterNeutral Hadrons
Figure 4.1: Plot for Hadronic Barrel Calorimeter for neutral hadrons. Each circlerepresents an intersection point of a neutral particle with the inner surface of a barrelcalorimeter.
-400 -300 -200 -100 0 100 200 300 400z (cm)
-300
-200
-100
0
100
200
300
HAD Barrel and EndCap CalorimeterNeutral Hadrons
Figure 4.2: Plot for Hadronic Barrel Calorimeter for neutral hadrons.
54
-300 -200 -100 0 100 200 300x (cm)
-300
-200
-100
0
100
200
300
y (c
m)
HAD Barrel CalorimeterCharged Hadrons
Figure 4.3: Plot for Hadronic Barrel Calorimeter for charged hadrons.
tan Φ0 =
√x2
0 + y20
z0
(4.7)
RH =p cos λ
|κqB|(4.8)
Here κ is a constant, q is the charge of the particle and B is the field in the positive
z direction. hq is the sign of the charge.
Figures 4.3 and 4.4 display the charged particle trajectory intersection with the
inner surface of the HCAL. The big circle consists of small circles, each representing
an intersection of a charged particle with the inner surface of the barrel calorimeter.
This displays that the analytic solution for a helical path of a charged particle given
in Equations 4.5 and 4.5 gives a correct interaction point.
Figures 4.5 display the trajectories of the final state particles of the same exam-
ple event in FastCal and GISMO for comparison. Note that the trajectories match
in orientation, and in propagation distance. This displays that the match between
55
-400 -200 0 200 400z (cm)
-300
-200
-100
0
100
200
300
x (c
m)
HAD Barrel calorimeterCharged Hadrons
Figure 4.4: Plot for Hadronic Barrel Calorimeter for charged hadrons.
particle paths in FastCal and GISMO is good.
4.5.2 High-energy physics processes
Electrons/positrons and photons
During their passage through matter electrons and positrons lose energy to ionization.
In addition photons are produced by Bremsstrahlung. Prompt photons, as well as
those from Bremsstrahlung, convert by pair production predominantly to electrons
and positrons. Therefore, electrons, positrons and photons together produce a cascade
of particles, referred to as an electromagnetic shower [59]. The longitudinal shower
development depends on the density of electrons in the material and, therefore, can
be described in terms of the radiation length (X0) [26]. The thickness of the ECAL is
∼ 25 X0, and as a result most of the electromagnetic (EM) shower is contained in the
ECAL. Therefore, in the first approximation, we assume there is no leakage of the
56
HAD Barrel Intersectioncharged particles -- Event 95
(a) FastCal (b) GISMO
Figure 4.5: Event with charged final state particle trajectories in FastCal and GISMOfor comparison. Figure (a) is obtained from FastCal and Figure (b) is the same eventsimulated by GISMO and viewed using LCDWired.
EM shower, and deposit the entire energy from electrons, positrons and photons in
the ECAL (Figure. 4.6). Any correction to this will contribute a very small correction
to the jets which are constructed from the cluster-level calorimeter energy response
in both the hadronic and electromagnetic showers.
Hadronic Showers
Hadronic particles passing through matter interact with the nuclei in a series of
inelastic nuclear interactions. These interactions result in more hadronic particles, in
a cascade called a hadronic shower. A significant fraction of the particles produced
is π0 particles, which decay mostly into photons. Therefore hadronic showers have a
significant electromagnetic component.
The origins of hadronic shower depths are dependent on a single parameter, the
interaction length. The energy deposited at the ECAL is calculated from the longitu-
dinal profile of the hadronic shower development, parameterized after Bock et al [17].
57
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (GISMO)
Figure 4.6: Calorimeter response to single e− events at 50 GeV. The left columnof histograms show the FastCal response, the right column show GISMO response.The top figures show the total energy deposit (EHCAL + EECAL). The middle figuresshow the energy in the ECAL (EECAL), and the bottom figure the HCAL response(EHCAL). Note that there is no energy leakage from the ECAL in FastCal, representedby the flat distribution of HCAL energy in the bottom left figure.
58
Longitudinal profiles of hadronic showers are simulated in FastCal by the well known
parameterization [17, 8, 89]
∫ x
0dE = E0 (wP (a, bt) + (1− w)P (c, du)) (4.9)
The parameterization consists of two parts. The first part, which depends on the
radiation length, is due to the electromagnetic decay of the π0 produced, whereas
the second part is due to the purely hadronic part of the shower. The proportion of
the two components is controlled by the parameter w. The normalization constant
is proportional to the energy of the particle at the shower origin. The parameters
a, b, c and d are taken from [17] (Table 4.2), and are obtained by fitting Equation 4.9
to incident π− test beam data obtained from the WA1, 379 FNAL and the UA1
experiments.
t x/X0
u x/λa 0.6165 + 0.3183log E0
b 0.2198c ad 0.9099 - 0.0237log E0
w 0.4634
Table 4.2: Table of parameters
X0 (cm) λ (cm) xi (cm)Pb 0.56 17.09 1.60Air 30420 69600 0.32
Polystyrene 34.40 79.36 0.40Tyvek 47.9 82.5 0.08
Table 4.3: Table of lengths. Tyvek is not included in the calculation. Since itforms such a thin layer in comparison to the others, it is expected to produce minorcorrections.
The total energy deposited in the calorimeter is calculated by performing the
59
integration in Equation 4.9 from the shower origin to the end of the calorimeter along
the particle trajectory. In Equation 4.9, the P (a, x)s are partial gamma functions [68]
P (a, x) =
∫ x0 ta−1e−tdt∫∞0 ta−1e−tdt
(4.10)
that are evaluated numerically.
In the calculations above, the radiation length and the interaction length of the
calorimeter are obtained by the weighted average of the values for each material
according to the thickness in each layer.
1
XECAL,HCAL
=
∑i
1Xi
δi∑i δi
, (4.11)
where Xi is the interaction/radiation length of calorimeter materials.
The energy that is obtained from Equation 4.9 is deposited in the ECAL. The
shower continues to develop in the HCAL. We assume that the entire hadronic shower
is contained in the HCAL. Therefore there is no need to develop the hadronic shower
explicitly. Instead the remaining energy is deposited in the HCAL according to the
HCAL energy resolution, given below.
δE
E=
43%√E⊕ 4% (4.12)
4.6 Single Particle Comparison
For a closer match between FastCal and GISMO, a single particle match is done
using negative π particles at 50 GeV. In particular, fluctuations are introduced in the
hadronic shower model parameters for a realistic energy deposition.
60
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
2000
4000
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
2000
4000
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
2000
4000
HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (GISMO)
Figure 4.7: Calorimeter response to single π− particle at 50 GeV without any fluctu-ation. The placement of the histograms is the same as that described in Figure 4.6.
Hadronic shower parametrization fluctuations
The energy deposits in FastCal for π particles without any fluctuations are given in
Figure 4.7.
Shower origin fluctuation: For hadronic particles, first hadronic shower origins
are obtained that have a randomized distribution of the form es/λ calculated from the
ECAL inner surface. Here λ is the nuclear interaction length, the values of which
are obtained from the Particle Data Group tables [38]. The inverse of the interac-
tion length is averaged for ECAL according to Equation 4.11. With the addition of
fluctuations in the shower origin, the energy deposit is given in Figure 4.8.
61
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
1000
2000
3000
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400
HCAL Energy (GISMO)
Figure 4.8: Calorimeter response to single π− particle at 50 GeV with fluctuations inthe shower origin. The placement of the histograms is the same as that described inFigure 4.6.
62
Energy deposit fluctuation: The energy deposited is fluctuated according to
the calorimeter energy resolution.
δE
E=
17%√E⊕ 0% (4.13)
The fluctuation is given by a Gaussian distribution with average energy given by the
parameterization, E and standard deviation δE. This spread in the energy completely
accounts for the energy spread in the total energy deposit in both the calorimeters
(Figure 4.9). The values displayed in the Equation 4.13 are obtained from the detector
design files.
Shower length scaling and fluctuation: The shower length fluctuation arises
because of the fluctuation of the shower center of gravity from the shower origin. Here
we adopt the distribution obtained in [66] for lead, and π− beams at 13 and 20 GeV.
Thus in Equation 4.9, t → t/f and u → u/f .
π0/π+/− fluctuation: This is achieved by fluctuating the w in Equation 4.9 above.
The distribution of this is adopted from the one used in the CDF collaboration [32]: a
uniform probability between 0.001 and 0.4. Above this, the probability is given by a
Gaussian with mean 0.4 and standard deviation 0.25 (Figure 4.11). w is not sampled
above 0.99.
4.6.1 Low-energy physics processes
In this section processes are described which account for the ECAL energy distribution
of the 50 GeV π− particles at low values. The minimum energy deposit is simulated
by the ionization due to the charge of the particle and the spread in the energy is
simulated with the delta rays.
63
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400
HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400
HCAL Energy (GISMO)
Figure 4.9: Calorimeter response to single π− particle at 50 GeV with fluctuations inthe shower origin. The placement of the histograms is the same as that described inFigure 4.6.
Ionization energy loss
Charged particles passing through the ECAL and HCAL deposit a small amount of
energy on account of ionization. The rate of energy loss for a moderately relativistic
charged particle is given by [15]
−dE
dx= Kz2Z
A
1
β2[1
2log
2mec2β2γ2Tmax
I2− β2] (4.14)
64
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
2000
ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (GISMO)
Figure 4.10: Calorimeter response to single π− particle at 50 GeV with additionalfluctuations in the shower length. The placement of the histograms is the same asthat described in Figure 4.6.
65
Figure 4.11: The distribution of w.
where the meaning of each variable is that given in the above mentioned reference.
In this form, note that the energy loss does not compensate for the density.
To obtain a close match with GISMO at low energies, FastCal energy deposits
from muons are obtained from Equation 4.14 and fitted to the energies obtained from
GISMO. Figure 4.13 represents this fit, where each data point represents energy loss
from muons with a given initial energy-momentum. The best fit is obtained with a
quadratic fit, given by Equation 4.15.
EMIP = 0.0092 + 0.37E + 9E2 (4.15)
Continuous energy loss of hadrons
The ionization energy loss accounts for the minimum energy deposits, but does not
account for the low energy spread spread in GISMO. The energy deposit high energy
66
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
200
400
600
800
Total Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
ECAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
500
1000
1500
ECAL Energy (GISMO)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (FastCal)
Energy [GeV]0 10 20 30 40 50 60 70 80 90 100
Eve
nts
0
100
200
300
400HCAL Energy (GISMO)
Figure 4.12: Calorimeter response to single π− particle at 50 GeV with additionalfluctuations in the w. The placement of the histograms is the same as that describedin Figure 4.6.
67
dE/dx (FastCal) [GeV/cm]0.005 0.006 0.007 0.008 0.009 0.01 0.011
dE/d
x (G
ism
o) [G
eV/c
m]
0.0115
0.012
0.0125
0.013
0.0135
0.014
0.0145
0.015
/ ndf 2χ 2.306 / 4p0 0.0001755ℜ± 0.009219 p1 0.03858ℜ± 0.3752 p2 2.112ℜ± 9.78
/ ndf 2χ 2.306 / 4p0 0.0001755ℜ± 0.009219 p1 0.03858ℜ± 0.3752 p2 2.112ℜ± 9.78
dE/dx Fit (Muons)
Figure 4.13: The quadratic fit for dE/dx.
tail with a minimum peak in the GISMO data indicates that the charged particles
are losing energy to additional processes. We approximate this spread by simulating
delta rays, (production of knock-on electrons) [15, 74].
Figure 4.14 displays the energy distribution in FastCal and GISMO at low energies
for π− at 50 GeV. The good match in the minimum deposit is due to the energy fit
to muon data, in the last section. The spread in the energy is due to delta rays in
FastCal and the FastCal and GISMO energy have a good agreement in the low energy
deposits.
Both the physical processes, the minimum ionization by charged hadrons, and the
simulation of δ-ray make small energy deposits in the ECAL that do not significantly
alter the overall energy deposition in the calorimeters, or alter the result significantly
at the level of jets, that we compare in the next chapter. But this section provides a
close match between FastCal and GISMO for single particle comparisons.
68
Energy [GeV]0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Eve
nts
0
200
400
600
ECAL Energy (FastCal)
Energy [GeV]0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Eve
nts
0
100
200
ECAL Energy (GISMO)
Figure 4.14: Comparison of low energy depositions in the ECAL for 50 GeV negative πwith minimum ionization and δ-ray simulations. Note that the lowest energy depositmatches well due to the fitting on the minimum energy given by Equation 4.15. Thespread to the right is due to the δ-ray simulation.
4.6.2 A synopsis of the hadronic particle simulations
Figure 4.15 displays flowchart for hadronic particle energy deposit in FastCal, de-
scribed below:
• Particle paths are traced to the front of ECAL.
• From the front of ECAL, a random shower origin according to a exp−s/λ distri-
bution.
• The particle path is traced to the back of the ECAL.
• If the particle is charged and the hadronic shower originates in the ECAL, min-
69
imum ionization and δ-ray simulations are performed from the front of ECAL
to the shower origin, and the energy is deposited in the ECAL. The deposited
energy is removed from the particle.
• If the particle is charged and the hadronic shower does not originate in the
ECAL, the minimum ionization and δ-ray simulations are performed from the
front of the ECAL to the back of the ECAL and the energy is deposited in the
ECAL. The deposited energy is removed from the particle.
• If the hadronic shower originates in the ECAL, hadronic shower parameteriza-
tion is performed with the remaining energy. The energy obtained is deposited
in the ECAL, and that amount is removed from the particle.
• The remaining particle energy is deposited in the HCAL after fluctuating it
according to the HCAL energy resolution.
4.7 FastCal gain
FastCal has yielded results that show very good agreement with those from GISMO,
for single particle tests with electrons and negative pions. The optimization scheme,
implemented to speed up simulation, yields very good results, as Table 4.4 demon-
strates.
4.8 Conclusion
In this chapter we have motivated and described the fast parameterized Linear Col-
lider Detector calorimeter simulator FastCal. FastCal simulates the calorimeter re-
sponse to final state e, e−, γ, µ and hadrons produced by the event generator Pandora-
70
shower?
getShowerDeposit
showerInRegion?
DeDxUptoEnd
DeDxUptoShowerOrigin
showerPossible?
initShowergetShowerDepositeDeDx+eShower
outputEnergy
No No
No
Figure 4.15: The algorithm for FastCal energy deposition in ECAL due to hadronicshowers.
FastCal GISMOEvents 9,900 9,900Hadronic events 4,877 4,877Four Jet Events 4,750 4,809Good Pairs 6,752 6,380Generation (100 events) 8.125 sec (PII) 8.125 sec (PII)Full Simulation (100 events) – 12,276 sec (SLAC)Clustering (100 events) 4.11 sec (PII) 61.27 sec (PII)Total (100 events) 4.11 sec (PII) 12,337.27 sec (PII)
Table 4.4: A comparison between FastCal and GISMO. Four jet events are those inwhich both the Z’s decay hadronically, and the events have at least four jets. Goodpairs are those pairs of the four jets that can be associated with the quarks from thesame Z.
71
Pythia. Hadronic showers are simulated explicitly. The single electron test events
indicate that most of the electromagnetic showers are contained in the ECAL and for
single cluster simulation, explicit simulation of showers is not required.
For hadronic showers, it was shown using single π− particles that shower origin and
energy deposit fluctuations alone are not sufficient to obtain a good match between
FastCal and GISMO. In addition, fluctuations in the shower peak (shower length)
and the electromagnetic shower component are important. All four fluctuations,
taken together, yield a good match between FastCal and GISMO.
For a GISMO/FastCal match at low ECAL energy deposits, ionization energy and
simulation of delta-rays are important. For a closer fit with GISMO, muons are used
to fit FastCal energy to GISMO energy.
Finally, FastCal implemented in the way described provides a significant gain in
time, with FastCal performing the simulation at about 0.03% the time required to do
it in GISMO.
72
Chapter 5
Neural Networks – comparison
between GISMO and FastCal
5.1 Introduction
In Chapter 3, an optimal neural network was designed with a training data set of
80,000 pairs of jets. This computes to about 50,000 events for each neural network.
For a full-fledged neural network training program many times that number of events
are required for efficient training. Since full simulation under GISMO is CPU inten-
sive, we employ FastCal to generate detector data.
This chapter compares the simulated data provided by GISMO and FastCal and
their application in data analysis. For the tests in this chapter, the GISMO data used
are from the SLAC repository. The GISMO data comprises full simulation of ldmar01
detector design, and the FastCal data are based on the corresponding Pandora-Pythia
StdHep events. Thus GISMO and FastCal come from two different simulations of the
same underlying events with the same detector.
73
sE0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Eve
nts
0
10
20
30
40
50
60
Fractional ECAL Energy (FastCal)
(a) FastCal
sE0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Eve
nts
0
10
20
30
40
50
60
Fractional ECAL Energy (FullSim)
(b) GISMO
Figure 5.1: The ECAL energy deposition obtained in (a) FastCal and (b) GISMO.The two distributions are in qualitative agreement.
5.2 Calorimeter deposition
The ECAL depositions are shown in Figure 5.1 for both FastCal and GISMO. For
FastCal, the energy per event is the sum of the energy deposited in the ECAL. For
GISMO, the energy is the sum of ECAL calorimeter hits energy times a scaling factor.
There is a good qualitative and quantitative agreement between the two distributions.
Figure 5.2 shows the HCAL energy depositions for FastCal and GISMO, and
they are obtained similarly to the depositions in the ECAL. Though the FastCal and
GISMO depositions are similar for the HCAL, they differ from the deposition in the
ECAL in the absence of the double hump.
The total calorimeter energy deposition is compared for FastCal and GISMO in
Figure 5.3. The simulated energy depositions can be compared with the total energy
of the final state e+, e−, γ and hadrons in Figure 5.4.
74
sE0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Eve
nts
0
10
20
30
40
50
60
70
80
90
Fractional HCAL Energy (FastCal)
(a) FastCal
sE0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Eve
nts
0
10
20
30
40
50
60
70
80
Fractional HCAL Energy (FullSim)
(b) GISMO
Figure 5.2: The HCAL energy deposition obtained in (a) FastCal and (b) GISMO.The two distributions are in qualitative agreement.
5.2.1 Cluster level comparison
The particles passing through the calorimeter deposit energy at calorimeter cells. For
a detector simulated by GISMO, these are represented by the CalorimeterHit objects
in hep.lcd.event. The clusters are constructed from the hits by the ClusterCheater
algorithm, a proximity algorithm that uses the particle Monte Carlo table to identify
the particle content of the calorimeter hit. The energy of the cluster is calculated by
multiplying the energy recorded for each cluster by the sampling fraction.
In the FastCal simulation of the calorimeters, there are no finer details of the
calorimeter beyond the average property. Since no new particles are created beyond
the ones created by the event generator (Pandora-Pythia), each event generator final
state particle passing through the calorimeter produces at most a single cluster. Since
the calorimeter simulated is not a sampling calorimeter, no sampling correction is
required.
The cluster energy distributions thus obtained in FastCal and GISMO are com-
pared in Figure 5.5(a). There is a general agreement in the distribution except in
75
sE0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Eve
nts
0
10
20
30
40
50
60
70
80
90
Fractional CAL Energy (FastCal)
(a) FastCal
sE0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Eve
nts
0
20
40
60
80
100
Fractional CAL Energy (FullSim)
(b) GISMO
Figure 5.3: The total calorimeter energy deposition obtained in (a) FastCal and (b)GISMO.
sE0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Eve
nts
0
20
40
60
80
100
120
140
160
180
Fractional Total Energy (Generator)
Figure 5.4: The total energy per event in final state e+, e−, γ and hadrons (generatorlevel).
the low energy range. Figure 5.5(b) shows an order of magnitude higher number of
clusters with energy <1 GeV. The difference is too big to be explained by the particles
created by GISMO during the detector simulation, and is possibly an artifact.
76
[GeV]cE0 50 100 150 200 250 300
Clu
ster
s
1
10
102
103
104
105 FastCal
FullSim
Cluster Energy
(a)
[GeV]cE0 1 2 3 4 5 6 7 8 9 10
Cls
uter
s
103
104
105
FastCalFullSim
Cluster Energy
(b)
Figure 5.5: (a) The comparison of cluster energy distribution in FastCal and GISMO(FullSim). There is a general agreement between FastCal and GISMO, except for lowenergies. (b) The first bin in (a) blown up to show GISMO has a high number of lowenergy clusters (<1 GeV). The very low energy “garbage” clusters do not impact thejet finding algorithms.
5.2.2 Jet level comparison
The clusters obtained as described above in FastCal and GISMO are then used in the
construction of jets. The JadeEJetFinder algorithm available in the lcd.hep package
is used with ycut = 0.005.
Figure 5.6 compares the jet energy distribution in FastCal and GISMO. There is
a general agreement in the two distributions with a slight discrepancy in the number
of low energy jets.
In Figure 5.7(a), the number of jets per event in FastCal and GISMO are com-
pared. As expected, the GISMO peaks at a higher value. Fitting a Gaussian and
comparing the mean, we obtain a difference of approximately half a jet between Fast-
Cal and GISMO. This is seen in Figure 5.7(b). Further, with increasingly higher jet
energy cuts, the difference progressively decreases, with not statistical difference in
the jets per event between FastCal and GISMO with a cut of 40 GeV.
77
[GeV]jE0 50 100 150 200 250 300
Jets
1
10
102
FastMCFullSim
Jet Energy
Figure 5.6: Jet energy distribution in FastCal and GISMO (FullSim). The numberof jets match in comparison, except for low energy jets.
Jets per Event0 1 2 3 4 5 6 7 8 9
Eve
nts
0
50
100
150
200
250
300
350
400
450FastMCFullSim
Jets per Event
(a)
Jet Energy Cut [GeV]0 10 20 30 40 50
Mea
n Je
ts p
er E
vent
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
4
4.2
4.4FastCalFullSim
Gaussian Fit to Jets per Event
(b)
Figure 5.7: (a) The comparison of cluster energy distribution in FastCal and GISMO.There is a general agreement between FastCal and GISMO, except for low energies.(b) The first bin in (a) blown up to show GISMO has a high number of low energyclusters (<1 GeV).
78
5.2.3 Jet-Quark Association
Jets are constructed using the Jade [11] jet finding algorithm (JadeE) available in JAS.
For GISMO data, JadeE is applied to the clusters, constructed by the ClusterCheater
algorithm that builds clusters using a proximity rule to each other. In FastCal, since
the data are available at the cluster level directly, no cluster builder algorithm is used.
For this chapter, the jet-quark association is done using the following algorithm:
• Consider the four most massive jets.
• Compute their opening angles with the quarks.
• Associate the jet with the quark closest to it.
• Break the association if the opening angle is larger than 0.3 radians.
• Break the association if another jet is closer to quark.
• Break the association if the energy of the quark is less than 10 GeV.
Using these association rules, we have a jet-quark association only if the jet is
closest to the quark, is less than 0.3 radians apart and the jet energy is larger than
10 GeV. The rule given above is used later to associate jets and quarks in ZZ → qqqq
events, and can be used for association in events in which one of the Zs decays
hadronically while the other decays into muons or neutrinos. Since neutrinos and
muons leave negligible energy depositions in the calorimeters, these events show up
in the calorimeter as single Z decay events. For such events, we may associate the jets
and quarks as follows. For the two most energetic jets, the jet-quark opening angle
is examined, and the pairs are associated if the greater of the two angles is less than
0.3 radians. Figure 5.8 displays the energies for GISMO jets. Figure 5.9 displays the
result for FastCal jets.
79
Figure 5.8: The jet-quark energy plot for GISMO events with one Z decaying hadron-ically, while the other decays into muons and neutrinos. The the most energetic jetsare then associated with the quarks, and a match is established if the bigger of thetwo jet-quark opening angles is less than 0.3 radians.
Figure 5.9: The jet-quark energy plot for FastCal events with one Z decaying hadron-ically, while the other decays into muons and neutrinos. The the most energetic jetsare then associated with the quarks, and a match is established if the bigger of thetwo jet-quark opening angles is less than 0.3 radians.
80
5.3 Neural network training
5.3.1 FastCal-GISMO comparison
Here, a comparison is made of the FastCal and GISMO neural network training.
For a head-to-head comparison of the FastCal and GISMO, see Table 4.4. The
data consist of that available in the SLAC repository of 9,900 generated and fully
simulated e+e− → ZZ events. Hadronic events are those events in which both the
Z’s decay hadronically and are identified by the particle decays in Monte Carlo tables.
The Z bosons decay into quarks that hadronize and form jets. It is assumed that
the four most energetic jets represent the fragmentation and hadronization of the
four daughter quarks of the Z boson. To find the right combination, each jet pair
(six, from four jets) is compared with both quark pairs and matched. The jet-quark
association algorithm is given in Section 5.2.3.
In addition to the SLAC generated data, we compare the results from newly
generated Pandora-Pythia data. Pandora-Pythia events are generated using Pandora
Pythia V3.2. This interface uses Pandora V2.2 (with the patches available), and
Pythia in the CERNLIB 2002 package. The data consists of e+e− → ZZ with
√s = 500 GeV. The electrons and positrons are unpolarized. Each beam carries half
the center-of-mass energy of 500 GeV. For use with FastCal, the following particles
are decayed: (1) k0s , (2) k0
l , (3) Λ, (4) Σ+, (5) Σ−, (6) Ξ0, (7) Ξ− and (8) Ω−.
Neural Network
A 6-21-22-2 neural network is implemented, with inputs from jet pairs. The input
variables are jet-jet invariant mass mjj, the jet direction cosines cos θj, the jet energies
Ej and the jet-jet opening angle θjj. There are two hidden layers with 21 and 22 nodes
respectively. The two output nodes represent the probabilities for the right and the
81
0 500 1000 1500 2000Training Cycles
0.04
0.045
0.05
0.055
0.06
0.065
Err
or
Neural Net Error(6−21−22−2)
TrainingValidation
Figure 5.10: The neural network training graph. The lowest validation error is reachedat the 780-th training cycle. The network at this stage is used in the testing phase.
wrong jet pair combinations.
ε-p comparison
Since neither efficiency nor purity are good measures of the neural network per-
formance, we compare their F -measure instead. Table 5.1 displays the F -measure
for three sets of data: SLAC-GISMO, SLAC-FastCal and Pandora-Pythia FastCal.
SLAC-FastCal has a slightly higher performance, which may not be significant, but
could possibly be due to the non-decay of the V0 particles. The Pandora-Pythia
FastCal F -measure is the mean of thirteen testing datasets, with a spread of 0.004.
F -measureSLAC GISMO 0.858SLAC FastCal 0.865Pandora-Pythia FastCal 0.864
Table 5.1: The F -measures (Equation 3.3) for the three data sets.
82
5.4 Conclusion
In this chapter, we have compared the results from FastCal at the cluster and jet
levels and compared them with those from GISMO and have shown that FastCal can
be appropriately used in our analysis routines in lieu of GISMO. FastCal and GISMO
produce qualitatively and quantitatively similar ECAL and HCAL depositions. At
the cluster level and jet levels, the energy distributions agree qualitatively and quan-
titatively, though there is some divergence at low energies. This divergence effect
does not impact the studies done here since the four most energetic jets are used in
the analysis. The neural network training on FastCal and GISMO data show simi-
lar behavior and results. Thus, as developed for this study, FastCal provides a fast
calorimeter simulator that replaces GISMO with a significant increase in simulation
speed but without loss of physical realism.
83
Chapter 6
Neural Network – results
6.1 Introduction
In this chapter, the neural network designed in Chapter 3 is used to solve and explore
the combinatorial problem in ZZ production events in which both bosons decay
hadronically, using kinematic variables. In addition, a neural network is also used to
distinguish between the Z and W bosons using kinematic variables. The data used in
this chapter are simulated using the FastCal detector simulator that was introduced
in Chapter 4, the results of which were compared those to those of the full simulation
in Chapter 5.
6.2 Jet-Boson Association
In order to train a neural network, the training program needs to know the true
identity of each jet. This is provided by the Monte Carlo internal particle tables that
document quark fragmentations and hadronizations. In the previous examples the
jets were associated with the quarks that the bosons decayed into with the use of an
84
angular proximity rule described in Section 5.2.3. In that method for jet identification,
Monte Carlo generator level information, in the form of quark energy-momenta, is
used directly, and the jet content is ignored. This introduces two sources of errors
into training data. First, because of the strict proximity rule, it ignores jets that
legitimately originate from a boson, but which have been deflected from the original
path because of physics processes like gluon radiation. Furthermore, it does not
provide any control or measure of jet contamination from clusters that have different
origins.
In this chapter, this deficiency in the angular proximity rule is rectified by the use
of an alternative boson content rule. Using the Monte Carlo table from the generator
level, we define the fractional boson content of a jet as follows:
fb =eb
Σeb′ + eunknown, (6.1)
where eb is the sum of energies of clusters in the jet that originate from a particular
boson. The denominator is the total jet energy, which is the sum of energies from
all decaying bosons in the event as well as energies from other sources. Figure 6.1
displays the distribution of the fractional boson content eb of the first Z in all jets in
e+e− → ZZ events. Since there are two bosons in the events, the plot is symmetric.
In this fractional boson content scheme, a jet is associated with a boson if more
than 65% of the jet energy comes from that particular boson. The 65% cutoff ignores
the relatively flat portion in the middle of the graph; the boson content begins to rise
sharply above 65%. This allows for a small percentage of the jets to be identified as
from unknown origin, since both bosons contribute to its energy nearly equally.
This association scheme, therefore, provides a measure of jet contamination and
is not dependent on angular proximity of the jet to the original quark.
85
0 0.2 0.4 0.6 0.8 1Boson energy fraction (First Boson)
100
1000
10000
1e+05
Jets
Boson Energy Fraction Distribution
Figure 6.1: The energy fraction of each jet for the first boson, given by Equation 6.1
In Table 6.1, a comparison is made between jet-boson association (based on boson
content) and jet-quark association (based on angular proximity). It shows that the
angular proximity and the boson content rules give a generally similar association,
represented by large values of true positives and true negatives. Of the jets associated
by angular proximity, about 10% are contaminated, and more than 20% of the correct
jet-boson association are lost due to the strict proximity rule.
Boson ContentAngular Proximity 1 0
1 425 540 133 1388
Table 6.1: Comparison of the Boson Content and the Angular Proximity methods inZ pair identification. 1 denotes a good jet pair and 0 denotes a bad jet pair.
86
For our training and testing purposes, we use the boson-content based jet-boson
association scheme.
6.3 Ensemble of neural networks
In Chapter 3, we designed a neural network, adopted a training scheme and imple-
mented them. In this section, we suggest a means to improve the performance of
neural networks in the classification of Z bosons, and their separation from a back-
ground of wrong jet combinations.
It has been suggested by many authors that to improve the performance of neural
networks, an ensemble of neural networks should be used instead of a single neural
network ([56] and references therein). The accuracy of such ensembles is seen to
improve under two conditions: that the classifiers are themselves as different from
each other as possible, and that each classifier is itself a very accurate classifier on its
own [48], but perhaps in a restrictive domain.
Here we consider an ensemble of neural networks, each of which has the same
architecture as the optimized network designed in Chapter 3. The networks differ in
the initial weights on which the networks begin training. From the best classification
criterion mentioned above the training is made to go on for much longer than what
earlystopping (see Section 3.5.3) would require. As we shall see, this is crucial for a
good result. For such an ensemble of networks, the ensemble average is given given
by:
O =1
Ne
∑i
Oi (6.2)
where Oi is the output vector for the i-th neural network.
87
In preparing the training data sample, we adopt a variation of the bagging tech-
nique [18]. The bagging technique consists of resampling, with replacement, the
training dataset for each individual network. This means that the training data for
individual networks are a subset of the entire dataset with some records repeated.
Thus in a given training cycle the network training is reinforced on the repeated
records. The bagging technique is particularly appropriate for unstable networks
that produce widely different results for a small change in the training dataset. Here,
the neural network, which has been designed to classify the correct and incorrect jet
combinations, is not unstable. Instead the finite size of the network imposes a restric-
tion on the size of the training dataset on which it can train. Since we are interested
in improving the performance of a stable network in overcoming the limitations on
the size of the training data imposed by the network architecture, we generate our
individual datasets for each network by resampling 80,000 records without replace-
ment.
6.3.1 Ensemble Results
Figure 6.2 displays the result of two kinds of ensemble neural networks: (1) networks
that were stopped early using a validation dataset, and (2) networks that were trained
for a much longer period until the error on the training dataset was sufficiently low.
The error in the training dataset does not reach a minimum, and so a qualitative
judgment is required for stopping the training.
There are a number of inferences that could be drawn from the graph. First,
networks that were stopped early showed different characteristics. The earlystopped
neural networks are grouped with a higher efficiency-purity performance than the
group of networks that have been trained for a lower error in the training data set.
This indicates that the assertion made in Section 3.5.3 is correct. Earlystopping of
88
0.83 0.84 0.85purity
0.83
0.84
0.85
0.86
effi
cien
cy
Ensemble NN (Early Stopping)Individual NN (Early Stopping)Ensemble NN (Min Training Error)Individual NN (Min Training Error)
Efficiency-Purity (Z Boson)
Figure 6.2: The ε-p graph illustrating the ensemble result. See Section 6.3.1.
individual neural networks improves performance.
We notice however that though individual networks that were not stopped early
performed worse than those that were stopped early, in ensembles they performed
better. We may understand this in terms of overfitting. Overfitting is the situation
when the neural network fits the given data well, but at the same time it creates
features in the model that are not actually present. These features are introduced at
random. Therefore, each individual neural network creates its own random features
which are washed out when the averaging takes place i.e. the overfitted features are
averaged out. Alternatively, since each dataset represents a fraction of the original
dataset, individual neural networks trained on such datasets without earlystopping
can be considered to be highly fitted to the region in variable space the sample dataset
represents. The averaging of the neural network outputs thus represents the average
89
features of these overlapping and highly fitted regions.
This also means that earlystopping has some limitations. As the training pro-
gresses and the neural network fits the data, the overfitted features are also getting
stronger. As a result, the earlystopping condition is a point where the effect of over-
fitting overwhelms the fitting. That is not the point where perfect fitting has taken
place. Therefore, an ensemble average gives us a better result than earlystopping.
It is to be expected that if this is indeed the case, then a significant overlap of the
training datasets of individual neural networks could be crucial.
Furthermore, the ensemble of neural networks as implemented here, can be used
to circumvent the limit to the size of the dataset that neural networks impose. In
many high-energy physics applications, it is likely that a large simulated data sample
is available. Using the training sample procedure described here, it becomes possible
to utilize a larger data set for training.
6.4 Classifying Z and W jet-pairs
In this section, we wish to explore the classification of W and the Z bosons from
hadronic decays using kinematic variables. This is a harder problem since the two
objects, the Z and the W are kinematically very similar in their hadronic decay
modes.
The neural network has the (6-21-22-2) architecture as in the previous example,
described in Chapter 3. The training data consist of events of the type e+e− →
W+W− where jet pairs are characterized by the variables used earlier (Ej1, cos θj1,
Ej2, cos θj2, mjj, θjj). The variables in the training data are standardized to zero mean
and unit variance according to Equation 3.5. The jet pairs are classified according to
the jet-boson association scheme described in Section 6.2.
90
0.5 0.6 0.7 0.8 0.9 1Purity
0.5
0.6
0.7
0.8
0.9
1
Eff
icie
ncy
W NN on W dataZ NN on W data
NN on W jet-pair data
Figure 6.3: The ε-p curves display the performance of a neural network trained one+e− → W+W− data and tested (1) on e+e− → ZZ data (Z NN on W data) and (2)on e+e− → W+W− data (W NN on W data). Note that network, trained on W dataperforms equally well on Z data.
91
0.5 0.6 0.7 0.8 0.9 1Purity
0.5
0.6
0.7
0.8
0.9
1
Eff
icie
ncy
Z NN on Z dataW NN on Z data
NN results on Z jet-pair data
Figure 6.4: The ε-p curves display the performance of a neural network trained one+e− → ZZ data and tested (1) on e+e− → ZZ data (Z NN on Z data) and (2)on e+e− → W+W− data (W NN on Z data). As in Figure 6.3, this shows that thenetworks trained on right and wrong jet pairs are picking up the features of a generalheavy boson.
Figure 6.3 displays the efficiency-purity curve for the neural network distinguishing
the correct jets pairs from the incorrect jet pairs in ZZ and WW events. Trained
on e+e− → W+W− data, the neural network performs nearly equally on both Z and
W data. This behavior is repeated in neural networks trained on e+e− → ZZ data
(Figure 6.4). For this network too, the performance of the network on W test data
is only marginally inferior to its performance on Z test data.
Thus the neural network designed for the combinatorial problem has picked up
features for the identification of a general heavy boson against a background of wrong
jet combinations, but it does not distinguish W from Z.
92
6.4.1 Distinguishing W and Z
In this section, we examine in more detail the performance of the network in distin-
guishing between W and Z. The training data now have 40,000 jet-boson-associated
jet pairs identified as W and an equal number of jet-boson-associated jet pairs iden-
tified as Z. The neural network has a fully connected 6-21-22-2 architecture trained
on the same six variables as above.
Figure 6.5 displays the efficiency-purity curve for the network that distinguishes
the Z from the W jet pairs. Since there are two classes in the data, the efficiency-
purity curve for the same network that distinguishes the W from the Z jet pairs
is complimentary and is not shown. In comparison to the Z boson classification
against the wrong jet pair combinations, the neural network does not perform as
well. Moreover, the curve is straighter and does not have a well defined concave
shape.
This straighter shape in Figure 6.5 can be understood from the neural network
probability output shown in Figure 6.6. It shows that the neural network output
between 0.15 and 0.7 is equally distributed between the Z and W jet pairs, following
identical distributions, indicating that this mid region is not a good discriminator.
The straighter efficiency-purity curve for the Z from Z,W test data in Figure 6.5 is
attributable to the low performance of the neural network in this region.
6.5 Conclusion
In this chapter, we have designed a training scheme for an ensemble of neural net-
works. It was seen that an ensemble of neural networks performs better than a single
neural network, though the performance does not improve drastically. There are two
crucial factors determining the success of this scheme. The first is that the neural
93
0 0.2 0.4 0.6 0.8 1purity
0
0.2
0.4
0.6
0.8
1
effi
cien
cy
Z from Z,W test dataZ from Z, wrong combination jet-pairs
Efficiency-Purity Graph for ZW NN
Figure 6.5: The ε-p curve for the Z jets pairs against a background of W jet pairs isgiven by the dark line. For comparison, the ε-p curve for the Z jets pairs against thebackground of wrong jet pairs is given. The dark line is straighter, denoting a poorclassification performance in the mid probabilities (See Figure 6.6).
94
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Unit 1
0
400
800
1200
1600
2000
2400
NN Output
Z pairsW pairs
Figure 6.6: The neural network output of the first unit (probability of Z jet pair).Since there are two classes, the probability for W is complementary. The classifica-tion is not optimal between the output values of 0.15 and 0.7, and it is particularlysuboptimal between 0.15 and 0.5. The peaks most likely denote subclasses of the jetpairs that have particular kinematic variable distributions.
95
networks should be trained on bagged data, and second, the neural network should
not be regularized by methods like earlystopping.
The variation of the bagging method used in the ensemble of networks is a means
by which neural networks can be trained on larger datasets. The neural network
architecture imposes a restriction on the size of the dataset on which the networks
can optimally train. When a large dataset is available for training, the ensemble
technique can be used to improve the performance of neural networks.
We have also examined the training of a neural network to classify heavy bosons
into Z and W separately. A naive implementation of such a classifier does not operate
perfectly, and there is a range of neural network output (i.e., Z probability output
between 0.15 and 0.7) where the classifier does not work. For an optimal classification,
a better understanding of the kinematic variable space is needed. This exploration of
the kinematic variable space is done in the next chapter.
96
Chapter 7
Unsupervised Methods
7.1 Introduction
The training of the feedforward neural network used so far is an example of supervised
learning. Under this training paradigm, the classifier is given a dataset with records
that contain training variables as well as the pre-determined class each record be-
longs to. The classifier implements an iterative learning procedure and learns on this
training set. In neural networks this learning procedure occurs with the iterative ad-
justments of the connection strengths (weights) between neural units. In binary trees,
the learning procedure occurs via repeated binary partition of the data according to
an optimizing criterion that depends on the specific binary tree implementation.
An alternative form of learning is unsupervised training. In this scheme, the
dataset on which the training takes place does not contain information on the correct
classification of each record. In this scenario, the learning algorithm learns the data
without supervision and produces the partitions (classes) spontaneously.
There are two basic advantages unsupervised learning methods have over super-
vised learning methods. First, unsupervised methods are not susceptible to the bias
97
of the training data sample that is characteristic of data samples that contain a pre-
determined classification. In the high-energy physics context, the training datasets
are generally obtained from simulation studies. A bias is introduced into the train-
ing data sample by the method, usually domain knowledge based (see Jet-Boson
association in Section 6.2), that determines the classes that each record belongs to.
Unsupervised methods, which do not train on data that have been pre-classified, are
therefore not limited by the domain knowledge inherent in training datasets.
The other advantage is the possible discovery of novel signals in the data. Since
supervised methods are trained on a fixed number of classes, any novel signal in
new data will likely be missed by supervised methods. Unsupervised methods, not
confined by this restriction of supervised methods, are more sensitive to novel signals
in new data.
From a computational point of view, a third advantage is that unsupervised meth-
ods are faster than supervised methods like neural networks. Combined with the first
two advantages, unsupervised methods are more suited for examining large datasets
for the presence of novel signals.
These advantages of unsupervised methods are offset by the ability of supervised
methods to model data in greater detail and thus offer greater classification perfor-
mance.
In the previous chapter we had encountered the problem of a neural network
not performing optimally in the classification of the Z against the W . A possible
explanation could be that the problem was ill posed. That is, the variables on which
the training took place were not adequate to separate the two different species of jet-
pairs. A significant section of the Z and W data overlap which the neural network
was unable to train on.
The solution would be to look at the distribution of the data variables (perhaps
98
using visualization techniques) and then to try to find better variables. To understand
the data distributions, we first provide an exercise in data visualization and explo-
ration. Most unsupervised methods rely on finding features in the data distributions.
In Section 7.2, we look at Principal Component Analysis as a tool to understand
and visualize data distributions. We also present a density cutoff procedure based on
principal components as an alternative method of signal separation. In Section 7.3
we introduce two clustering methods and examine their efficacy in separating signals
and backgrounds.
7.2 Visualization
Data visualization is important in high-energy physics, since it is the first step toward
data analysis. A popular form of one-dimensional data visualization and presentation
is the histogram. This idea can be extended to two dimensions, and then it is referred
to as a lego plot. This type of visualization cannot be extended to higher dimensions.
Another form of visualization aid is the scatter plot for two or at most three
dimensions. Scatter plots are particularly helpful in the search for structures in the
data distribution and classification. Scatter plots are sensitive to the size of the
dataset. A very large dataset would render the plot entirely black and thus of no use.
Too small a data set would produce too sparse a scatter plot, again of no use. It is
easy for significant clusters to hide within clusters.
Where the number of dimensions is more than three, visualization of the data
becomes more difficult. In particular, it becomes important to choose the variables
that best characterize the data and exhibit the separation between the classes.
A priori, it is not possible to order variables in importance without specific domain
knowledge. In the examples earlier, the choice of the invariant mass, an important
99
discriminating variable, was motivated by domain knowledge. Even if an ordering of
variables is possible, there are possibly correlations within the choosen variables that
hide the particular distribution of the data that would best display the structure of
the data.
We discuss below a method, Principal Component Analysis (PCA), that addresses
these issues, enabling us to visualize the data optimally.
7.2.1 The combinatorial problem and Principal Component
Analysis
In the combinatorial problem, we have data for all possible pairs of jets in an event,
from which we would like to classify the correct pairs. The question is: can we distin-
guish the correct pair from these kinematic quantities using a visualization method?
The four variables we examine are the product of the two jet energies (E1E2),
the jet pair invariant mass (m12), the jet opening angle (θ12) and the error in the
invariant mass (δm212). These variables are known to be good discriminants. Ideally,
a knowledge discovery tool should be able to pick up the relevant discriminant on its
own. However, if the discriminant is a nonlinear function of several variables, this is
very unlikely to happen. Since the method we examine here, Principal Component
Analysis, is a linear method, we need to include non-linear discriminants explicitly.
Principal Component Analysis
PCA is a linear transformation from the p-dimensional space to another p-dimensional
space. The components in the new space are ordered according to the proportion of
the variance in the data in that direction. This can be explained geometrically as
follows: the first component is a general straight line fit to the data in the original
100
p-dimensional space. Therefore, it is the direction along which the data is most
widely spread. The second component is orthogonal to the first and accounts for the
next biggest spread in the data. The third component accounts for the third highest
variance in the data and it is orthogonal to the first two components. And so on.
Since the components are now ordered according to the variance, the structure
of the data becomes obvious in the first few components, now called the principal
components. Using the complete set of new p components does not yield anything
new, since the data in the new space has as much information as the data in the
old space. The advantage with the principal components lies in the fact that they
are ordered according to the variance along the components. For data with a large
number of variables (dimensions), the principal components that account for the
smallest measure of the variance in the data can be dropped. This results in a loss
of some information, but this loss is compensated by a gain because of the decrease
in the dimensionality of the data [12].
7.2.2 PCA to 2 dimensions
Figure 7.1 shows the distribution of combinations in use for our unsupervised training
in a scatter matrix plot, matched against each other. The sample consists of both
Z boson pairs and wrong combinations. We are trying to address the problem of
visualizing the separation of the two kinds of pairs. To do this we apply PCA.
The transformation matrix of the four variables given in Section 7.2.1 that results
in PC1, PC2, PC3 and PC4 is given in Table 7.1, where PC1, PC2, PC3 and PC4
are the principal components, given in the order of maximum variance to minimum
variance. Their relative importances are displayed in the scree 1 diagram (Figure 7.2).
1A scree is characterized by a sharply rising region and a flat region. The point where the tworegions meet (sometimes called the knee) can be used as a cutoff on some parameter. Here, we usethe scree diagram to identify the number of significant principal components.
101
Figure 7.1: The surface matrix plot for the four variables mentioned in Section 7.2.1.
PC1 PC2 PC3 PC4E1E2 (COM) 0.5008767 -0.4829708 -0.6611164 0.28068994
m12 0.5233797 -0.2480016 0.2479882 -0.77657627θ12 0.4426680 0.8397824 -0.3069293 -0.06786034
δm212 0.5284328 -0.0000699 0.6381390 0.55994413
Table 7.1: The transformation matrix. Note that the transformation is between thescaled and standardized (zero mean, unit variance) values of variables in the leftcolumn and the principal components
The knee of the scree diagram suggests that the first two principal components are
the most important. The transformation is on normalized variables.
Figure 7.3 shows the distribution of the two most important principal components
in a scatter plot. This view shows the data in the two dimensions in which it has the
maximum spread. The background wrong combinations of the jets are more spread
out, and the Z jet-pairs are considerably localized.
102
0 1 2 3 4 5Principal Component Index
0
1
2
3
4
Eig
en V
alue
PCA Eigen Values
Figure 7.2: The scree plot, that displays the eigenvalues of the principal component.
Interpretation of the transformation
A major reason for the use of principal component analysis is the ordering of compo-
nents, a property that makes it attractive for dimension reduction. We use the order-
ing property here to visualize the data and later as inputs to unsupervised training.
The two components shown in Figure 7.3 account for nearly 90% of the variance in
the data.
The transformation equations, including the initial rescaling and the subsequent
principal component transformation, are shown in Equations 7.1 and 7.2. In the
equations below, PC1 denotes the first principal component and PC2 denotes the
second principal component.
PC1 = 6.56× 10−5E1E2
103
−2 0 2 4 6 8
−1
01
23
pca1
pca2
Figure 7.3: The two orthogonal components with the highest standard deviationsafter principal component analysis. In this plot, the Z jet-pairs are given in blackand the wrong combinations are given in gray.
104
+6.25× 10−3m12
+4.98× 10−1θ12
+3.43× 10−3δm212
−2.87 (7.1)
PC2 = 6.33× 10−5E1E2
+3.00× 10−3m12
−9.38× 10−1θ12
+6.49× 10−6δm212
+1.00 (7.2)
Here the transformations are from the unscaled and unstandardized variables as op-
posed to the transformations in Table 7.1, which transform from scaled and standard-
ized variables (Section 3.5.1).
In this space, the Z jet pairs are localized in a smaller region, whereas the wrong
jet pairs are spread out. The wrong combinations do not occupy the entire space, but
are themselves bounded. These bounds are due to kinematic constraints.
Multidimensional scaling
Multidimensional scaling [22] (MDS) is another technique for transforming the vari-
ables. Given an n×p dataset, this can be used to map from a p-dimensional space to
n-dimensional space with the variables ordered according to the discriminating power
as in PCA. Traditionally MDS is used to map to a 2-dimensional space. The chief
characteristic of MDS is that the inter-record distance in the dataset is preserved as
best as possible. Multidimensional scaling methods that are based on the Euclidean
105
distance give a result that is equivalent with that from PCA. Thus the interpreta-
tion of MDS—that the transformation maintains the relative distance between data
points—can be extended to the PCA transformation above.
Though the Euclidean-based MDS can be extended further by using different
distance measures to explore different kinds of data distributions, MDS methods do
not scale well with increasing data. This is due to the fact that the method requires
a distance matrix D, which for a data matrix given by n × p, is an n × n matrix.
Thus MDS methods scale as O(n2) and, therefore, are much slower than PCA and
require bigger computer memory sizes. Though the extensions to the basic MDS
might provide more insight into the data, we do not consider MDS methods further
in this thesis.
7.2.3 The density plot
As seen in Figure 7.3, the signal (Z jet pairs) is localized against the background of
noise (wrong jet-pair combinations). Therefore, we explore the viability of using a
density cutoff to identify the Z jet pairs in this section.
A density plot is obtained using a Gaussian-like kernel of the following form
K(d, θ) =
(1− ( d
3c)2)2 d < c
0 d ≥ c(7.3)
The kernel is normalized by πc2, and c is radius of an arbitrary circle. Figures 7.4
and 7.5 show the plot of this density. The density peak is located in the region of
the Z boson jet pairs. A cutoff on the density can thus be used to define Z. With
pairs falling within a grid square with density above the cutoff identified as Z pairs,
we obtain the efficiency-purity curve given in Figure 7.6.
106
X
Y
Z
Figure 7.4: The surface plot for the density distribution, calculated using theGaussian-like kernel described above. Notice the peaks. The higher of the two peaksrepresent the signal.
0
10
20
30
40
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 7.5: The contour plot of the density distribution.
107
0.2 0.4 0.6 0.8 1purity
0
0.2
0.4
0.6
0.8
1
effi
cien
cy
CMS-density cutoffNN
The efficiency-purity ComparisonNN vs CMS
Figure 7.6: The efficiency-purity curve. The two curves compare the result of (a) theclassical multidimensional scaling procedure and (b) the neural network. In (a), adensity cutoff is used to identify the Z. The density is obtained in a 2-dimensionalreduced subspace obtained from a classical scaling on the following variables: E1E2,m12, θ12 and δm2
12. The neural network is trained on E1, E2, cos θ1, cos θ2, m12 andθ12.
108
PCA and Density Cutoff
Figure 7.6 shows the efficiency-purity curves of the PCA density-cutoff procedure and
compares it with the supervised neural network result. The neural network gives a
better result. The better result from neural networks is due to two aspects of neural
networks. First, the neural network is a non-linear procedure, as opposed to PCA,
which is a linear procedure. Second, the neural network is a supervised training
procedure, whereas the PCA is unsupervised.
Even though the PCA procedure lacks the precision of the neural network, it is
expected to perform better with a minimum-bias sample, and is expected to do better
when unknown samples are given to it.
7.2.4 PCA at the quark level
To understand better the distribution of the Z and the wrong-combination jet pairs
in the Principal Component Analysis above, the PCA transformations are applied at
the quark level. In this dataset, there are exactly four jets, of which two are Z pairs
and four are bad combinations. The corresponding first two principal component
distributions for the quark-level information is given in Figure 7.7. Because of the
absence of the energy spread at the quark level, which is unavoidable at the jet level,
Figure 7.7 lets us examine the nature of the Z decay against the wrong combinations
more clearly. We observe that the general distribution of the wrong combinations is
very similar to the one observed in Figure 7.3. The distribution of the correct quark
pairs shows a much sharper and significant shape, which stands out against the wrong
combinations. Figures 7.8 display the correct and wrong quark-pair distributions
separately to bring out the shapes of these distributions better.
In Figure 7.8(a), the sharp upper bound is the lower bound on the opening angle
109
Figure 7.7: The PCA plot using the same transformation as the jets but on quarkinformation.
of the quarks from the decaying Z. The long tail denotes the spread in the opening
angle.
The quark-level distributions in the two principal components indicate that there
exist some structures in the data distribution that are not clearly displayed in the
two-dimensional plot. To examine the distribution further, we add the third principal
component and look at the three-dimensional distribution of the data in the following
section.
PCA to three dimensions at the quark level
With the additional view of the third principal component, some more structure
becomes visible. In Figure 7.9, several views are shown. The blue denotes the wrong-
110
(a) Right combinations (b) Wrong combinations
Figure 7.8: The right and and wrong combinations at the quark level
combination quark pairs, whereas the red denotes Z quark pairs.
The blue data points fall on a curved surface and the red data points do not fall
on that surface. This is an indication that the two data classes are separable on the
basis of these four variables alone. This is significant because the data now show a
structure in the three dimensions.
7.3 Clustering methods
Since an important task in high-energy physics data is classification, our goal is to
be able to use classification methods in an automatic manner. Visualization tech-
niques described above are a means to look at preliminary features in the data. The
visualization technique, based on Principal Component Analysis, was extended for
classification. In this section we examine the use of clustering methods to classify
data automatically.
111
(a) (b)
(c) (d)
Figure 7.9: Four perspectives of three-dimensional PCA at the quark level. The dark(black) points are Z and the light (green) points are wrong combinations. The twoclasses form different distributions. Section 7.2.4 for description.
112
7.3.1 Fuzzy clustering
Here we employ a simple fuzzy clustering [41] method called fuzzy C-means or FCM
(see Appendix E). FCM is a k-means class of classifier, which means that the number
of clusters have to be specified at the outset. The salient feature of this classifier is its
fuzziness, which is defined by the membership attribute of each data point calculated
for each cluster. The membership attribute in fuzzy methods is a number between
0 and 1, in contrast to regular k-means method where the membership attribute is
either 0 or 1, thus forming a hard boundary. The fuzzy attribute offers soft boundaries
and higher interpretability.
The method consists of selecting a determined number of clusters c based on our
understanding of the problem, which in the jet combinatorial problem is two, i.e. there
are good and bad combinations of jet pairs, and ideally c = 2. The clusters are ini-
tialized by randomly selecting the centroids in the parameter space. The membership
function that data point k belongs to cluster i is given by
mki =
d−1ki∑
j d−1kj
(7.4)
where dki is the Euclidean distance between the datapoint k and cluster centroid i.
The soft membership of the datapoints, as implemented in the equation above,
enables the algorithm to handle situations in which the clusters overlap at the bound-
aries. The cluster centroids are calculated by
Ci =∑k
mki Pk∑
j mjkPk
(7.5)
where Pk is the k-th datapoint.
The membership attributes are calculated for each datapoint with respect to each
113
centroid using Equation 7.4. The new centroids are then calculated using Equa-
tion 7.5. This is iterated until the centroids converge. The convergence criterion is
decided by a tolerance 10−8 on the Euclidean distance change in each centroid. This
tolerance is acceptable, since a decrease in the value does not change the result.
Fuzzy cluster application
The FCM method is applied to the combinatorial problem. Since there are two
classes in the problem, the algorithm is initialized with c = 2. The results are given
in Table 7.2. Fuzzy clustering with a Euclidean distance d, and probability 1/d is
performed for two and three clusters. The clustering algorithm has been able to
Cluster index Z-pair non-Z-pair F -measure0 11,379 17,759 0.561 66 20,796
Table 7.2: Fuzzy clustering for 2 clusters (c = 2). The cluster 1 is interpreted as theincorrect jet-pair cluster. The F -measure is calculated for cluster 0 and interpretedas the correct jet-pair cluster.
partition the data into two clusters, one with nearly all the correct jet-pairs (Z)
pairs (cluster 0). This cluster is the Z-pair cluster, which has a very high efficiency
but lower than 50% purity. The other cluster (cluster 1), has a high purity in the
incorrect pair, and just over 50% efficiency. In effect, the clustering for c = 2 has
been successful in identifying a little over 50% of the incorrect pairs.
The effect of partitioning the data into more than two clusters is examined. The
clusters for c = 3 are displayed in Table 7.3. Comparing the c = 2 and c = 3
clusters, it is seen that the additional cluster does not change change the partition
drastically. Instead, the third cluster is created out of the cluster 1 in the c = 2
clustering (Table 7.2). By increasing the number of clusters c, the cluster 0 in c = 2
114
Cluster index Z-pair non-Z-pair F -measure0 11,380 17,595 0.561 4 862 61 20,874
Table 7.3: Fuzzy clustering for 3 clusters (c = 3). The F -measure for cluster 0 doesnot improve with the increase of c from 2 to 3.
could not be partitioned further, which would have resulted in a better efficiency and
purity for the correct jet pair (Z).
The cluster centroids are given in Table 7.4. Note that the centroids cannot
be interpreted as the average of datapoints belonging to that cluster in the FCM
algorithm, since the cluster points are weighted averages of all datapoints.
Cluster index E1 cos θ1 E2 cos θ2 m12 θ12
0 -1.96e-05 1.41e-06 -2.36e-05 -9.21e-07 -3.59e-05 -3.35e-051 3.37e-06 -3.46e-07 3.99e-06 3.35e-07 6.05e-06 5.62e-062 1.62e-05 -1.07e-06 1.97e-05 5.85e-07 2.98e-05 2.79e-05
Table 7.4: The centroids of the three clusters. Note that the variables are standardizedto zero mean and unit variance.
Table 7.2 demonstrates that this clustering method is unsuitable for the separation
of the Z-pair jets from the non-Z-pair jets. This is because the data has non-spherical
structures and the FCM methods based on Euclidean distances work well for spherical
distributions. Figure 7.10 displays the first two principal components of the data.
The signal forms a high-density localized distribution overlapping another localized
distribution of the wrong jet-jet pairs.
So while this method can enhance the sample, it is however not an efficient clas-
sifier. Figure 7.10 shows the non-spherical distribution of the Z boson jet-pair data.
With an appropriate choice of a non-standard norm in the distance metric, an ellipse,
with the major axis along the longitudinal axis of the Z jet-pair distribution, can be
115
−2 0 2 4 6
−2−1
01
2
X
Y
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
..
.. .
.
.
.
.
. ..
.
.
..
..
. .
..
..
. .
..
...
...
.
..
..
.
..
..
...
.
.
.
..
.
.
.
.
.
.
..
.
.
.. .
..
..
.
..
..
.. .
.
.
.
.
.
..
.
.
..
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ...
...
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
...
...
..
.
.
..
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
.. ..
.
.
..
..
.
.
..
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
...
...
..
..
..
.
.
.
.
.
.
.
....
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
. .
.
.
.
.
.
.
.
.
.
.
..
.
. .
.
. .
.
.
.
.
. .
.
.
.
.
..
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
.
.
..
..
..
..
..
.
.
..
.
.
.
..
.
..
.
.
..
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
..
..
.
...
...
..
.
.
..
.
.
..
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
..
..
..
.
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
....
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
..
..
..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
.
..
...
.
.
.
..
.
.
. .
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
..
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
..
.
.
..
.
.
.
.
.
. .
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
.
.
..
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
..
..
. .
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.. .
.
.
.
.
.
..
.
...
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
...
. ..
...
. ...
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
..
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
..
..
.
..
..
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
..
..
.
.
.
.
.
. .
. .
..
.
..
. .
..
..
..
..
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
..
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
..
...
. ..
..
.
.
..
..
. .
..
..
. .
...
.
. .
..
..
.
.
..
..
. .
..
..
.
.
..
..
. .
.. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. .
...
... .
.
.
.
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
...
..
....
.
.
...
.
.
.
.
.
.
.
.
.
.
. ..
..
. .
.
.
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
..
.
.
..
. .
..
.
..
..
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
. .
.
..
..
. .
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ...
...
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
...
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
.
.
..
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.. ..
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
.. .
.
.
.
.
..
.
.
.
.
.
..
..
..
..
. .
..
..
..
..
.
.
..
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
.
.
..
..
..
.
..
..
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
..
..
... .
.
.
.
.
...
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.. ....
..
..
..
..
..
..
..
.
.
. .
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
..
..
.
.
..
..
.
.
..
..
..
..
.
.
..
.
.
..
.
.
..
.
..
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
..
..
..
.
.
.. .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
...
... .
.
.
.
..
.
.
.
.
.
.
..
..
..
.
.
. .
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
..
.
.
.
.
.
.
...
...
.
.
..
.
.
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
... .
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
...
..
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
...
.
.
.
.
.
..
.
..
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
..
..
....
..
..
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
..
.
.
.
.
..
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
..
.
.
.
.
..
.. ....
.
.
.
.
.
.
..
..
..
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
... .
.
..
..
..
..
..
...
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
..
..
...
..
..
.
..
..
..
..
..
..
.
.
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. ..
..
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
. .
.
.
..
.
.
..
..
..
..
...
...
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
. .
.
.
..
..
..
..
..
..
..
..
..
.
. ...
.
..
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
...
....
.
.
.
.
.
..
.
.
...
.
.
.
.
. ...
. ..
..
.
...
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
....
..
..
..
.
.
..
..
..
..
..
.
.
..
....
..
.
.
.
.
.
. ..
.
.
...
.. ..
.
..
.
.
..
...
...
.
.
.
.
.
. .
.
.
.
.
...
.
.
..
.
... .
.
.
.
.
.
.
.
.
.
..
.
....
...
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
. .
..
...
...
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
. . .
..
..
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
.
.
.. ..
.
.
...
.
.
.
..
.
.
..
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
.
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
.
..
..
. .
.. ..
.
.
.
..
.
.
..
...
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
. .
..
.
...
...
..
..
..
.
.
.
.
.
.
.
.
..
..
.. .
...
..
.
..
.
.
.
.
.
.
.
.
..
..
.
..
..
..
...
..
.
.
.
.
.
..
.
.
.
.
.
.
..
..
. .
..
.
.
..
...
...
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
. .
. .
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. ..
.
.
..
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
.
.
..
...
...
...
...
.
.
..
.
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
. .
..
..
.
.
.. ..
.
.
..
...
.. .
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
..
..
..
..
.
.
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
..
. .
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
...
...
..
..
..
.
.
..
.
.
...
...
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
..
.
.
.
..
..
.
...
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
.. ..
.
...
.
...
.
.
...
..
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
....
.. .
..
..
..
.
.
.
.
.
.
.
. ...
.
..
..
..
..
.
.
..
..
..
..
.
. .
...
.
.
.
.
.
.
.
....
.
...
...
.
..
..
.
.
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
....
..
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
...
.
.
..
.
.
. .
..
..
...
..
...
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
.
.
..
..
.
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
....
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
..
.. .
...
..
. .
..
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
. .
..
.
.
..
.
..
.
.
.
.
.
..
.
.
.. ..
.
.
..
.
.
.
.
.
..
...
.
..
.
.
.
..
..
.
.
..
..
..
..
..
..
..
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
. .
..
...
.. .
..
..
..
.
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
.
..
..
.
.
..
..
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
...
.
.
.
... .
.
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
.
..
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
. .
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
.
..
. .
. .
.
.
..
. .
.
..
..
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
..
..
..
..
.
.
.
.
.
.
..
..
..
. .
.
.
..
.
.
.
.
.
.
..
.
.
.. ..
.
.
..
.
.
.
.
.
.
..
....
.
.
.
.
.
..
.
. .
.
.
.
.
.
.
.
.
.
..
..
.
.
..
..
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
...
.
.
...
.
.
.
.
.
...
...
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
..
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
..
.
.
.
. .
..
. ...
...
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
..
..
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
..
..
..
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
..
.. .
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
..
..
.
...
. ..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
...
..
..
..
.
.
.
.
.
...
..
..
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
...
...
.
.
.
.
.
.
..
..
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
.
.
...
.
.
.
.
.
...
...
...
.. .
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
...
...
.
..
..
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
...
...
.
.
.
.
..
...
...
..
.
.
..
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.. ..
.
..
. .
..
..
..
..
...
...
..
..
..
..
..
. .
.
..
..
..
....
.
..
..
..
.
.
.
.
..
..
. .
..
.
.
.
.
.
.
..
.
.
..
...
...
..
.
.
.
.
..
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
...
. .
....
. .
. .
.
.
.
.
.
.
..
.
.
...
.
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. ..
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
..
. .
.
..
..
..
.
..
..
..
.
.
.
..
..
.
.
.
.
..
.
.
..
..
. ..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
.. .
.
.
.
.
. ..
.
.
...
.. .
.
..
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
. ..
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
. .
..
..
..
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
..
.
.
..
.
.
..
.
..
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
.
.
.
.
.
. .
.
.
.
.
....
...
...
...
..
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
...
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
. .
.
.
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
..
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
..
..
..
...
...
.
.
. .
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
. ..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
..
.
.
.
.
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
. .
..
. .
..
..
. .
....
..
..
..
. .
...
.
.
.
.
.
..
..
....
.
.
..
..
. .
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
..
.
.
...
.
.
.
.
. .
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
. .
..
...
. ..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
... .
.
.
.
.
.
.
. ...
.
..
.
.
.
.
..
. .
.. ..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
..
..
..
. .
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
. .
..
..
. .
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
..
..
.
.
.
..
.
.
..
..
...
.
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
..
. .
..
.
.
..
.
.. ..
.
..
..
. .
.
.
.
.
.
.
..
.
.
..
..
.
.
. .
.
.
..
.
. .
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
...
...
..
.
.
..
.
.
.
.
.
...
.
.
..
.
.. .
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
..
..
.
.
..
..
.
..
..
...
.
.
.
..
..
.
.
...
.
.
.
.
.
...
...
..
. .
....
..
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
...
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
.
.
..
..
.
.
..
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
... .
.
.
.
.
.
.
. .
.
. .
.
.
..
.
...
..
.
.
.. .
.
.
.
.
.
..
.
.
.. ..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
...
.
.
.
.
.
.
.
..
..
..
..
.
...
..
.
.
..
..
.
.
..
...
.. .
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
. .
...
...
..
.
..
.
.
.
.
.
.
.
.
.
..
.
...
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
.
..
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.. .
.
.
.
.
.
.
..
..
.
.
.
.
.
.
...
.
.
..
.
... .
..
.
.
.
.
.
.
.
.
.
.
. ...
...
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
..
..
. .
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
. .
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
..
.
.
. .
.
..
..
.
..
.
.
..
.
.
.
.
..
..
..
...
.
.
.
.
.
..
..
..
..
..
..
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. ...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
..
..
. .
.
.
.
.
.
.
..
.
..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. ..
.
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
..
. .
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
.. .
.
..
..
..
.
.
..
..
.
.
..
..
.
.
.. ..
..
.. .
.
.
.
.
.
.
..
..
.
..
.
.
..
..
.
.
..
.
.
.
.
..
.
. ...
..
..
..
.
..
.
.
..
...
....
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
..
.
.
..
.
.
..
.
... .
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
..
..
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
..
.
.
.
.
..
.
. .
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
..
.
..
..
.
..
.
.
.
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
. .
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
..
..
..
.
.
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
..
...
.. .
..
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
.
...
. ..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
.. .
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
. .
.. ..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
..
. .
..
.
..
..
.
..
..
..
.
.
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
..
. .
.
.
.
..
.
.
.
.
..
.
.
..
.
..
.
..
.
.
..
.
.
.
.
.. .
.
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.. ..
..
.. ..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
..
...
..
...
..
..
..
.
.
.
.
.
.
...
.. .
..
..
..
.
..
..
.
..
..
...
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
. ..
... .
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
..
.
.
.
..
.
...
.
.
. . .
.
.
.
.
.
...
...
..
..
..
..
..
..
..
.
.
..
. .
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
...
. .
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
...
...
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
..
.
.
..
..
..
..
.
..
..
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
.
..
..
.
..
.
.
.. .
.
.
.
.
.
.
.
.
.
..
..
..
..
.
...
.
.
..
..
..
..
. .
....
.
.
..
.
.
.
.
.
.
..
..
..
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
.
.
..
.
.
.
.
.
. ..
.
.
. .
.
.
.
.
.
..
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
....
.
.
. .
.
..
..
.
.
.
.
.
..
.
.
.
.
.
. ..
..
..
...
..
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.. ..
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
...
... .
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
...
..
.
.
..
.
.
...
. ..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. ..
.
.
..
..
..
..
.
..
..
.
.
.
.
.
.
..
.
.
.
.
...
..
..
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
..
..
..
.
.
..
.
.
..
.
.
..
..
..
..
..
.
.
.. ..
..
..
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
...
...
..
.
.
. .
..
.
.
...
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
. .
.
..
..
. ..
.
.
..
..
.
..
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
. .
..
.. .
.. .
..
. .
..
.
.
.
.
.
.
..
. .
...
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.. .
.
..
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
..
.
.
.
.
..
.
. .
..
.
.
.
.
.
.
. ...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. .
.
..
.
. .
.
..
..
.
.
..
.
.
.
.. ..
.
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
..
.
.
..
. .
.. .
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
. ..
..
..
.
..
..
.
...
...
.
.
.
.
.
.
..
..
..
.
..
. .
.
..
.
.
..
..
..
..
..
.
.
..
...
. ..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
. .
....
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
.. ..
.
.
..
..
. .
..
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
..
..
...
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
..
.
.
...
.
..
..
..
. .
..
..
..
..
...
... .
....
. .
.. ..
.
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
..
..
..
.
..
..
.
...
...
..
..
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
...
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
. .
..
.
.
..
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
..
.
..
.
.
..
.
....
.
.
.
.
.
.
.
..
.
.
..
..
..
.. .
.
.
.
.
.
.
....
..
....
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
..
. .
.. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
..
.
.
..
.
.
..
.
.
..
.
.
..
..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
.. ..
.
.
..
.
.. .
.
.
.
.
.
.
.. .
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
. .
..
. .
..
.
.
.
.
.
.
.
.
..
..
..
. .
..
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
..
.
. . ..
.
..
..
....
.
.
..
.
.
.
.
.
.
..
..
...
.
.
.
.
.
..
.
..
.
...
...
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
.
.
..
.
.
..
. .
..
.
..
...
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
. . .
.
.
.
.
.
.
.
.
.
.
. ..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
..
..
..
.
..
..
.
.
..
..
.
..
..
..
.
.
. .
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
..
..
..
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.. .
...
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
....
.
.
.
.
.
.
.
..
..
. .
.
..
..
.
..
..
..
..
. .
..
..
. .
..
.
.. .
.
...
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
...
...
..
..
..
..
. .
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
. ...
.
..
..
..
..
.
.
. .
.
.
.
.
.
.
...
.. .
..
.
.
.
.
.
.
.
.
.
.
..
. .
. .
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
. .
.
.
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
....
. .
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
..
..
...
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
...
...
.
..
. .
.
..
..
..
..
..
..
..
..
..
..
.
.
..
.
.
.
.
.
...
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
...
..
..
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
.. .
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
...
.
..
..
..
..
..
..
.
.
. .
.
.
.
.
.
.
..
. .
.
.
.
.
. .
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.. .
... .
...
.
.
.
.
.
.
.
..
.
.
.
.. .
.
.
.
.
.
...
...
..
..
..
.
. .
..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
. .
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
..
.
. .
..
..
.
..
..
..
..
..
..
..
.
.
..
.
.
.
.
.
....
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
...
.
..
.
.
.
...
.. ..
.
.
.
.
.
.
....
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
...
... .
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
.
..
..
.
..
..
...
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
..
..
.....
. .
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
. ...
. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
..
..
.
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
..
. .
..
..
..
..
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
. .
..
.
..
. .
..
..
..
.
..
..
. .
..
..
..
.
.
.
.
.
.
..
.
...
..
.
.
..
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
..
.
.
.
.
...
..
..
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
.. ..
..
.. .
.
.
.
.
..
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
. ...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
..
....
.
.
.
. .
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
..
..
. ....
...
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
..
.
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
.. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
..
..
..
..
..
..
..
..
..
...
...
.
.
.
.
.
.
.
...
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
..
..
.. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
...
...
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
. ..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
..
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.. .
...
...
...
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
.
.
..
..
..
..
.
..
.
.
.
..
..
....
..
....
. .
..
.
..
..
.
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
...
.. .. .
.
.
..
. .
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
...
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
. .
.
.
.
.
.
.
..
..
..
..
. .
....
. .
...
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
...
.
.
.
.
.
..
.
.
..
.
..
. .
..
..
..
.
.
.
. .
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
. .
.
...
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
. .
..
...
.. .
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
...
...
.
.
.
.
.
.
...
....
.
.
.
.
.
..
..
..
..
.
.
. .
.
..
..
.
.
.
.
.
.
.
.
..
..
.
..
..
...
..
...
..
. .
. .
..
..
..
.
.
.
.
.
.
.
..
..
.
..
..
.
.
.
.
..
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
.. .
.
..
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.
.
.
..
.
.
...
...
...
... ..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.. ..
.
.
.
.
.
.
.
..
..
..
..
. .
..
..
. .
..
.
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.. .
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
.
.
.
.
.
.
..
.
.
..
...
...
...
. ..
..
.
.
..
..
..
..
.
.
.
.
.
..
.
.
.
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
..
..
...
..
..
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.. ..
..
... .. ...
...
..
..
..
..
..
...
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
...
.. .
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
...
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
.. ...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
..
.
.
.
..
.
.
..
.
.
..
.
.
.
....
.
..
..
...
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
....
..
.
.
.
.
.
..
.
.
.
..
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
..
.
.
..
.
.
..
..
.
. ..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
..
...
.
.
.
.
..
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
.
....
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
. .
.
.
.
.
.
.
..
..
..
..
..
...
.
.
.
.
. ..
.
.
..
.
..
..
.
.
.
.
.
.
...
. .
..
..
. .
..
.
. .
..
.
..
.
.
..
.
.
.
.
.
.
.
..
..
. .
..
..
..
..
..
.
..
. .
. .
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
..
.
...
..
..
..
.
.
.
.
.
. ..
.
.
..
..
.
.
...
.
.
.
..
..
.
.
..
..
..
.. ..
..
..
.
.
.
.
.
. ..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
.
.
. . ..
.
..
.
..
.
.
..
.
.
.
.
.
..
.
.
.
..
..
..
.
.
.
.
..
.
.
.. .
...
.
..
..
.
...
...
.
.
.
.
.
.
...
...
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
.
.
.
.
.
.
.
...
.. .
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
..
..
..
..
..
..
..
..
.
.
.
.
.
.
..
.
. .
.
.
.
.
.
.
.. ..
.
.
.
..
.
.
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. . .
.
.
.
.
.
...
...
.
..
..
. .
.
.
.
.
.
.
.. ..
..
..
..
.
..
..
....
.
.
..
..
. .
..
..
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.... .
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
...
..
..
.
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
...
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. ..
.
.
..
..
..
.. ..
.
.
..
.
.
.
.
.
.
.
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
...
... .
.
.
.
..
..
..
....
..
.. .
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
.
..
.
...
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
. .
..
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
. .
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
..
..
..
..
.
.
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
...
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
..
..
....
.
.
..
..
..
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
.
.
..
..
.
...
. .. ..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
. .
..
...
...
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
.
..
..
..
..
. ..
..
.
.
..
.
....
.
...
...
..
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
..
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
....
..
.. ..
..
..
.
.
.
.
.
.
.
... .
.
...
..
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
..
..
..
..
. .
..
...
. ...
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
..
...
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
. ...
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
..
..
..
.
..
..
.
..
..
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
.
..
. ..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
..
. .
..
.
.
.
.
.
. ..
. .
..
..
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
. .
.. .
.
.
.
.
.
..
.
.
. .
.
..
..
.
.
.
.
.
.
.
..
..
.. ..
. .
. .
.
..
..
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
. ..
.
.
..
.
..
..
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
.
.
..
..
.
.
..
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
.
....
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
..
.. ..
.
...
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
..
.
..
.
.
.
.
.
..
..
..
.
.
.
.
.
.
...
... .
.
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
..
..
.
..
..
..
...
...
..
.
.
..
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
. .
. .
..
..
..
.. ...
...
..
..
...
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
..
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
..
..
.
...
...
.
.
..
..
...
... .
..
..
.
.
..
. .
.
..
.
.
..
..
..
..
.
....
.
..
.
.
..
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
...
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
...
...
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
.
...
...
.
.
.
.
.
.
..
.
.
..
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
..
...
.
.
.
..
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
..
..
.
.
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
..
.
..
..
..
..
..
..
..
.
.
..
..
..
..
.
.
.
.
.
. .
..
..
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
..
. .
.. .
.
..
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
..
. . ..
.
.
..
.
..
. .
. ..
.
.. .
.
.
.
.
.
..
.
.
.
.
.
..
.
.
...
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
...
...
..
..
...
..
..
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. ..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
...
...
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
...
.
..
...
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
..
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
..
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
..
...
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.. ..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
..
...
... .
.
.
.
.
..
.
.
.
.
.
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. . .
.
. .
.
..
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.. .
.
..
..
..
.
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
. .
..
..
..
..
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
. .
.
..
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
..
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
..
.. .
.
..
.
..
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
. .
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
. ..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
..
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
. ...
.
.
.
.
.
..
.
.
.
.
.. .
.
..
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
...
...
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
...
..
..
.
.
.
.
.
.
.
.
.
.
.
. ...
...
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
...
...
..
.
.
..
...
. ..
.
.
.
.
.
.
..
.
.
..
..
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.. .
.
.
.
.
.
.
.
..
.
.
.
..
..
.
.
..
..
.
..
..
..
..
..
..
.
.. ..
.
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
...
...
..
.
.
...
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
..
..
..
..
. .
.
...
..
..
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
..
..
. .
.
.
.
.
.
..
.
.
...
.
.
.
.
.
..
.
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
.. ..
..
..
.
.
.
.
.
.
..
..
..
..
. .
..
...
...
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
. .
...
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
.
.
..
.
.
.
.
..
.
..
..
.
.
.
. .
.
. .
.
.
..
.
.
.
.
.
.
.
..
..
..
..
..
. . .
.
.
.
.
.
..
.
.
..
...
. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.. ..
.
.
.. ..
..
..
..
.
.
..
...
. ..
..
..
..
..
.
.
..
..
..
..
..
..
..
..
..
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
. .
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
.
.
..
..
.
.
..
...
...
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
..
.
.
...
.. .
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
...
. ..
..
..
. .
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
.
.
..
.
.
. ..
.
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
. .
.
.
..
. .
..
..
.
.
..
..
..
..
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
.
.
.
.
.
.
.
.
.
..
.
. .
.
.
.
.
.
.
.. .
.
.
..
..
.. .
.
.
.
..
. .
.
.
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
...
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
...
..
.
.
.
.
.
.
.
..
. .
..
..
..
...
..
..
.
..
.
.
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
....
..
. .
..
..
..
.
.
..
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.
.
.
.
.
..
.
..
..
.
.
.
.
..
..
.
.
..
..
.
.
..
..
.
.
..
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
..
. .
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
...
...
.
.
.
.
.
.
..
..
.. ..
. .
..
.
.
.
.
.
.
..
..
..
.
..
..
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
.
.
..
..
..
..
...
...
.
.
.
.
.
. .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
...
...
.
. .
..
. .
..
..
.
.
.
.
.
.
.
.
....
.
...
...
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
...
...
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
. .
..
.
..
.
.
..
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.. ..
..
.
.
.
.
.
..
.
.
..
.
..
. .
.
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
...
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. ..
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
..
.. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
...
...
..
..
..
..
. .
..
.
....
. .
.
.
.
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
..
.
..
.
..
.
.
.. .
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
. .
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
. .
..
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
. .
..
..
..
.
.
.
.
.
.
..
..
..
.
..
..
. .
.
.
.
.
.
..
..
.
.
..
..
..
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. ..
. .
.. .
.
..
.
.
..
. .
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
.. ...
..
..
.
.
.
.
.
..
.
.
..
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
...
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
.
. ..
.
.
.. ..
. .
.. ..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
..
. .
..
.
.
.
..
..
.
.
..
.
.
..
.
.
..
.
...
..
. .
. ...
..
..
...
...
..
..
..
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
. .
....
.
.
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
. .
...
.
..
..
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. ...
...
.
..
..
.
..
.
.
..
..
.
.
..
..
.
.
..
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
...
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
...
..
.. ..
.
.
..
..
..
..
..
.
.
..
..
..
..
..
..
..
.
..
..
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
. .
.
..
.
..
.
.
.
..
.
.
.
.
.
.
..
.
.
..
...
...
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
....
...
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
..
. .
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ...
.. .
.
.
.
.
.
.
..
.
..
.
..
.
.
..
.
.
.
.
.
.
...
.. .
..
..
..
..
.
.
..
..
.
.
..
.
...
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
...
... .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.. ..
.
..
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
. .
..
..
.
..
.
.
..
.. .
....
... .
..
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
.
....
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
. .
.. .
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
..
..
..
..
.
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
.
.
.
. .
..
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.. .
.
.
.
..
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
.. .
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
. .
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
.
..
..
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
.. .
..
..
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
...
...
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
....
.
.
.
.
.
.
. .
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
..
.
.
..
..
..
.
.
. .
.
.
..
.
..
.
.
.
.
.
.
.
..
. .
..
.
.
..
.
.
.
.
..
.
.
..
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
.
.
.. .
.
.
.
.
.
..
..
..
..
.
.
..
.
.
..
.
.
..
..
..
...
...
..
. .
..
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
...
..
. .
.
..
.
.
.. ..
.
...
..
.
.
..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
...
.
.
.
.
.
.
.
...
.
.
..
..
..
..
..
.. .
..
..
.
.
..
..
. .
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
. ..
.
.
.. .
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
..
.
.
.
..
..
..
..
..
..
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
...
.
.
.
..
..
.
.
..
..
.
..
. .
..
.
.
.
.
.
.
..
..
..
..
.
.
..
...
....
.
.
.
.
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.. ..
.
..
..
...
.
.
.
.
.
..
.
.
..
..
. .
.. .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
.
....
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
..
..
..
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
.. ..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.... .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
...
...
..
.
.
..
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
....
...
..
.
.
..
.
..
..
.
..
.
.
..
..
.
..
.
.
.
.
.
.
.
..
. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.. .
.
..
.
.
.
.
.
.
.
.
.
...
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
. .
.
.
.
.
..
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
...
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
.
.
.
.
.
. ..
..
.. ..
..
..
...
...
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
..
.
..
..
. ..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
..
. .
..
..
.
.
..
..
. .
..
.
..
. .
.
.
.
..
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
..
. .
.....
. ..
.
.
.
.
.
.
..
.
.
..
...
...
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
...
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
.
. .
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
..
..
.
.
. .
..
. .
.
.
.
.
.
.
.
.
.
.
. ..
..
..
...
....
.
.
.
.
.
..
.
.
..
. ..
...
..
. .
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
...
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
..
.
.
.
...
.
.
.
..
..
. .
.
..
.
.
.
..
..
. ...
.
..
..
.
.
..
..
.
.
..
..
.
.
..
.
..
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
..
. .
.
.
.
.
.
. ..
. .
..
.
.
..
.
.
..
.
.
. .
.
.
.
.
.
.
..
.
...
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
. .
.
.
.
.
.
.
.
..
. .
..
.
..
..
..
.
.
.
.
.
...
...
.
.
. .
..
..
..
..
..
.
.
..
..
..
.. .
.
..
.
.
..
..
..
. .
..
...
.
..
....
.
.
...
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
..
..
. .
..
..
.
.
. .
..
..
..
.
..
. .
.
.
.
.
.
.
.
..
.
.
..
.
...
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
. .
.
.
..
..
...
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
. ..
.
..
..
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. ...
...
.
.
.
.
.
.
..
.
.
.. ..
. .
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
..
..
..
.
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
..
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
.. .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
....
....
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
...
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
...
...
...
. ..
.
.
. .
..
.
.
.
.
.
..
..
..
.
..
..
. .
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
. .
.. ..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
. ..
.
..
. .
.. ..
.
.
.. .
..
..
.
..
.
.
..
.
.
.
.
.
. ..
. .
..
..
..
..
.
.
.
.
.
.
.
. ...
.
.
.
. .
.
.
..
..
.. .
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
...
.
.
..
..
.
.
...
.
..
.. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
.. ..
.
.
.. ..
.
.
.. .
.
..
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
..
..
. ..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
... .
.
.
.
..
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
...
..
..
..
..
.
.
.
..
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
.
.
..
.
.
..
..
..
..
.
....
.
...
...
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
.. .
.
..
.
...
.
.
..
.
.
.
.
.
. .
.
.
.
.
...
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
..
..
...
.. ..
.
.
.
.
.
.. .
....
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
.
.
..
.
. ...
. .
.
.
.
.
.
..
..
...
.
.
.
..
.
.
.
.
.
.
..
.
...
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
..
..
.
.
..
..
..
..
...
...
..
..
..
..
..
..
..
..
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
. .
. .
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.. ..
.
.
.
.
.
.
.
..
..
..
..
.
.
...
..
..
.
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
...
.
.
.
.
.
...
. ..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
....
... .
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
...
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
...
.
.
. .
..
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
...
.
..
.
.
.
..
..
.
..
..
..
.
.
..
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
..
..
..
.
....
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
....
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
..
.
.
..
.
.
. .
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
.
.
..
..
..
. .
. .
...
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
...
..
. .
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
..
..
..
.
.
.
..
.
.
.
.
.
.
.
. ..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
...
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
....
.
.
.
.
.
.
.
..
..
..
..
..
.. .
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ...
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
..
.. .
.
..
..
. .
..
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
. .
..
..
..
..
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
...
.
.
..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
.
..
..
.
.
..
..
.
..
. .
.
...
..
..
.
..
..
.
..
.
.
.. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ...
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
...
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
...
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
..
.. ..
..
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
..
.
.
..
.
... ..
..
.
.
..
...
. ..
..
.
.
..
.
.
.
.
.
.
..
. .
..
..
.
...
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
.
.
..
..
.
.
..
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
.. ...
. ..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
..
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
.
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.
...
...
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
..
..
.
..
. .
...
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
. .
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
.
.
.
. ...
. ..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
..
...
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
..
.
.
..
..
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
... .
.
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ...
...
...
...
.
..
..
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
..
. .
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
..
.
.
....
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
..
.
.
.
.
.
.
..
..
.. ..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
..
...
. ..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
. .
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. ...
...
..
. .
. .
..
. .
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
....
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
..
..
..
...
...
.
.
.
.
.
...
. .
..
.
.
. .
.
.
.
.
.
.
.
.
..
..
..
...
...
.
.
.
.
.
.
...
...
..
..
. .
..
.
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
..
...
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
. ...
...
.
...
.
.
..
. .
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
..
.
.
.
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
. .
..
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
..
..
.
..
..
...
.
..
.
.
.
.
.
.
.
.
..
.
..
.
.
.
..
.
.
..
..
..
..
.
.
.. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
..
..
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
....
. .
..
..
..
..
..
..
..
.
.. ..
.. .
.
.
..
.
.
..
.
.
.
..
..
.
..
...
.
.
.
..
..
...
...
.
.
.
.
..
.
..
..
.
..
.
...
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
..
.
..
..
. .
.
.
..
..
..
. .
. ..
.
.
.
..
..
.
.
..
..
.
.
..
..
..
..
..
.
.
..
.
...
.
.
..
..
.. ..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
..
.
.
.
.
.
..
.
..
.
.
..
.
.
...
.
.
.
.
. .
.
.
.
.
.
.
..
..
.
..
. .
..
..
..
..
.
.
..
.
.
.
..
..
.
..
..
..
.
..
..
.
..
..
..
..
.
.
..
..
. .
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
..
..
.
..
..
..
.
..
... .
.
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
. .
.
.
..
..
..
..
..
..
..
.
.
.
.
.
....
. ..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
..
...
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
. .
...
...
.
.
.
.
.
...
..
..
..
.
.
..
.
.
.
.
.
. ..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
..
..
.
.
.
.
.
.
..
. .
..
..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
...
.
.
..
.
...
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
. .
. .
..
..
.
. .
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
...
...
..
..
..
.
..
. .
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
..
...
.
..
..
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
...
.
.
.
.
.
.
.
. .
.
.
..
.
.
..
.
..
..
..
..
..
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
..
..
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
.
.
..
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
...
..
..
.
..
. .
.
...
....
...
.
.
.
.
.
.
.
.
..
..
..
...
...
..
..
..
.
..
. .
.
.
.
.
.
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
..
. .
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
...
...
..
.
..
. .
. ..
.
.
.
.
.
..
.
.
.
.
..
. .
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
. .
.
.
..
.
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
..
..
.. .
..
. .
.
.
.
.
.
.
..
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.. .
.
.
.
.
.
..
..
..
.
.
.
.
.
.
...
...
..
.
.
..
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
..
. .
...
.
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
. .
.. ...
...
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
. .
.
.
.
..
..
.
...
. ..
..
. .
..
.
.
.
.
.
.
.
..
..
...
.
.
.. ..
.
.
..
.
.
.
.
.
.
.
. .
..
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
. .
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.. .
.. .
..
.
.
..
.
.
.
.
.
.
..
..
..
..
..
..
..
..
..
..
. .
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
..
..
..
..
..
..
..
.
.
..
...
...
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
..
. .
.. ..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.. .
..
..
.
...
...
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
...
. .
..
.
.. .
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
...
.
.
.
..
.
.
.
.
.
. .
... .
.
..
. .
. ..
.
.
.
.. .
.
..
..
..
. .
..
.
.
.
.
.
. ...
...
..
..
..
..
.
.
..
.
.
.
.
.
.
.
..
..
.
..
..
...
.
.
.
.
.
.
.
.
.
.
. ..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
..
...
..
..
..
.
. ...
.
..
.
...
.
.
.
.
.
.
.
..
..
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
....
.
..
.
.
..
.
.
. .
.
.
..
.
.
..
..
.
.
..
.
..
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
....
.
..
. .
..
.
.
..
.
.
.
..
..
. ..
.
.
. .
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
..
..
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
.
.
.. .
..
.
.
.
.
.
.
.
.
.
.. ... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. .
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
.
.
..
..
...
.
.
.
.
.
..
..
..
..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
..
..
..
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
..
..
. ..
.
.
..
..
..
..
.
.
.
.
.
. ...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
. .
..
.
..
.
.
..
.
..
..
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
....
..
...
.
.
.
.
.
..
..
..
.
.
..
.
.
.
..
..
.
..
..
..
..
..
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
. .
..
..
..
...
.
.
.
.
.
..
. .
..
...
...
..
..
..
.
..
..
.
..
.
.
..
.
..
..
.
.
.
..
.
.
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
. .
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
...
...
.
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
. ..
.
...
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
...
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
. . ..
.
.
..
..
. .
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
..
..
..
..
.
.
.
.
.
..
. .
. .
.
....
.
..
.
.
. .
..
.
...
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
.
.
.
.
.
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
. .
..
..
..
...
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. ..
.
.
..
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
..
.
.
..
. .
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
.
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
...
.
..
.
.
.
.
.
.
..
.
.
..
...
.. .
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.. .
.
..
..
.
...
...
..
.
.
..
..
..
..
.
.
.
.
..
..
. .
..
..
..
..
..
.
.
..
..
.
.
..
.
..
..
.
..
.
.
..
.
.
..
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.
..
..
..
..
..
.. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
. .
.
. .
..
. .
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
....
.
.
.
..
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
..
..
.
.. ..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
...
..
.
..
.
..
.
.
.
.
.
.
.
..
..
..
.
.. ..
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
..
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
. .
.
.
...
...
..
..
..
..
..
..
...
...
..
.
.
..
.
.
..
.
.
.
.
.
.
.
..
.
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.. .
.
.
.
.
.
.
..
..
.
..
..
..
.
.
.
.
.
.
..
..
...
.
..
..
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
.
.
.
.
..
.
.
..
..
....
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
..
..
.
.
..
...
...
.
.
.
.
..
.
.
.
.
.
.
..
..
..
.
..
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
.
..
.
.
.
.
.
. .
.
..
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
. .
. .
..
.
..
..
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
..
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
..
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
..
..
..
..
..
..
.
.. .
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
. ...
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
.
..
..
.
.
.
..
.
.
.
.
.
.
..
.
. ...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
. .
..
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
..
. ..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
...
...
.
.
..
.
.
..
..
..
.
.
.
.
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
....
..
. . .
.
.
.
.
. ..
. .
..
.
.
.
.
..
.
.
..
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
..
..
...
.
..
..
.
.
.
.
.
.
..
..
..
.
..
. .
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
.. ..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
..
.
.
..
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
...
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.. .
.
.
.
..
.
..
..
. .... .
.
..
.
.
..
..
.
.
..
..
..
..
. .
..
..
.
.
.
.
.
.
.
.. ..
. .
.. ..
.
..
..
..
.
.
.
.
.
..
.
.
.
..
.
..
..
.
.
.
.
.
.
.
...
...
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
..
..
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
. .
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
. ..
. .
..
.
.
..
.
.
.
.
..
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
. .
.
.
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
..
..
. .
..
.
.
.
.
.
. .
.
.
.
.
.
..
..
..
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
..
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
...
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
..
.
.
.
.
...
..
.. ..
. .
.. ..
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
.
..
..
.
..
. .
..
..
.
.
..
.
..
..
.
..
..
...
.
.
.
.
. ..
. .
..
..
..
.. .
..
. .
..
.
.
.
..
...
..
.
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
.
.
.
.
.
.
..
.
.
..
..
..
.. .
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
. .
..
.
...
.
.
..
.
.
...
..
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
..
..
.
.
..
.
.
.
. .
..
.
..
..
..
.
.
..
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
.
....
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
...
...
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
..
..
.
...
...
..
.
..
.
..
.
.
..
.
.
.
.
.
.
..
..
. .
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
..
..
.
..
.
.
..
...
...
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
.
...
.
.
..
.
..
..
.
.
.
.
.
.
.. .
..
..
.
..
..
.
.
..
..
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
....
.
.
.
.
.
.
.
.
.
..
.
.
.
...
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
...
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
... ..
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
. .
.
.
.
.
.
..
. .
.. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.. ....
.
.
.
.
.
.
..
. .
....
..
..
..
..
..
.
.
.
.
.
..
..
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
.
.
.
..
.
.
.
.
..
.
. .
.
.
.
.
.
.
.
....
...
..
. .
..
..
..
..
.
..
..
.
..
.
..
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
..
..
.. .
.
.
.
.
..
.
..
..
..
..
..
...
. ..
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
...
...
.
..
..
.
..
.
.
..
.
.
.
.
.
..
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
..
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
...
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
. .
..
.
..
. .
..
..
..
..
..
..
..
.
..
..
.
.
.
.
.
.
. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
..
.
.
..
.
.
.
.
.
. ..
.
.
..
..
. .
..
..
.
.
..
..
.
.
..
..
..
..
..
..
.. ..
..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
. .
..
..
..
..
.
.
.
.
.
.
.
. .
..
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
.. ..
..
..
.
.
..
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
....
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
...
.
..
..
.
...
.
.
.
.
..
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
..
....
.
.
. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.. ..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
..
..
...
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
..
.
.
.
.
.
.
.
.
..
. .
.
.
.
.
.
.
.
.
..
..
.
..
..
...
..
...
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
...
...
.
.
.
.
.
.
.
..
..
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
..
.
..
..
..
..
.
.
..
..
.
.
..
.
..
..
.
.
... .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
..
.
.
.. .
.
.
.
.
.
.
..
..
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.. .
.
.
..
..
..
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
..
. .
..
..
..
..
.
.
.
.
.
.
..
..
. . ..
..
..
.
.
. .
.
..
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
...
.
.
...
..
. .
. .
.
..
.
.
.
.
.
.
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
....
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
..
.
..
.
.
..
...
.. .
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
...
..
. ..
.
..
..
.
.
.
.
.
..
.
.
.
.
.
..
..
...
.
.
.
.
.
..
. .
..
.
..
..
.
..
.
. .
..
.
.
.
.
. ..
..
.. .
.
.
.
..
..
. .
..
..
.
.
.
.
..
.
..
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
..
.
.
..
.
.
...
...
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
.
..
..
..
.
.. .
.
.
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
..
.
.
..
.
.
..
.
.
..
.
.
..
..
..
..
..
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
. .
.
.
..
.
.
..
...
.
...
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
..
....
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
..
.
.
.
.. .
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
..
..
.
.
..
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
...
.
..
.
.
..
.
.
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
. .
.
.
..
..
.
.
...
.
. .
..
..
. .
..
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
....
.
.
...
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
....
..
. .
...
...
.
.
.
.
.
.
...
...
..
.
..
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
..
.
.
.
.
.
.
.
..
.
..
..
.
.
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
..
. .
.. ...
...
..
.
.
.. .
.
.
.
.
.
.
..
..
. ..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
...
...
..
..
...
.
.
.
.
.
..
. .
..
.
.
.
.
.
..
.
.
.
.
..
..
. .
.
.
.
.
.
.
.
..
..
..
.
..
..
.
..
.
.
..
.
.
.
.
.
..
.
..
..
.
.
.
.
.
.
.
....
.
..
.
.
..
..
..
..
..
.
.
..
..
.
.
..
...
..
.
.
.
.
.
.
.
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
..
....
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
...
...
..
..
.. ..
..
..
.
.
..
..
.
..
..
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
..
..
..
.
.
.
.
.
...
.
.
..
.
.
.
.
.
.
...
...
..
..
..
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
. .
.. ..
..
..
.
.
.
.
.
.
..
. .
..
.
..
..
.
..
..
. .
..
..
.. .
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. ..
. .
..
.
.
.
.
.
.
...
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
..
..
..
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
..
.
.
....
..
..
..
.
.
..
.
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. ..
.
..
. .
..
..
.
.
.
.
.
.
.
.
. .
..
.
..
..
...
.
..
..
..
.
.
..
...
...
.
..
..
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
...
...
..
..
..
..
. .
....
..
.. .
.
.
.
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
...
. ..
..
..
..
..
..
...
..
..
.
..
.
.
..
..
.
.
..
.
.
..
.
.
..
.
... .
.
.
.
.
..
.
.
.
..
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
... ....
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
. ..
..
..
.
.
.
.
..
.
.
.
.
.
. ...
.. .
..
. .
..
.
..
..
.
..
..
..
.
..
..
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
..
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
..
.
.
.
.
.
..
. .
..
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
..
.
.
..
...
. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
....
.
.
..
..
.
.
.
.
.
..
.
..
. .
.
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
. .
.
.
..
..
..
.
.
. .
.
.
..
..
..
..
..
. .
.
.
.
.
.
. ..
..
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
..
..
.
..
.
.
..
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
.
.
.
.
.
.
..
.
.
..
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
..
..
.
.
..
..
.
..
.
.
..
...
...
..
..
..
..
. .
..
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
. . .
....
.
.
.
..
.
.
.
.
.
. ..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
..
.
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. .
.
.
.
.
.
.
...
.
.
..
..
. .
.
..
..
.
.
.
.
.
.
.
..
..
..
..
.
.
..
.
.
..
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
..
.. ..
..
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
...
. ..
..
..
..
.
.. ..
.
..
.
.
..
..
..
. .
.
....
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
..
.
.
.
.
.
.
. .
.
.
.
..
.
..
...
...
.. .
..
.
.
..
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
.. .
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
..
..
.
.
..
..
..
..
..
.
.
.
.
..
.
.
..
..
..
..
..
.
.
.. .
..
..
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
. .
..
..
....
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.. .
.
.
..
.
.
..
.
.
.
.
.
.
..
. .
..
.
..
..
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
.
. .
..
.
..
.
.
..
..
..
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
.
.
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
...
.
.
.
..
.
.
.
.
.
. ..
.
.
. .
.
.
.
..
.
.
.
..
..
.
.
.
.
.
...
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
..
.
.
.
.
..
.
.
..
..
..
.
.
..
.
.
.
..
..
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
. .
.
.
..
..
..
..
.
.
.
.
.
.
.
.
..
.
..
.
..
.
.
..
.
...
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
....
.
.
..
..
.
.
..
..
..
..
..
.
.
.. ..
.
.
..
...
.. .
..
..
.. ..
. .
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
...
..
.
..
. .
.. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
...
...
..
.
.
..
.
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
.
.
...
.
.
.
.. .
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
. .
..
.
.
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.. .
..
.
..
. .
..
..
..
..
..
.
.
.. .
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.. .
.
..
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
....
.
..
.
.
..
.
..
..
.
...
...
.
.
.
.
.
.
..
.
.
..
...
..
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
..
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
.
..
.
.
..
...
...
..
.
.
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
..
..
..
..
.
.
..
.
.
.
.
.
...
..
..
.
..
...
..
.
..
.
...
...
.
..
..
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
...
...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
..
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
..
.
.
..
..
..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
...
..
.
.
..
...
...
.
.
.
.
.
.
.
..
. .
.
.
...
.
.
..
.
.
..
.
.
.
.
..
.
.
.
.
.
.
.
.. .
.
.
.
.
.
.
..
.
.
.
.
.
..
.
..
..
..
..
..
.. .
.. .
..
..
.. ..
.
.
..
.
.
.
.
.
.
..
..
..
..
. ..
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
..
..
...
.
.
.
.
.
..
..
..
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
...
..
.
..
.
.
..
.
.
.
.
.
.
.
....
.
..
.
.
..
.
.. ..
. ..
.
.
.
.
..
.
..
.
..
.
.
..
.
.
..
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
....
.
..
..
..
.
.
.
.
.
.
..
..
..
.
..
. .
..
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..
.
.
..
..
...
...
.
.
.
.
..
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
.
..
..
.
..
..
..
.
.
..
.
.
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
...
..
..
..
..
..
..
..
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.
.
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
. ..
.
.
..
.
.
.
.
.
.
..
.
.
..
..
..
..
..
..
..
.
..
..
.
. .
..
..
..
..
..
..
.
.
.. .
.
.
.
..
..
. .
..
.
.
.
.
.
.
..
..
..
.
.
. .
..
.
..
..
.
..
..
..
.
.
.
.
.
.
.
..
..
. .
.
.
.
.
.
..
..
..
.
..
..
.
.
.
.
.
.
.
.
.
.
.
.
. ..
.
.
. .
.
.
.
.
.
.
..
..
..
.
.
.
.
..
.
..
..
.
..
.
.
..
..
..
..
..
. .
..
.
..
..
.
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
..
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...
..
..
.. ..
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
.. ..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
..
..
..
.
.
..
.
..
..
.
.
..
...
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
..
..
..
..
..
..
.
.
..
.
..
..
.
..
.
.
.
.
..
.
.
..
..
.
.
. .
.
.
.
.
.
..
.
.
.
.
. ..
.
.
..
..
..
..
.
..
..
.
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
...
.
...
..
.
.
..
..
..
..
..
..
...
..
..
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
...
...
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
... . ..
..
.
.
..
.
....
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
.
.
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
...
.
.
..
..
..
.
.. .
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
..
..
.
...
..
..
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
. ..
.
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
.. ..
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
..
..
.
.
.
.
.
.
.
..
..
.
..
.
.
..
.
.
.
.
.
.
...
...
.
.
..
.
.
.
.. .
.
.
.
.
.
.
.
.
...
. ..
.
.
..
.
.
.
.
.
.
.
..
.
.
.
.
.
..
..
..
..
.
.
..
...
...
.
.
..
.
.
..
. .
..
..
.
.
..
.
...
.
.
..
..
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
..
..
...
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
.. ..
. .
..
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
...
.
..
.
.
..
.
.
..
.
.
.
.
.
...
.
.
..
..
..
..
..
..
..
.
.
.
.
..
..
.
.
..
.
.
.
.
.
..
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
. .
..
..
.
.
.
..
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
... .
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
... .
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
..
.
.
..
..
..
.. ..
.
.
..
.
.
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
..
.
.
.
.
.
.
..
.
.
. .
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
..
. .
..
.
.
.
.
.
.
.
..
..
.
.
.
.
.
.
.
...
...
..
.
.
..
.
.
.
.
.
. .
.
.
.
.
.
.
.
..
.
.
.
.
..
.
.
..
..
..
.
.
.
.
.
.
..
. .
..
..
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
..
..
..
..
..
.
.
..
.
.
..
..
..
. .
..
.
.
.
.
.
.
..
..
..
...
...
.
..
.
.
.
.
..
..
.
..
.
..
.
..
..
..
..
..
..
.
.
.
.
.
.
..
.
.
..
..
..
..
.
.
.
.
.
.
..
..
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
..
..
.
.
..
.
.
.
.
Figure 7.10: The plot in the first two principal components on the fuzzy cluster data.Note the structures in the data distribution. The dark datapoints are the correctjet-jet pairs, and the lighter datapoints are the incorrect jet-jet pairs.
116
deformed into a circle. This would render the Z jet-pair distribution into a spherical
shape, making the FCM more efficient. But this would be a non-general solution,
appropriate for only this particular data distribution.
7.3.2 Demographic clustering on the combinatorial problem
In the last section, we examined the use of a classifier based on the Euclidean distance,
that is capable of enhancing the sample, but incapable of efficiently classifying the
sample. Here, we attempt to apply the demographic clustering algorithm [37] to solve
the combinatorial problem. In contrast to the FCM algorithm, the demographic
clustering algorithm measures the proximity of the datapoints based on a voting
mechanism.
The voting mechanism introduces the non-Euclidean element into this algorithm.
Each pair of datapoints is voted once for each variable axis. If the separation is large
(see distance parameter below) along an axis, then the pair receives a vote against,
and if the separation is not large, then the pair receives a vote for. This is repeated
for each axis in the problem (six in the present case). The six votes are tallied: if the
“for” votes are larger than the “against” votes, the pair belong to the same cluster,
and if not, then the pair belongs to different clusters. For a detailed description of
this procedure, see Appendix C.
Thus the voting mechanism makes the demographic clustering method non-Euclidean
as well as non-linear.
The separation between the two datapoints is measured according to a distance
parameter, d. This parameter is defined for each axis, and optimized according to an
internal algorithm by default, but can be externally specified according to use.
The dataset consists of the same data set used in the last section. The data are not
standardized for this section, because the algorithm is not metric based, but voting
117
0 0.5 1 1.5 2θ-distance [radians]
0.3
0.35
0.4
0.45
0.5
0.55
Puri
ty
θ-Distance Parameter vs Purity
Figure 7.11: Demographic Clustering: The purity as a function of the distance pa-rameter on the θ12 variable.
count based. Standardizing the data, therefore, affords no advantage.
θ12 parameter
The use of demographic clustering algorithm is automatic. The algorithm internally
optimizes all its parameters. Here we examine the optimization of the distance pa-
rameter d that the algorithm uses to assign voting scores along different variable
axes. An important variable in the classification of correct and incorrect jet pairs is
the jet-jet opening angle θ12 (Section 2.3.1).
Figure 7.11 displays the result of varying the distance parameter dθ for θ12. Purity
is used to optimize the value of dθ because it is a measure of how well the cluster can
be identified with the signal. In this case we observe that the maximum in the purity
coincides with the maximum in the F -measure. Table 7.5 displays the classification
118
ability of the two clustering methods.
efficiency purity F -measureFuzzy c-means 0.99 0.39 0.56Demographic Clustering 0.97 0.51 0.67
Table 7.5: Comparison between fuzzy c-means and demographic clustering.
7.3.3 Comparing clustering result
Figure 7.12 compares the result from the partition obtained from demographic clus-
tering, that obtained from using the θ12-cuts obtained from Section 2.3.1, and that
from fuzzy clustering. As unsupervised classifiers, demographic clustering performs
better than fuzzy clustering, since the graph lies more to the right and thus has a
higher F -measure. Note that the previous supervised method using an ensemble of
neural networks however performs much better than any of the unsupervised methods.
7.4 Conclusion
In this chapter, we have examined the use of Principal Component Analysis as a
means to visualize data and conduct initial data exploration. PCA is helpful in a
high-dimensional problem which lets it choose the most significant coordinates to
view the data. The signal is localized in a space defined by the first two principal
components, which we exploited to identify signals using a density cutoff. The use
of PCA at the quark level has shown that there are non-linear structures in the data
distribution.
For automatic splitting of the data fuzzy c-means clustering was used. The cluster
method was able to enhance the signal in the dataset, but was unable to classify
119
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Purity
0.5
0.6
0.7
0.8
0.9
1
Eff
icie
ncy
θ-cutsDemographic ClusteringNN EnsembleFuzzy Cluster
Comparision Efficiency-Purity
Figure 7.12: Demographic Clustering: The efficiency-purity graph obtained by vary-ing the distance parameter on the θ12 variable. Comparison is made with fuzzyclustering, θ12-cut and an ensemble of neural networks.
it (produce a cluster with higher than 50% purity). This was attributed to the
non-linear data distribution and the Euclidean distance measure used in the fuzzy
clustering method. To overcome this shortfall, demographic clustering, based on a
non-Euclidean distance parameter and a voting mechanism was used, with improved
results.
120
Chapter 8
Conclusion and Future work
8.1 The goal
The goal of this work is to examine multivariate methods in the study of high-energy
physics data. High-energy physics data are manifestly multivariate and we have
analyzed it from a data mining point of view. Data mining techniques are sufficiently
developed but their applications in specific problems are not trivial. With the LHC
close to completion and the research and development on the linear collider pushing
it too toward a certainty, we are now at the verge of a deluge of experimental data
that would alter our understanding of the basic nature of our physical world. Data
mining techniques, as we have demonstrated in this thesis, can provide tools in the
analysis of the data with potentially useful insight into the physics.
We have chosen the context of a future linear collider accelerator to examine the
use of data mining, since it is the context in which this approach is most suited.
We constrained ourselves to examining only the kinematic quantities since they are
general quantities and different phenomena are likely to have different kinematic
signals.
121
8.2 The Results
8.2.1 FastCal
Since the work was designed to meet the challenges of a future accelerator, and the
parameters of the experimental facilities are still under development, the study had to
adjust to a changing environment. To meet with the dearth of data, a fast Monte Carlo
simulator for the detector calorimeter, FastCal, was designed to provide voluminous
data quickly. The central feature of this Monte Carlo simulator is the parameteri-
zation of the hadronic shower and the averaging of calorimeter material properties.
This simulator produces data at rates which are about 4 orders of magnitude faster
than the full simulation, yielding about 50 times the statistical power.
FastCal yields cluster-level information, with each particle producing at most one
cluster in each of the two calorimeters (ECAL and HCAL). This level of graininess in
the data is sufficient for jet-based studies, which do not require calorimeter cell level
information, but require very high statistics.
8.2.2 Neural Network
As classifiers, supervised neural networks are superior in general to other methods.
This is attributable to the non-linear sigmoidal activation function of neural units and
the ability of neural networks to form complex class boundaries in variable space. In
contrast, binary trees, which are also trained using a supervised learning algorithm,
are limited to rectangular decision boundaries in variable space. Though this results
in less efficient classifiers, they provide decision boundaries which are akin to cuts on
variables obtainable from traditional physics data analysis and therefore somewhat
easier to understand.
The neural network models are difficult to understand and interpret in physics
122
terms. To understand neural models better, binary trees were used on neural network
results to approximate the complex decision boundaries with rectangular ones.
Though neural networks work efficiently, their optimum performance depends on
many non-trivial specifications. For this, we seek general solutions. Of immediate
concern is the neural network architecture. Too small a neural network would render
the network unable to model the required problem adequately, and too large a network
would introduce model artifacts and decrease its generalizing capabilities. These
issues were addressed with the use of a regularization scheme of weight decay and
pruning based on a simple rule to minimize the network size, and the use of an
earlystopping mechanism to prevent overfitting.
The size of the neural network further restricts the size of the training dataset
that it could optimally train on. A too-large or a too-small training dataset results
in poor neural network performance.
An ensemble of neural networks as used to improve the classification performance.
Individual neural networks are trained on separate training datasets with some over-
lap. Such ensembles perform better than individual neural networks. A crucial issue
in such ensembles is the training of individual neural networks. Earlystopping de-
creases the efficiency of such ensembles and it is important to continue training them
until a low error is obtained. It is important during such training to use a bagging
technique while designing the datasets so that some of the data are repeated across
different networks, and these features are reinforced.
CJNN
For the study of neural networks, a neural network package was designed and imple-
mented. This package provides a general, configurable neural network that can be
used for problems beyond those addressed in this thesis. It was specifically designed
123
for use in the JAS environment, so that neural networks could be trained and used
within JAS. These trained neural networks could form a part of a bigger analysis
package. This method is sufficiently general for applications in neural network so-
lutions to problems beyond the jet-combinatorial problem that we have addressed
here.
8.2.3 Exploratory Data Mining
Data mining approaches depend critically on data distributions. With problems that
have dimensions higher than three, it is hard to visualize the data distributions. The
data exploration was done using Principal Component Analysis, which is a linear
method that rotates the axes and orders them according to the variance content. The
axes with the highest variance in the data were used to visualize the data distribution.
While performing PCA, it is important to center the data and scale it to unit variance
along each original axis.
The data distributions in the principal components generally provide clues on
how analysis could be done. Interesting features are seen as regions of high density.
As an example, a cutoff in the two-dimensional principal components was performed
to distinguish correct jet pairs from incorrect jet pairs. The performance of such a
classifier approaches that of the neural networks.
Of particular significance are the shapes of the data distributions in the principal
components space. The kinematic limits in the wrong combinations result in distri-
butions that are curved and widely dispersed. These non-linear structures are not
adequately addressed by the linear transformations of the principal components.
124
8.2.4 Clustering
Visualization of data distributions is generally limited to three dimensions. Though
PCA orders the axes according to the variance and thus importance, visualization does
not aid in the detection of structures in higher dimensions. For problems in higher
dimensions other multivariate methods are required. Cluster analysis methods are one
class of methods that can be applied. There is a wide variety of methods available,
and their applications are non-trivial.
The Fuzzy c-means (FCM) clustering was applied on the combinatorial problem.
The fuzzy membership measure of a datapoint with respect to a cluster can be in-
terpreted as the probability of membership, akin to the Bayesian output of neural
networks. For c = 2, the FCM isolated a high purity (but low efficiency) cluster of
wrong combination jets. Thus this clustering method was able to enhance the signal
(correct jet pair) in the other cluster. The FCM, based on the Euclidean distance,
was unable to identify the correct jet pairs adequately.
A correction for the non-spherical clusters is a possible non-standard norm for the
distance metric. Though these non-standard norms can be used to squeeze ellipsoidal
distributions into spherical ones and thus make the data more amenable to Euclidean
distance methods, they are non-general solutions. For instance, if all signal clusters
have ellipsoidal distributions aligned (the major axes are parallel) the non-standard
norm would improve detection. If the signal clusters are not aligned, then there exists
no norm that would improve detection for all signal clusters.
To improve clustering ability, the demographic clustering was used. This method,
based on a voting mechanism, is not dependent on the Euclidean distance. On account
of this property, the demographic clustering method performed better than the FCM
clustering method. The correct jet-pair clusters had a purity that was higher than
50%, and the F -measure was 0.67.
125
The clustering methods, which are examples of unsupervised training methods, are
capable of distinguishing data distributions with broad properties. This is in contrast
to methods based on supervised learning, which are capable of creating models of
data that are finer in detail. As a result of this, cluster analyses do not yield results
that match the performance of supervised learning algorithms like neural networks.
8.3 Data mining and the future
The data mining process is an iterative process. This is analogous to the iterative
processes that constitute the learning algorithms: neural networks or cluster analysis.
With each iteration of the data, a better understanding of the problem is achieved,
and relevant domain knowledge is added to the next cycle of the data mining process,
which results in further improvement of the result.
The work in this thesis is a preliminary exploration toward a more systematic use
of data mining techniques in high-energy physics. One of the main results of this
thesis was the importance of the data distribution in kinematic variable space. The
Principal Component Analysis demonstrated that data distributions have structures.
For example, the data distribution of the wrong jet pairs in the three-dimensional
principal component space is a curved hypersurface. Principal Component Analysis,
which is a linear transformation of variables, is not capable of providing a transfor-
mation in which this hypersurface is not curved but flat. Such a distribution would
be better suited for separating the wrong jet pairs from the correct jet pairs.
To unfurl curved hypersurfaces, non-linear transformations are required. Transfor-
mations like Isomap [84] could be used, which directly address the issue of extracting
low-dimensional structures in high-dimensional data. Since Isomap uses the geodesic
distances, which are distances measured along curved surfaces, as opposed to Eu-
126
clidean structures used in PCA or MDS, the structures associated with the curved
surfaces will be laid bare. With the curved surface well defined, signal distributions
that occur away from the these surfaces will stand out more easily.
Isomap outputs can be further supplemented with self-associative neural network-
based non-linear PCA (NPCA) [47]. The self-associative neural networks, trained
on sufficient data, would result in a smaller-dimension data sample transformed non-
linearly.
8.4 Conclusion
The data mining approach to data analysis in high-energy physics offers a opportu-
nity for experimentalists to address the issues of large data sets and the search for
novel signals in the data. An iterative technique, this approach is data-centric and
particularly suited for data analysis in an experimental realm, especially in the ex-
perimental environments in the Large Hadron Colliders and Linear Colliders of the
future.
127
Appendix A
Using FastCal in JAS
A.1 Introduction
FastCal is a general, fast, parameterized Monte Carlo simulator for the linear collider
detector that is written in Java. It designed to work in the JAS environment with
the hep.lcd suite of packages. FastCal is designed to coexist and cooperate with
other simulation packages. It works directly on final state particles in the particle
tables in hep.lcd.event.LCDEvent objects. The preferred inputs are from StdHep
files generated by Pandora-Pythia.
The main class FastCal extends the class AbstractProcessor in the lcd pack-
age. The method process(LCDEvent event) processes every event and adds two
ClusterList objects to the LCDEvent data. These two objects contain the clusters
in the ECAL and HCAL.
FastCal is available as a standalone Java archive (.jar) file, and also as a part of
the hep.lcd package under hep.lcd.mc.fast.cluster.saurav.
128
A.2 Use of the FastCal package
The package can be used as a standalone package. If FastCal is used as a standalone
package, .jar file should be in the extensions folder in JAS version 3 (or in the
classpath in JAS version 2). The package classes are imported with the following
code:
import fastcal.*;
If FastCal is used from the hep.lcd package then no special installation is required,
and the classes can be imported via the following code:
import hep.lcd.mc.fast.cluster.saurav.*;
The FastCal simulation is added to the analysis code by specifying a detector and
adding a FastCal object as follows:
Detector.setCurrentDetector(new Detector("ldmar01"));
add(new FastCal(true));
add(new UserAnalysis());
The first line sets the detector type for use (in this case, detector ldmar01), and
the second line includes a FastCal object. The constructor has an input parameter of
type boolean. This parameter, when set to true, flips the direction of the magnetic
field in the detector. If it is set to false, the magnetic field is not flipped. This
parameter can be used to compare the FastCal results with those from GISMO (set it
to true). The third line adds the user analysis object UserAnalysis which contains
the user’s analysis code.
In the process() method in user’s analysis code, one can retrieve the two ClusterList
objects in the following way:
129
ClusterList hadCL = (ClusterList) event.get("HADFCalClusterList");
ClusterList emCL = (ClusterList) event.get("EMFCalClusterList");
These objects can now be used like any object with a ClusterList interface. The
Cluster objects can now be used in the conjunction with other objects in the event
data, like tracks etc., or on its own.
130
Appendix B
CJNN – Neural Network GUI
package for JAS
CJNN is a neural network package that is designed to work in the JAS environment,
in both the JAS2 and JAS3 versions. The goals for CJNN were:
• A user-friendly graphical user interface for network training.
• An ability to use the trained network from within JAS in analysis codes written
in Java.
• An ability to include a trained network in a Java-archived analysis package.
CJNN can also be used as a standalone neural network program.
B.1 A brief description of CJNN
CJNN is a fully connected multi-layered neural network package in Java. The number
of layers and neural units in each layer can be varied according to user needs. The
neural units are based on the sigmoidal (logistic) function, with a bias term. Learning
131
is via standard backpropagation on a standard error term with a momentum term to
speed it up, and the number of learning iterations is limited via earlystopping [78].
B.2 Use of CJNN
The CJNN package is available in a .jar file, that should be saved in the JAS3
extensions directory. The CJNN plugin adds a menuitem to the toolbar in JAS. The
selection of this menuitem launches the CJNN graphical user interface.
B.3 Data
CJNN requires two independent datasets for training—one for training and the other
for validation. Training data are read from flat files, with data arranged in columns.
For a training dataset that consists of p input variables, r output variables and n
records, the data are read from files with p + r columns and n rows. The first p
columns are input variables and the rest are output variables. The validation data
are read from flat files in a similar format.
The training interface is graphics-based, and can be run from within the JAS
environment. Figures below illustrate the GUI.
B.4 Training interface
Figure B.1 shows the CJNN SetUp Network panel that lets the user choose the
working directory and the network architecture. The network architecture is encoded
in the form i-h1-h2-o, where i is the number of inputs, h1 and h2 are the number of
hidden units in the two hidden layers and o is the number of output units. There
132
Figure B.1: The network setup panel. The working directory sets up the folder inwhich the data files exist, and all CJNN created files are copied. Network configura-tion configures the basic network architecture. The neural network can alternativelybe read from an already existing configuration file. Further, the configuration of anetwork can be written to a file for later use.
can be more hidden layers, and all units will be fully connected with all the units in
adjoining layers.
Figure B.2 shows the CJNN Training Params panel which lets the user set η, the
learning rate and α, the momentum rate. Default values are provided. The training
and validation datasets are flat files that are selected from the working directory.
Rescaling the training data is optional, which would standardize the input variables
to zero mean and unit variance.
Figure B.3 displays the Train panel. It gives the user the option to initialize the
network. The network is initialized by assigning uniformly distributed random values
to the weights. The assigned values are between the specified limits. For efficient
network training, the random numbers should be small.
133
Figure B.2: The second panel in the GUI sets up the training parameters. eta is thelearning coefficient and alpha is the momentum coefficient term. Rescaling the datatransforms variables such that they have zero mean and unit variance along columns.The training parameters can be written to and read from a file.
134
Figure B.3: The third panel, under Weights Initialization, sets the weights in thenetwork. This can be done either from a file, or the weights can be randomized.Once the weights are set, the iterative training can begin.
The network can also be initialized from a saved network. This option can be
used to retrain partially trained and previously saved networks (weights.last). The
network can be trained for a fixed number of cycles (epochs).
B.5 Error Display
CJNN uses the JAIDA graphic packages, and during training, a realtime graph of
the training and validation error is displayed. The earlystopping is implemented by
saving the network configuration with the minimum error on the validation dataset
in a file weights.min.
135
B.6 Application of CJNN in code
The use of the CJNN trained neural network is given in the following example code:
import cjnn.backprop.*;
public class TestCJNN
NetworkBasic net;
TestCJNN()
// Create the network
net = new NetworkBasic();
// Set the working folder
net.setWorkingFolder("/home/saurav/cjnn-test");
// Configure the network from the default file
// the file is called <network.net>, and should have been
// saved at the time of training
net.configureNetwork();
// Read in the weights from the file.
net.readWeights("weights.min");
// Read in the normalization parameters. The parameters
// are read from a file called <data.standard>
net.readStandardFile();
public void analysisCode()
double[] in = 1.588442e-04, 0, 1, 1.588442e-04;
double[] out = net.ffOutStandard(in);
for (int i=0; i<out.length; i++)
System.out.println(out[i]);
public static void main (String args[])
TestCJNN cjnnTest = new TestCJNN();
136
cjnnTest.analysisCode();
137
Appendix C
Demographic Clustering – an
example
The demographic clustering method [37] classifies data using an algorithm maximizing
the similarity between records in a cluster and minimizing the similarity of records
in different clusters. This is achieved by using an algorithm based on the Condorcet
solution from 1785.
C.1 Condorcet’s Criterion
Marquis de Condorcet (1743-1794) was a French philosopher, mathematician and
political scientist who devised a way to choose a candidate in a preferential voting
system. It is based on a ranking of candidates but the winner is chosen not by
counting the highest preferential votes, but by a pairwise preference count against all
other candidates. That is if more voters prefer candidate A over B, then A is the
winner against B. This rule is used to sort the list of candidates and find the ultimate
winner.
138
Inspired by this rule Michaud [60] proposed the New Condorcet Criterion. This
rule is based on a “vote” dij between two records i and j. If the two records differ from
each other by a distance value in one variable, the vote increases by 1. Therefore, dij
is the count of the number of variables in which the two records differ. If there are p
variables, then p− dij is the count of variables that the two records agree.
For any partition P (the division of the dataset into clusters), the goodness crite-
rion is measured as:
G(P ) =N∑
k=1
∑i∈Ck
∑j∈Ck;j 6=i
(p− dij) +∑
j 6∈Ck
dij
, (C.1)
where N is the number of clusters and Ck is the k-th cluster.
The Demographic Clustering method is an implementation of this criterion that
seeks a particular partition that maximizes this goodness criterion.
C.2 Demographic Clustering example
The algorithm used in the Intelligent Miner application can be understood from the
following example.
Let us consider three records, with three fields each
1 5.6 4.5 6.5
2 5.0 6.2 6.3
3 8.9 6.0 9.3
For each field we define a distance parameter d, which we fix at 1.0 for all three
fields. We consider the records pairwise, comparing the values of each field between
the two records. If the numbers in a particular field fall within d, then it receives a
score of 1 for; if not, it receives a score of 1 against. Since there are three records, we
may write down the the scores of the three pairs of records:
139
For Against
1,2 2 1
2,3 1 2
1,3 0 3
We consider all the possible combinations of clusters and score them. The scoring
is done in the following way. If two records are kept in the same cluster, then their
scores, for and against, are the same as given in the table above. If they are in
different clusters, the scores get reversed. For example, consider the combination
(1,2)(3); where records 1 and 2 belong to one cluster and record 3 belongs to a
different cluster.
(1,2)(3) For Against
1,2 2 1
2,3 2 1
1,3 3 0
Total 7 2
Therefore, this combination receives a total score of 7/9. We may calculate the
scores for all the different combinations ((1,3)(2), (1)(2)(3) etc.), and pick up the
combination with the highest score. The combination given above has the highest
score.
If we add a fourth record, a similar table to the first one can be made. We will
thus have six pairs. And correspondingly, we will have three different combinations
to choose from: (1,2,4)(3); (1,2)(3,4) and (1,2)(3)(4). As we go down the dataset
and test for each subsequent record, there is always a possibility that a new cluster
is created.
140
Appendix D
Binary Trees
The binary tree is a data mining algorithm that is based on supervised training.
Given a training data set, the algorithm trains by performing recursive division of
the data into two parts until each part consists of members of the same class. Because
of the binary and recursive splitting of the data, methods that employ this approach
are called binary trees.
The particular implementation of the binary tree used in this thesis is called
SPRINT, which is a part of the tool set in Intelligent Miner from IBM.
D.1 SPRINT
The SPRINT algorithm is optimized for speed, large datasets and categorical as well
as continuous data. The exact implementation for this will not be discussed here (for
details, see [79]).
At each node of the binary tree, the algorithm seeks to divide the data into two
parts that represent the classes more cleanly. That is, if there are two classes on which
the algorithm is training, the algorithm seeks to divide the data into two parts so
141
that each part contains data from one of the classes. This it seeks to do by imposing
a cut on one of the variables in the data.
Therefore, at each node, the algorithm has to make two choices. One, it has to
pick one of the variables; and two it has to pick a cut on this variable.
D.1.1 Split Points
The split points are chosen with the help of the gini index. If S denotes the entire
dataset at the given node and there are n classes, then the gini index is given by
g(S) = 1−n∑j
p2j (D.1)
where pj is the relative frequency of class j in S.
All possible splits are considered, and for each split, gs(S), the split gini index is
evaluated
gs(S) =n1
ng(S1) +
n2
ng(S2) (D.2)
The split with the lowest gs(S) is choosen.
142
Appendix E
Fuzzy Clustering
The fuzzy clustering method is a k-means method of clustering. k-means denotes the
requirement that the number of clusters into which the data will be partioned must
be given as an input. It is called fuzzy because the members are not given a discrete
(0,1) membership in a cluster but a probability of membership. The membership
probability is based on the inverse of the Euclidean distance from cluster centroids.
This is also called Fuzzy c-means (FCM) [24, 14].
E.1 Algorithm
The algorithm is initiated with a randomized selection of cluster centroids. The
membership probability pij is calculated for each record by
pij =d−1
ij∑k d−1
kj
, (E.1)
where pij is the probability that record j belongs to the i-th cluster. The summation
in the denominator over all clusters normalizes the probability. The distance dij is
the Euclidean distance between the record data point and the cluster centroid.
143
The new centroid positions are found by the weighted average of the data points
~cnewi =
∑j pij~rj
N, (E.2)
where ~cnewi is the new position vector of the i-th cluster centroid and ~rj is the position
vector of the j-th record. N is the total number of records in the dataset.
This repositioning of the cluster centroids is done iteratively until the change in
the position of all clusters fall below a tolerance level.
This algorithm implicitly minimizes the following objective function.
J =∑
i
∑j
pijdij (E.3)
144
Appendix F
Principal Component Analysis and
Multidimensional Scaling
F.1 Classical Multidimensional Scaling
The classical multidimensional scaling (MDS) technique consists of transforming a
distance matrix into a two-dimensional space, such that the relative distance be-
tween any two records is preserved. It is equivalent in result to Principal Component
Analysis.
F.2 MDS Algorithm
Consider the configuration the n × p matrix, which constitutes the entire data set.
n is the number of records in the dataset and p is the number of variates in each
record. Since the variates have different units and ranges, calculating Euclidean
distances between record pairs does not have any meaning. Therefore the variates of
145
the configuration matrix are normalized:
x′ =x− µ
σ. (F.1)
Here µ is the mean and σ is the standard deviation of the variate.
From the configuration matrix, the distance-squared matrix D2 is obtained. This
is an n × n matrix, with each element representing the Euclidean distance squared
between record pairs. This is a symmetric matrix. D2 is centered using the following
prescription:
B = −1
2(I − 1
2U)D2(I − 1
2U). (F.2)
Here I is the identity matrix and U is a matrix with all unit elements.
A singular value decomposition is performed. Since B is a symmetric and centered
matrix, the decomposition is of the form:
B = UWU ′, (F.3)
where W is a diagonal matrix and U ′ is the transpose of U .
From the decomposition, we may reconstruct the new configuration matrix Y as
follows:
Y = UW12 . (F.4)
Here Y is an n×n matrix, but we may pick just the first two rows and neglect the rest,
since they represent the most important of the variates that adequately preserves the
D2 matrix. Thus we now have a n× 2 matrix, called Y2, which replaces our original
configuration matrix X.
146
Glossary
C
CDF Collider Detector ar Fermilab.
CJNN A neural network package written in C++ and Java.
CP Charge-Parity.
E
ECAL Electromagnetic Calorimeter.
EM Electromagnetic.
F
FastMC A fast Monte Carlo simulator.
G
GISMO A Package for Particle Transport and Detector Simulation.
H
HCAL Hadronic Calorimeter.
147
HEP High-energy Physics.
I
IBM International Business Machines.
IM Intelligent Miner, a suite of data mining tools from IBM.
J
JAS Java Analysis Studio.
K
KDD Knowledge Discovery in Databases.
L
LC Linear Collider.
LCD Linear Collider Detector.
LCDROOT A suit of tools for LCD analysis in the ROOT framework.
LEP Large Electron-Positron Collider.
LHC Large Hadron Collider.
M
MDS Multidimensional Scaling.
MLP Multilayered Perceptron.
148
MP McCulloch-Pitts, a kind of artifical neuron.
MSSM Minimal Supersymmetric Model.
N
NLC Next Linear Collider.
NN Neural Network.
NPCA Non-linear Principal Component Analysis.
NSCP National Scalable Cluster Project.
P
PCA Principal Component Analysis.
R
RMS Root Mean Square.
ROOT An Object Oriented Data Analysis Framework.
S
SKICAT Sky Image Cataloging and Analysis Tool.
SLIQ Supervised Learning In Quest, where Quest is the Data Mining project at
the IBM Almaden Research Center.
SM Standard Model.
SPRINT Scalable Parallelizable Induction of Decision Tree.
149
X
XML eXtensible Markup Language.
150
Bibliography
[1] T. Abe et al. Detectors for the linear collider (chapter 15). [3]. Resource book
for Snowmass 2001, 30 Jun - 21 Jul 2001, Snowmass, Colorado.
[2] T. Abe et al. Higgs bosons at the linear collider (chapter 3). [3]. Resource book
for Snowmass 2001, 30 Jun - 21 Jul 2001, Snowmass, Colorado.
[3] T. Abe et al. Linear Collider Physics Resource Book for Snowmass 2001. Stan-
ford Linear Accelerator Center, Stanford, California, June 2001. Resource book
for Snowmass 2001, 30 Jun - 21 Jul 2001, Snowmass, Colorado.
[4] Guozhong An. The effects of adding noise during backpropagation training on a
generalization performance. Neural Computation, 8:643–674, 1996.
[5] J. A. Anderson and E. Rosenfeld, editors. Neurocomputing. MIT Press, 1988.
[6] G. Arnison et al. Experimental observation of isolated large transverse en-
ergy electrons with associated missing energy at√
s = 540 GeV. Phys. Lett.,
B122:103–116, 1983.
[7] W. B. Atwood, T. Burnett, R. Cailliau, D. R. Myers, and K. M. Storr. Gismo:
An object oriented program for high-energy physics event simulation and recon-
struction. Int. J. Mod. Phys., C3:459–478, 1992.
151
[8] P. Avery et al. MCFast: A fast simulation package for detector design studies. In
Proceedings of Computing in High-energy Physics (CHEP 97), Berlin, Germany,
7-11 Apr 1997, 1997.
[9] M. Banner et al. Observation of single isolated electrons of high transverse mo-
mentum in events with missing transverse energy at the CERN anti-p p collider.
Phys. Lett., B122:476–485, 1983.
[10] R. Barate et al. Search for the standard model higgs boson at LEP. Physics
Letters B, 565:61–75, 2003.
[11] W. Bartel et al. JADE collaboration. Z. Phys., C33:23, 1986.
[12] R. Bellman. Adaptive Control Processes: A Guided Tour. Princeton University
Press, New Jersey, 1961.
[13] S. Bethke, M. Cavetti, H. F. Hoffmann, D. Jacobs, M. Kasemann, and D. Linglin.
Report of the steering group of the LHC computing review. Technical Report
CERN/LHCC/2001-004, CERN, 2001.
[14] J. C. Bezdek. Fuzzy Mathematics in Pattern Recognition. PhD thesis, Applied
Math. Center, Cornell University, 1973.
[15] H. Bichsel, D. E. Groom, and S. R. Klein. Passage of particles through matter.
Phys. Rev., D66:010001–195–010001–206, 2002.
[16] C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press,
New York, 1995.
[17] R. K. Bock, T. Hansl-Kozanecka, and T. P. Shah. Parametrization of the longitu-
dinal development of hadronic showers in sampling calorimeters. NIM, 186:533–
539, 1981.
152
[18] Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
[19] T. H. Burnett, W. B. Atwood, R. Cailliau, D. Myers, and K. M. Storr. The
GISMO project: Application of object oriented techniques to detector simu-
lation. In Proceedings, Data structures for particle physics experiments, pages
125–130, 1990.
[20] Rich Caruana, Steve Lawrence, and C. Lee Giles. Overfitting in neural networks:
Backpropagation, conjugate gradient, and early stopping. In Advances in Neural
Information Processing Systems, Denver, Colorado, 2000.
[21] D. L. Chester. Why two hidden layers are better than one. In Lawrence Erlbaum,
editor, Proceedings of the International Joint Conference on Neural Networks,
pages 265–268, 1990.
[22] Trevor Cox and Michael Cox. Multidimensional Scaling. Chapman & Hall,
London, U.K, 1994.
[23] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines.
Cambridge University Press, 2000.
[24] Joseph C. Dunn. A fuzzy relative of ISODATA process and its use in detecting
compact, well separated clusters. Journ. Cybern., 3:95–104, 1973.
[25] J. L. Elman. Finding structure in time. Cognitive Science, 14:179–211, 1990.
[26] C. W. Fabjan. Calorimetry in high-energy physics. NATO Adv. Study Inst. Ser.
B Phys., 128:281, 1985.
[27] U. Fayyad, D. Haussler, and P. Stolorz. KDD for science data analysis: issues and
examples. In Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining, Portland, Oregon, 1996. AAAI Press.
153
[28] U. M. Fayyad, N. Weir, and S. Djorgovski. SKICAT: A machine learning system
for automated cataloging of large scale sky surveys. In Proceedings of the Tenth
International Conference on Machine Learning, pages 112–119. ICML, 1993.
[29] R. A. Fisher. The Design of Experiments. Oliver and Boyd, London, 1925.
[30] R. A. Fisher. Statistical Methods and Scientific Inference. Oliver and Boyd,
London, 2 edition, 1956.
[31] W. Frawley, G. Piatetsky-Shapiro, and C. Matheus. Knowledge discovery in
databases: An overview. AI Magazine, pages 213–228, Fall 1992.
[32] J. Freeman and A. Beretvas. A short review of the CDF electromagnetic and
hadronic shower simulation. In Proceedings of the 1986 Summer Study on the
Physics of the Superconducting Supercollider, pages 482–486, New York, NY,
1986. American Physical Society.
[33] K. Funahashi. On the approximate realization of continuous mappings by neural
networks. Neural Networks, 2:183–192, 1989.
[34] Lynn Garren. StdHep 5.01: Monte Carlo Standardization at FNAL, 2002.
[35] S. Geman, S. Bienenstock, and R. Doursat. Neural networks and the
bias/variance dilemma. Neural Computation, 4:1–58, 1992.
[36] S. L. Glashow. Partial symmetries of weak interactions. Nucl. Phys., 22:579–588,
1961.
[37] Johannes Grabmeier and Andreas Rudolf. Techniques of cluster algorithms in
data mining. Data Mining and Knowledge Discovery, 6:303–360, 2002.
154
[38] D. E. Groom. Atomic and nuclear properties of materials (rev.). Phys. Rev.,
D66:010001, 2002.
[39] David J. Hand. Data mining: Statistics and more? The American Statistician,
52(2), 1998.
[40] D. O. Hebb. The Organization of Behavior, chapter 4, pages 60–78. Wiley, New
York, 1949.
[41] Frank Hoppner, Frank Klawonn, Rudolf Kruse, and Thomas Runkler. Fuzzy
Cluster Analysis. John Wiley & Sons, New York, 1999.
[42] K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are
universal approximators. Neural Networks, 2:359–366, 1989.
[43] Masako Iwasaki and Toshinori Abe. LCD ROOT simulation and analysis tools.
2001. hep-ex/0102015.
[44] William James. Psychology (Briefer Course), chapter XVI, pages 253–279. Holt,
New York, 1890.
[45] T. Kohonen. Self-organized formation of topologically correct feature maps.
Biological Cybernetics, 43:59–69, 1982. Reprinted in [5].
[46] T. Kohonen. Self-Organization and Associative Memory. Springer-Verlag, Berlin,
third edition, 1989.
[47] M. A. Kramer. Nonlinear principal component analysis using autoassociative
neural networks. AIChe Journal, 37:233–243, 1991.
[48] Anders Krogh and Jesper Vedelsby. Neural network ensembles, cross validation
and active learning, volume 7 of Advances in Neural Information Processing
Systems. MIT, Cambridge, Massachussets, 1995.
155
[49] Ray Kurzweil. The law of accelerating returns. 2000.
http://www.kurzweilai.net/articles/art0134.html.
[50] A. Lapedes and R. Farber. How neural nets work. American Institute of Physics,
New York, 1988.
[51] Yann LeCun, Leon Bottou, Geneviere B. Orr, and Klaus-Robert Muller. Effi-
cient backprop. In Geneviere B. Orr and Klaus-Robert Muller, editors, Neural
Networks: Tricks of the Trade. 1998.
[52] David D. Lewis. Evaluating and optimizing autonomous text classification sys-
tems. 1995.
[53] L. Lonnblad, C. Peterson, Hong Pi, and T. Rongvladson. Self-organizing net-
works for extracting jet features. Computer Physics Communications, 67:193,
1991.
[54] L. Lonnblad, C. Peterson, and T. Rognvaldsson. Mass reconstruction with neural
network. Physics Letters B, 278:181–186, 1992.
[55] L. Lonnblad, C. Peterson, and T. Rongvladson. Using neural networks to identify
jets. Nuclear Physics B, 349:675, 1991.
[56] Richard Maclin and David Opitz. An empirical evaluation of bagging and boost-
ing. In The Fourteenth National Conference on Artificial Intelligence, pages
546–551, Rhode Island, 1997. AAAI Press.
[57] W. S. McCulloch and W. Pitts. A logical calculus of ideas immanent in nervous
activity. Bulletin of Mathematical Biophysics, 5:115-133:18–27, 1943.
[58] Manish Mehta, Rakesh Agrawal, and Jorma Rissanen. SLIQ: A fast scalable
classifier for data mining. In Extending Database Technology, pages 18–32, 1996.
156
[59] H. Messel and D. F. Crawford. Electron–photon shower distribution function;
tables for lead, copper, and air absorbers. Pergamon Press, New York, 1970.
[60] Pierre Michaud. Clustering techniques. Future Gener. Comput. Syst., 13(2-
3):135–147, 1997.
[61] M. Minsky and S. Papert. Perceptrons. MIT Press, 1969. Partially reprinted in
[5].
[62] Tom M. Mitchell. Does machine learning really work? AI Magazine, 18(3):11–20,
1997.
[63] Gordon Moore. Cramming more components onto integrated cirsuits. Electron-
ics, 38(8), 1965.
[64] Gordon Moore. No exponential is forever: but forever can be delayed! In IEEE
International Solid-State Circuit Conference, 2003.
[65] J. von Neumann. pages 66–82. Yale University Press, New Haven.
[66] M. De Palma, C. Favuzzi, G. Maggi, E. Nappi, A. Ranieri, and P. Spinelli.
Measurement, parametrization and fast simulation of hadronic showers in lead.
Nucl. Inst. Meths. in Phys. Res., 219:87–96, 1984.
[67] Michael E. Peskin. Pandora: An object-oriented event generator for linear col-
lider physics. 1999. hep-ph/9910519.
[68] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flan-
nery. Numerical Recipes in C: The Art of Scientific Computing. Cambridge
University Press, 1992.
157
[69] M. D. Richard and R. P. Lippmann. Neural network classifiers estimate bayesian
a posteriori probabilities. Neural Computation, 3(4):461–483, 1991.
[70] B. D. Ripley. Pattern Recognition and Neural networks. Cambridge University
Press, 1996.
[71] N. Rochester, J. H. Holland, L. H. Haibt, and W. L. Duda. Tests on a cell
assembly theory of the action of the brain, using a large digital computer. IRE
Transactions on Information Technology, 1956.
[72] Thorsteinn Rognvaldsson. On langevin updating in multilayer perceptrons. Neu-
ral Computation, 6, 1994.
[73] F. Rosenblatt. The perceptron: a probabilistic model for information storage
and organization in the brain. Psychological Review, 65:386–408, 1958.
[74] B. Rossi. High Energy Particle. Prentice Hall Inc., Englewood Cliffs, NJ, 1952.
[75] Dennis W. Ruck et al. The multilayer perceptron as an approximation to a
bayes optimal discrimination function. IEEE Transactions on Neural Networks,
1(4):296–298, 1990.
[76] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal represen-
tations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors,
Parallel Distributed Processing: Explorations in the Microstructure of Cognition,
volume 1, pages 318–362. MIT Press, 1986.
[77] Abdus Salam. Weak and electromagnetic interactions. In Proc. of the 8th Nobel
Symposium on ‘Elementary particle theory, relativistic groups and analyticity’,
pages 367–377, Stockholm, Sweden.
158
[78] W. Sarle. Stopped training and other remedies for overfitting. In Proceedings
of the 27th Symposium on Interface of Computing Science and Statistics, pages
352–360. Interface Foundation, 1995.
[79] John C. Shafer, Rakesh Agrawal, and Manish Mehta. SPRINT: A scalable par-
allel classifier for data mining. In T. M. Vijayaraman, Alejandro P. Buchmann,
C. Mohan, and Nandlal L. Sarda, editors, Proc. 22nd Int. Conf. Very Large
Databases, VLDB, pages 544–555. Morgan Kaufmann, 3–6 1996.
[80] H Siegelmann and E. Sontag. Neural nets are universal computing devices.
Technical Report SYCON-91-08, 1991.
[81] G. Simone. GEANT4: Simulation for the next generation of hep experiments.
In International Conference on Computing in High-energy Physics (CHEP 95),
Rio de Janeiro, Brazil, 18-22 Sep 1995, 1995.
[82] Torbjorn Sjostrand, Leif Lonnblad, and Stephen Mrenna. Pythia 6.2: Physics
and manual. 2001. hep-ph/0108264.
[83] Leonard Susskind. The gauge hierarchy problem, technicolor, supersymmetry,
and all that. (talk). Phys. Rept., 104:181–193, 1984.
[84] Joshua B. Tenenbaum, Vin de Silva, and John C. Langford. A global geometric
framework for nonlinear dimensionality reduction. Science, 290:2319–2323, 2000.
[85] Valdimir N. Vapnik. Statistical Learning Theory: Adaptive and learning systems
for signal processing, communications and control. John Wiley and Sons Inc.,
1998.
[86] E. A. Wan. Neural network classification: A bayesian interpretation. IEEE
Transactions on Neural Networks, 1(4):303–305, 1990.
159
[87] Steven Weinberg. A model of leptons. Phys. Rev. Lett., 19:1264–1266, 1967.
[88] H. White. Connectionists nonparametric regression: multilayer feedforward net-
works can learn arbitrary mappings. Neural Networks, 3:535–549, 1990.
[89] Julia Yarba. User’s guide for showering/calorimetry in MCFAST. 1999.
160