Post on 07-May-2021
Project Report
Reconstruction of PINGU data with aDifferential Evolution Minimizer
Christoph Raab
January 9, 2014
Department of Physics and AstronomySupervisor: Carlos Perez de los Heros; Reviewer: David Boersma
Abstract
The Precision IceCube Next Generation Upgrade (PINGU) is supposed to have an energythreshold below . 10 GeV in order to resolve the neutrino mass hierarchy. In order toreconstruct the energy and direction of neutrinos interacting in this array, producingboth a hadronic cascade and a muon track, advanced reconstruction methods need tobe employed. A class of these seeks to maximize a complicated likelihood functionwithin an 8-dimensional parameter space describing the event, and requires sophisticatedminimizers to achieve the necessary resolution in a reasonable time. In this report, a pre-existing but hitherto unused minimizer which samples that parameter space with severalMarkov chains at once, based on the Differential Evolution Monte Carlo algorithm, isdeveloped further and its behaviour and performance is tested on simulated data of theIceCube/PINGU array. The tests compare both various configurations of the minimizerand Markov Chain Monte Carlo, a similar previous approach.
Contents
1. Introduction 11.1. Neutrino Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Track reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Simulation, Processing and Reconstruction Chain 42.1. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2. Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3. Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Differential Evolution Monte Carlo 73.1. Differential Evolution Algorithm . . . . . . . . . . . . . . . . . . . . . . . 73.2. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1. Example Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2.2. Tray Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2.3. Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3. Problems and challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3.1. Stuck chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3.2. Cascade Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4.1. Comparison to Markov Chain Monte Carlo . . . . . . . . . . . . . 153.4.2. Resolution for SPE4 Seed . . . . . . . . . . . . . . . . . . . . . . . 213.4.3. Resolution for Mixed Monte Carlo/SPE4 Seed . . . . . . . . . . . 21
4. Summary, Conclusions and Outlook 244.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5. Acknowledgements 28
A. Software options and parameters 29A.1. Example script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30A.2. Tray script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31A.3. Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
B. Supplementary figures 36
Bibliography 43
Chapter 1.
Introduction1
1.1. Neutrino Telescopes2
High-energy cosmic neutrinos impinge upon the Earth at a high flux from the entire sky3
and can be produced in a variety of sources. Large-scale instruments that are used to4
observe these sources by measuring the direction of the neutrinos are called neutrino5
telescopes. One such neutrino telescope is installed at the South Pole, and uses the6
antarctic ice as a target for neutrinos to interact in. In case of the weak, charged-current7
interaction of a muon-flavour neutrino, this creates a muon alongside an hadronic cascade8
from the interacting nucleus:9
νµ +N → µ+ +N∗ (1.1)νµ +N → µ− +N∗ (1.2)
The muon and nuclear fragments deposit their energy into secondary particles, which10
in turn induce Cherenkov light in the ice with absorption lengths typically between 90 m11
and 270 m, strongly depending on the ice layer. The muon does so along a linear-, while12
the cascade creates a mostly point-shaped and nearly isotropic light source. (See fig.13
1.1.) If this light is captured by an array of sensors, it is possible to infer the direction of14
the incoming neutrino with a certain precision. This opens up the possibility of neutrino15
astronomy, where neutrino sources from the entire sky can be observed. An irreducible16
background to cosmic neutrinos are neutrinos created by the interaction of cosmic rays17
in the atmosphere.18
IceCube is an array of 86 strings of 60 (digital) optical modules (DOMs) each, buried19
between depths of 1500 m and 2500 m. The modules consist of large photomultipliers,20
which can resolve the shape of pulses due to single Cherenkov photons, and readout21
electronics, which send data to the surface via a common cable, where the detector is22
triggered according to coincidence conditions. The strings are spread out over a square23
kilometer of surface and instrumenting a cubic kilometer of ice in total. The resulting24
string spacing of 125 m leads to an energy threshold corresponding to the shortest muon25
track possible to be resolved of 50 - 100 GeV, making IceCube mostly useful for observing26
high energy neutrinos. These can e.g. give information about the origin of high-energy27
cosmic rays. However neutrino telescopes can also be used to detect signatures of dark28
matter particles gravitationally accumulated in certain locations such as the Sun or29
1
Chapter 1. Introduction
μ
ν
17 m (IceCube)
5 m/3 m (PINGU)
125 m (IceCube)
14 - 26 m (PINGU)
A
cascade
Cer
enkov c
one
Fig. 1.1.: Diagram of the light sources from a νµ charged-current interaction inside theIceCube array.
Galactic center, or to measure the oscillation of neutrinos through the Earth. For these30
purposes, a lower energy threshold is required, and IceCube contains the DeepCore31
inset array of closer spaced strings with, in turn, 50 closer-spaced DOMs of 35% higher32
quantum efficiency. This lowers the energy threshold of the combined detector to 1033
GeV by gaining light yield for low-energy events, and improve the resolution of short34
tracks. DeepCore’s position at the bottom center of the array, from a depth of 2100 m35
to 2450 m, allows the rest of IceCube and a layer of 10 further DOMs per string above36
2000 m to act as a veto for the DeepCore sub-array. [4]37
PINGU is a planned extension of the IceCube/DeepCore array with an inset of even38
higher-density strings. Neutrinos oscillate between their flavours due to having three39
different mass eigenstates, m1...3. In vacuum, these oscillations are only parametrized by40
the absolute value |∆m2ij | = |m2
i −m2j |. In matter, however the MSW effect is sensitive41
to the signs, as well, and by measuring the oscillations of atmospheric neutrinos through42
the Earth, PINGU will be employed to resolve the neutrino mass hierarchy (NHM). In43
the oscillation, the two “free” parameters are the energy of the neutrino, and oscillation44
length. The latter corresponds to the zenith angle of the neutrino arriving in the detector.45
Hence, resolution of both these will need to be improved, especially for E < 10 GeV,46
where the effect is expected to be most strongly pronounced [5].47
The used geometry in this report is called “V6” 1.2. It consists of 20 strings within48
the central eight that make up DeepCore with a horizontal spacing of 26 m. Each string49
has 60 DOMs with a vertical spacing of 5 m.50
1.2. Track reconstruction51
The energies of muons which may leave the array (& 100 GeV) is estimated by the energy52
loss rate along the track, which is approximately linear ∝ E at & 500 GeV [3]. For the53
low energies in concern, the track is most likely contained in the detector, and its energy54
loss has a constant rate ≈ 0.222GeVm , so that the muon energy can be estimated by the55
total length of the track. However, the energy lost through the hadronic cascade is com-56
2
Chapter 1. Introduction
IceCube-86 with PINGU v6 inlay
Fig. 1.2.: Left: A top view of the IceCube/DeepCore array with PINGU in the V6geometry. Right: A close view of V6, an inset of 20 strings (26 m spacing)with 60 high-quantum-efficiency DOMs each (5 m vertical spacing).
parable to the energy lost in the track (if any), so it is Eν +Mnucleus = Ecascade +Etrack.57
This means that for PINGU, the event hypothesis is a combination of the light yield from58
charged particles, originating at one point and along a line segment. Such a hypothesis59
is described by 8 total parameters: the interaction time, vertex position, the zenith and60
and azimuth angle, Ecascade and Etrack. The last step from the actual event to these61
parameters is the reconstruction of the signal from all the individual DOMs. Models62
describing the light production in the ice, its propagation and the detector response63
result in a likelihood function L which is the probability of the measurement, given a64
certain hypothesis. A likelihood reconstruction seeks to find the parameter vector ~x65
that maximizes L(event|~x), or for short L(x). Especially for a low number of detected66
photons / number of channels (hit DOMs), this function has several local maxima. This67
multimodality is a challenge that all minimizing algorithms face. In addition, a stan-68
dard reconstruction method is needed already before the construction of PINGU in order69
to evaluate its sensitivity. Several for these are being studied, like SANTA/monopod70
which was already used for IceCube/DeepCore, and Multinest which is currently being71
developed. This report considers a new minimizer that explores the 8-dimensional pa-72
rameter space at several points simultaneously, based on an algorithm called Differential73
Evolution Monte Carlo (DE). This algorithm is then tested in various configurations on74
simulated data of the PINGU/IceCube array, and compared in performance to Markov75
Chain Monte Carlo (MC), a similar previous approach.76
3
Chapter 2.
Simulation, Processing and Reconstruction77
Chain78
2.1. Simulation79
To perform reconstruction studies for PINGU necessarily involves simulation data. For80
the events studied in this report, νµ and νµ with a flat spectrum between 1 GeV and 8081
GeV were thrown into the V6 geometry of IceCube/PINGU (fig. 1.2). Their interaction82
was simulated with the GENIE software, and the resulting light propagated assuming the83
SPICE-Mie ice model, i.e. a combination of absorption and scattering lengths, depending84
on the depth in the ice. This model describes the optical properties of the ice at the South85
Pole, and was developed based on dedicated calibration runs. When a photon impinges86
upon a DOM, the detector response is simulated accordingly, creating a waveform. This87
includes for instance afterpulses, which can follow the main photon-induced pulse like88
an echo with a ∼ 6µs delay due to the ionization of residual gases inside the PMT [12].89
The readout is triggered according to the waveform. On top of the hits resulting from90
this simulation, noise hits are added.91
The events were also selected according to the simulated location of the vertex, im-92
posing R < 50m and −400m < Z < −200m with R the distance from the central string93
(#36) and Z the depth in IceCube coordinates [9]. This containment criterium could94
also be implemented in a less ideal form from simulated/real data, and ensures that the95
reconstruction studies are in particular relevant to PINGU. Furthermore, neutral-current96
reactions which would not contain a muon were removed, since they a) do not match97
with the expected event signature but more importantly b) the outgoing neutrino would98
carry away an unknown amount of energy and make studying energy resolution of the99
reconstruction pointless.100
2.2. Processing101
These simulation files were then processed as experimental data would be. For each hit102
DOM, the waveform as output by the digitizers (Fig. 2.1) undergoes a feature-extraction103
that linearly decomposes it into a series of pulses with a certain time, width and charge.104
These pulses then undergo a cleaning filter that seeks to remove noise. In our case,105
the static time window cleaning was used which simply selects hits within a constant106
time window relative to each event’s trigger time. A method that was previously used is107
4
Chapter 2. Simulation, Processing and Reconstruction Chain
the sliding time window, which simply seeks time window of a pre-defined length which108
contains the greatest number of pulses, and disregards those outside. Using this type of109
cleaning could however lead to cleaning away hits from low-energy events if there was a110
coincident event of higher energy, even when the dimmer one triggered the detector. This111
behaviour is naturally undesirable for low-energy analysis. After that, only events that112
had more than 10 hits were allowed to pass. On the hits within the PINGU subarray,113
an SPE4 reconstruction was performed as a preliminary guess for the following ones.114
This reconstruction is based on the SPE1st reconstruction, which minimizes a likelihood115
computed by using the first hit from each DOM and the Pandel likelihood function,116
which is an analytical approximation to the probability of a DOM being hit, given a117
track hypothesis. This is plugged into the I3IterativeFitter module from the gulliver118
suite. This module varies the result, minimizes again, and after four iterations returns119
the fit result with the highest logL out of the five as the SPE4 fit result.120
Fig. 2.1.: The output of the four different amplifiers present on a DOM. The FADCchip digitizes the PMT output signal with a long readout window (6400 ns)after passing a shaping amplifier, and the ATWD chip captures shorter timewindows (422 ns) at a higher resolution while making use of three gain levels.Figure from [13]
2.3. Likelihood Function121
The likelihood function is taken from Millipede, a reconstruction software developed122
within the IceCube collaboration (project millipede). The model at the basis of the123
Millipede likelihood function is one of Poissonian statistics (see ch. 2ff of [3]). For124
each DOM and each time bin within the duration that it recorded a waveform, signified125
by index i, the measured quantity ki is the deposited charge in units of photoelectrons126
5
Chapter 2. Simulation, Processing and Reconstruction Chain
(PE), corresponding to a single photon. It is expected to follow a Poissonian distribution127
Li = λkik! e−λi . The average λi is the expected number of photons. On top of that, there128
is expected to be a noise level ρi, so that λi → λi + ρi. The complete likelihood is then129
a product of Li, or130
logL =∑i
logLi =∑i
ki ln(λi + ρi)− (λi + ρi)− ln ki! (2.1)
For DOMs that were not hit within the readout time window, the logL sum over time131
bins simplifies into a P no hit, the probability that no single photon reached this particular132
DOM during the entire readout time.133
Millipede was configured to use time bins of variable width that contain maximally134
1 PE, and are at most 200 ns long. The λi are retrieved from so-called spline tables,135
which are based on Monte Carlo simulation of the detector response for idealized forms136
of sources. These are Cherenkov-emitting track segments of 15 m length correspond-137
ing to minimally ionizing muons, and isotropically emitting point sources for hadronic138
cascades. The resulting expection values λi were then fitted to spline functions [10] on139
the parameter space, and the fit results stored in lookup tables, from which the λi to140
be used for any given configuration of sources, i.e. event hypothesis, can be computed141
more accurately than from analytical models [3]. Even though, they make up the largest142
portion of the likelihood function call, which in itself dominates the processing time of143
likelihood reconstruction methods.144
Since the photospline tables are only computed for integer numbers of track segments,145
likelihood values for track lengths that lie between two multiples of 15 m are are ap-146
proximated by a linear interpolation. This is implemented in the LikelihoodWrapper147
module, which was written by M. Dunkman as part of the hybrid-reco project.148
6
Chapter 3.
Differential Evolution Monte Carlo149
3.1. Differential Evolution Algorithm150
One way to find the maximum of L(~x) is to simply sample points ~x out of the 8-151
dimensional parameter space (interaction time, vertex position, the zenith and and az-152
imuth angle, Ecascade and Etrack), compute L(~x) for each and take the best one. An153
attempt to make this sampling more efficient than taking an 8-dimensional grid is called154
Markov Chain Monte Carlo. It proceeds as follows:155
1. Choose a seed as the starting vector ~x, for instance a first guess of the parameters.156
2. Sample a new point ~y from a Gaussian distribution around ~x.157
3. Replace ~x = ~y with a probability of min(1,L(y)L(x))158
(This is called a Metropolis-Hastings test.)159
4. Reiterate from the current point - this progression is called a chain.160
After a number of burn-in steps, the distribution of points in the history of the chain161
will be a sampling of the input distribution L, so the points are denser where L is higher,162
i.e. the interesting region(s) in parameter space. The one parameter vector with the163
highest L will then be chosen as the reconstructed maximum.164
This method is still rather slow in approximating the true maximum. It also carries165
with it the possibility of the chain converging to a non-global maximum, depending on166
the chosen step size. Differential Evolution Monte Carlo (DE) is a way to attempt to167
accelerate the convergence and make it more resistant against multimodality.168
DE acts upon several chains in parallel. At each step, they’re collectively called a169
generation. In summary:170
1. Smear the seed into the initial generation (taking N samples out of a Gaussian).171
2. For the i-th vector ~xi, calculate an updated version ~xp:172
a) Randomly chose two other, non-identical vectors ~xr1 and ~xr2 out of the gen-173
eration.174
b) Sample a vector ~e from a distribution that is small compared to the spread175
within the generation176
7
Chapter 3. Differential Evolution Monte Carlo
c) Compute177
~xp = ~x+ γ( ~xr2 − ~xr1 + ~e) (3.1)
(The factor γ is 0.2 by default, the ~e widths are the steps in A.3.)178
3. Replace ~xi = ~xp according to a Metropolis-Hastings test as in MC.179
4. Reiterate.180
The unique stationary distribution of the sampled parameter vectors is L as in MC181
(proof in [1]). In this method, when the spread of the generation goes down, so will the182
variation from step to step, which could accelerate the convergence to the stationary183
distribution. Taking the ”best“ vector out of several chains also means that it’s enough184
for part of them to converge to the region of the true maximum. On the other hand,185
the rest of the generation will still use up processing time for its updates.186
3.2. Implementation187
A basic version of a DE-based likelihood reconstruction was already implemented by188
Ken Clark and Matt Dunkman (PSU) as a module within the likelihood-scanner project189
(sandbox/mdunkman/likelihood-scanner/trunk at rev 105435). I updated the order190
of parameters to the one used currently by HybridReco (sandbox/reagan/hybrid-reco/trunk191
at rev 100077), added timing functionality, more options and output keys including the192
possibility to write a log of the entire evolution to a separate file, made it possible to pass193
bounds and step sizes to the method from within a script via the HybridReco parameter194
service, synchronized the same values to those default for the MC method, fixed miscel-195
laneous bugs, implemented new kinds of parameter-vector updates, and packaged it in196
scripts which could be used by the IceCube software framework IceTray, as described197
below.198
3.2.1. Example Script199
There is an example script in likelihood-scanner/resources/examples/darwin_chain.py.200
This tray script201
• reads in an input file202
• skips neutral current events203
• computes the hadronic scaling factor F . This factor is to correct for the fact204
that a hadronic cascade has a smaller light yield (or visible energy) than an elec-205
tromagnetic cascade of the same energy due to neutral hadrons (like neutrons).206
The photospline tables are calculated for electromagnetic cascades, so the actual207
cascade energy relates to the reconstructed visible energy by208
Eactual = EvisibleF (Eactual)
(3.2)
8
Chapter 3. Differential Evolution Monte Carlo
The factor can be parametrized as209
F (E) = 1− 0.690(max(2.7 GeV, E)
0.188 GeV
)−0.162[8] (3.3)
Since the current implementation requires Monte Carlo truth input, this factor is210
not used for the evaluation of this method’s resolution.211
• includes the tray segment likelihood_scanner.darwin_chain described below212
• and finally writes output in the form of .i3 files and via tableio in ROOT or HDF213
format.214
Its command-line options are listed in tab. A.1.215
216
3.2.2. Tray Segment217
The tray segment is included in the above example tray script. It218
• sets the log-level for the involved modules219
• adds the Photospline services (I3PhotoSplineServiceFactory) for both tracks220
and cascades221
• adds the time window in which pulses could have been recorded by the simulated222
read-out, for use by the likelihood via I3WaveformTimeRangeCalculator. Simu-223
lation files which contained both a noise-free pulse series and one with added noise224
used to have a bug in the trigger simulation which allowed noise pulses to prompt225
the error ”Millipede: Pulse time before readout window start“.226
• adds the Millipede likelihood service. This service in turn is configured to use227
the afforementioned muon and cascade photospline services, the time window, and228
count 1 pulse per variable-width time bin.229
• adds a seed particle to the frame. In case of a Monte Carlo seed, this particle230
has position, time and energy from the cascade, and length and direction from the231
muon track. In case of the infinite-track SPE4_PINGU seed, the latter is appended232
with default length 50 m and energy 25 GeV. If chosen, the position and time are233
replaced by the MC values.234
• stores the seed in the HybridReco seed service (HybridRecoSeedServiceFactory)235
• adds the HybridReco parameter service (HybridRecoParametrizationServiceFactory)236
and stores step sizes and bounds for all parameters therein (see tab. A.3), which237
are later used by the minimizer. A step size of 0 leaves the parameter fixed.238
These services were taken from the project hybrid-reco (sandbox/mdunkman/likelihood-scanner/trunk239
at rev 105435).240
9
Chapter 3. Differential Evolution Monte Carlo
• Finally the DarwinizedChainer module is added and provided with likelihood,241
parameter and seed services.242
• The tray returns a list of output keys, which can e.g. be passed to a table writer.243
Options passed to the tray segment are described in tab. A.2.244
245
3.2.3. Module246
The modul implements the actual minimizer as a C++ class which inherits from the247
I3Module class. On inclusion into a tray script, it receives a set of parameters which are248
handled (once) in the method DarwinizedChainer::Configure (see A.3). These are249
detailed in tab. A.3. Among these parameters are several external pieces of software:250
• a seed service, from which the seed particle is retrieved251
• a parameter service, which supplies the bounds and step sizes used during the252
evolution update (see A.3)253
• a likelihood service, containing the likelihood function to be maximized254
The general form of the iteration is described in A.3. Upon completion, the best255
parameter-and-LLH vector and its history, along with information about used CPU256
time, are written into the output file. It is possible to write an additional file, containing257
the complete evolution of all individual chains. For more details, see A.3.258
3.3. Problems and challenges259
3.3.1. Stuck chains260
While most chains gradually improve their likelihood over the course of the evolution,261
some chains stay stuck at a lower likelihood which is clearly separate from the rest of262
the generation. In the figure 3.1, this is shown once for 80 chains on one specific event,263
and once for two particular chains representing each case. Examining individual chains264
(see fig. 3.1) shows that the stuck chains are updated with constant frequency, but seem265
to oscillate around a broad, local maximum. In comparison, the higher-likelihood chains266
slowly converge to a maximum, as the parameter space they can move into shrinks.267
Separating these two subsets of chains with an logL cut and plotting the X-Y coordi-268
nates of the vertex in fig. 3.2 shows that the stuck chains progress from the initial (seed)269
region to the edges of the detector volume. Plotting the duration of a likelihood-function270
call versus the number of track segments for these two sets in fig. 3.2 it’s apparent that271
the stuck chains extend their tracks farther from the edge of the detector. These track272
segments lie outside the array, so there are no DOMs nearby that would penalize them273
with Millipede’s P (no hit) (see 2.3). The energy of the cascade meanwhile increases to274
match the expected photoelectrons from an interaction at the edge of the array to the hits275
10
Chapter 3. Differential Evolution Monte Carlo
step100 150 200 250 300 350 400 450
LLH
-950
-900
-850
-800
-750
-700
-650
-600
-550
-500
0
1
2
3
4
5
6
7
8
9
10
LLH of all chains in a single file
(a) All chains, the stuck chains are below -880.
generationNum100 150 200 250 300 350 400 450
LLH
-470
-460
-450
-440
-430
-420
-410
-400
LLH:generationNum {chainNum==13}
(b) A normal chain.generationNum
100 150 200 250 300 350 400 450
LLH
-590.5
-590
-589.5
-589
-588.5
LLH:generationNum {chainNum==71}
(c) A stuck chain.
Fig. 3.1.: The evolution of logL with the steps for one event. Note the relative scale oftheir variations.
inside the array, while the interaction time has to decrease to accomodate the increased276
propagation time.277
The hits inside the array will only contribute a flat, noise-like logL from each track278
segment which is too far removed compared to the scale of the absorption length. Hence279
in fig. 3.2 the linear fits are280
time for normal chains [s] = (0.18 + 0.09×Nsegments)time for stuck chains [s] = (0.22 + 0.04×Nsegments)
The out-of-reach tracks take half the time per track segment. One track segment corre-281
sponds to one lookup of the photospline table (see 2.3) for each hit, so the higher number282
of track segments still increases the average time (2.2 s → 7.9 s).283
Comparing the logL call time for two representative chains in fig. 3.3 shows that284
the track in the ”stuck“ chain continues to grow almost steadily. For some events, the285
proportion of stuck chains can be so high that their diverging track length is clearly286
reflected in an overall diverging computation time per step (fig. 3.3). The non-divergent287
events on the other hand contain a smaller fraction of stuck chains.288
This behaviour is unwelcome not only due to the increase in computation cost, but289
also because evolution steps that involve taking a difference between one conservative,290
11
Chapter 3. Differential Evolution Monte Carlo
high-logL and one extremely distant, low-logL region are unlikely to improve the logL291
of chains in the high-logL region, and therefore the current best likelihood. There are292
several attempts to control it.293
Parameter Boundaries294
Since the divergence in time is due to a divergence in track length and is accompanied295
by variations in cascade position, time, and energy, a straight-forward solution to this296
problem is by implementing boundaries on all of these parameters. Especially the track297
length had hitherto no upper boundary, and the vertex position/time were bound far298
too losely compared to what could be expected of the data.
(a) (b)
Fig. 3.2.: Left: The vertex for normal chains (gradient colours) and stuck chains (red) inone event, with the bounds indicated. Right: Time of a logL function call forone event vs. number of track segments. Separated in normal chains (green)and stuck chains (red), showing means with RMS error bars per bin, linear fitsand the upper boundary.
299
Jump Steps300
In a jump step, the γ parameter in eqn. 3.1 is set to 1, so the update becomes301
~y = ~x+ (~r1 − ~r2) + ~e (3.4)
The idea is that stuck chains could be removed from their local maximum by a step302
with a randomly chosen difference vector between the stuck sub-population and the303
converging sub-population.The random choice would then result in an exponential decay304
of the stuck population size. However, the stuck chains cover a broad region in parameter305
space, which increases the time constant of this decay. Also, the other chains are affected306
12
Chapter 3. Differential Evolution Monte Carlo
by this jump step, too, while the time spectrum (fig. 3.4) did not qualitatively change307
for runs seeded with Monte Carlo truth, while the divergent chains were cut off for the308
SPE4 seed. This however remains to be separated from the effect of the track length309
boundary.310
The effects of both jump steps and parameter boundaries are examined in fig. 3.5,311
where the time per step as well as the best logL are shown vs. step number for the312
same event being processed with and without these measures, once with an SPE seed313
and once with an Monte Carlo truth seed. Of the four curves, the only one diverging314
in time is for SPE without jump steps and unbound parameters. The times of the SPE315
runs are comparable before the divergence, and the MC run’s time is clearly higher with316
the counter-divergence measures before it becomes compatible after ∼ 250 steps. Note317
that this excludes the actual jump steps, which have significantly higher times. For both318
seeds, the logL is enhanced by bound parameters, with an Monte Carlo seed always319
exceeding an SPE seed.320
Hard Time Limit321
The divergence of processing time for a whole event can be mitigated by simply inter-322
rupting the evolution if the total time exceeds a limit estimated:323 ∑t ≤ 2×Nsteps × 〈t〉first 20 steps (3.5)
Once all the above measures are implemented, the fraction of events cancelled by324
this condition are on average 0.1% for both 40 chains and 80 chains, one event being325
cancelled in each run with an SPE seed. The fractions were 1% (40) and 0.4% (80) for326
an Monte Carlo seed. The reason for this was not investigated. The time per step is327
then well-bound to ≤165 s/step and does not diverge, seen in fig. 3.4, which contrasts328
to the previous upper limit of 634 s/step for a diverging event.
(a) Compared to number of track segments (gradientcolours)
step0 50 100 150 200 250 300 350 400 450
time/
step
[s]
0
5
10
15
20
25
(b) Compared to a normal chain (blue)
Fig. 3.3.: Time per step for a stuck chain (red) vs. step number (no jump steps).329
13
Chapter 3. Differential Evolution Monte Carlo
time/step [s]0 100 200 300 400 500 600
freq
uenc
y
1
10
210
310
410
Step times before and after
(a) Blue: from a large sample of events. Red:from a smaller sample.
time/step [s]0 50 100 150 200 250 300 350 400
freq
uenc
y
1
10
210
310
Time spectrum before, after bounds and jump steps (SPE seed)
(b) Just SPE4 seed.
time/step [s]20 40 60 80 100 120 140 160
freq
uenc
y
1
10
210
310
Time spectrum before, after bounds and jump steps (MC seed)
(c) Just Monte Carloseed.
Fig. 3.4.: Distribution of time per step. Blue: after all parameter bounds, jump steps,and time limits have been implemented. Red: Before.
3.3.2. Cascade Energy330
The photospline tables 2.3 are actually computed for electromagnetic cascades. The331
likelihood reconstruction is hence expected to ideally give the energy of an electromag-332
netic cascade that had the same light yield as the hadronic cascade in the data. Since in333
a hadronic cascade, more energy is deposited stochastically into neutral particles such334
as neutrons which do not result in Cherenkov light, this would lead on average to an335
underestimation of the energy of the hadronic cascade. This underestimation is reflected336
in a bias of −9.7 GeV (averaged over the used data sample) when comparing the recon-337
structed cascade energy to the energy of the cascade as taken from the Monte Carlo338
truth information. This effect gets more pronounced towards low energies, as shown in339
Fig. 3.6. If one applies the analytical approximation of the factor, as computed from340
the true cascade energy, as an unphysical correction factor to the reconstructed cascade341
energy, this bias reduces to −4.8 GeV, which is smaller than the RMS of 9.9 GeV. Any342
way to apply this correction using purely reconstructed data was thought to involve a343
separate study to arrive at a different correction function, and hence no such attempts344
were made.345
3.4. Performance346
Several quantities were computed to describe the deviation between reconstruction and347
Monte Carlo truth. Here, the form ∆X means Xreco −Xtrue.348
The first two are the most interesting for physical analyses:349
• Relative energy error∣∣∣ ∆EEtrue
∣∣∣, with the total reconstructed and true neutrino energy350
respectively351
14
Chapter 3. Differential Evolution Monte Carlo
step number0 50 100 150 200 250 300 350 400 450
time/
step
[s]
20
40
60
80
100
120
140
160
time:generationNum {eventNum==0 && generationNum%5 == 0 && generationNum%20!=0}
(a) Time per step vs. step, 1 event
step number0 50 100 150 200 250 300 350 400 450
Bes
t LLH
-540
-520
-500
-480
-460
-440
-420
-400
-380
bestLLH:generationNum {eventNum==0 && generationNum%5==0 && generationNum!=20}
(b) Best logL evolution, 1 event
Fig. 3.5.: Green: SPE seed. Blue: Monte Carlo seed. Dashed: unbound track length, nojump steps. Solid: bound track length, 1/20 jump steps. The separate pointsare for the jump steps.
• Zenith error |∆θ|352
Two more energy-related, also with physical relevance353
• Energy bias ∆E, the signed difference of total reconstructed and true neutrino354
energy. This is expected to have a bias proportional to true energy, see sec. 3.3.2355
• Fraction error∣∣∣∆Eµ
Eν
∣∣∣, which evaluates how well the actual hybrid hypothesis is356
matched357
Vertex-related quantities which affect the reconstruction quality of others (see sec.358
3.4.3)359
• Vertex position∣∣∣∆~R
∣∣∣, the distance between the true and reconstructed vertex.360
• Interaction time |∆T |361
The performance of the reconstruction method is then evaluated by e.g. the median362
of these errors across the event sample, called the resolution.363
Plots are shown here for the first group, and in the appendix B for the rest.364
3.4.1. Comparison to Markov Chain Monte Carlo365
First, a comparison to the original MC has to be made. Both methods were run on366
the same events, with the same SPE4 seeds, and their best vector recorded at regular367
intervals of steps, along with the time elapsed since the start of the iteration (see A.3).368
For DE, the effective number of steps is given by the number of generations times the369
size of the generation. The chosen generation size was 40, since previous tests showed370
15
Chapter 3. Differential Evolution Monte Carlo
E/GeV
F(E)
scalin
g f
act
or
(a) Parametrization
0 10 20 30 40 50 60 70 80
0.2
0.4
0.6
0.8
1
1.2
Cascade Energy Reconstruction/Truth vs. Truth, Mean 0.57
(b) Reconstruction mean ± rms
Fig. 3.6.: Left: Average visible over actual energy for hadronic cascades, parameterizedaccording to L. Radel [8]. Right: Reconstructed over true energy, with mean(red) and mean ± rms (blue). Note the different ranges and large width of thedistribution.
promising results with this number, which is on scale of the recommended 10 × d (see371
3.1). A population of 10×d = 80 was also tried, but limits on computing time interruped372
the reconstruction on the majority of the events.373
The median resolutions for both methods can then be plotted over the step number374
at which they were achieved; i.e., the resolution if the iterations were interrupted at this375
step. These plots are shown in the left column of fig. 3.7. The difference between two376
medians does not contain all information about the relative merits. A second type of377
quantity is also computed: the portion of events where a certain resolution improved378
when moving from MC to DE, called the separation. These make up the right column379
of fig. 3.7.380
16
Chapter
3.D
ifferentialEvolutionM
onteC
arlo
step (MC)5000 10000 15000 20000 25000 30000
] (M
C)
° [ θ ∆
6
7
8
9
10
11
12
Median Zenith Resolution vs. Step
step (DE)5000 10000 15000 20000 25000 30000
0.32
0.34
0.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5
Portion of events where Zenith Resolution DE<MC vs. Step
(a) Zenith resolution and separation
step (MC)5000 10000 15000 20000 25000 30000
/E (
MC
)
E∆
0.3
0.32
0.34
0.36
0.38
0.4
0.42
Median Relative Energy Resolution vs. Step
step (DE)5000 10000 15000 20000 25000 30000
0.44
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
0.62
0.64
Portion of events where Relative Energy Resolution DE<MC vs. Step
(b) Relative energy resolution and separation
Fig. 3.7.: Left: Resolutions for both MC (blue) and DE (red), plotted vs. step number. Right: Separation between MC andDE, expressed as the portion of events where DE<MC.
17
Chapter 3. Differential Evolution Monte Carlo
The zenith resolution (fig. 3.7, top) of DE declines from a larger starting value of 11.8◦,381
which has to be due to the initial smearing (A.3). Meanwhile the MC zenith resolution382
loses its rate of decline quicker, leading to a close approach under 5.8◦ (DE) and 5.5◦383
(MC), with a separation between 0.48 and 0.5, after 25k steps. The relative energy384
resolution (fig. 3.7, bottom) begins at a slightly higher level, but here the separation385
crosses 0.5 at 2000 steps already and ends up above 0.62 after 25k steps, corresponding386
to resolutions of 30% (DE) and 38% (MC).387
An interesting point to evaluate is also the processing time per step for each method.388
This quantity depends on the computing resources available, in this case the NPX4389
cluster at the University of Wisconsin in Madison. However it’s apparent in Fig. 3.8390
that DE differs clearly from MC in that regard, with almost twice the processing time391
per step.392
step5000 10000 15000 20000 25000 30000
ratio
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
MC<tDEPortion of events where t
(a) Separation of times vs. step [s]MCt
0 10000 20000 30000 40000 50000 60000
[s]
DE
t
0
10000
20000
30000
40000
50000
60000
0
5
10
15
20
25
30
35
Time, DEMC vs. MCMC @ Step 30k
(b) time(DE) vs. time(MC) atstep 30k.
step5000 10000 15000 20000 25000 30000
time/
step
[s]
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
Mean Time/Step, DEMC and MCMC
(c) Mean time/step for DE (red) and MC (blue).
Fig. 3.8.: Comparison of running times for DE and MC.
This motivates a second type of plot, namely the resolutions at a certain time, but393
including all events that already have concluded their iterations; i.e., the resolution if394
the iterations were interrupted after this CPU time. These are shown in fig. 3.9395
Here, MC resolutions can be seen reaching a minimum between 15 ks and 20 ks,396
18
Chapter 3. Differential Evolution Monte Carlo
time [s] (MC)0 10000 20000 30000 40000 50000 60000
] (M
C)
° [ θ ∆
5
6
7
8
9
10
11
12
Median Zenith Resolution vs. Time
(a) Zenith resolution vs. time
time [s] (MC)0 10000 20000 30000 40000 50000 60000
/E (
MC
)
E∆
0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
0.44
0.46
Median Relative Energy Resolution vs. Time
(b) Relative energy resolution vs. time
Fig. 3.9.: Resolutions vs. time for MC (blue) and DE (red). The resolution at certaintime corresponds to the condition that the iteration would be interrupted eitherafter this time is reached or at the maximum number of steps. Hence, theasymptotic behaviour is not intrinsic to the method, but rather a result of thelimited step number leading to fewer events completing their iteration after acertain time.
with the few events still running after that point worsening the resolution. The more397
rapid descent of DE in zenith resolution is balanced by the longer time, so that the two398
methods actually achieve the same resolution at 27 ks.399
For benchmark purposes, a display step of 25200 is chosen, which corresponds to400
the step after which DE remains underneath the zenith resolution it achieves at the401
maximum step of 30000. The resolutions and separations can then be measured at this402
step, but varying over the (true neutrino) energy (fig. 3.10) and cosine of zenith (fig.403
B.3).404
19
Chapter
3.D
ifferentialEvolutionM
onteC
arlo
[GeV]νtrue E0 10 20 30 40 50 60 70 80
] (M
C)
° [ θ ∆
4
6
8
10
12
14
Median Zenith Resolution vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.42
0.44
0.46
0.48
0.5
0.52
0.54
0.56
Portion of events where Zenith Resolution DE<MC vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
/E (
MC
)
E∆
0.25
0.3
0.35
0.4
0.45
Median Relative Energy Resolution vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.55
0.6
0.65
0.7
0.75
Portion of events where Relative Energy Resolution DE<MC vs. Energy at step 25k
Fig. 3.10.: Left: Resolutions for both MC (blue) and DE (red), right: separation between DE and MC, vs. true neutrinoenergy, at step 25200 . From top to bottom: Zenith, relative energy resolution, energy bias, fraction of muon overtotal energy.
20
Chapter 3. Differential Evolution Monte Carlo
The main deficiency of DE w.r.t. MC in zenith resolution is at energies < 20 GeV,405
which is weighted more strongly in the overall median resolution due to the E−2 input406
spectrum. Both relative energy resolutions increase from the low to the high end of the407
energy range, but stay in the same order of magnitude. This is consistent with the clear408
dependence of the energy bias in fig. B.2. Due DE starting to rise at a lower energy than409
MC, the separation dips to a minimum of 0.51% between 40 and 50 GeV. Meanwhile the410
separation is 0.75 > 70 GeV, and 0.65 < 10 GeV.411
3.4.2. Resolution for SPE4 Seed412
The DE method is run for two different population sizes, 40 and 80. The “display” step413
number of 25200 is then reached after respectively 625 and 312.5 generations. Since414
multiples of 50 were recorded in this run, generations 600 and 300 are used for plots in415
fig. 3.11.416
The zenith resolution always improves with energy, at least until 50 GeV. It’s in-417
tuitively understandable that the events < 10 GeV with few hits are the hardest to418
reconstruct with a track direction. The slight improvement of 7.2◦ for 80 chains, to 5.9◦419
for 40 chains can be seen to continue in the zenith resolution for MC, which achieves420
5.6◦ with one chain (see fig.3.7). The energy resolutions are even more similar, but still421
show an improvement 30% (80) to 33% (40).422
3.4.3. Resolution for Mixed Monte Carlo/SPE4 Seed423
A trial was made to evaluate the performance of DE under the condition where the424
vertex position and interaction were known to a high accuracy. In such a case, which425
for example can be possible using Multinest for reconstruction, it would be possible to426
actually fix these four parameters at their initial values for the entire evolution. Not427
only would this mean slightly less computation time spent on updating them and an428
effective decrease in the number of dimensions, but the true maximum of the likelihood429
function in the remaining free parameters would also better approach the true values.430
To this end, a seed particle was prepared using the direction from SPE4, default cascade431
energy and track length, and the Monte Carlo truth X, Y, Z and T. The latter four were432
neither smeared before, nor updated during the evolution. Again, population sizes of 40433
and 80 were used.434
The zenith resolution (fig. 3.12) is improved dramatically with regard to the SPE seed435
(fig. 3.11) and even MC (fig. 3.10), but at 3.8◦ for both 40 and 80 chains. Particularly <436
10 GeV does not show a resolution which is as large proportionally to the total resolution,437
while for the SPE seed the factor was ∼ 3. The energy resolutions (fig. 3.12) are also438
similar with 33% and show the same energy dependence as before.439
21
Chapter
3.D
ifferentialEvolutionM
onteC
arlo
[GeV]νtrue E0 10 20 30 40 50 60 70 80
] (D
E 8
0)°
[ θ ∆
0
10
20
30
40
50
60
0
2
4
6
8
10
12
14
16
18
20
22
Zenith Resolution (DE 80) vs. Energy @ gen 600, Median 5.910000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
] (D
E 8
0)°
[ θ ∆
0
10
20
30
40
50
60
0
2
4
6
8
10
12
14
16
18
Zenith Resolution (DE 80) vs. Energy @ gen 300, Median 7.230000
(a) Zenith resolution
[GeV]νtrue E0 10 20 30 40 50 60 70 80
/E (
DE
80)
E∆
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0
2
4
6
8
10
12
Relative Energy Resolution (DE 80) vs. Energy @ gen 600, Median 0.303000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
/E (
DE
80)
E∆
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0
2
4
6
8
10
12
14
16
Relative Energy Resolution (DE 80) vs. Energy @ gen 300, Median 0.325000
(b) Relative energy resolution
Fig. 3.11.: Resolutions for populations of 40 (left) and 80 (right) vs. true neutrino energy, with the SPE4 seed.
22
Chapter
3.D
ifferentialEvolutionM
onteC
arlo
[GeV]νtrue E0 10 20 30 40 50 60 70 80
] (D
E 8
0)°
| [θ ∆|
0
10
20
30
40
50
60
0
2
4
6
8
10
12
14
16
18
20
22
24
Zenith Resolution (DE 80) vs. Energy @ gen 600, Median 3.810000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
] (D
E 8
0)°
| [θ ∆|
0
10
20
30
40
50
60
0
2
4
6
8
10
12
14
16
18
20
22
24
Zenith Resolution (DE 80) vs. Energy @ gen 300, Median 3.750000
(a) Zenith resolution
[GeV]νtrue E0 10 20 30 40 50 60 70 80
/E (
DE
80)
E∆
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0
2
4
6
8
10
12
14
Relative Energy Resolution (DE 80) vs. Energy @ gen 600, Median 0.333000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
/E (
DE
80)
E∆
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0
2
4
6
8
10
12
14
16
18
Relative Energy Resolution (DE 80) vs. Energy @ gen 300, Median 0.329000
(b) Relative energy resolution
Fig. 3.12.: Resolutions for populations of 40 (left) and 80 (right) vs. true neutrino energy, with the mixed and partially-fixedseed.
23
Chapter 4.
Summary, Conclusions and Outlook440
4.1. Summary441
A minimizer based on Differential Evolution Markov Chains was employed for likelihood442
reconstruction of both a track and a cascade within PINGU. The minimizer was equipped443
to compare it to the previously used Metropolis-Hastings Markov Chain miminizer.444
Whether the original MC or this DE method has the more favourable resolution depends445
on the quantity. In general, energy-derived quantities are however better reconstructed.446
DE takes 1.6× the time/step as MC. The difference between running DE with 40 and447
80 chains appears mostly in a slight detriment to the zenith resolution. A trial with448
using the true vertex to fix its four parameters however improves the zenith resolution449
below that of MC, which makes trials with an additional, better vertex reconstruction450
promising. A drastic reduction in the required processing time was not observed, instead451
DE takes longer per step. The problem of chains diverging towards ever-greater track452
lengths (along with the vertex moving to the detector edge), and hence driving up the453
time for LLH function calls, was mitigated with a bound on the track length and jump454
steps, so that events where the majority of chains exhibit this behaviour and significantly455
affect the overall processing time don’t appear anymore.456
4.2. Conclusions457
Differential Evolution is a minimizer that encompasses many choices of parameters and458
modes of application, which have to be studied and chosen carefully, beyond the scope459
of this project. The apparatus to systematically status these is in place, so that someone460
continuing the study should be able to pick up at the same point. As encouragement461
might be seen that it is not unambiguosly worse than the method it was supposed to462
replace, even though that the places where it marks an improvement it can not be said463
to be significantly faster, either. Furthermore, study of the method has revealed a wide464
variety of modifications to explore, which are detailed in 4.3. In its current state, the465
long CPU time required for the DE makes Multinest a more viable default minimizer466
for IceCube/PINGU.467
24
Chapter 4. Summary, Conclusions and Outlook
4.3. Outlook468
There are several points from which these studies can move forward. In the following,469
these are detailed in several groups.470
As has been shown in section 3.3.1, stuck chains remain part of the generation, which471
is a point to further investigate.472
The fact that they are little affected by jump steps might be a hint that473
• these steps need to be constructed in a different way, for instance through the474
snooker update475
• there needs to be a heuristic to detect stuck chains, so that any measure could476
against them could be applied more discriminately477
• the step magnitude is too big in general. This would not only impede the conver-478
gence, but also mean that jump steps do not any make any significant difference.479
This could be either due to a too-large scale of the e variation, or the γ factor.480
Compared to MC, the speed/resolution performance is still lacking. Apart from vari-481
ations in the algorithm’s parameters, several types of updates could be of help:482
Crossover and Block Update483
Both these related ideas are described in the 2008 paper by Cajo et al. [1]. A crossover484
appears in many genetic algorithms. It involves replacing certain parameters in the trial485
vector with the un-updated version according a certain pattern, for instance binomially486
with a crossover probability for each parameter. A special case of non-random crossover487
simply divides the parameters into blocks, subsets that get updated in turn. This effec-488
tively decreases the number of dimensions and with that the minimal generation size,489
corresponding to the time spent in each step.490
Snooker Update491
The module contains an implementation (::EvolveSnooker) of the snooker update492
which I derived from the 2008 paper ([2]). It choses the direction and magnitude of493
the step separately. For a vector ~x, the update proceeds as this:494
1. Randomly chose a different chain ~z, yielding the direction ~n = ~z − ~x495
2. Until the trial vector is in bounds:496
a) Choose two more ~r1 6= ~r2.497
b) Project ~r1 − ~r2 onto ~n, yielding ~d498
c) Set trial ~y = ~x+ γs~d499
25
Chapter 4. Summary, Conclusions and Outlook
3. Accept the trial with probability min(1,r), where500
r = L(y)L(x)
( ||~y − ~z||||~x− ~z||
)d−1(4.1)
For d=1, this is a special case of the standard update with e = 0.501
One major advantage would be that it becomes more likely for stuck chains to be moved502
towards the normal chains as the number of stuck chains decreases, reversed to the case503
for the standard update. There are however conceptual problems with step magnitude,504
which would be dominated by the parameters with with the largest numerical scale,505
i.e. time and to a lesser extent position, while the relative spread of the angles would506
be virtually irrelevant. This could be mitigated by applying a sort of scaling factor,507
or separating the parameters into blocks. Either the step would be done separately in508
each parameter subspace, with a common ~r1 and ~r2, or the whole update could be done509
separately for each parameter block. Still, a trial of the snooker update, possibly in a510
certain mixture with other updates, seems worthwhile.511
DE-Z512
In the DE-Z algorithm [2], every K steps the current population of size N is added into513
a matrix of M = N × stepK . At every step, the indices R1 and R2 are sampled from514
1, . . . ,M . This allows for DE-like updates with a generation size N < d, and hence less515
time per step.516
Other techniques that could be worth trying are517
• elitism: the best members of a population get directly copied to the next generation518
without any variation519
• annealing: according to a ”cooling schedule“, the scale of the probability of a step520
towards a smaller likelihood gets decreased, so that the Metropolis-Hastings test521
becomes r < min(1, T (step)× LH(trial)/LH(current))522
• a convergence criterium: define a metric (on the current generation or its history)523
that interrupts the evolution when the waiting time for the next ”better“ vector is524
above a pre-defined threshold with some certainty525
• One parameter worth varying is left within the Millipede likelihood function. Mil-526
lipede defines time bins with variable width, given a maximum contained charge527
and length for each bin.528
• Also to improve convergence, the smearing size of the seed should be evaluated529
so that it’s not larger than necessary. If the smearing of the initial population is530
too large compared to the resolution of the seed, the time for any chain to evolve531
towards the maximum is increased, as is the risk of converging onto the wrong532
maximum.533
26
Chapter 4. Summary, Conclusions and Outlook
• To allow the method to be evaluated within the field of reconstruction methods for534
PINGU, a common hit series (i.e. cleaning) and seed will have to be used. As the535
current de-facto standard, the SANTA/monopod reconstruction’s choices would536
be wise to be followed. This, and the fact that the current ”baseline“ geometry537
is V15 instead of V6, means processing new simulation data. Following this, the538
Multinest minimizer and SANTA/monopod reconstruction could also be drawn539
into comparison.540
27
Chapter 5.
Acknowledgements541
I wish to thank Carlos Perez de los Heros for accepting the job of a supervisor for my542
project and coming up with the topic. David Boersma helped me continuously with543
a plethora of software and reconstruction questions and guided the overall structure544
of the report. Both David and Carlos were also present with helpful commentary and545
enduring patience during its writing. The entire Uppsala IceCube group deserves thanks546
for welcoming me into their midst and an into environment that gave me both many547
opportunities to learn more about the physics and techniques involved with IceCube, as548
well as a pleasant social climate. Both of these eventually convinced me to stay in the549
group for my master’s thesis. The transatlantic connection to IceCubers in Madison and550
Pennsylvania was very valuable for their unexpected impulses and the encouragement551
that comes with being taken seriously. Matt Dunkman and Ryan Eagan started the552
DarwinizedChainer, without which I might be working with mysterious Fortran code,553
and were helpful on questions about how to compare it to the Metropolis-Hastings554
Markov Chain minimizer. Similarly it rarely took long to get a response from the larger555
IceCube community to the many questions I posted to the mailing lists.556
28
29
Appendix A. Software options and parameters
Appendix A.
Software options and parameters557
A.1. Example script558
Option Description DefaultInput and Output--gcd The input file containing
G(eometry), C(alibration) andD(etector status) frames
""
--input_dir Directory containing the inputdata files
""
--file_base Specific file in this directory ""--event Specific event in the file to
process, if -1 process multipleevents
-1
--max_events Maximum number of events toread in, in that case
15
--output_dir Directory containing the out-put files
""
--suffix String to append to the filenames
""
Options passed to the tray segment--statistics Has the DarwinizedChainer
module write a diagnosticROOT file
(off)
--pulses Pulse series from the frame onwhich the reconstruction acts
"WavedeformPulses_STW"
--seed_particle Seed particle to create the ini-tial generation from
"SPE4_PINGU"
--cascade_mode Force track length 0 "False"--mc_vertex Use the Monte Carlo truth for
vertex position and time in theseed
(off)
--fixed_vertex Do not smear or evolve the ver-tex position and time
(off)
--burn_in Number of burn-in steps whichare not counted towards themean output vector
50
--record_best_every Frequency to record the cur-rent best vector
50
--steps Steps to take after burn-in 150--jump_steps Frequency for jump steps, if 0
take none0
--population Number of chains 16
559
30
Appendix A. Software options and parameters
A.2. Tray script560
Option Description Defaultshower_spline_dir Directory containing shower
spline tables""
track_spline_dir Directory containing track seg-ment spline tables
""
cascade_mode Force zero track segments Falseoutput_base Prefix for output object keys "Darwin"seed_particle Name of seed particle to
take from frame to seed(x,y,z,t,zen,azi), if MC— takeMonte Carlo truth seed
"SPE4_PINGU"
pulses Name of pulse series to use inLLH calculations
"WavedeformPulses_STW"
burn_in Number of burn-in steps whichare not counted towards the meanoutput vector
50
record_best_every Put the best vector to frame ev-ery N steps
50
steps Steps to take after burn-in 150log_level Logging level for HybridReco and
the DarwinizedChainerI3LogLevel.LOG_INFO
561
31
Appendix A. Software options and parameters
A.3. Module562
Option Description DefaultStatMapName Name of the covariance map to add to
frame"I3StatisticsMap"
OutputBase Base for the output keys "Darwin"SeedServiceName Name of seed service ""LLHServiceName Name of likelihood service ""ParamServiceName Name of parameter service ""Generations Number of generations to evolve 150BurnInLength Number of points to throw out during
burn in50
RecordBestEvery Record best vector in frame every Nsteps
50
NumChains Number of chains to use in the evolution 80RootFileName Name of root file, write none if empty ""ScaleFactor Scale factor for evolution addition 0.2Seed Random number seed seedBigJumpSteps Take a big jump every N steps, or none
if 00
BestLLHKeep Average the best N vectors for each ofthe last N steps
5
563
DarwinizedChainer::Configure564
The method checks the arguments (tab. A.3) and whether the services exists. It records565
the names, bounds and steps of the parameters stored in the parameter service in class at-566
tribute vectors vector<string> names_, vector<double> steps_, vector<double> mins_567
and vector<double> maxs_.568
The names of the parameters, their default bounds and step sizes are:569
32
Appendix A. Software options and parameters
Index Parameter Unit [Min,Max] ± Step Comments0, 1 X, Y m [-500,500] ± 2 horizontal ver-
tex position instandard IceCubecoordinates [9]
2 Z m [-500,500] ± 5 (vertical vertex po-sition)
3 T ns [9000,11000] ± 5 vertex/interactiontime relative to theevent trigger
4 CosZenith 1 [-1,1] (wrapped) ± 0.02 cos(zenith) for thetrack
5 Azimuth rad [0,2π] (wrapped) ± 2 azimuth angle ofthe track
6 CascadeEnergy GeV [0,120] ± 0.2 energy of the cas-cade at the vertex
7 NumTrackSegments 1 [0,35] ± 0.1 number of 15mtrack segments, 1GeV muon ≡ 4.5 mtrack
570
It also creates the TNTupleD objects which are later filled and stored in the statistics571
ROOT file.572
DarwinizedChainer::Physics573
The method DarwinizedChainer::Physics implements the minimizer and is called for574
each P frame processed. First, it creates the initial generation by575
• retrieving the seed particle from the seed service576
• for each chain:577
– For each parameter where the step size is <0:578
∗ As initial smearing, add a value sampled from a uniform distribution with579
a half-width of580
X,Y,Z T CosZenith Azimuth CascadeEnergy NumTrackSegments50 m 100 ns 0.1 0.2 rad 10 GeV 5 segments581
∗ Re-sample until the parameter is within the boundaries582
– Append the likelihood value (computed with likelihood wrapper from 2.3)583
• find and record the best vector584
• and store each of them in the currentMatrix, making up the starting generation.585
The iteration over the defined number of steps follows. It is interrupted by a criterium586
for time divergence (see 3.3.1) that uses the average time of the first 20 steps. Afterwards,587
33
Appendix A. Software options and parameters
the hence-total time may not exceed double the total time expected for full iteration.588
Each step is an iteration over the 80 vectors (the generation), in which each is updated,589
either the standard way or by alternative update methods.590
Standard Update (::Evolve2006)591
The standard update is described in the 2006 paper by C.F. terBraak et. al. [1]. As592
described in 3.1, it consists of replacing593
~x→ ~x+ γ ∗ (~r1 − ~r2) + ~e (A.1)
according to a Metropolis-Hastings test, where ~r1 6= ~r2 are randomly chosen members of594
the population, different from the current one, and ~e is sampled from an 8-dimensional595
uncorrelated gaussian. If necessary, the boundaries are enforced for each parameter596
individually by re-chosing ~r1 and ~r2 and re-throwing ~e until they new value is within597
them. This saves time from re-doing the entire calculation for the sake of a single598
parameter. The vertex coordinates are re-thrown together, however, since they are599
expected to be closely correlated in the LLH function. A jump step (see also 3.3.1) sets600
γ = 1.601
Output602
The module puts to the frame:603
34
Appendix A. Software options and parameters
Type Name (with prefix”Darwin“)
Description
I3Map<string,double> DarwinBest Best parameters andLLH after the full evo-lution
I3Map<string,double> DarwinBest.step DarwinBest after eachrecorded step
I3Map<string,vector<double>> DarwinBestVector The same informationin one object, with pro-cessing time at ¡step¿
I3Bool DarwinCompleted Whether the chain wasinterrupted by the timedivergence criteriumfrom 3.3.1
I3Map<string,double> DarwinMeanAll Mean of all vectors pastburn-in
I3Map<string,double> DarwinMeanBestN Mean of the best N vec-tors of each of the lastN steps
I3Particle DarwinBestParticle The same as Dar-winBest as a particle,with fit status fromDarwinCompleted
I3Double DarwinTime The total duration ofthe evolution
I3Double HadronicFactor The hadronic scalingfactor in 3.3
604
The statistics ROOT file contains two trees.605
DarwinizedChain Parameters, LLH, LLH call time, and whether it was actually updated for each606
chain and each step607
608
GenerationStats hitherto best parameters, best LLH, duration of the whole step and the fraction609
of accepted updates for each step.610
611
35
Appendix B.
Supplementary figures612
36
Appendix B. Supplementary figures
step (MC)5000 10000 15000 20000 25000 30000
E [G
eV] (
MC
)∆
-8
-6
-4
-2
0
2
4
Median Energy Bias vs. Step
step (DE)5000 10000 15000 20000 25000 30000
0.36
0.37
0.38
0.39
0.4
0.41
0.42
0.43
Portion of events where Energy Bias (DE) DE<MC vs. Step
step (MC)5000 10000 15000 20000 25000 30000
(M
C)
ν/E µ
E
0.17
0.18
0.19
0.2
0.21
0.22
0.23
Fraction Resolution vs. Stepν/EµMedian E
step (DE)5000 10000 15000 20000 25000 30000
0.46
0.48
0.5
0.52
0.54
0.56
Fraction Resolution DE<MC vs. Stepν/EµPortion of events where E
step (MC)5000 10000 15000 20000 25000 30000
| [m
] (M
C)
r ∆|
5
10
15
20
25
30
Median Vertex Resolution vs. Step
step (DE)5000 10000 15000 20000 25000 30000
0.3
0.4
0.5
0.6
0.7
Portion of events where Vertex Resolution DE<MC vs. Step
step (MC)5000 10000 15000 20000 25000 30000
[ns]
(M
C)
T∆
20
40
60
80
100
Median Interaction Time Resolution vs. Step
step (DE)5000 10000 15000 20000 25000 30000
0.2
0.3
0.4
0.5
0.6
0.7
Portion of events where Interaction Time Resolution DE<MC vs. Step
Fig. B.1.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. step number. From top to bottom: Zenith, relative energyresolution, energy bias, fraction of track over total energy.
37
Appendix B. Supplementary figures
[GeV]νtrue E0 10 20 30 40 50 60 70 80
E [G
eV] (
MC
)∆
-40
-35
-30
-25
-20
-15
-10
-5
0
Median Energy Bias vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.25
0.3
0.35
0.4
0.45
Portion of events where Energy Bias (DE) DE<MC vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
(M
C)
ν/E µ
E
0.15
0.2
0.25
0.3
0.35
Fraction Resolution vs. Energy at step 25kν/EµMedian E
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
Fraction Resolution DE<MC vs. Energy at step 25kν/EµPortion of events where E
[GeV]νtrue E0 10 20 30 40 50 60 70 80
| [m
] (M
C)
r ∆|
2
4
6
8
10
12
Median Vertex Resolution vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.45
0.5
0.55
0.6
0.65
0.7
0.75
Portion of events where Vertex Resolution DE<MC vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
[ns]
(M
C)
T∆
5
10
15
20
25
30
35
Median Interaction Time Resolution vs. Energy at step 25k
[GeV]νtrue E0 10 20 30 40 50 60 70 80
0.5
0.55
0.6
0.65
0.7
0.75
0.8
Portion of events where Interaction Time Resolution DE<MC vs. Energy at step 25k
Fig. B.2.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. true neutrino energy in GeV, at step 25200 . From top tobottom: Zenith, relative energy resolution, energy bias, fraction of track overtotal energy.
38
Appendix B. Supplementary figures
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
] (M
C)
° [ θ ∆
4
4.5
5
5.5
6
6.5
7
7.5
8
Median Zenith Resolution vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.4
0.45
0.5
0.55
0.6
Portion of events where Zenith Resolution DE<MC vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
/E (
MC
)
E∆
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
Median Relative Energy Resolution vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.5
0.55
0.6
0.65
0.7
0.75
Portion of events where Relative Energy Resolution DE<MC vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
E [G
eV] (
MC
)∆
-12
-10
-8
-6
-4
Median Energy Bias vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.32
0.34
0.36
0.38
0.4
0.42
0.44
Portion of events where Energy Bias (DE) DE<MC vs. Cos(Zenith) at step 25k
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
(M
C)
ν/E µ
E
0.14
0.16
0.18
0.2
0.22
0.24
0.26
Fraction Resolution vs. Cos(Zenith) at step 25kν/EµMedian E
)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.4
0.45
0.5
0.55
0.6
Fraction Resolution DE<MC vs. Cos(Zenith) at step 25kν/EµPortion of events where E
Fig. B.3.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. true cosine of zenith, at step 25200 . From top to bottom:Zenith, relative energy resolution, energy bias, fraction of track over totalenergy.
39
Appendix B. Supplementary figures
time [s] (MC)0 10000 20000 30000 40000 50000 60000
E [G
eV] (
MC
)∆
-10
-8
-6
-4
-2
0
2
4
Median Energy Bias vs. Time
time [s] (MC)0 10000 20000 30000 40000 50000 60000
(M
C)
ν/E µ
E
0.17
0.18
0.19
0.2
0.21
0.22
0.23
Fraction Resolution vs. Timeν/EµMedian E
time [s] (MC)0 10000 20000 30000 40000 50000 60000
| [m
] (M
C)
r ∆|
5
10
15
20
25
30
Median Vertex Resolution vs. Time
time [s] (MC)0 10000 20000 30000 40000 50000 60000
[ns]
(M
C)
T∆
20
40
60
80
100
Median Interaction Time Resolution vs. Time
Fig. B.4.: Resolutions vs. time for MC (blue) and DE (red). From top to bottom: Energybias, track/total energy fraction resolution, vertex resolution, interaction timeresolution. The asymptotic behaviour is due to statistics tailing off at hightimes.
40
Appendix B. Supplementary figures
[GeV]νtrue E0 10 20 30 40 50 60 70 80
E [G
eV] (
DE
80)
∆
-60
-40
-20
0
20
40
60
0
2
4
6
8
10
12
14
16
18
20
22
Energy Bias (DE 80) vs. Energy @ gen 600, Mean -6.756736, RMS 13.792047
[GeV]νtrue E0 10 20 30 40 50 60 70 80
E [G
eV] (
DE
80)
∆
-60
-40
-20
0
20
40
60
0
2
4
6
8
10
12
14
16
18
20
22
24
Energy Bias (DE 80) vs. Energy @ gen 300, Mean -5.359335, RMS 14.489410
[GeV]νtrue E0 10 20 30 40 50 60 70 80
(D
E 8
0)ν
/E µE
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
2
4
6
8
10
Fraction Resolution (DE 80) vs. Energy @ gen 600, Median 0.185500ν/EµE
[GeV]νtrue E0 10 20 30 40 50 60 70 80
(D
E 8
0)ν
/E µE
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
2
4
6
8
10
Fraction Resolution (DE 80) vs. Energy @ gen 300, Median 0.201500ν/EµE
[GeV]νtrue E0 10 20 30 40 50 60 70 80
| [m
] (D
E 8
0)r ∆|
0
5
10
15
20
25
30
35
40
45
50
0
2
4
6
8
10
12
14
16
18
20
22
Vertex Resolution (DE 80) vs. Energy @ gen 600, Median 4.075000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
| [m
] (D
E 8
0)r ∆|
0
5
10
15
20
25
30
35
40
45
50
0
2
4
6
8
10
12
14
16
Vertex Resolution (DE 80) vs. Energy @ gen 300, Median 5.575000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
[ns]
(D
E 8
0)
T∆
0
10
20
30
40
50
60
70
80
90
100
0
5
10
15
20
25
30
Interaction Time Resolution (DE 80) vs. Energy @ gen 600, Median 8.050000
[GeV]νtrue E0 10 20 30 40 50 60 70 80
[ns]
(D
E 8
0)
T∆
0
10
20
30
40
50
60
70
80
90
100
0
2
4
6
8
10
12
Interaction Time Resolution (DE 80) vs. Energy @ gen 300, Median 12.250000
Fig. B.5.: Resolutions for populations of 40 (left) and 80 (right) with the SPE4 seed, vs.true neutrino energy in GeV. From top to bottom: Energy bias, track/totalenergy fraction resolution, vertex resolution, interaction time resolution. Thered curves for energy bias are mean and rms, for the rest they are medians.
41
Appendix B. Supplementary figures
[GeV]νtrue E0 10 20 30 40 50 60 70 80
E [G
eV] (
DE
80)
∆
-60
-40
-20
0
20
40
60
0
5
10
15
20
25
Energy Bias (DE 80) vs. Energy @ gen 600, Mean -11.482507, RMS 10.252414
[GeV]νtrue E0 10 20 30 40 50 60 70 80
E [G
eV] (
DE
80)
∆
-60
-40
-20
0
20
40
60
0
5
10
15
20
25
Energy Bias (DE 80) vs. Energy @ gen 300, Mean -11.244673, RMS 10.391890
[GeV]νtrue E0 10 20 30 40 50 60 70 80
(D
E 8
0)ν
/E µE
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
2
4
6
8
10
12
Fraction Resolution (DE 80) vs. Energy @ gen 600, Median 0.195500ν/EµE
[GeV]νtrue E0 10 20 30 40 50 60 70 80
(D
E 8
0)ν
/E µE
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
2
4
6
8
10
Fraction Resolution (DE 80) vs. Energy @ gen 300, Median 0.197500ν/EµE
Fig. B.6.: Resolutions for populations of 40 (left) and 80 (right) with the mixed andpartially-fixed seed, vs. true neutrino energy in GeV. Top: Energy bias, bot-tom: track/total energy fraction resolution. The red curves for energy biasare mean and rms, for the fraction they are medians. The vertex resolution isnot shown since these parameters were fixed during this run.
42
Bibliography613
[1] C.J.F. ter Braak, A Markov Chain Monte Carlo version of the genetic algorithm614
Differential Evolution: easy Bayesian computing for real parameter spaces, Stat615
Comput (2006) 16:239-249616
[2] C.J.F. ter Braak, J.A. Vrugt, Differential Evolution Markov Chain with snooker617
updater and fewer chains, Stat Comput DOI 10.1007/s11222-008-9104-9618
[3] M.G. Aartsen et al., Energy Reconstruction Methods and Performance in the Ice-619
Cube Neutrino Detector620
[4] R. Abbasi et al., The Design and Performance of IceCube DeepCore,621
arXiv:1109.6096 [astro-ph.IM]622
[5] D. Williams, The Precision IceCube Next Generation Upgrade, presented at DPF623
2013, arXiv:1310.1287 [astro-ph.IM]624
[6] M.G. Aartsen et al., Measurement of South Pole ice transparency with the IceCube625
LED calibration system, Appendix B arXiv:1301.5361 [astro-ph.IM]626
[7] M. Kowalski, Search for Neutrino-Induced Cascades with the AMANDA-II Detector627
(PhD thesis, January 13th 2004), ch. 3.4.2, p. 21ff628
[8] L. Radel, Simulation Studies of the Cherenko Light Yield from Relativistic Particles629
in High-Energy Neutrino Telescopes with Geant4 (Master thesis, September 2012),630
ch. 4.3.2631
[9] IceCube coordinate system on the IceCube wiki632
[10] N. Whitehorn, J. van Santen, S. Lafebre, Penalized Splines for Smooth Representa-633
tion of High-dimensional Monte Carlo Datasets, arXiv:1301.2184 [physics.data-an]634
[11] J. Lundberg et al., Light tracking through ice and water - Scattering and absorption635
in heterogeneous media with PHOTONICS, arXiv:astro-ph/0702108636
[12] R. Abbasi et al., Calibration and Characterization of the IceCube Photomultiplier637
Tube, ch. 11, arXiv:1002.2442 [astro-ph.IM]638
[13] IceCube Bootcamp 2008 presentation on Hits, PEs and Waveforms, p. 11639
43