Reconstruction of PINGU data with a Diﬀerential Evolution …685277/FULLTEXT01.pdf · 2014. 1....

Project Report

Reconstruction of PINGU data with aDifferential Evolution Minimizer

Christoph Raab

January 9, 2014

Department of Physics and AstronomySupervisor: Carlos Perez de los Heros; Reviewer: David Boersma

Abstract

The Precision IceCube Next Generation Upgrade (PINGU) is supposed to have an energythreshold below . 10 GeV in order to resolve the neutrino mass hierarchy. In order toreconstruct the energy and direction of neutrinos interacting in this array, producingboth a hadronic cascade and a muon track, advanced reconstruction methods need tobe employed. A class of these seeks to maximize a complicated likelihood functionwithin an 8-dimensional parameter space describing the event, and requires sophisticatedminimizers to achieve the necessary resolution in a reasonable time. In this report, a pre-existing but hitherto unused minimizer which samples that parameter space with severalMarkov chains at once, based on the Differential Evolution Monte Carlo algorithm, isdeveloped further and its behaviour and performance is tested on simulated data of theIceCube/PINGU array. The tests compare both various configurations of the minimizerand Markov Chain Monte Carlo, a similar previous approach.

Contents

1. Introduction 11.1. Neutrino Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Track reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2. Simulation, Processing and Reconstruction Chain 42.1. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2. Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3. Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3. Differential Evolution Monte Carlo 73.1. Differential Evolution Algorithm . . . . . . . . . . . . . . . . . . . . . . . 73.2. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.1. Example Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2.2. Tray Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2.3. Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.3. Problems and challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3.1. Stuck chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3.2. Cascade Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4.1. Comparison to Markov Chain Monte Carlo . . . . . . . . . . . . . 153.4.2. Resolution for SPE4 Seed . . . . . . . . . . . . . . . . . . . . . . . 213.4.3. Resolution for Mixed Monte Carlo/SPE4 Seed . . . . . . . . . . . 21

4. Summary, Conclusions and Outlook 244.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5. Acknowledgements 28

A. Software options and parameters 29A.1. Example script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30A.2. Tray script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31A.3. Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

B. Supplementary figures 36

Bibliography 43

Chapter 1.

Introduction1

1.1. Neutrino Telescopes2

High-energy cosmic neutrinos impinge upon the Earth at a high flux from the entire sky3

and can be produced in a variety of sources. Large-scale instruments that are used to4

observe these sources by measuring the direction of the neutrinos are called neutrino5

telescopes. One such neutrino telescope is installed at the South Pole, and uses the6

antarctic ice as a target for neutrinos to interact in. In case of the weak, charged-current7

interaction of a muon-flavour neutrino, this creates a muon alongside an hadronic cascade8

from the interacting nucleus:9

νµ +N → µ+ +N∗ (1.1)νµ +N → µ− +N∗ (1.2)

The muon and nuclear fragments deposit their energy into secondary particles, which10

in turn induce Cherenkov light in the ice with absorption lengths typically between 90 m11

and 270 m, strongly depending on the ice layer. The muon does so along a linear-, while12

the cascade creates a mostly point-shaped and nearly isotropic light source. (See fig.13

1.1.) If this light is captured by an array of sensors, it is possible to infer the direction of14

the incoming neutrino with a certain precision. This opens up the possibility of neutrino15

astronomy, where neutrino sources from the entire sky can be observed. An irreducible16

background to cosmic neutrinos are neutrinos created by the interaction of cosmic rays17

in the atmosphere.18

IceCube is an array of 86 strings of 60 (digital) optical modules (DOMs) each, buried19

between depths of 1500 m and 2500 m. The modules consist of large photomultipliers,20

which can resolve the shape of pulses due to single Cherenkov photons, and readout21

electronics, which send data to the surface via a common cable, where the detector is22

triggered according to coincidence conditions. The strings are spread out over a square23

kilometer of surface and instrumenting a cubic kilometer of ice in total. The resulting24

string spacing of 125 m leads to an energy threshold corresponding to the shortest muon25

track possible to be resolved of 50 - 100 GeV, making IceCube mostly useful for observing26

high energy neutrinos. These can e.g. give information about the origin of high-energy27

cosmic rays. However neutrino telescopes can also be used to detect signatures of dark28

matter particles gravitationally accumulated in certain locations such as the Sun or29

1

Chapter 1. Introduction

μ

ν

17 m (IceCube)

5 m/3 m (PINGU)

125 m (IceCube)

14 - 26 m (PINGU)

A

cascade

Cer

enkov c

one

Fig. 1.1.: Diagram of the light sources from a νµ charged-current interaction inside theIceCube array.

Galactic center, or to measure the oscillation of neutrinos through the Earth. For these30

purposes, a lower energy threshold is required, and IceCube contains the DeepCore31

inset array of closer spaced strings with, in turn, 50 closer-spaced DOMs of 35% higher32

quantum efficiency. This lowers the energy threshold of the combined detector to 1033

GeV by gaining light yield for low-energy events, and improve the resolution of short34

tracks. DeepCore’s position at the bottom center of the array, from a depth of 2100 m35

to 2450 m, allows the rest of IceCube and a layer of 10 further DOMs per string above36

2000 m to act as a veto for the DeepCore sub-array. [4]37

PINGU is a planned extension of the IceCube/DeepCore array with an inset of even38

higher-density strings. Neutrinos oscillate between their flavours due to having three39

different mass eigenstates, m1...3. In vacuum, these oscillations are only parametrized by40

the absolute value |∆m2ij | = |m2

i −m2j |. In matter, however the MSW effect is sensitive41

to the signs, as well, and by measuring the oscillations of atmospheric neutrinos through42

the Earth, PINGU will be employed to resolve the neutrino mass hierarchy (NHM). In43

the oscillation, the two “free” parameters are the energy of the neutrino, and oscillation44

length. The latter corresponds to the zenith angle of the neutrino arriving in the detector.45

Hence, resolution of both these will need to be improved, especially for E < 10 GeV,46

where the effect is expected to be most strongly pronounced [5].47

The used geometry in this report is called “V6” 1.2. It consists of 20 strings within48

the central eight that make up DeepCore with a horizontal spacing of 26 m. Each string49

has 60 DOMs with a vertical spacing of 5 m.50

1.2. Track reconstruction51

The energies of muons which may leave the array (& 100 GeV) is estimated by the energy52

loss rate along the track, which is approximately linear ∝ E at & 500 GeV [3]. For the53

low energies in concern, the track is most likely contained in the detector, and its energy54

loss has a constant rate ≈ 0.222GeVm , so that the muon energy can be estimated by the55

total length of the track. However, the energy lost through the hadronic cascade is com-56

2

Chapter 1. Introduction

IceCube-86 with PINGU v6 inlay

Fig. 1.2.: Left: A top view of the IceCube/DeepCore array with PINGU in the V6geometry. Right: A close view of V6, an inset of 20 strings (26 m spacing)with 60 high-quantum-efficiency DOMs each (5 m vertical spacing).

parable to the energy lost in the track (if any), so it is Eν +Mnucleus = Ecascade +Etrack.57

This means that for PINGU, the event hypothesis is a combination of the light yield from58

charged particles, originating at one point and along a line segment. Such a hypothesis59

is described by 8 total parameters: the interaction time, vertex position, the zenith and60

and azimuth angle, Ecascade and Etrack. The last step from the actual event to these61

parameters is the reconstruction of the signal from all the individual DOMs. Models62

describing the light production in the ice, its propagation and the detector response63

result in a likelihood function L which is the probability of the measurement, given a64

certain hypothesis. A likelihood reconstruction seeks to find the parameter vector ~x65

that maximizes L(event|~x), or for short L(x). Especially for a low number of detected66

photons / number of channels (hit DOMs), this function has several local maxima. This67

multimodality is a challenge that all minimizing algorithms face. In addition, a stan-68

dard reconstruction method is needed already before the construction of PINGU in order69

to evaluate its sensitivity. Several for these are being studied, like SANTA/monopod70

which was already used for IceCube/DeepCore, and Multinest which is currently being71

developed. This report considers a new minimizer that explores the 8-dimensional pa-72

rameter space at several points simultaneously, based on an algorithm called Differential73

Evolution Monte Carlo (DE). This algorithm is then tested in various configurations on74

simulated data of the PINGU/IceCube array, and compared in performance to Markov75

Chain Monte Carlo (MC), a similar previous approach.76

3

Chapter 2.

Simulation, Processing and Reconstruction77

Chain78

2.1. Simulation79

To perform reconstruction studies for PINGU necessarily involves simulation data. For80

the events studied in this report, νµ and νµ with a flat spectrum between 1 GeV and 8081

GeV were thrown into the V6 geometry of IceCube/PINGU (fig. 1.2). Their interaction82

was simulated with the GENIE software, and the resulting light propagated assuming the83

SPICE-Mie ice model, i.e. a combination of absorption and scattering lengths, depending84

on the depth in the ice. This model describes the optical properties of the ice at the South85

Pole, and was developed based on dedicated calibration runs. When a photon impinges86

upon a DOM, the detector response is simulated accordingly, creating a waveform. This87

includes for instance afterpulses, which can follow the main photon-induced pulse like88

an echo with a ∼ 6µs delay due to the ionization of residual gases inside the PMT [12].89

The readout is triggered according to the waveform. On top of the hits resulting from90

this simulation, noise hits are added.91

The events were also selected according to the simulated location of the vertex, im-92

posing R < 50m and −400m < Z < −200m with R the distance from the central string93

(#36) and Z the depth in IceCube coordinates [9]. This containment criterium could94

also be implemented in a less ideal form from simulated/real data, and ensures that the95

reconstruction studies are in particular relevant to PINGU. Furthermore, neutral-current96

reactions which would not contain a muon were removed, since they a) do not match97

with the expected event signature but more importantly b) the outgoing neutrino would98

carry away an unknown amount of energy and make studying energy resolution of the99

reconstruction pointless.100

2.2. Processing101

These simulation files were then processed as experimental data would be. For each hit102

DOM, the waveform as output by the digitizers (Fig. 2.1) undergoes a feature-extraction103

that linearly decomposes it into a series of pulses with a certain time, width and charge.104

These pulses then undergo a cleaning filter that seeks to remove noise. In our case,105

the static time window cleaning was used which simply selects hits within a constant106

time window relative to each event’s trigger time. A method that was previously used is107

4

Chapter 2. Simulation, Processing and Reconstruction Chain

the sliding time window, which simply seeks time window of a pre-defined length which108

contains the greatest number of pulses, and disregards those outside. Using this type of109

cleaning could however lead to cleaning away hits from low-energy events if there was a110

coincident event of higher energy, even when the dimmer one triggered the detector. This111

behaviour is naturally undesirable for low-energy analysis. After that, only events that112

had more than 10 hits were allowed to pass. On the hits within the PINGU subarray,113

an SPE4 reconstruction was performed as a preliminary guess for the following ones.114

This reconstruction is based on the SPE1st reconstruction, which minimizes a likelihood115

computed by using the first hit from each DOM and the Pandel likelihood function,116

which is an analytical approximation to the probability of a DOM being hit, given a117

track hypothesis. This is plugged into the I3IterativeFitter module from the gulliver118

suite. This module varies the result, minimizes again, and after four iterations returns119

the fit result with the highest logL out of the five as the SPE4 fit result.120

Fig. 2.1.: The output of the four different amplifiers present on a DOM. The FADCchip digitizes the PMT output signal with a long readout window (6400 ns)after passing a shaping amplifier, and the ATWD chip captures shorter timewindows (422 ns) at a higher resolution while making use of three gain levels.Figure from [13]

2.3. Likelihood Function121

The likelihood function is taken from Millipede, a reconstruction software developed122

within the IceCube collaboration (project millipede). The model at the basis of the123

Millipede likelihood function is one of Poissonian statistics (see ch. 2ff of [3]). For124

each DOM and each time bin within the duration that it recorded a waveform, signified125

by index i, the measured quantity ki is the deposited charge in units of photoelectrons126

5

Chapter 2. Simulation, Processing and Reconstruction Chain

(PE), corresponding to a single photon. It is expected to follow a Poissonian distribution127

Li = λkik! e−λi . The average λi is the expected number of photons. On top of that, there128

is expected to be a noise level ρi, so that λi → λi + ρi. The complete likelihood is then129

a product of Li, or130

logL =∑i

logLi =∑i

ki ln(λi + ρi)− (λi + ρi)− ln ki! (2.1)

For DOMs that were not hit within the readout time window, the logL sum over time131

bins simplifies into a P no hit, the probability that no single photon reached this particular132

DOM during the entire readout time.133

Millipede was configured to use time bins of variable width that contain maximally134

1 PE, and are at most 200 ns long. The λi are retrieved from so-called spline tables,135

which are based on Monte Carlo simulation of the detector response for idealized forms136

of sources. These are Cherenkov-emitting track segments of 15 m length correspond-137

ing to minimally ionizing muons, and isotropically emitting point sources for hadronic138

cascades. The resulting expection values λi were then fitted to spline functions [10] on139

the parameter space, and the fit results stored in lookup tables, from which the λi to140

be used for any given configuration of sources, i.e. event hypothesis, can be computed141

more accurately than from analytical models [3]. Even though, they make up the largest142

portion of the likelihood function call, which in itself dominates the processing time of143

likelihood reconstruction methods.144

Since the photospline tables are only computed for integer numbers of track segments,145

likelihood values for track lengths that lie between two multiples of 15 m are are ap-146

proximated by a linear interpolation. This is implemented in the LikelihoodWrapper147

module, which was written by M. Dunkman as part of the hybrid-reco project.148

6

Chapter 3.

Differential Evolution Monte Carlo149

3.1. Differential Evolution Algorithm150

One way to find the maximum of L(~x) is to simply sample points ~x out of the 8-151

dimensional parameter space (interaction time, vertex position, the zenith and and az-152

imuth angle, Ecascade and Etrack), compute L(~x) for each and take the best one. An153

attempt to make this sampling more efficient than taking an 8-dimensional grid is called154

Markov Chain Monte Carlo. It proceeds as follows:155

1. Choose a seed as the starting vector ~x, for instance a first guess of the parameters.156

2. Sample a new point ~y from a Gaussian distribution around ~x.157

3. Replace ~x = ~y with a probability of min(1,L(y)L(x))158

(This is called a Metropolis-Hastings test.)159

4. Reiterate from the current point - this progression is called a chain.160

After a number of burn-in steps, the distribution of points in the history of the chain161

will be a sampling of the input distribution L, so the points are denser where L is higher,162

i.e. the interesting region(s) in parameter space. The one parameter vector with the163

highest L will then be chosen as the reconstructed maximum.164

This method is still rather slow in approximating the true maximum. It also carries165

with it the possibility of the chain converging to a non-global maximum, depending on166

the chosen step size. Differential Evolution Monte Carlo (DE) is a way to attempt to167

accelerate the convergence and make it more resistant against multimodality.168

DE acts upon several chains in parallel. At each step, they’re collectively called a169

generation. In summary:170

1. Smear the seed into the initial generation (taking N samples out of a Gaussian).171

2. For the i-th vector ~xi, calculate an updated version ~xp:172

a) Randomly chose two other, non-identical vectors ~xr1 and ~xr2 out of the gen-173

eration.174

b) Sample a vector ~e from a distribution that is small compared to the spread175

within the generation176

7

Chapter 3. Differential Evolution Monte Carlo

c) Compute177

~xp = ~x+ γ( ~xr2 − ~xr1 + ~e) (3.1)

(The factor γ is 0.2 by default, the ~e widths are the steps in A.3.)178

3. Replace ~xi = ~xp according to a Metropolis-Hastings test as in MC.179

4. Reiterate.180

The unique stationary distribution of the sampled parameter vectors is L as in MC181

(proof in [1]). In this method, when the spread of the generation goes down, so will the182

variation from step to step, which could accelerate the convergence to the stationary183

distribution. Taking the ”best“ vector out of several chains also means that it’s enough184

for part of them to converge to the region of the true maximum. On the other hand,185

the rest of the generation will still use up processing time for its updates.186

3.2. Implementation187

A basic version of a DE-based likelihood reconstruction was already implemented by188

Ken Clark and Matt Dunkman (PSU) as a module within the likelihood-scanner project189

(sandbox/mdunkman/likelihood-scanner/trunk at rev 105435). I updated the order190

of parameters to the one used currently by HybridReco (sandbox/reagan/hybrid-reco/trunk191

at rev 100077), added timing functionality, more options and output keys including the192

possibility to write a log of the entire evolution to a separate file, made it possible to pass193

bounds and step sizes to the method from within a script via the HybridReco parameter194

service, synchronized the same values to those default for the MC method, fixed miscel-195

laneous bugs, implemented new kinds of parameter-vector updates, and packaged it in196

scripts which could be used by the IceCube software framework IceTray, as described197

below.198

3.2.1. Example Script199

There is an example script in likelihood-scanner/resources/examples/darwin_chain.py.200

This tray script201

• reads in an input file202

• skips neutral current events203

• computes the hadronic scaling factor F . This factor is to correct for the fact204

that a hadronic cascade has a smaller light yield (or visible energy) than an elec-205

tromagnetic cascade of the same energy due to neutral hadrons (like neutrons).206

The photospline tables are calculated for electromagnetic cascades, so the actual207

cascade energy relates to the reconstructed visible energy by208

Eactual = EvisibleF (Eactual)

(3.2)

8


The factor can be parametrized as209

F (E) = 1− 0.690(max(2.7 GeV, E)

0.188 GeV

)−0.162[8] (3.3)

Since the current implementation requires Monte Carlo truth input, this factor is210

not used for the evaluation of this method’s resolution.211

• includes the tray segment likelihood_scanner.darwin_chain described below212

• and finally writes output in the form of .i3 files and via tableio in ROOT or HDF213

format.214

Its command-line options are listed in tab. A.1.215

216

3.2.2. Tray Segment217

The tray segment is included in the above example tray script. It218

• sets the log-level for the involved modules219

• adds the Photospline services (I3PhotoSplineServiceFactory) for both tracks220

and cascades221

• adds the time window in which pulses could have been recorded by the simulated222

read-out, for use by the likelihood via I3WaveformTimeRangeCalculator. Simu-223

lation files which contained both a noise-free pulse series and one with added noise224

used to have a bug in the trigger simulation which allowed noise pulses to prompt225

the error ”Millipede: Pulse time before readout window start“.226

• adds the Millipede likelihood service. This service in turn is configured to use227

the afforementioned muon and cascade photospline services, the time window, and228

count 1 pulse per variable-width time bin.229

• adds a seed particle to the frame. In case of a Monte Carlo seed, this particle230

has position, time and energy from the cascade, and length and direction from the231

muon track. In case of the infinite-track SPE4_PINGU seed, the latter is appended232

with default length 50 m and energy 25 GeV. If chosen, the position and time are233

replaced by the MC values.234

• stores the seed in the HybridReco seed service (HybridRecoSeedServiceFactory)235

• adds the HybridReco parameter service (HybridRecoParametrizationServiceFactory)236

and stores step sizes and bounds for all parameters therein (see tab. A.3), which237

are later used by the minimizer. A step size of 0 leaves the parameter fixed.238

These services were taken from the project hybrid-reco (sandbox/mdunkman/likelihood-scanner/trunk239

at rev 105435).240

9


• Finally the DarwinizedChainer module is added and provided with likelihood,241

parameter and seed services.242

• The tray returns a list of output keys, which can e.g. be passed to a table writer.243

Options passed to the tray segment are described in tab. A.2.244

245

3.2.3. Module246

The modul implements the actual minimizer as a C++ class which inherits from the247

I3Module class. On inclusion into a tray script, it receives a set of parameters which are248

handled (once) in the method DarwinizedChainer::Configure (see A.3). These are249

detailed in tab. A.3. Among these parameters are several external pieces of software:250

• a seed service, from which the seed particle is retrieved251

• a parameter service, which supplies the bounds and step sizes used during the252

evolution update (see A.3)253

• a likelihood service, containing the likelihood function to be maximized254

The general form of the iteration is described in A.3. Upon completion, the best255

parameter-and-LLH vector and its history, along with information about used CPU256

time, are written into the output file. It is possible to write an additional file, containing257

the complete evolution of all individual chains. For more details, see A.3.258

3.3. Problems and challenges259

3.3.1. Stuck chains260

While most chains gradually improve their likelihood over the course of the evolution,261

some chains stay stuck at a lower likelihood which is clearly separate from the rest of262

the generation. In the figure 3.1, this is shown once for 80 chains on one specific event,263

and once for two particular chains representing each case. Examining individual chains264

(see fig. 3.1) shows that the stuck chains are updated with constant frequency, but seem265

to oscillate around a broad, local maximum. In comparison, the higher-likelihood chains266

slowly converge to a maximum, as the parameter space they can move into shrinks.267

Separating these two subsets of chains with an logL cut and plotting the X-Y coordi-268

nates of the vertex in fig. 3.2 shows that the stuck chains progress from the initial (seed)269

region to the edges of the detector volume. Plotting the duration of a likelihood-function270

call versus the number of track segments for these two sets in fig. 3.2 it’s apparent that271

the stuck chains extend their tracks farther from the edge of the detector. These track272

segments lie outside the array, so there are no DOMs nearby that would penalize them273

with Millipede’s P (no hit) (see 2.3). The energy of the cascade meanwhile increases to274

match the expected photoelectrons from an interaction at the edge of the array to the hits275

10


step100 150 200 250 300 350 400 450

LLH

-950

-900

-850

-800

-750

-700

-650

-600

-550

-500

0

1

2

3

4

5

6

7

8

9

10

LLH of all chains in a single file

(a) All chains, the stuck chains are below -880.

generationNum100 150 200 250 300 350 400 450

LLH

-470

-460

-450

-440

-430

-420

-410

-400

LLH:generationNum {chainNum==13}

(b) A normal chain.generationNum

100 150 200 250 300 350 400 450

LLH

-590.5

-590

-589.5

-589

-588.5

LLH:generationNum {chainNum==71}

(c) A stuck chain.

Fig. 3.1.: The evolution of logL with the steps for one event. Note the relative scale oftheir variations.

inside the array, while the interaction time has to decrease to accomodate the increased276

propagation time.277

The hits inside the array will only contribute a flat, noise-like logL from each track278

segment which is too far removed compared to the scale of the absorption length. Hence279

in fig. 3.2 the linear fits are280

time for normal chains [s] = (0.18 + 0.09×Nsegments)time for stuck chains [s] = (0.22 + 0.04×Nsegments)

The out-of-reach tracks take half the time per track segment. One track segment corre-281

sponds to one lookup of the photospline table (see 2.3) for each hit, so the higher number282

of track segments still increases the average time (2.2 s → 7.9 s).283

Comparing the logL call time for two representative chains in fig. 3.3 shows that284

the track in the ”stuck“ chain continues to grow almost steadily. For some events, the285

proportion of stuck chains can be so high that their diverging track length is clearly286

reflected in an overall diverging computation time per step (fig. 3.3). The non-divergent287

events on the other hand contain a smaller fraction of stuck chains.288

This behaviour is unwelcome not only due to the increase in computation cost, but289

also because evolution steps that involve taking a difference between one conservative,290

11


high-logL and one extremely distant, low-logL region are unlikely to improve the logL291

of chains in the high-logL region, and therefore the current best likelihood. There are292

several attempts to control it.293

Parameter Boundaries294

Since the divergence in time is due to a divergence in track length and is accompanied295

by variations in cascade position, time, and energy, a straight-forward solution to this296

problem is by implementing boundaries on all of these parameters. Especially the track297

length had hitherto no upper boundary, and the vertex position/time were bound far298

too losely compared to what could be expected of the data.

(a) (b)

Fig. 3.2.: Left: The vertex for normal chains (gradient colours) and stuck chains (red) inone event, with the bounds indicated. Right: Time of a logL function call forone event vs. number of track segments. Separated in normal chains (green)and stuck chains (red), showing means with RMS error bars per bin, linear fitsand the upper boundary.

299

Jump Steps300

In a jump step, the γ parameter in eqn. 3.1 is set to 1, so the update becomes301

~y = ~x+ (~r1 − ~r2) + ~e (3.4)

The idea is that stuck chains could be removed from their local maximum by a step302

with a randomly chosen difference vector between the stuck sub-population and the303

converging sub-population.The random choice would then result in an exponential decay304

of the stuck population size. However, the stuck chains cover a broad region in parameter305

space, which increases the time constant of this decay. Also, the other chains are affected306

12


by this jump step, too, while the time spectrum (fig. 3.4) did not qualitatively change307

for runs seeded with Monte Carlo truth, while the divergent chains were cut off for the308

SPE4 seed. This however remains to be separated from the effect of the track length309

boundary.310

The effects of both jump steps and parameter boundaries are examined in fig. 3.5,311

where the time per step as well as the best logL are shown vs. step number for the312

same event being processed with and without these measures, once with an SPE seed313

and once with an Monte Carlo truth seed. Of the four curves, the only one diverging314

in time is for SPE without jump steps and unbound parameters. The times of the SPE315

runs are comparable before the divergence, and the MC run’s time is clearly higher with316

the counter-divergence measures before it becomes compatible after ∼ 250 steps. Note317

that this excludes the actual jump steps, which have significantly higher times. For both318

seeds, the logL is enhanced by bound parameters, with an Monte Carlo seed always319

exceeding an SPE seed.320

Hard Time Limit321

The divergence of processing time for a whole event can be mitigated by simply inter-322

rupting the evolution if the total time exceeds a limit estimated:323 ∑t ≤ 2×Nsteps × 〈t〉first 20 steps (3.5)

Once all the above measures are implemented, the fraction of events cancelled by324

this condition are on average 0.1% for both 40 chains and 80 chains, one event being325

cancelled in each run with an SPE seed. The fractions were 1% (40) and 0.4% (80) for326

an Monte Carlo seed. The reason for this was not investigated. The time per step is327

then well-bound to ≤165 s/step and does not diverge, seen in fig. 3.4, which contrasts328

to the previous upper limit of 634 s/step for a diverging event.

(a) Compared to number of track segments (gradientcolours)

step0 50 100 150 200 250 300 350 400 450

time/

step

[s]

0

5

10

15

20

25

(b) Compared to a normal chain (blue)

Fig. 3.3.: Time per step for a stuck chain (red) vs. step number (no jump steps).329

13


time/step [s]0 100 200 300 400 500 600

freq

uenc

y

1

10

210

310

410

Step times before and after

(a) Blue: from a large sample of events. Red:from a smaller sample.

time/step [s]0 50 100 150 200 250 300 350 400

freq

uenc

y

1

10

210

310

Time spectrum before, after bounds and jump steps (SPE seed)

(b) Just SPE4 seed.

time/step [s]20 40 60 80 100 120 140 160

freq

uenc

y

1

10

210

310

Time spectrum before, after bounds and jump steps (MC seed)

(c) Just Monte Carloseed.

Fig. 3.4.: Distribution of time per step. Blue: after all parameter bounds, jump steps,and time limits have been implemented. Red: Before.

3.3.2. Cascade Energy330

The photospline tables 2.3 are actually computed for electromagnetic cascades. The331

likelihood reconstruction is hence expected to ideally give the energy of an electromag-332

netic cascade that had the same light yield as the hadronic cascade in the data. Since in333

a hadronic cascade, more energy is deposited stochastically into neutral particles such334

as neutrons which do not result in Cherenkov light, this would lead on average to an335

underestimation of the energy of the hadronic cascade. This underestimation is reflected336

in a bias of −9.7 GeV (averaged over the used data sample) when comparing the recon-337

structed cascade energy to the energy of the cascade as taken from the Monte Carlo338

truth information. This effect gets more pronounced towards low energies, as shown in339

Fig. 3.6. If one applies the analytical approximation of the factor, as computed from340

the true cascade energy, as an unphysical correction factor to the reconstructed cascade341

energy, this bias reduces to −4.8 GeV, which is smaller than the RMS of 9.9 GeV. Any342

way to apply this correction using purely reconstructed data was thought to involve a343

separate study to arrive at a different correction function, and hence no such attempts344

were made.345

3.4. Performance346

Several quantities were computed to describe the deviation between reconstruction and347

Monte Carlo truth. Here, the form ∆X means Xreco −Xtrue.348

The first two are the most interesting for physical analyses:349

• Relative energy error∣∣∣ ∆EEtrue

∣∣∣, with the total reconstructed and true neutrino energy350

respectively351

14


step number0 50 100 150 200 250 300 350 400 450

time/

step

[s]

20

40

60

80

100

120

140

160

time:generationNum {eventNum==0 && generationNum%5 == 0 && generationNum%20!=0}

(a) Time per step vs. step, 1 event

step number0 50 100 150 200 250 300 350 400 450

Bes

t LLH

-540

-520

-500

-480

-460

-440

-420

-400

-380

bestLLH:generationNum {eventNum==0 && generationNum%5==0 && generationNum!=20}

(b) Best logL evolution, 1 event

Fig. 3.5.: Green: SPE seed. Blue: Monte Carlo seed. Dashed: unbound track length, nojump steps. Solid: bound track length, 1/20 jump steps. The separate pointsare for the jump steps.

• Zenith error |∆θ|352

Two more energy-related, also with physical relevance353

• Energy bias ∆E, the signed difference of total reconstructed and true neutrino354

energy. This is expected to have a bias proportional to true energy, see sec. 3.3.2355

• Fraction error∣∣∣∆Eµ

Eν

∣∣∣, which evaluates how well the actual hybrid hypothesis is356

matched357

Vertex-related quantities which affect the reconstruction quality of others (see sec.358

3.4.3)359

• Vertex position∣∣∣∆~R

∣∣∣, the distance between the true and reconstructed vertex.360

• Interaction time |∆T |361

The performance of the reconstruction method is then evaluated by e.g. the median362

of these errors across the event sample, called the resolution.363

Plots are shown here for the first group, and in the appendix B for the rest.364

3.4.1. Comparison to Markov Chain Monte Carlo365

First, a comparison to the original MC has to be made. Both methods were run on366

the same events, with the same SPE4 seeds, and their best vector recorded at regular367

intervals of steps, along with the time elapsed since the start of the iteration (see A.3).368

For DE, the effective number of steps is given by the number of generations times the369

size of the generation. The chosen generation size was 40, since previous tests showed370

15


E/GeV

F(E)

scalin

g f

act

or

(a) Parametrization

0 10 20 30 40 50 60 70 80

0.2

0.4

0.6

0.8

1

1.2

Cascade Energy Reconstruction/Truth vs. Truth, Mean 0.57

(b) Reconstruction mean ± rms

Fig. 3.6.: Left: Average visible over actual energy for hadronic cascades, parameterizedaccording to L. Radel [8]. Right: Reconstructed over true energy, with mean(red) and mean ± rms (blue). Note the different ranges and large width of thedistribution.

promising results with this number, which is on scale of the recommended 10 × d (see371

3.1). A population of 10×d = 80 was also tried, but limits on computing time interruped372

the reconstruction on the majority of the events.373

The median resolutions for both methods can then be plotted over the step number374

at which they were achieved; i.e., the resolution if the iterations were interrupted at this375

step. These plots are shown in the left column of fig. 3.7. The difference between two376

medians does not contain all information about the relative merits. A second type of377

quantity is also computed: the portion of events where a certain resolution improved378

when moving from MC to DE, called the separation. These make up the right column379

of fig. 3.7.380

16

Chapter

3.D

ifferentialEvolutionM

onteC

arlo

step (MC)5000 10000 15000 20000 25000 30000

] (M

C)

° [ θ ∆

6

7

8

9

10

11

12

Median Zenith Resolution vs. Step

step (DE)5000 10000 15000 20000 25000 30000

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

Portion of events where Zenith Resolution DE<MC vs. Step

(a) Zenith resolution and separation

step (MC)5000 10000 15000 20000 25000 30000

/E (

MC

)

E∆

0.3

0.32

0.34

0.36

0.38

0.4

0.42

Median Relative Energy Resolution vs. Step

step (DE)5000 10000 15000 20000 25000 30000

0.44

0.46

0.48

0.5

0.52

0.54

0.56

0.58

0.6

0.62

0.64

Portion of events where Relative Energy Resolution DE<MC vs. Step

(b) Relative energy resolution and separation

Fig. 3.7.: Left: Resolutions for both MC (blue) and DE (red), plotted vs. step number. Right: Separation between MC andDE, expressed as the portion of events where DE<MC.

17


The zenith resolution (fig. 3.7, top) of DE declines from a larger starting value of 11.8◦,381

which has to be due to the initial smearing (A.3). Meanwhile the MC zenith resolution382

loses its rate of decline quicker, leading to a close approach under 5.8◦ (DE) and 5.5◦383

(MC), with a separation between 0.48 and 0.5, after 25k steps. The relative energy384

resolution (fig. 3.7, bottom) begins at a slightly higher level, but here the separation385

crosses 0.5 at 2000 steps already and ends up above 0.62 after 25k steps, corresponding386

to resolutions of 30% (DE) and 38% (MC).387

An interesting point to evaluate is also the processing time per step for each method.388

This quantity depends on the computing resources available, in this case the NPX4389

cluster at the University of Wisconsin in Madison. However it’s apparent in Fig. 3.8390

that DE differs clearly from MC in that regard, with almost twice the processing time391

per step.392

step5000 10000 15000 20000 25000 30000

ratio

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.14

MC<tDEPortion of events where t

(a) Separation of times vs. step [s]MCt

0 10000 20000 30000 40000 50000 60000

[s]

DE

t

0

10000

20000

30000

40000

50000

60000

0

5

10

15

20

25

30

35

Time, DEMC vs. MCMC @ Step 30k

(b) time(DE) vs. time(MC) atstep 30k.

step5000 10000 15000 20000 25000 30000

time/

step

[s]

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

Mean Time/Step, DEMC and MCMC

(c) Mean time/step for DE (red) and MC (blue).

Fig. 3.8.: Comparison of running times for DE and MC.

This motivates a second type of plot, namely the resolutions at a certain time, but393

including all events that already have concluded their iterations; i.e., the resolution if394

the iterations were interrupted after this CPU time. These are shown in fig. 3.9395

Here, MC resolutions can be seen reaching a minimum between 15 ks and 20 ks,396

18


time [s] (MC)0 10000 20000 30000 40000 50000 60000

] (M

C)

° [ θ ∆

5

6

7

8

9

10

11

12

Median Zenith Resolution vs. Time

(a) Zenith resolution vs. time

time [s] (MC)0 10000 20000 30000 40000 50000 60000

/E (

MC

)

E∆

0.28

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

Median Relative Energy Resolution vs. Time

(b) Relative energy resolution vs. time

Fig. 3.9.: Resolutions vs. time for MC (blue) and DE (red). The resolution at certaintime corresponds to the condition that the iteration would be interrupted eitherafter this time is reached or at the maximum number of steps. Hence, theasymptotic behaviour is not intrinsic to the method, but rather a result of thelimited step number leading to fewer events completing their iteration after acertain time.

with the few events still running after that point worsening the resolution. The more397

rapid descent of DE in zenith resolution is balanced by the longer time, so that the two398

methods actually achieve the same resolution at 27 ks.399

For benchmark purposes, a display step of 25200 is chosen, which corresponds to400

the step after which DE remains underneath the zenith resolution it achieves at the401

maximum step of 30000. The resolutions and separations can then be measured at this402

step, but varying over the (true neutrino) energy (fig. 3.10) and cosine of zenith (fig.403

B.3).404

19

Chapter

3.D


onteC

arlo

[GeV]νtrue E0 10 20 30 40 50 60 70 80

] (M

C)

° [ θ ∆

4

6

8

10

12

14

Median Zenith Resolution vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.42

0.44

0.46

0.48

0.5

0.52

0.54

0.56

Portion of events where Zenith Resolution DE<MC vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

/E (

MC

)

E∆

0.25

0.3

0.35

0.4

0.45

Median Relative Energy Resolution vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.55

0.6

0.65

0.7

0.75

Portion of events where Relative Energy Resolution DE<MC vs. Energy at step 25k

Fig. 3.10.: Left: Resolutions for both MC (blue) and DE (red), right: separation between DE and MC, vs. true neutrinoenergy, at step 25200 . From top to bottom: Zenith, relative energy resolution, energy bias, fraction of muon overtotal energy.

20


The main deficiency of DE w.r.t. MC in zenith resolution is at energies < 20 GeV,405

which is weighted more strongly in the overall median resolution due to the E−2 input406

spectrum. Both relative energy resolutions increase from the low to the high end of the407

energy range, but stay in the same order of magnitude. This is consistent with the clear408

dependence of the energy bias in fig. B.2. Due DE starting to rise at a lower energy than409

MC, the separation dips to a minimum of 0.51% between 40 and 50 GeV. Meanwhile the410

separation is 0.75 > 70 GeV, and 0.65 < 10 GeV.411

3.4.2. Resolution for SPE4 Seed412

The DE method is run for two different population sizes, 40 and 80. The “display” step413

number of 25200 is then reached after respectively 625 and 312.5 generations. Since414

multiples of 50 were recorded in this run, generations 600 and 300 are used for plots in415

fig. 3.11.416

The zenith resolution always improves with energy, at least until 50 GeV. It’s in-417

tuitively understandable that the events < 10 GeV with few hits are the hardest to418

reconstruct with a track direction. The slight improvement of 7.2◦ for 80 chains, to 5.9◦419

for 40 chains can be seen to continue in the zenith resolution for MC, which achieves420

5.6◦ with one chain (see fig.3.7). The energy resolutions are even more similar, but still421

show an improvement 30% (80) to 33% (40).422

3.4.3. Resolution for Mixed Monte Carlo/SPE4 Seed423

A trial was made to evaluate the performance of DE under the condition where the424

vertex position and interaction were known to a high accuracy. In such a case, which425

for example can be possible using Multinest for reconstruction, it would be possible to426

actually fix these four parameters at their initial values for the entire evolution. Not427

only would this mean slightly less computation time spent on updating them and an428

effective decrease in the number of dimensions, but the true maximum of the likelihood429

function in the remaining free parameters would also better approach the true values.430

To this end, a seed particle was prepared using the direction from SPE4, default cascade431

energy and track length, and the Monte Carlo truth X, Y, Z and T. The latter four were432

neither smeared before, nor updated during the evolution. Again, population sizes of 40433

and 80 were used.434

The zenith resolution (fig. 3.12) is improved dramatically with regard to the SPE seed435

(fig. 3.11) and even MC (fig. 3.10), but at 3.8◦ for both 40 and 80 chains. Particularly <436

10 GeV does not show a resolution which is as large proportionally to the total resolution,437

while for the SPE seed the factor was ∼ 3. The energy resolutions (fig. 3.12) are also438

similar with 33% and show the same energy dependence as before.439

21

Chapter

3.D


onteC

arlo

[GeV]νtrue E0 10 20 30 40 50 60 70 80

] (D

E 8

0)°

[ θ ∆

0

10

20

30

40

50

60

0

2

4

6

8

10

12

14

16

18

20

22

Zenith Resolution (DE 80) vs. Energy @ gen 600, Median 5.910000

[GeV]νtrue E0 10 20 30 40 50 60 70 80

] (D

E 8

0)°

[ θ ∆

0

10

20

30

40

50

60

0

2

4

6

8

10

12

14

16

18


(a) Zenith resolution

[GeV]νtrue E0 10 20 30 40 50 60 70 80

/E (

DE

80)

E∆

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0

2

4

6

8

10

12

Relative Energy Resolution (DE 80) vs. Energy @ gen 600, Median 0.303000

[GeV]νtrue E0 10 20 30 40 50 60 70 80

/E (

DE

80)

E∆

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0

2

4

6

8

10

12

14

16


(b) Relative energy resolution

Fig. 3.11.: Resolutions for populations of 40 (left) and 80 (right) vs. true neutrino energy, with the SPE4 seed.

22

Chapter

3.D


onteC

arlo

[GeV]νtrue E0 10 20 30 40 50 60 70 80

] (D

E 8

0)°

| [θ ∆|

0

10

20

30

40

50

60

0

2

4

6

8

10

12

14

16

18

20

22

24


[GeV]νtrue E0 10 20 30 40 50 60 70 80

] (D

E 8

0)°

| [θ ∆|

0

10

20

30

40

50

60

0

2

4

6

8

10

12

14

16

18

20

22

24


(a) Zenith resolution

[GeV]νtrue E0 10 20 30 40 50 60 70 80

/E (

DE

80)

E∆

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0

2

4

6

8

10

12

14


[GeV]νtrue E0 10 20 30 40 50 60 70 80

/E (

DE

80)

E∆

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0

2

4

6

8

10

12

14

16

18


(b) Relative energy resolution

Fig. 3.12.: Resolutions for populations of 40 (left) and 80 (right) vs. true neutrino energy, with the mixed and partially-fixedseed.

23

Chapter 4.

Summary, Conclusions and Outlook440

4.1. Summary441

A minimizer based on Differential Evolution Markov Chains was employed for likelihood442

reconstruction of both a track and a cascade within PINGU. The minimizer was equipped443

to compare it to the previously used Metropolis-Hastings Markov Chain miminizer.444

Whether the original MC or this DE method has the more favourable resolution depends445

on the quantity. In general, energy-derived quantities are however better reconstructed.446

DE takes 1.6× the time/step as MC. The difference between running DE with 40 and447

80 chains appears mostly in a slight detriment to the zenith resolution. A trial with448

using the true vertex to fix its four parameters however improves the zenith resolution449

below that of MC, which makes trials with an additional, better vertex reconstruction450

promising. A drastic reduction in the required processing time was not observed, instead451

DE takes longer per step. The problem of chains diverging towards ever-greater track452

lengths (along with the vertex moving to the detector edge), and hence driving up the453

time for LLH function calls, was mitigated with a bound on the track length and jump454

steps, so that events where the majority of chains exhibit this behaviour and significantly455

affect the overall processing time don’t appear anymore.456

4.2. Conclusions457

Differential Evolution is a minimizer that encompasses many choices of parameters and458

modes of application, which have to be studied and chosen carefully, beyond the scope459

of this project. The apparatus to systematically status these is in place, so that someone460

continuing the study should be able to pick up at the same point. As encouragement461

might be seen that it is not unambiguosly worse than the method it was supposed to462

replace, even though that the places where it marks an improvement it can not be said463

to be significantly faster, either. Furthermore, study of the method has revealed a wide464

variety of modifications to explore, which are detailed in 4.3. In its current state, the465

long CPU time required for the DE makes Multinest a more viable default minimizer466

for IceCube/PINGU.467

24

Chapter 4. Summary, Conclusions and Outlook

4.3. Outlook468

There are several points from which these studies can move forward. In the following,469

these are detailed in several groups.470

As has been shown in section 3.3.1, stuck chains remain part of the generation, which471

is a point to further investigate.472

The fact that they are little affected by jump steps might be a hint that473

• these steps need to be constructed in a different way, for instance through the474

snooker update475

• there needs to be a heuristic to detect stuck chains, so that any measure could476

against them could be applied more discriminately477

• the step magnitude is too big in general. This would not only impede the conver-478

gence, but also mean that jump steps do not any make any significant difference.479

This could be either due to a too-large scale of the e variation, or the γ factor.480

Compared to MC, the speed/resolution performance is still lacking. Apart from vari-481

ations in the algorithm’s parameters, several types of updates could be of help:482

Crossover and Block Update483

Both these related ideas are described in the 2008 paper by Cajo et al. [1]. A crossover484

appears in many genetic algorithms. It involves replacing certain parameters in the trial485

vector with the un-updated version according a certain pattern, for instance binomially486

with a crossover probability for each parameter. A special case of non-random crossover487

simply divides the parameters into blocks, subsets that get updated in turn. This effec-488

tively decreases the number of dimensions and with that the minimal generation size,489

corresponding to the time spent in each step.490

Snooker Update491

The module contains an implementation (::EvolveSnooker) of the snooker update492

which I derived from the 2008 paper ([2]). It choses the direction and magnitude of493

the step separately. For a vector ~x, the update proceeds as this:494

1. Randomly chose a different chain ~z, yielding the direction ~n = ~z − ~x495

2. Until the trial vector is in bounds:496

a) Choose two more ~r1 6= ~r2.497

b) Project ~r1 − ~r2 onto ~n, yielding ~d498

c) Set trial ~y = ~x+ γs~d499

25


3. Accept the trial with probability min(1,r), where500

r = L(y)L(x)

( ||~y − ~z||||~x− ~z||

)d−1(4.1)

For d=1, this is a special case of the standard update with e = 0.501

One major advantage would be that it becomes more likely for stuck chains to be moved502

towards the normal chains as the number of stuck chains decreases, reversed to the case503

for the standard update. There are however conceptual problems with step magnitude,504

which would be dominated by the parameters with with the largest numerical scale,505

i.e. time and to a lesser extent position, while the relative spread of the angles would506

be virtually irrelevant. This could be mitigated by applying a sort of scaling factor,507

or separating the parameters into blocks. Either the step would be done separately in508

each parameter subspace, with a common ~r1 and ~r2, or the whole update could be done509

separately for each parameter block. Still, a trial of the snooker update, possibly in a510

certain mixture with other updates, seems worthwhile.511

DE-Z512

In the DE-Z algorithm [2], every K steps the current population of size N is added into513

a matrix of M = N × stepK . At every step, the indices R1 and R2 are sampled from514

1, . . . ,M . This allows for DE-like updates with a generation size N < d, and hence less515

time per step.516

Other techniques that could be worth trying are517

• elitism: the best members of a population get directly copied to the next generation518

without any variation519

• annealing: according to a ”cooling schedule“, the scale of the probability of a step520

towards a smaller likelihood gets decreased, so that the Metropolis-Hastings test521

becomes r < min(1, T (step)× LH(trial)/LH(current))522

• a convergence criterium: define a metric (on the current generation or its history)523

that interrupts the evolution when the waiting time for the next ”better“ vector is524

above a pre-defined threshold with some certainty525

• One parameter worth varying is left within the Millipede likelihood function. Mil-526

lipede defines time bins with variable width, given a maximum contained charge527

and length for each bin.528

• Also to improve convergence, the smearing size of the seed should be evaluated529

so that it’s not larger than necessary. If the smearing of the initial population is530

too large compared to the resolution of the seed, the time for any chain to evolve531

towards the maximum is increased, as is the risk of converging onto the wrong532

maximum.533

26


• To allow the method to be evaluated within the field of reconstruction methods for534

PINGU, a common hit series (i.e. cleaning) and seed will have to be used. As the535

current de-facto standard, the SANTA/monopod reconstruction’s choices would536

be wise to be followed. This, and the fact that the current ”baseline“ geometry537

is V15 instead of V6, means processing new simulation data. Following this, the538

Multinest minimizer and SANTA/monopod reconstruction could also be drawn539

into comparison.540

27

Chapter 5.

Acknowledgements541

I wish to thank Carlos Perez de los Heros for accepting the job of a supervisor for my542

project and coming up with the topic. David Boersma helped me continuously with543

a plethora of software and reconstruction questions and guided the overall structure544

of the report. Both David and Carlos were also present with helpful commentary and545

enduring patience during its writing. The entire Uppsala IceCube group deserves thanks546

for welcoming me into their midst and an into environment that gave me both many547

opportunities to learn more about the physics and techniques involved with IceCube, as548

well as a pleasant social climate. Both of these eventually convinced me to stay in the549

group for my master’s thesis. The transatlantic connection to IceCubers in Madison and550

Pennsylvania was very valuable for their unexpected impulses and the encouragement551

that comes with being taken seriously. Matt Dunkman and Ryan Eagan started the552

DarwinizedChainer, without which I might be working with mysterious Fortran code,553

and were helpful on questions about how to compare it to the Metropolis-Hastings554

Markov Chain minimizer. Similarly it rarely took long to get a response from the larger555

IceCube community to the many questions I posted to the mailing lists.556

28

Appendix A. Software options and parameters

Appendix A.

Software options and parameters557

A.1. Example script558

Option Description DefaultInput and Output--gcd The input file containing

G(eometry), C(alibration) andD(etector status) frames

""

--input_dir Directory containing the inputdata files

""

--file_base Specific file in this directory ""--event Specific event in the file to

process, if -1 process multipleevents

-1

--max_events Maximum number of events toread in, in that case

15

--output_dir Directory containing the out-put files

""

--suffix String to append to the filenames

""

Options passed to the tray segment--statistics Has the DarwinizedChainer

module write a diagnosticROOT file

(off)

--pulses Pulse series from the frame onwhich the reconstruction acts

"WavedeformPulses_STW"

--seed_particle Seed particle to create the ini-tial generation from

"SPE4_PINGU"

--cascade_mode Force track length 0 "False"--mc_vertex Use the Monte Carlo truth for

vertex position and time in theseed

(off)

--fixed_vertex Do not smear or evolve the ver-tex position and time

(off)

--burn_in Number of burn-in steps whichare not counted towards themean output vector

50

--record_best_every Frequency to record the cur-rent best vector

50

--steps Steps to take after burn-in 150--jump_steps Frequency for jump steps, if 0

take none0

--population Number of chains 16

559

30


A.2. Tray script560

Option Description Defaultshower_spline_dir Directory containing shower

spline tables""

track_spline_dir Directory containing track seg-ment spline tables

""

cascade_mode Force zero track segments Falseoutput_base Prefix for output object keys "Darwin"seed_particle Name of seed particle to

take from frame to seed(x,y,z,t,zen,azi), if MC— takeMonte Carlo truth seed

"SPE4_PINGU"

pulses Name of pulse series to use inLLH calculations

"WavedeformPulses_STW"

burn_in Number of burn-in steps whichare not counted towards the meanoutput vector

50

record_best_every Put the best vector to frame ev-ery N steps

50

steps Steps to take after burn-in 150log_level Logging level for HybridReco and

the DarwinizedChainerI3LogLevel.LOG_INFO

561

31


A.3. Module562

Option Description DefaultStatMapName Name of the covariance map to add to

frame"I3StatisticsMap"

OutputBase Base for the output keys "Darwin"SeedServiceName Name of seed service ""LLHServiceName Name of likelihood service ""ParamServiceName Name of parameter service ""Generations Number of generations to evolve 150BurnInLength Number of points to throw out during

burn in50

RecordBestEvery Record best vector in frame every Nsteps

50

NumChains Number of chains to use in the evolution 80RootFileName Name of root file, write none if empty ""ScaleFactor Scale factor for evolution addition 0.2Seed Random number seed seedBigJumpSteps Take a big jump every N steps, or none

if 00

BestLLHKeep Average the best N vectors for each ofthe last N steps

5

563

DarwinizedChainer::Configure564

The method checks the arguments (tab. A.3) and whether the services exists. It records565

the names, bounds and steps of the parameters stored in the parameter service in class at-566

tribute vectors vector<string> names_, vector<double> steps_, vector<double> mins_567

and vector<double> maxs_.568

The names of the parameters, their default bounds and step sizes are:569

32


Index Parameter Unit [Min,Max] ± Step Comments0, 1 X, Y m [-500,500] ± 2 horizontal ver-

tex position instandard IceCubecoordinates [9]

2 Z m [-500,500] ± 5 (vertical vertex po-sition)

3 T ns [9000,11000] ± 5 vertex/interactiontime relative to theevent trigger

4 CosZenith 1 [-1,1] (wrapped) ± 0.02 cos(zenith) for thetrack

5 Azimuth rad [0,2π] (wrapped) ± 2 azimuth angle ofthe track

6 CascadeEnergy GeV [0,120] ± 0.2 energy of the cas-cade at the vertex

7 NumTrackSegments 1 [0,35] ± 0.1 number of 15mtrack segments, 1GeV muon ≡ 4.5 mtrack

570

It also creates the TNTupleD objects which are later filled and stored in the statistics571

ROOT file.572

DarwinizedChainer::Physics573

The method DarwinizedChainer::Physics implements the minimizer and is called for574

each P frame processed. First, it creates the initial generation by575

• retrieving the seed particle from the seed service576

• for each chain:577

– For each parameter where the step size is <0:578

∗ As initial smearing, add a value sampled from a uniform distribution with579

a half-width of580

X,Y,Z T CosZenith Azimuth CascadeEnergy NumTrackSegments50 m 100 ns 0.1 0.2 rad 10 GeV 5 segments581

∗ Re-sample until the parameter is within the boundaries582

– Append the likelihood value (computed with likelihood wrapper from 2.3)583

• find and record the best vector584

• and store each of them in the currentMatrix, making up the starting generation.585

The iteration over the defined number of steps follows. It is interrupted by a criterium586

for time divergence (see 3.3.1) that uses the average time of the first 20 steps. Afterwards,587

33


the hence-total time may not exceed double the total time expected for full iteration.588

Each step is an iteration over the 80 vectors (the generation), in which each is updated,589

either the standard way or by alternative update methods.590

Standard Update (::Evolve2006)591

The standard update is described in the 2006 paper by C.F. terBraak et. al. [1]. As592

described in 3.1, it consists of replacing593

~x→ ~x+ γ ∗ (~r1 − ~r2) + ~e (A.1)

according to a Metropolis-Hastings test, where ~r1 6= ~r2 are randomly chosen members of594

the population, different from the current one, and ~e is sampled from an 8-dimensional595

uncorrelated gaussian. If necessary, the boundaries are enforced for each parameter596

individually by re-chosing ~r1 and ~r2 and re-throwing ~e until they new value is within597

them. This saves time from re-doing the entire calculation for the sake of a single598

parameter. The vertex coordinates are re-thrown together, however, since they are599

expected to be closely correlated in the LLH function. A jump step (see also 3.3.1) sets600

γ = 1.601

Output602

The module puts to the frame:603

34


Type Name (with prefix”Darwin“)

Description

I3Map<string,double> DarwinBest Best parameters andLLH after the full evo-lution

I3Map<string,double> DarwinBest.step DarwinBest after eachrecorded step

I3Map<string,vector<double>> DarwinBestVector The same informationin one object, with pro-cessing time at ¡step¿

I3Bool DarwinCompleted Whether the chain wasinterrupted by the timedivergence criteriumfrom 3.3.1

I3Map<string,double> DarwinMeanAll Mean of all vectors pastburn-in

I3Map<string,double> DarwinMeanBestN Mean of the best N vec-tors of each of the lastN steps

I3Particle DarwinBestParticle The same as Dar-winBest as a particle,with fit status fromDarwinCompleted

I3Double DarwinTime The total duration ofthe evolution

I3Double HadronicFactor The hadronic scalingfactor in 3.3

604

The statistics ROOT file contains two trees.605

DarwinizedChain Parameters, LLH, LLH call time, and whether it was actually updated for each606

chain and each step607

608

GenerationStats hitherto best parameters, best LLH, duration of the whole step and the fraction609

of accepted updates for each step.610

611

35

Appendix B.

Supplementary figures612

36

Appendix B. Supplementary figures

step (MC)5000 10000 15000 20000 25000 30000

E [G

eV] (

MC

)∆

-8

-6

-4

-2

0

2

4

Median Energy Bias vs. Step

step (DE)5000 10000 15000 20000 25000 30000

0.36

0.37

0.38

0.39

0.4

0.41

0.42

0.43

Portion of events where Energy Bias (DE) DE<MC vs. Step

step (MC)5000 10000 15000 20000 25000 30000

(M

C)

ν/E µ

E

0.17

0.18

0.19

0.2

0.21

0.22

0.23

Fraction Resolution vs. Stepν/EµMedian E

step (DE)5000 10000 15000 20000 25000 30000

0.46

0.48

0.5

0.52

0.54

0.56

Fraction Resolution DE<MC vs. Stepν/EµPortion of events where E

step (MC)5000 10000 15000 20000 25000 30000

| [m

] (M

C)

r ∆|

5

10

15

20

25

30

Median Vertex Resolution vs. Step

step (DE)5000 10000 15000 20000 25000 30000

0.3

0.4

0.5

0.6

0.7

Portion of events where Vertex Resolution DE<MC vs. Step

step (MC)5000 10000 15000 20000 25000 30000

[ns]

(M

C)

T∆

20

40

60

80

100

Median Interaction Time Resolution vs. Step

step (DE)5000 10000 15000 20000 25000 30000

0.2

0.3

0.4

0.5

0.6

0.7

Portion of events where Interaction Time Resolution DE<MC vs. Step

Fig. B.1.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. step number. From top to bottom: Zenith, relative energyresolution, energy bias, fraction of track over total energy.

37


[GeV]νtrue E0 10 20 30 40 50 60 70 80

E [G

eV] (

MC

)∆

-40

-35

-30

-25

-20

-15

-10

-5

0

Median Energy Bias vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.25

0.3

0.35

0.4

0.45

Portion of events where Energy Bias (DE) DE<MC vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

(M

C)

ν/E µ

E

0.15

0.2

0.25

0.3

0.35

Fraction Resolution vs. Energy at step 25kν/EµMedian E

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

Fraction Resolution DE<MC vs. Energy at step 25kν/EµPortion of events where E

[GeV]νtrue E0 10 20 30 40 50 60 70 80

| [m

] (M

C)

r ∆|

2

4

6

8

10

12

Median Vertex Resolution vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.45

0.5

0.55

0.6

0.65

0.7

0.75

Portion of events where Vertex Resolution DE<MC vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

[ns]

(M

C)

T∆

5

10

15

20

25

30

35

Median Interaction Time Resolution vs. Energy at step 25k

[GeV]νtrue E0 10 20 30 40 50 60 70 80

0.5

0.55

0.6

0.65

0.7

0.75

0.8

Portion of events where Interaction Time Resolution DE<MC vs. Energy at step 25k

Fig. B.2.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. true neutrino energy in GeV, at step 25200 . From top tobottom: Zenith, relative energy resolution, energy bias, fraction of track overtotal energy.

38


)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

] (M

C)

° [ θ ∆

4

4.5

5

5.5

6

6.5

7

7.5

8

Median Zenith Resolution vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.4

0.45

0.5

0.55

0.6

Portion of events where Zenith Resolution DE<MC vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

/E (

MC

)

E∆

0.22

0.24

0.26

0.28

0.3

0.32

0.34

0.36

0.38

0.4

Median Relative Energy Resolution vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.5

0.55

0.6

0.65

0.7

0.75

Portion of events where Relative Energy Resolution DE<MC vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

E [G

eV] (

MC

)∆

-12

-10

-8

-6

-4

Median Energy Bias vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.32

0.34

0.36

0.38

0.4

0.42

0.44

Portion of events where Energy Bias (DE) DE<MC vs. Cos(Zenith) at step 25k

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

(M

C)

ν/E µ

E

0.14

0.16

0.18

0.2

0.22

0.24

0.26

Fraction Resolution vs. Cos(Zenith) at step 25kν/EµMedian E

)θcos(-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.4

0.45

0.5

0.55

0.6

Fraction Resolution DE<MC vs. Cos(Zenith) at step 25kν/EµPortion of events where E

Fig. B.3.: Left: Resolutions for both MC (blue) and DE (red), right: separation betweenDE and MC, vs. true cosine of zenith, at step 25200 . From top to bottom:Zenith, relative energy resolution, energy bias, fraction of track over totalenergy.

39


time [s] (MC)0 10000 20000 30000 40000 50000 60000

E [G

eV] (

MC

)∆

-10

-8

-6

-4

-2

0

2

4

Median Energy Bias vs. Time

time [s] (MC)0 10000 20000 30000 40000 50000 60000

(M

C)

ν/E µ

E

0.17

0.18

0.19

0.2

0.21

0.22

0.23

Fraction Resolution vs. Timeν/EµMedian E

time [s] (MC)0 10000 20000 30000 40000 50000 60000

| [m

] (M

C)

r ∆|

5

10

15

20

25

30

Median Vertex Resolution vs. Time

time [s] (MC)0 10000 20000 30000 40000 50000 60000

[ns]

(M

C)

T∆

20

40

60

80

100

Median Interaction Time Resolution vs. Time

Fig. B.4.: Resolutions vs. time for MC (blue) and DE (red). From top to bottom: Energybias, track/total energy fraction resolution, vertex resolution, interaction timeresolution. The asymptotic behaviour is due to statistics tailing off at hightimes.

40


[GeV]νtrue E0 10 20 30 40 50 60 70 80

E [G

eV] (

DE

80)

∆

-60

-40

-20

0

20

40

60

0

2

4

6

8

10

12

14

16

18

20

22

Energy Bias (DE 80) vs. Energy @ gen 600, Mean -6.756736, RMS 13.792047

[GeV]νtrue E0 10 20 30 40 50 60 70 80

E [G

eV] (

DE

80)

∆

-60

-40

-20

0

20

40

60

0

2

4

6

8

10

12

14

16

18

20

22

24


[GeV]νtrue E0 10 20 30 40 50 60 70 80

(D

E 8

0)ν

/E µE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

2

4

6

8

10

Fraction Resolution (DE 80) vs. Energy @ gen 600, Median 0.185500ν/EµE

[GeV]νtrue E0 10 20 30 40 50 60 70 80

(D

E 8

0)ν

/E µE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

2

4

6

8

10


[GeV]νtrue E0 10 20 30 40 50 60 70 80

| [m

] (D

E 8

0)r ∆|

0

5

10

15

20

25

30

35

40

45

50

0

2

4

6

8

10

12

14

16

18

20

22

Vertex Resolution (DE 80) vs. Energy @ gen 600, Median 4.075000

[GeV]νtrue E0 10 20 30 40 50 60 70 80

| [m

] (D

E 8

0)r ∆|

0

5

10

15

20

25

30

35

40

45

50

0

2

4

6

8

10

12

14

16

Vertex Resolution (DE 80) vs. Energy @ gen 300, Median 5.575000

[GeV]νtrue E0 10 20 30 40 50 60 70 80

[ns]

(D

E 8

0)

T∆

0

10

20

30

40

50

60

70

80

90

100

0

5

10

15

20

25

30

Interaction Time Resolution (DE 80) vs. Energy @ gen 600, Median 8.050000

[GeV]νtrue E0 10 20 30 40 50 60 70 80

[ns]

(D

E 8

0)

T∆

0

10

20

30

40

50

60

70

80

90

100

0

2

4

6

8

10

12

Interaction Time Resolution (DE 80) vs. Energy @ gen 300, Median 12.250000

Fig. B.5.: Resolutions for populations of 40 (left) and 80 (right) with the SPE4 seed, vs.true neutrino energy in GeV. From top to bottom: Energy bias, track/totalenergy fraction resolution, vertex resolution, interaction time resolution. Thered curves for energy bias are mean and rms, for the rest they are medians.

41


[GeV]νtrue E0 10 20 30 40 50 60 70 80

E [G

eV] (

DE

80)

∆

-60

-40

-20

0

20

40

60

0

5

10

15

20

25


[GeV]νtrue E0 10 20 30 40 50 60 70 80

E [G

eV] (

DE

80)

∆

-60

-40

-20

0

20

40

60

0

5

10

15

20

25


[GeV]νtrue E0 10 20 30 40 50 60 70 80

(D

E 8

0)ν

/E µE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

2

4

6

8

10

12


[GeV]νtrue E0 10 20 30 40 50 60 70 80

(D

E 8

0)ν

/E µE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

2

4

6

8

10


Fig. B.6.: Resolutions for populations of 40 (left) and 80 (right) with the mixed andpartially-fixed seed, vs. true neutrino energy in GeV. Top: Energy bias, bot-tom: track/total energy fraction resolution. The red curves for energy biasare mean and rms, for the fraction they are medians. The vertex resolution isnot shown since these parameters were fixed during this run.

42

Bibliography613

[1] C.J.F. ter Braak, A Markov Chain Monte Carlo version of the genetic algorithm614

Differential Evolution: easy Bayesian computing for real parameter spaces, Stat615

Comput (2006) 16:239-249616

[2] C.J.F. ter Braak, J.A. Vrugt, Differential Evolution Markov Chain with snooker617

updater and fewer chains, Stat Comput DOI 10.1007/s11222-008-9104-9618

[3] M.G. Aartsen et al., Energy Reconstruction Methods and Performance in the Ice-619

Cube Neutrino Detector620

[4] R. Abbasi et al., The Design and Performance of IceCube DeepCore,621

arXiv:1109.6096 [astro-ph.IM]622

[5] D. Williams, The Precision IceCube Next Generation Upgrade, presented at DPF623

2013, arXiv:1310.1287 [astro-ph.IM]624

[6] M.G. Aartsen et al., Measurement of South Pole ice transparency with the IceCube625

LED calibration system, Appendix B arXiv:1301.5361 [astro-ph.IM]626

[7] M. Kowalski, Search for Neutrino-Induced Cascades with the AMANDA-II Detector627

(PhD thesis, January 13th 2004), ch. 3.4.2, p. 21ff628

[8] L. Radel, Simulation Studies of the Cherenko Light Yield from Relativistic Particles629

in High-Energy Neutrino Telescopes with Geant4 (Master thesis, September 2012),630

ch. 4.3.2631

[9] IceCube coordinate system on the IceCube wiki632

[10] N. Whitehorn, J. van Santen, S. Lafebre, Penalized Splines for Smooth Representa-633

tion of High-dimensional Monte Carlo Datasets, arXiv:1301.2184 [physics.data-an]634

[11] J. Lundberg et al., Light tracking through ice and water - Scattering and absorption635

in heterogeneous media with PHOTONICS, arXiv:astro-ph/0702108636

[12] R. Abbasi et al., Calibration and Characterization of the IceCube Photomultiplier637

Tube, ch. 11, arXiv:1002.2442 [astro-ph.IM]638

[13] IceCube Bootcamp 2008 presentation on Hits, PEs and Waveforms, p. 11639

43

http://arxiv.org/abs/1109.6096v1



https://wiki.icecube.wisc.edu/index.php/Coordinate_system


http://arXiv.org/abs/astro-ph/0702108v2


https://docushare.icecube.wisc.edu/dsweb/Get/Document-46800/Bootcamp2008_hits.pdf

Reconstruction of PINGU data with a Diﬀerential Evolution …685277/FULLTEXT01.pdf · 2014. 1....

Documents

Transcript of Reconstruction of PINGU data with a Diﬀerential Evolution …685277/FULLTEXT01.pdf · 2014. 1....