A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical...

21
A statistical test for point source searches - Aart Heijboer - AWG - Cern tistical test for point source searche Aart Heijboer contents: Motivation Hypothesis testing reminder Likelihood ratio test Calculation of Likelihood formulas ingredients Event generators test results Conclusions and plans under constru ction! Results shown only serve as illustration!

Transcript of A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical...

Page 1: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

A statistical test for point source searchesAart Heijboer

contents: Motivation Hypothesis testing reminder Likelihood ratio test Calculation of Likelihood

formulasingredients

Event generators test results Conclusions and plans

under construction!

Results shown only

serve as illustration!

Page 2: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Motivation

Suppose we use a binned method and find a candidate bin; we would like to know a bit more about these events:

What is the energy?Are the events located together withinthe bin?is the angular separation compatiblewith the measured muon energy?

not unimportant

2o x 2o bin 2o x 2o bin

less signal like more signal like

what is the energy of this event?

Try to develop a methodthat uses all information(no information loss by binning or clustering)

Page 3: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

pro

babili

ty d

ensi

tyHypothesis testing - reminder

Given the data, which is more likely? H0 : only atmospheric neutrinos are present or H1: in addition to the background there is some signal

How to decide between the two: Choose a parameter that is a function of the data

l(data), the 'test statistic'. The distribution of l(data) should be sensitive to whether

the data was 'caused' by H0 or H1.

Define the a region wherel is unlikely if H0 is true:Reject H0 if l is in this region

rejection regionacceptance region l

H0 H1

Page 4: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Hypothesis testing reminder.pro

babili

ty d

ensi

ty

1-level of significance 1-power

rejection regionacceptance region l

H0 H1

two important parameters of a test: level of significance (aka size or confidence level) º 1- the probability of rejecting H0, when it H0 is true.the power º 1 - the probability of accepting H0 when H1 is true

at a fixed level of significance, the power is relatedto the sensitivity for 'detecting' H1. for a given H1 a large power results in the rejection

of H0 at high confidence level.

Page 5: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

How to choose the test statistic?

We are free to any function of the data as test statistic l!

examples for H1= 'there is a point source' number events in direction bin

how to define bin-size? number of events in a cluster

how to define clustering algorithm? minimum of the difference in direction of all pairs of events

If H1 is completely specified (no unknown parameters): there exists a recipe for the best possible test statistic!If H1 has free parameters:

there is a recipe that usually performs very well.

Page 6: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

The likelihood ratio test statisticIf H1 is completely specified,for example H1 = 'there is a point source of neutrinos at (l,b) =399,-4 with flux 1.10-4 E-2 GeV-1 m-2 s-1', Then the most powerful test is the Neyman-Pearson test (likelihood ratio test):Choose: =l log of the ratio of the probabilities of the data (x) under H1 and H0

.... so we have to calculate the probability of the observed data for a given flux.

If H1 has free parameters:for example H1 = 'there is a point source with a power law spectrum somewhere in the sky', Then is usual to

choose the unknown parameters so that they maximise the probability of the data

with these parameters, do a likelihood ratio testthis is the 'maximum likelihood ratio' test.

H1 : = F Fbg + Fsig

H0 : = F Fbg

H1 : = F Fbg + Fsig

H0 : = F Fbg

unknown: has free parameters

Page 7: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

likelihood of all datafor flux F

Rate of this kind of event

Calculation of the likelihood

reconstructed muon directionand energy

likelihood of event i

Likelihood is related to the predicted event rate:

Depends only on the data

total number of predicted events

We know howto calculate event rates

except for this factor

Page 8: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Calculating the likelihood

final approximation: split energy and direction dependence &parameterize the direction-term as function of reconstructed energy

This is the probability of measuring muon direction when the true neutrino direction is for an event with a reconstructed energy . It is called the point spread function (PSF)

Page 9: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Calculating the likelihood

evaluate at reconstructed coordinates. Integrate over coordinates (l,b)

The flux is given by atmospheric background and a point source:

peaked in a few degrees weakly dependent on l,b

Page 10: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Calculating the likelihood: summary

prob. of event i background-rate at reconstructed coordinates and energy

Point spread function rate of signal events at reconstructed muon energy

H1 : = F Fbg + Fsig

H0 : = F Fbg

simply set the signal flux to 0

Page 11: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Ingredients: The point spread functionDistribution of angle between true neutrino and rec. muon for differentbins in muon energy (will use rec. muon energy in future).Parameterised with Landau.

Page 12: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Ingredients: The point spread function II

dl (deg)

db

(d

eg)

Now as function of difference in angular coordinates

Page 13: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Ingredients: neutrino effective area & P(Em|En)

Used for calculating predicted event rates as function of the neutrino flux

OFF-TOPPIC:this table is now implented inCALCRATE to give event ratevs muon energy

Page 14: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Events & event generators

Bartol+RQPM flux for atmospheric neutrinos (= highest flux)Need realistic sample (can not use event weighting for this!)

have selected a number of events from production with P(select)weight.can make program available if you want.

Written point source mode for GENHENgenerates events for specified declinationavailable in CVS (GENHEN v5r1)

Same event selection as atmospheric neutrinos

To get a distribution of the test statistic, we need to simulate manyfull ANTARES experiments (i.e. many years of data taking). For each ofthem the test statistic must be calculated.® use same sample of events but mix the detection times (event mixing). This can also be done with the real data!

atmospheric neutrinos

points source neutrinosnb: no atmospheric muons.(hopefully negligible w.r.t atm. neutrinos)

Page 15: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Clustering

The search for sources can be restricted to regions of the sky where events are present, therefore

search for clustersevaluate S+B/S likelihood ratio for each clustercluster with largest likelihood ratio is source candidate

clustering algorithm is simple:

for each event {find all events within a degrees }

a is e.g. 1.5 degrees

NB: clustering only serves to speed up the computation (not waste time on areas of the sky where there are no events)The search method is not a 'clustering method'

one year of background events

Page 16: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Fitting source parameters

We are looking for point sources with a power law spectrum: flux contains 4 unknown variables source position (2 parameters) spectral index flux 'normalisation'

In the test statistic, the Maximum Likelihood is needed:for each cluster, these parameters are fitted (MINUIT) to the dataadditional advantage: good (best) estimator of the source-position and spectrum.

tested by making clusters of 3 or 6 events at known position

Page 17: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Fitting source parameters: test results

true values

conclusion: this works (but need to check spectral index)

Page 18: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Test results Simulated 4 x 1000 x 1 year data-taking periods mix atmospheric event-times and azimuth angles add N signal events from weighted sample of

point source events at (ra,dec) = ( 0, -0.5)find test statistic:

find all candidate clusters & calculate test stat.test stat. of all data is largest test stat. of all clusters

N = Poisson( Rate(F) ) = F 0 :background only experiments (H0) = F 1 10-3 E-2 GeV-1 m-2 s-1 (~3 sig. events/year) = 2F 10-3 E-2 GeV-1 m-2 s-1 (~6 sig. events/year) = 4F 10-3 E-2 GeV-1 m-2 s-1 (~12 sig. events/year)

Page 19: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

test results

preliminary!

'good' separation between background and signalbetter separation if there are more signal events

Page 20: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

test results: fitted source position

= F 0 :background only experiments (H0) = F 1 10-3 E-2 GeV-1 m-2 s-1 (~3 sig.

events/year)

= F 1 10-3 E-2 GeV-1 m-2 s-1

= 4F 10-3 E-2 GeV-1 m-2 s-1

cut at 1.4

better resolution if thereare more events.

Page 21: A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:

A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002

Conclusions

A statistical test was described using the maximum likelihood ratio as test statisticThe test 'automatically' takes into account

likelihood of measured muon energy for the source spectrumvariation of angular resolution point spread function with energy.

Work done on event generators to get realistic samples of signal and atmospheric background.First results look sensible: the method seems to be working!Lot of work to be done

start using reconstructed muon energy in stead of true.check that atmospheric muons are negligable.check for errors by comparing with binned method.calculate sensitivity / exclusion power

Additional ideas for improvementloosen selection cutspeed up clustering algorithm the end