Download - Look elsewhere effect

Look elsewhere effect

Ofer VitellsStatistics miniworkshop at CERN , February 2013

LEE Topics• Introduction

– Definition of gaussian & gaussian-related fields

• Z-dependence of trial factor– Variance of m-hat – Bayesian comparison

• Different possibilities for critical region– Constatnt LR (“Tevatron” test statistic) curves– Leadbetter formula

• Location (“energy-scale”) uncertainties– Single channel– Combination

• Approximation/estimation problems– Sliding window effect on upcrossings counting– Uncertainty on observed number of upcrossings (poisson?)– When asymptotic formulae break down in practice

• Gaussian & Gaussian related fields

- The joint distribution of any collection {f(t1),f(t2),…,f(tn)} is multivariate Gaussian- Gaussian related fields are functions of Gaussian fields, e.g.

(chi-squared field)

2 2

1

( ) ( )k

k ii

t f t

t

f(t)Wilks :

2( 0)( ) 2 log

ˆ( , )q

LL

• Z-dependance

0p-value=P(max[ ( )] )q u

22 /2 /2

1 1 1

1( )

2u Z

localP u e p e N N

1ZNTrial-factor

( ( ) )i i in Poiss s m b

21

2

logˆ[ ] ( [ ])

LVar m E

m

222

2 2

( )log 1[ ]

( ( ) )i

i i i

s mLE

m s m b m

ˆ2

1 1 1ˆ[ ] , mVar m

Z

ˆ ( )m

rangeTF

Z

In the large sample limit

Example :

• Variance of : m

( )i ib s

Bayesian estimate

ˆ

1m

m

2

2( 0)2log

ˆ( )

ˆ( )

( 0)Z

q Z

e

LL

LL

( , )mL

There is less posterior probability in the peak as it narrows (~1/Z)

( 0) 1 ( )( ) ( )m

m m

mP Z

m Z Z

m

With a uniform prior:

“Trial factor”

Bayesian estimate

ˆ

1m

m

2

2( 0)2log

ˆ( )

ˆ( )

( 0)Z

q Z

e

LL

LL

( , )mL

There is less posterior probability in the peak as it narrows (~1/Z)

m

Jeffreys Prior:2

ˆ

2 2ˆ

0( , ) det

0 (1)m

m

Cancels the Z dependence

Example normalized likelihoods

0 20 40 60 80 100 120 140 160 180 200-5

-4

-3

-2

-1

0

1

2

3

4

5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140 160 180 200-5

-4

-3

-2

-1

0

1

2

3

4

5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Background

Background + signal

m

Bayesian estimate

m

( , )mL

Integrate m ?

m

what exactly is meant by 'more impressive than observed in data' ?q(PL) vs. qTEV

2 22 ˆˆ ˆ(0)2 log 2

( )SM SM SM

TEVSM

Lq

L

2ˆ(0)

2 logˆ( )

Lq

L

SMSMZ

SM Expected significance (sensitivity):

observed significance : ˆobsZ q

22 obsTEV SM SMq Z Z Z

At a given mass point the two tests are equivalent (1-to-1 functions of ),

But give different answers to what is the “best fit” mass - or

* note qTEV = 0 if ZSM=0. max[qTEV] is generally not at the point of largest local significance.

max[ ( )]mq m max[ ( )]TEV

mq m

9

22 obsTEV SM SMq Z Z Z

SMZ

obsZ

2

2obs TEV SM

SM

q ZZ

Z

10

Curves of constant qTEV

2

2obs TEV SM

SM

q ZZ

Z

p-value = Prob( max[qTEV] > c )

A similar signal at 160 GeV would give much smaller global significance (because less consistent with the SM) - same as local 1σ @ 600 GeV

11

Curves of constant qTEV

2

2obs TEV SM

SM

q ZZ

Z

p-value = Prob( max[qTEV] > c )

Can be estimated with Leadbetter’s formula (upcrossings above a curve)

A similar signal at 160 GeV would give much smaller global significance (because less consistent with the SM) - same as local 1σ @ 600 GeV

12

Where to put the critical region

13

Energy-scale uncertainties

14

Likelihood at a fixed mass M0

Energy-scale nuisance parameter

“local” LEE (Leadbetter)

ATLAS combined Higgs workspace toy sampling at 126.5 GeV with ES uncertainty

Leadbetter formula with a parabolic curve (gaussian constraint)

Similar to a LEE in the range defined by

ES

combination

m1

m2

1 2

2 2

1 0 2 00 1 2

,1 2

( ) max ( , )m m

m m m mq m q m m

21 /2d Zp Z e (trial factor )

dZ

2D field

2

• Sliding windows (mass dependant cuts)==>discontinuity in q(m) due to events getting in/out

0 20 40 60 80 100 1200

10

20

30

40

50

Eve

nts

/ uni

t mas

s

0 20 40 60 80 100 1200

5

m

q(m

)

0 ( )q m

u

0q

q(m)

– Uncertainty on observed number of upcrossings

• Usually assumed Poisson • Effect on significance is logarithmic

– When & how asymptotic formulae break down in practice

• ?

2 /21 1( ) Z

global localp p e N N

Extra slides

Example of combination of channels with different mas resolutions

• Toy combination of two channels:(both gaussian signal + flat bkg)- channel 1: σm=1 GeV- channel 2: σm=10 GeV

Combination example

q0(mH) µ(mH)^

0 10 20 30 40 50 60 70 80 90 100-3

-2

-1

0

1

2

3

combined

channel 1channel 2

0 10 20 30 40 50 60 70 80 90 1000

1

2

3

4

5

6

7

8

9

combined

channel 1channel 2

Note the effect of the wide bump on the number of upcrossings at 1

mH mH

Average number of upcrossings

0 2 4 6 8 10 12 14 16 18 20 2210

-5

10-4

10-3

10-2

10-1

100

101

level

aver

age

num

ber

of u

pcro

ssin

gs

channel 1

channel 2Combined

/20( ) cN c N e

25,000Toy simulations

Note that the average number of upcrossins in the combination is alwayssmaller than in channel 1 alone

23

2-D exapmle #2: resonance search with unknown width

• Gaussian signal on exponential background• Toy model : 0<m<100 , 2<σ<6• Unbinned likelihood:

( ) ( )( | )s s i b s i

s bi s b

N f x N f xPoiss N N N

N N

L

( ) cxbf x ce

0 10 20 30 40 50 60 70 80 90 10010

0

101

102

10 20 30 40 50 60 70 80 902

2.5

3

3.5

4

4.5

5

5.5

60q

σ

m

2

2

( )

2

2

1( ; , )

2

x m

sf x m e

24

2-D exapmle #2: resonance search with unknown width

10 20 30 40 50 60 70 80 902

2.5

3

3.5

4

4.5

5

5.5

6

10 20 30 40 50 60 70 80 902

2.5

3

3.5

4

4.5

5

5.5

6

u=1 u=0

5 10 15 20 25 3010

-6

10-5

10-4

10-3

10-2

10-1

100

P-value0q

2 /21 2

1[ ( )] P( ) ( )

2u

uE A u u e N N

1

2

4 0.2

0.7 0.3

N

N

0 4.5 0.2 1 3 0.16

Excellent approximation above the ~2σ level

(2nd term is dominant for ) 1 2/ 5.7Z N N

20 40 60 80 100 120 140 160 180 200

20

40

60

80

100

120

140

160

180

200

m1

m2

1 2

2 2

1 0 2 00 1 2

,1 2

( ) max ( , )m m

m m m mq m q m m

Asymptotic formulae

To have the distribution well defined for , take , since

e.g. If take , such that constant ( )

In this limit is independent of up to

e.g.

N ' / N 1/ N

b s b /s b / 1/ 0s b b

(1/ )O N

/s s

O s bb s b