Sept. 12-15, 2005M. Block, Phystat 05, Oxford PHYSTAT 05 - Oxford 12th - 15th September 2005...
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
3
Transcript of Sept. 12-15, 2005M. Block, Phystat 05, Oxford PHYSTAT 05 - Oxford 12th - 15th September 2005...
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
PHYSTAT 05 - Oxford 12th - 15th September 2005
Statistical problems in Particle Physics, Astrophysics and
Cosmology
“Sifting data in the real world”
Martin BlockNorthwestern University
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
“Sifting Data in the Real World”,
M. Block, arXiv:physics/0506010 (2005).
“Fishing” for Data
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Hence,minimize i (z), or equivalently, we minimize 2 i 2i
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Why choose normalization constant =0.179 in Lorentzian 02?
Computer simulations show that the choice of =0.179 tunes the Lorentzian so that minimizing 0
2, using data that are gaussianly distributed, gives the same central values and approximately the same errors for parameters obtained by minimizing these data using a conventional 2 fit.
If there are no outliers, it gives the same answers as a 2 fit.
Hence, using the tuned Lorentzian 02 , much like using the
Hippocratic oath, does “no harm”.
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
All cross section data for Ecms > 6 GeV,
pp and pbar p, from Particle Data Group
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
All data (Real/Imaginary of forward scattering amplitude), for Ecms > 6 GeV,
pp and pbar p, from Particle Data Group
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
We use real analytical amplitudes that saturate the Froissart bound with the term ln2(/m), where is the laboratory energy and m is the proton (pion) mass. We simultaneously fit the cross section and (the ratio of the real to the imaginary portion of the forward scattering amplitude), where:
Fitting the “Sieved” pp and p data with analytic amplitudes
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Only 3 Free Parameters
However, only 2, c1 and c2, are needed in cross section fits !
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Cross section model fits for Ecms > 6 GeV, anchored at 4 GeV,
pp and pbar p, after applying “Sieve” algorithm to Real World data
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
-value fits for Ecms > 6 GeV, anchored at 4 GeV,
pp and pbar p, after applying “Sieve” algorithm
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
What the “Sieve” algorithm accomplished for the pp and pbar p data
Before imposing the “Sieve algorithm:
2/d.f.=5.7 for 209 degrees of freedom;
Total 2=1182.3.
After imposing the “Sieve” algorithm:
Renormalized 2/d.f.=1.09 for 184 degrees of freedom, for 2i > 6 cut;
Total 2=201.4.
Probability of fit ~0.2.
The 25 rejected points contributed 981 to the total 2 , an average 2i
of ~39 per point.
Similar results were found when fitting +p and -p data from the Particle Data Group (not shown due to lack of time!)
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Cross section and -value predictions for pp and pbar-p
The errors are due to the statistical uncertainties in the fitted parameters
LHC prediction
Cosmic Ray Prediction
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
100 data points, gaussianly distributed on the straight line y=1-2x; 20 noise points, randomly distributed, with 2
i>6.
After 2i>6 cut:
Best fit is y=0.998-2.014x; R2
min/=1.01; fit to all data has 2
min/=4.8
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
100 data points, gaussianly distributed about the constant y=10; 40 noise points, randomly distributed, with 2
i>4.
After 2i>4 cut:
Best fit is y=9.98R2min/=1.09; fit to all
data has 2
min/=4.39.
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
Lessons learned from computer studies of a straight line and a constant model
where is the parameter error found in the 2 fit
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
2renorm = 2
obs/R-1 renorm = r2 obs,
where is the parameter error
Sept. 12-15, 2005 M. Block, Phystat 05, Oxford
100 data points, gaussianly distributed about the parabola y=1+2x +0.5x2; 35 noise points, randomly distributed about nearby parabola y=12+2x+0.2x2; We have 13 “inliers”.
After 2i>6 cut: 113
points are kept; Best fit is y=1.23+2.04x+0.48x2
BONUS: Seems to also work reasonably well in separating two similar distributions!
What happens when we try to separate two similar distributions?