Mathematical Methods For Physicists Weber & Arfken selected Solutions ch. 1
Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for...
Transcript of Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for...
![Page 1: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/1.jpg)
USMUSM
UniversityObservatoryMunich
Statistical MethodsAn Introduction for (Astro-)Physicists
![Page 2: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/2.jpg)
2
USMContent
Fundamental terms of statistics and data analysis, with examples from physics and astrophysics
probability: axioms, Bayes-theoremprobability distribution functions of one or several random variables
• expectation value, variance, moments, characteristic function, variable transformation, variable
reduction, covariance
Tchebychev inequality, central limit theorem important distributions
• binomial, multinomial, Poisson, normal, exponential, chi-squared
measurement errors, error propagationestimation: consistent, unbiased, efficientMaximum Likelihood methods, minimum variance boundlinear regression, chi-square minimization, goodness of fit(hypothesis testing
• error of first and 2nd kind, F-test, Student-test, Kolmogorov-Smirnov-test)
![Page 3: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/3.jpg)
3
USMLiterature
R.J. Barlow, Statistics, John Wiley & Sons, 1989
S. Brandt, Statistical and Computational Methods in Data Analysis, North-Holland, 1976 (2nd ed.)
G. Bohm & G. Zech, Einführung in Statistik und Messwertanalyse für Physiker, Springer, 2012?
G. Zech, Einführung in Statistik und Messwertanalyse für Physiker, Vorlesungsskript, Univ. Siegen,http://personal.ifae.es/jamin/lehre/bayes/Zech04.pdf
![Page 4: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/4.jpg)
4
USMI. Probabilities
Majority of predictions affected by uncertainties (“the only certain things in life are taxes and death”). Thus, dealing with probabilities and statistics is sensible for everybody. Inevitable for experimental and empirical sciences.
• accuracy of experiments restricted by precision of used devices
• underlying processes often stochastic (at random; stochastics (greek): the art of conjecture, die Kunst des Mutmaßens) estimates for measurement quantities and their accuracy required
• estimates with errors enable to check hypotheses. Results can be improved subsequently, by adding new measurements and suitable averaging prescriptions.
• statistics yields mathematical algorithms to conclude, from a certain sample, on the properties of the underlying parent population.
example 1: Polls allow to predict distribution of parliament seats. Parent population is the entity of voters, the sample is a representative selection of them. Important to know the accuracy of the prediction.
example 2: Determine the mean life time of an unstable nucleus, from the observation of 100 decays. Randomness induced by quantum mechanical effects. Sample representative for the entity of all possible decays, if experimental device able to measure all decay times (between zero and infinity) with sufficient precision.
example 3: Determine the frequency of a pendulum, from 10 observations. The estimate for the actual frequency and its uncertainty are determined by suitable averages. It is assumed that the frequency can be determined with arbitrary precision for an infinite number of observations, and that a finite accuracy is the result of a restricted number of observations. Actual observations are a sample of infinite possible observations.
example 4: Test whether two experimental devices work similar. Compare samples from both devices. Test whether these samples originate from the same parent population.
Difference between observation and measurement:An observation (event) is the element of a sample (with one or more elements). Measurement is a parameter estimate, attributed with an (in)accuracy.
• Example: Decay times for 10 pion decays (observations), The estimate of the decay rate is a measurement.
• Fit of a straight line: observations are data points, slope is measurement.
![Page 5: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/5.jpg)
5
USMAxioms of probability
let S={E1, E2, E3, …} be the set of possible results of an experiment = events. events are said to be mutually exclusive if it is impossible that both of them occur in one result. For every event E there is a probability P(E) which is a real number satisfying the axioms of probability (Kolmogorov 1950):
(simplified version of Kolmogorov’s axioms)
1 2 1 2 1 2
i
I. ( ) 0II. ( or ) ( ) ( ) if E and E are mutually exclusiveIII. ( ) 1, where the sum is over all mutually exclusive eventsi
P EP E E P E P E
P E
≥= +
=∑
![Page 6: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/6.jpg)
6
USM
random events → probabilities
A+B means A or BA·B means A and B;
• if P(A·B) = 0, then A and B mutually exclusive
random events can be described by random variables = variatesa realization of a variate is an observation (event)
event E complementary event E (not E)from axiom III: ( ) 1 ( )and thus ( ) 1
P E P EP E
→
= −≤
![Page 7: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/7.jpg)
7
USMEmpirical (classical) probabilities
Frequency definition (frequentists’ interpretation):In a large number N of experiments the event A is observed to occur n times. Then
The set of all N cases (N repetitions of the same experiment or N simultaneous identical experiments) is called the collective or ensembleIn this case, the probability is not only a property of the experiment, but the joint property of experiment and ensemble
• example (von Mises, 1957): German insurance companies found that the fraction of their male clients dying at the age of 40 is 1.1%
• but this is not the probability that a particular man dies at this age. If data had been collected from other samples (all Germans, German hang-glider pilots,…), the outcome would have been different. Thus, the probability depends on the collective from which it has been taken.
as well: experiments must be repeatable, under identical conditions.“What is the probability that is will rain tomorrow?”“Will the General Motors shares raise tomorrow?”and: are we allowed to speak about the probability that, e.g., the mass of the Higgs particle lies in the range of 100 to 200 GeV/c2
( ) limN
nP AN→∞
=
![Page 8: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/8.jpg)
8
USMObjective probabilities
Peirce (1910): probability is a property of device/ experiment, e.g. a dieresurrected by Popper (in connection with quantum mechanics): objective probability or propensity (in German: Hang, Neigung)seems reasonable when considering equally likely cases, e.g., due to symmetry (coin, die etc.)breaks down for continuous variables(transformation can make uniform, symmetric distribution non-uniform, and there is no natural choice for the “best” variable)
![Page 9: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/9.jpg)
9
USMSubjective probability ‒ Bayesian statistics
definition: conditional probability P(A|B) is the probability of Agiven B is true
implies: P(A·B) = P(A|B) P(B), reasonable definition: If P(A·B)=P(A) P(B), then the probabilities are independent of each other: in this case, P(A|B)=P(A)!Bayes’ theorem (published posthumously by R. Price1763), undisputed:
( | ) ( ) ( | ) ( ) [ ( )]
and also( ) ( ) ( ) ( )
P A B P B P B A P A P A B
P A B P A P B P A B
= = ⋅
+ = + − ⋅
( )( | )( )
P A BP A BP B
⋅=
“Venn-diagram”
![Page 10: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/10.jpg)
10
USMRule of total probability
A collection of sets E1, E2, …, Eksuch that E1∪ E2 ∪ E3 ∪ … ∪ Em = S is said to be exhaustive.
Assume E1, E2, .., Ek are k mutually exclusive and exhaustive sets. ThenP(B) = P(B ∩ E1) + P(B ∩ E2) + … + P(B ∩ Ek) = P(B·E1) + P(B·E2) + … + P(B·Ek) = P(B|E1)P(E1) + P(B|E2)P(E2) + … + = ∑P(B|Ek)P(Ek).
S B
![Page 11: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/11.jpg)
11
USM
Thus
( | ) ( ) ( | ) ( )( | )( ) ( | ) ( )
or
( | ) ( ) ( | ) ( )( | )( ) ( | ) ( ) ( | ) ( )
with ( ) 1 ( )
i ii
P B A P A P B A P AP A BP B P B A P A
P B A P A P B A P AP A BP B P B A P A P B A P A
P A P A
= =
= =+
= −
∑
![Page 12: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/12.jpg)
12
USMExamples
example 1: probabilities for drawing certain cards from a well-shuffled card game with 32 cardsP(Queen): 4/32 =1/8P(spade): 1/4P(Queen of spade): 1/8*1/4=1/32 (spade and queen)P(Queen or spade): 1/8+1/4-1/32=11/32 (not mutually exclusive)P(spade|queen): 1/4 = P(spade) (independent events)
example 2: Calculate the fraction of female students, from the fraction of women and students in the population, and from the fraction of students among the female population P(A)=0.05 fraction of students in populationP(B)=0.52 fraction of women in populationP(A|B) = 0.07 fraction of students among female population
( | ) ( ) 0.07 0.52( | ) 0.728( ) 0.05
P A B P BP B AP A
⋅= = =
![Page 13: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/13.jpg)
13
USMExamples (cont’d)
example 3: Infected? (from Gigerenzer 2002, realistic numbers)
HIV-screening for persons without risky behaviourpositive test-result (D) with respect to two modern tests (ELISA, Western-Blot-Test) in Germany: H1: one of 10000 men HIV-infected (non risk-group)P(D | H1) = 0.999 that positive test (D) if man infected P(D | H2) = 0.0001 that positive test if not infected.
Calculate P(H1 |D) that there is an actual infection if a man (non risk-group) tests positive
4
4 4 0.49
( | 1) ( 1) ( | 1) ( 1)( 1| )( ) ( | 1) ( 1) ( | 2) ( 2)
0.999 100.999 10 0.0001 (1 10 )
1approximation: ( | 1) 1 ( 1| ) ; ( | 2)1( 1)
( | 2) ( 1) : test OK ( | 2
8!
)
9
P D H P H P D H P HP H DP D P D H P H P D H P H
P D H P H D P D HP H
P D H P HP D H
−
− −
= =+
⋅= =
⋅ + ⋅ −
≈ ⇒ ≈+
= ( 1) : prob. that actually infected = 0.5( 1)( | 2) ( 1) : prob. that actually infected very low, ( 1| )
( | 2)
P HP HP D H P H P H D
P D H≈
![Page 14: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/14.jpg)
14
USMBayesian statistics
so far, so good … (if all probabilities known, not disputed)but: applied also to statements which are regarded as ‘unscientific’ in the frequency definition.probability of a theory (it will rain tomorrow, parity is not violated…) is considered as a subjective ‘degree of belief’. Subsequent experimental evidence then modifies this initial degree of belief.expressed as
What is the probability of a theory???if complete ignorance, uniform distribution assumed …(see example below, “The first night in paradise”)
• otherwise, suitable choice due to symmetry arguments, laws of nature, empirical knowledge, experts opinion…
But: with respect to which parameter? (example: mass or mass^2 give different priors)
(result|theory)(theory|result)= (theory)P(result)
PP P‘prior’
![Page 15: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/15.jpg)
15
USM
example: assume you toss a coin 3 times and obtain always “head”. Calculate probability that coin is a phoney, i.e., has a head on each side.
If you have drawn the coin from your pocket, the prior should be very small. Let P(phoney)=10-6.ThenP(phoney|3 heads)=8· 10-6, i.e., reasonably small
Now assume that you have played against the car salesman Honest Adi for a beer, and that Honest Adi has given you the coin. In this case, the a priori probability that the coin is a phoney might be higher, you estimate 5%, and one finds P(phoney|3 heads)=0.3, which is a considerable chance.
3
(3 heads|phoney)(phoney|3 heads)= (phoney)(3 heads|phoney) (phoney)+ (3 heads| not phoney)(1- (phoney))
(3 heads|phoney)=1
1(3 heads| not
prior ?
phoney)= 0.1252
: (phoney)= ??
PP PP P P P
P
P
P
⎛ ⎞ =⎜ ⎟⎝ ⎠
![Page 16: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/16.jpg)
16
USMThe first night in paradise
From G. Gigerenzer 2004, “The evolution of statistic thinking”, Unterrichtswissenschaft, 32
![Page 17: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/17.jpg)
17
USM
![Page 18: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/18.jpg)
18
USMThe game show problem
© Christian Rieck -- www.spieltheorie.de/Spieltheorie_Anwendungen/ziegenproblem.htm
Das Ziegenproblem ist eines der Probleme, das die Gemüter lange Zeit erhitzt hat und ganze Scharen von Mathematikern an den Rand der Verzweiflung gebracht hat (insbesondere, weil Sie von ihrer Intuition irregeführt wurden und lange gebraucht haben, das zu bemerken). Es gibt wohl keinen Spieltheoretiker, der Ende der 1980er Jahre nicht in irgendeiner Form über dieses Problem nachgedacht hat. Und das, obwohl es eigentlich gar kein Mehrpersonenspiel ist. Aber zunächst die Regeln für diejenigen, die das Problem noch nicht kennen sollten.
In einer amerikanischen Quizsendung steht eine Kandidatin vor drei verschlossenen Türen, hinter denen in einem Fall ein Auto steht und in zwei Fällen eine Ziege. Die Kandidatin darf jetzt eine der Türen wählen; anschließend öffnet der Showmaster eine der verbleibenden zwei Türen, und zwar immer so, dass auf jeden Fall eine Tür mit Ziege geöffnet wird, sodass das Auto also hinter einer der noch verschlossenen Türen sein muss. Er bietet der Kandidatin dann an, jetzt noch einmal die Türe zu wechseln oder bei der zuerst gewählten Tür zu bleiben, bevor sie geöffnet wird. Die Kandidatin bekommt dann das, was hinter der von ihr endgültig gewählten Tür steht (wobei wir hier davon ausgehen wollen, dass sie das Auto der Ziege vorzieht).
In einer Kolumne von Marilyn vos Savant (www.marilynvossavant.com/articles/gameshow.html) stellte jemand die Frage, ob es in dieser Situation besser sei zu wechseln oder bei der ursprünglichen Wahl zu bleiben. Die meisten Menschen dachten damals, dass es egal sein müsse. Marily vos Savant, die den höchsten jemals gemessenen IQ hat und daher als der intelligenteste Mensch der Welt gilt, antwortete allerdings lapidar mit "wechseln ist besser" und löste damit die Diskussion aus, in der es Wochen dauerte, bis sich die Menschheit auf die bis heute akzeptierte Lösung einigen konnte. Davor bekam sie allerdings so nette Zuschriften wie: "Sie sind die Ziege!", oder: "Sie haben einen Fehler gemacht. ... Wenn sich all diese Doktoren irren würden, dann wäre unser Land in ernsthaften Schwierigkeiten." Aber wenigstens ist der intelligenteste Mensch der Welt dadurch berühmt geworden.
Dabei lässt sich das Problem durch Anwendung des Satzes von Bayes lösen (Sie finden die Erklärung zu diesem Satz von Bayes auch in meinem Spieltheorie-Buch). Darin ist E das unbeobachtbare Ereignis (wo steht das Auto?) und B ist die Beobachtung (welche Tür lässt der Quizmaster geschlossen?). Das führt dann zu folgenden Werten, die wir für den Satz von Bayes brauchen:
![Page 19: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/19.jpg)
19
USMThe game show problem (cont’d)
P(E) = Wahrscheinlichkeit, dass hinter einer Tür das Auto steht;da es ein Auto hinter drei Türen gibt, ist dieser Wert 1/3
P(B) = Wahrscheinlichkeit, mit der eine Tür nach der Wahl des Quizmasters geschlossen bleibt;da der Quizmaster eine von zwei verschlossenen Türen öffnet, ist dieser Wert 1/2
P(B/E) Wahrscheinlichkeit, mit der der Quizmaster eine Tür geschlossen lässt, wenn hinter ihr ein Auto steht;da der Quizmaster nur eine Tür öffnet, wenn hinter ihr kein Auto steht, ist dieser Wert 1
P(E/B) = Wahrscheinlichkeit, mit der hinter der vom Quizmaster geschlossen gelassenen Tür ein Auto ist;dies ist der Wert, den wir suchen.
Eingesetzt in die Formel von Bayes ergibt sich:
Die Wahrscheinlichkeit, dass das Auto hinter der Tür steht, die der Quizmaster geschlossen lässt, beträgt somit 2/3, wogegen sie hinter der ursprünglichen Tür nur 1/3 beträgt. Somit ist klar, dass man seine Chancen auf das Auto verdoppelt, wenn man wechselt. Vos Savant hatte also Recht.
Ich habe weiter oben schon erwähnt, dass dies eigentlich kein Mehrpersonenspiel ist. Denn der Quizmaster ist hier kein Entscheider, der sich noch aufgrund seiner eigenen Präferenzen zwischen zwei Türen zu entscheiden hat, sondern er verhält sich wie ein rein ausführender Algorithmus, der nur eine Tür öffnet, hinter der mit Sicherheit kein Auto steht. Somit handelt es sich hier um ein Spiel gegen die Natur (das heißt gegen die Wahrscheinlichkeitsverteilung, mit der hinter den Türen Autos und Ziegen stehen).
( | ) ( ) 1 1/3 2( | )( ) 1/ 2 3
P B E P BP E BP B
⋅= = =
![Page 20: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/20.jpg)
20
USMII. Probability distributions – one random variable
random events are characterized by random variables Probability distribution functions associate random variables with corresponding probabilitiesdiscrete and continuous random variables (r.v.)in the following, probabilities refer to one r.v. x, i.e., one property which can be quantified.
definition: (cumulative) distribution function (c.d.f) F(t) defines the probability of finding a value being smaller than t,
from the probability axioms, we obtain the following properties for F(t)• F(t) increases monotonically with t
• F(-∞)=0
• F(∞) =1
There are discrete and continuous distributions
( ) ( ) withF t P x t t= < −∞ < < ∞
![Page 21: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/21.jpg)
21
USM
describe probabilities for the occurrence of N discrete, different events, with
example: die; the probability to dice a certain number xi is P(xi ) =1/6, xi=i for i=1,6
discrete distributions can be treated as continuous distributions, via the Dirac δ-function
( ) ( ) ( )i i i
Discrete distributions
P x F x F xε ε= + − −
( ) 1ii
P x =∑
F(x)P(x)
probability distribution distribution function
![Page 22: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/22.jpg)
22
USMContinuous distributions
instead of probability distribution, define probability density f(x) (p.d.f. = prob. density function) with
( )( )
and properties( ) ( ) 0
( ) 1; thus,
( ) ( ) ( ) ( )b
a
dF xf xdx
f f
f x dx
P a x b F b F a f x dx
∞
−∞
=
−∞ = +∞ =
=
≤ < = − =
∫
∫
![Page 23: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/23.jpg)
23
USM
example: life-times of instable particles follow an exponential distribution
0
exp( / )( ) for t 0 and with mean life-time
( ) ( ') ' ( ') ' 1 exp( / ),
and the probability that the particle lives longer than is( ) ( ) ( ) exp( 1)
t t
tf t
F t f t dt f t dt t
P t F F
τ ττ
τ
ττ τ
−∞
−= ≥
⇒
= → = − −
> = ∞ − = −
∫ ∫
![Page 24: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/24.jpg)
24
USMExpectation value
note: if x is a r.v., than any function u(x) is a r.v. as welldistributions have characteristic parameters such as expectation value, width and asymmetry
the expectation value or mean of a r.v. x results from averaging over x according to its distribution,
-
-
( ) discrete dist.
( )( ) continuous dist.
( ) ( ) discrete dist.
( ( ))( ) ( ) continuous dist.
i ii
i ii
x P x
E x x xxf x dx
u x P x
E u x u uu x f x dx
μ ∞
∞
∞
∞
⎧⎪⎪= =< >= = ⎨⎪⎪⎩
⎧⎪⎪= =< >= ⎨⎪⎪⎩
∑
∫
∑
∫
![Page 25: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/25.jpg)
25
USM
calculation rules: let α,β be constants and u and v functions of x
the expectation value is the centre of gravity of the distribution
the expection value is a linear operat
independen
( ) ; ( ( )) ( )( ) ( ) ( );
if , are r.v., thenE(u(x)v(y))=E(u)E(v) (see Sect.
or!
IIIt
)
E E E u E uE u v E u E v
x y
α αα β α β
= =+ = +
![Page 26: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/26.jpg)
26
USMCentral moments of a r.v.
Let’s choose especially
0 1
2 22
( ) ( ) with ( ( )) : ' {( ) }which is called the n-th central moment or the n-th moment about the mean.
Lowest order central moments are' 1 and ' 0
The quantity ' ( ) ( ) {( ) }
i
n nnu x x E u x E x
Var x x E x
μ μ μ
μ μ
μ σ μ
= − = = −
= =
= = = −s the lowest central moment which contains information about the
average deviation ovariance
f x from the mean. It's called the of st x an, an dardd is t devia nhe tioσ
![Page 27: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/27.jpg)
27
USMVariance
the variance measures the mean quadratic deviation from the mean.the standard deviation σ=√ Var has the same units as x, will be identified with the errors of measurementsthe mechanical analogue to the variance is the moment of inertiacalculation rules
different representation
2
2 2
( ) 0, ( ) ( )( ) ( ) ( ) (see S if , are independen ect. III)t
Var Var x Var xVar x y Var x V x yar y
α α α
α β α β
= =
+ = +
2 2 2
2 2 2
2 2 2 2
( ) {( ) } ( 2 ) ( ) 2 = ( ) oThe variance (and all other moments) centra
l
r
Var x E x E x xE x
E x x x
is invariant to translations of the r.v.!!!
μ μ μ
μ μ
μ
= − = − + =
= − + =
− < > − < >
![Page 28: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/28.jpg)
28
USMVariance of a convolution
We measure a quantity x with p.d.f. g, and the measurement is smeared out according to a p.d.f. h. (→ convolution, see below). We look for the variance of the measurement x’. Since the variance is translation-invariant, we set E(x) to zero.
The variance of x’ is the sum of the variances of the distributions g and h. For sequential measurements of a quantity the individual errors add quadratically(see below)
2 2
2 2
2 2
2 2 2 2
2 2 2
Otherwise (E(x) 0)
( ') ( ) ( ' )
' ' ( ) ( ' ) '
[( ' ) 2 ' ] ( ) ( ' )
Analog
'
[ 2 ( ) ] ( ) ( )
[ 2 ] ( ) ( )
' 2u
f x g x h x x dx
x x g x h x x dxdx
x x xx x g x h x x dxdx
u x u x x g x h u dxdu
u xu x g x h u dxdu u x
x u x u x
= −
< >= − =
= − + − − =
= + + − =
= + + =< > + < >
< >=< > + < > + < >
≠
><
∫∫∫
∫∫∫∫∫∫
2 2
2 2
' , ' ( )( ') '
e derivation identical result' !( ) ( )
x u x x u xVar x x x Var u Var x
< >=< > + < > < > = < > + < >
⇒ =< > − < > = +
![Page 29: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/29.jpg)
29
USMSkewness
measures the asymmetry of a distribution3 33
1 3
3 2 3
3
21 1
' {( ) }/ ...
E(x ) 3 =
sometimes one finds ( )a positive skew describes a distribution with a tail which extends t
skewness is invariant to translations and elongation
h
s
o t e
E xμγ μ σσ
μσ μσ
β γ
= = − =
− −
=right.
![Page 30: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/30.jpg)
30
USMCurtosis/Kurtosis
measures how pronounced the tails of the distribution are4 44
2 4
4 3 2 2 4
4
2 2
2
' {( ) }/ ...
( ) 4 ( ) 6 ( ) 3 =
3 is defined in such a way as to be zero for a distribution
positive implies a relatively higher,
normal=Gaussia
narrower peak and wid
n
E x
E x E x E x
μβ μ σσ
μ μ μσ
γ β
γ
= = − =
− + −
= −
2
er wingsthan the normal distribution , and vice versa (wider peak, shorter wings
with same mean) for negative
and .
σγ
![Page 31: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/31.jpg)
31
USMExamples
3 different p.d.f, all with zero mean and unit variance, but different skewness and curtosis. Left: linear scale; right: logarithmic scale.
Let ( ) . Then ( ) 0 and ( ) 1
The r.v. has particularly simple properties, and i redus called a ced (normalized) variable
xu x E u Var u
u
μσ−
= = =
![Page 32: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/32.jpg)
32
USMExamples (cont’d)
n
0
2 2
13 3
24 4
<t > exp( / ) !
<t>=<t >=2
2, skewed, with tail to the right<t >=6
6, higher peak and wider wings <t >=24 than
life-time (exponen
no
tial) distributio
m
n
r al
nnt t dt nτ τ
τ
μ ττσ τ
τγ
τγ
τ
∞
= − =
=⎫
=⎪⎪⇒ =⎬⎪ =⎪⎭
∫
dist.
⎧⎪⎪⎪⎨⎪⎪⎪⎩
life-time (blue) and normal (red) distribution, τ=1both distributions have identical mean and variance (indicated by dotted lines)
![Page 33: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/33.jpg)
33
USMOther parameters of a distribution
2
2
00.5
( ) maxif distribution has a differentiable probability density, then the mode is determined via
( ) 0, ( ) 0
if one maximum, distr. , otherwise
mode :
unimodal multimodal
median :
F(
mm P x x
d df x
x
x
x
fdx dx
x
= =
= <
0.5
.5 0.5
-
0.25 0.75
) ( ) 0.5
For a continous distribution, ( ) 0.5
The median divides the total range of
lower and up
x into
per qu
two regions
artile
of equal probability.
F( ) 0.25; F( ) 0.75s:
full
x
P x x
f x dx
x x
∞
= < =
=
= =
∫
is independent of the tailsfor a Gaus
width of halsian distribu
f maximum (FWHM)tion, 2.
35FWHM σ=
![Page 34: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/34.jpg)
34
USMChebychev’s inequality
The values of a r.v. are somewhere in the neighbourhood of the mean μ. Deviations from the mean are less probable the larger they are compared with σ. This fact is expressed by Chebychev’s inequality (which is generally very weak):
2 2
2 -2
2 2 2
"The probability of being more than standard 1(| | ) , 1 deviations away from the mean is lower than "
for a continuous r.v.(| | ) (( ) )
( ) wit
P
h
roof
k
x kP x k k
k k
P P x k P x k
P g t dtσ
μ σ
μ σ μ σ∞
⎧− > < ≥ ⎨
⎩
= − > = − >
= ∫2 2
2 2
2
2 2
0
the p.d.f. of ( - )
{( ) } ( ) ( ) ( ) ( )
Since integration over positive values only and g(t) positiv definite (p.d.f),
the integral can be approximated ( )
k
k
g t x
E x E t tg t dt tg t dt tg t dt
tg t
σ
σ
μ
σ μ∞ ∞
−∞
=
= − = = = +∫ ∫ ∫
2 2
2 2 2 2 22
( ) as
10 ( ) , i.e., q .e . d .
b b
a a
k
dt a g t dt
k g t dt k P Pkσ
σ σ σ∞
⎛ ⎞>⎜ ⎟
⎝ ⎠
> + = <
∫ ∫
∫
![Page 35: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/35.jpg)
35
USMExample
(| | ) can be alternatively written as
( ) = ( ) for postive deviations and as
( ) = ( ) for negative deviations
Test of Tchebychev's inequality for the life-time dis
k
k
P x k
P x k f t dt
P x k f t dt
μ σ
μ σ
μ σ
μ σ
μ σ
∞
+
−
−∞
− >
> +
< −
∫
∫
2
(1 )
tribution
exp( / )( ) ( (1 )) exp( (1 )) , q.e.d.
( ) ( (1 )) 0 for 1, else not defined (c.f. Fig. page 29)k
tP x k P x k dt k k
P x k P x k kτ
τμ σ ττ
μ σ τ
∞−
+
<
−> + → > + = = − + <<
< − → − = =
∫
![Page 36: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/36.jpg)
36
USMMoments of a distribution
remember central moments (of r.v. or distribution)
analogue definition: moments of distribution
remember as well
the probability density function is uniquely defined by its moments, as we will show now
n' {( ) } ( ) ( )n nE x x f x dxμ μ μ∞
−∞
= − = −∫
n n
1
1
( ) ( ) or ( ) )
( ):
(n n n nk k
kE x x f x dx E x x P x
x E xμ
μ μ
μ
∞ ∞
=−∞
= = = =
=< >==
∑∫
12
23
3 1
' 0
' ( )
'
Var x
μ
μ σ
μ γ σ
=
= =
=
![Page 37: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/37.jpg)
37
USMCharacteristic function
definition: The characteristic function of a p.d.f. f(x) is
for continuous distribution, the characteristic function is the Fourier transform of f(x) (Note the (missing) normalization.) Thus, the transform is invertible
… and the characteristic function defines the p.d.f.
k=1
( ) ( ) ( ) or ( ) Note: the lower summation
index might be also 0
kitxitx itxkt E e e f x dx e P xφ
∞ ∞
−∞
= = ∑∫
![Page 38: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/38.jpg)
38
USMCharacteristic function and moments
0 0
The n-th derivative of the characteristic function is
( ) ( ) ( ) .
At 0 one obtains
( ) ( )( ) ( ) , i.e.,
Thus, the Taylor expansion of ( ) around
nn itx
n
n nn n
nn nt t
d t ix e f x dxdt
t
d t d tix f x dx idt dt
t
φ
φ φ μ
φ
∞
−∞
∞
−∞= =
=
=
= =
∫
∫
0 00
0,
1 ( ) 1( ) ( )! !
delivers all moments of the distribution. Since the Fourier transform canbe uniquely inverted and the Taylor expansion of the characteristic functioncon
nn n
nnn nt
t
d tt t itn dt n
φφ μ∞ ∞
= ==
=
= =∑ ∑
( ) ( )
0
sists of the moments, we conclude that indeed , as stated above.
For the central moments, we find in analogy
1'(
the m
) ( ) ( ) ( ) '!
Note i
oments define the p.d.f.
n p
it x it x nn
n
t E e e f x dx itn
μ μφ μ∞ ∞
− −
=−∞
= = →∑∫2
22 2
0
'( )articular that ' ( )t
d txdtφμ σ
=
= = −
![Page 39: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/39.jpg)
39
USMSum of two independent r.v.
The p.d.f. of a sum of two independent r.v. is the (inverse) Fourier-transform of the products of the two corresponding characteristic functions!
, independent( )
Let with independent r.v. , and corresponding p.d.f.s ( ), ( ).Calculate the distribution ( ).
( ) ( ) ( ) ( ) ( ), i.e.,
( ) ( ) ( )
and thus
(
x yit x y itx ity itx ity
h
h f g
z x y x y f x g yh z
t E e E e e E e E e
t t t
h z
φ
φ φ φ
+
= +
= = =
=
1) ( )2
itzhe t dtφ
π
∞−
−∞
= ∫
![Page 40: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/40.jpg)
40
USMExample
characteristic function and moments of the exponential distribution.
( )
0 0
2
1
0
1( ) for 0 (e.g., for the life-time distribution)
( )
By differentition,( )
( )( ) !
( )
( ) ! , we obtain th
x
itx x it x
n n
n n
n n
n nt
f x e x
t e e dx eit it
d t idt it
d t n idt it
d t n idt
λ
λ λ
λ λτ
λ λφ λλ λ
φ λλ
φ λλ
φλ
−
∞∞− − +
+
=
= ≥ =
= = =− + −
=−
=−
=
∫
n
2 2 3 31 2 1 3
e moments
! ! , e.g.
, , ( 3 )
without explicitly
(compare Fig. page 2
calculating the integrals defining the expectation
/ 2,
val
9)
ues!
n nn nμ λ τ
μ μ τ σ μ μ τ γ μ σ μ μ σ
−= =
= = = − = = − − =
![Page 41: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/41.jpg)
41
USMTransformation of variables
given a p.d.f f(x), we like to know the p.d.f g(u), when u is a (uniquely invertible) function of x, u(x)example: given a distribution of velocities f(v), we want to calculate the distribution of energies, ½mv2
for discrete distributions, this is trivial. The probability for the event u(xk) (where u is a function of x) is the same as for the event xk itself,
P(u(xk))=P(xk)
for continuous distributions, we have to invoke calculus
![Page 42: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/42.jpg)
42
USM
2 2
1 1
1 2 1 2 1 1 2 2
p.d.f. f(x) and uniquely invertible function u(x) are given. Calculate g(u)( ) ( ) with ( ) and ( ).
( ) ( ) ( ) ( ) and thus
( ) ( )
The absolute s
x u
x u
P x x x P u u u u u x u u x
P f x dx g u du g u du f x dx
dxg u f xdu
< < = < < = =
= = ⇒ =
=
∫ ∫
ign garantuees that the p.d.f. is positive. Integrating this equation yields ( ) ( ).
If ( ) is invertible, but no longer uniquely, and thus ( ) is ambiguous, one has to sum over all contributing
F x G u
u x x u
=
branch 1 branch 2
branches
( ) ( ) ( ) ...dx dxg u f x f xdu du
⎧ ⎫ ⎧ ⎫= + +⎨ ⎬ ⎨ ⎬⎩ ⎭ ⎩ ⎭
Transformation via a parabola. The sum of the indicated areas under f(x) are equal to the area under g(u).
Transformation of a p.d.f. f(x) to g(u) via u(x). Theindicated areas are equal.
Calculation of the transformed p.d.f.
![Page 43: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/43.jpg)
43
USMExamples
m
2
calculate the p.d.f. for the area of a circle from a (see Sect. IV)distribution of radii between 0 and r .
1p.d.f. for : ( ) fo
examp
r 0 ; ( ) 0 else.0
( ) ( ) with ;
uniform le 1:
mm
r f r r r f rr
dr dAg A f r A rdA d
π
= < < =−
= =
12
0 0
12 ; ( )2
1 1 1( ) ; Test: ( ) 1!2 2
m m
m
A A
m m
r g Ar r r
g A g A dA A dAA A A
ππ
−
= =
= = =∫ ∫
2
22 ( )
2
Calculate the distribution for the of a reduced r.v. which itself should be normally distributed.
( - ) 1 and ( ) e (see Sect. IV)2
example 2: squar
The function ( )
e
has two
xxu f x
x u
μσμ
σ σ π
−−⎡ ⎤= =⎢ ⎥⎣ ⎦
branches!
![Page 44: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/44.jpg)
44
USMExamples (cont’d)
2 2
branch 1 branch2
/ 2
2
1 1; ( )2 2 2 2 2
Since the contributions from branch 1 and 2 are identical, we obtain1( ) ,
2which is the so-called -distribution for one degr
u u
u
dx g u e edu u u u
g u eu
σ σ σσ π σ π
πχ
− −
−
⎧ ⎫ ⎧ ⎫= ± = + −⎨ ⎬ ⎨ ⎬
⎩ ⎭ ⎩ ⎭
=
ee of freedom (see Sect. IV)
2
2
kinetic energy for a 1-D ideal gas. The p.d.f. of the velocity of a particle into direction is
( ) . Calculate the corresponding energy distribution.2
A
example 3:
s above1 ; both
2
mvkT
x
mf v ekT
dvdE mE
π−
=
= ±
/ /
branches have similar contributions, thus
2 1( )22
E kT E kTmg E e ekTmE kTEπ π
− −= =
![Page 45: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/45.jpg)
45
USMCalculation of the transformation
Now, the original and the transformed p.d.f., f(x) and g(u), are given, and the transformation u(x) needs to be calculated. This situation is frequently met in Monte-Carlo simulations. Random number generators usually create uniformly distributed r.v., and we look for the transformation law which transforms these uniformly distributed r.v. into others which are distributed following a given p.d.f. (defined by the process to be investigated).
1
( ') ' ( ') '; integration yields the c.d.f.s
( ) ( ) and thus ( ) ( ( ))
x u
f x dx g u du
F x G u u x G F x−∞ −∞
−
=
= =
∫ ∫
![Page 46: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/46.jpg)
46
USM
The problem can be solved analytically only if both p.d.f.s f and g can be integrated analytically, and if the inverse of G can be calculated.in other cases (which are the majority), numerical methods have to be applied. Most powerful is the rejection method by von Neumann (see, e.g., “Numerical Recipes” and http://www.usm.uni-muenchen.de/people/puls/lessons/numpraktnew/montecarlo/mc_manual.pdf
1
In the former case of being a uniform distribution over the unit interval, i.e., ( ) 1 for 0 1 and ( ) 0 else,
we obtain ( ) and thus( )
example
; ( ).
Create exponentially dist u: rib te
ff x x f x
F x xG u x u G x−
= ≤ ≤ ==
= =
'
0uniformly distr. x in unit interval
d r.v. from a uniform distribution.( ) ;
( ) ' : ( ) ;
1 ; ( ) ln(1 ) / ln( ) /
u
uu
u
g u e
G u e du F x x
e x u x x x
λ
λ
λ
λ
λ
λ λ
−
−
−
=
= = =
− = = − − −
∫
![Page 47: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/47.jpg)
47
USM
P.d.f.s for a uniform distribution (black), generated by a random number generator from N=103 (left) and N=106 subsequent numbers. The corresponding exponential distribution (λ=2, blue) has been created from these numbers using the transformation method as described above. Displayed are histograms with bin size 0.02. Analytical p.d.f.s in green and red. IDL (interactive data language) code below.
![Page 48: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/48.jpg)
48
USM
until now, univariate distributions: one r.v.generalization to several r.v. “easy”: multivariate (also: more-dimensional) distributionsin the following, only continuous distributions
definition of prob. distribution for two r.v., x,y:
corresponding joint p.d.f.
( , ) ( ' , ' ) with(- ,- ) 0, ( , ) 1
III. Distributions of several random variables ‒ multivariate p.d.f.s
F x y P x x y yF F
= < <∞ ∞ = ∞ ∞ =
2 ( , )( , ) ( , ) 1 and
( , ) ( , ) b d
a c
F x yf x y f x y dxdyx y
P a x b c y d f x y dxdy
∞ ∞
−∞ −∞
∂= ⇒ =
∂ ∂
≤ < ≤ < =
∫ ∫
∫ ∫
![Page 49: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/49.jpg)
49
USMMarginal distributions
following problem: sometimes the c.d.f F(x,y) is approximately determined (by many measurements), but only the probability distribution of x (irrespective of y) is of interest. example: the appearance of a certain disease is known as a function of location and date. For a certain investigation, the dependence on date is without interest. In this case, we marginalize the distribution, i.e., we integrate over the whole range in y
( , ) ( , ) ( )b b
a a
P a x b y f x y dy dx g x dx∞
−∞
⎡ ⎤≤ < −∞ < < ∞ = =⎢ ⎥
⎣ ⎦∫ ∫ ∫
![Page 50: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/50.jpg)
50
USM
( ) ( , ) is a p.d.f. of , called the marginal distribution of .
The corresponding distribution of is
( ) ( , )
Marginal distributions are "projections" of the joint p.d.f. onto the
g x f x y dy x x
y
h y f x y dx
∞
−∞
∞
−∞
=
=
∫
∫axes.
( , ) ( ) ( )
Now, we can define the conditional probability for ' given that ' is known:( ' | ' ).
The corresponding p.d.f. is given by( , )
Two r.v. , are independe
(
nt if
| )
f x y g x h y
y xP y y y dy x x x dx
f x yf
y
y
x
x
=
≤ < + ≤ < +
= ,( )
and the above probability results as ( | ) .conditional probabilities as defined above areNote: normalized !
g xf y x dy
![Page 51: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/51.jpg)
51
USM
(see Sect. I) is then expressed by
( ) ( , ) ( | ) ( ) .
If the variables are independent, t
The rule of total probability
hen( , ) ( ) ( )( | ) ( )( ) ( )
Any constraint on one vari
h y f x y dx f y x g x dx
f x y g x h yf y x h yg x g x
∞ ∞
−∞ −∞
= =
= = =
∫ ∫
able cannot contribute information aboutthe other, if the variables are independe
Bayes theorem for two-dimensional dis
nt!
( | ) ( )tribu
( | ) ( ) ( ,tions:
)f x y h y f y x g x f x y= =
![Page 52: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/52.jpg)
52
USMExample
superposition of two normal-distributions, with corresponding marginal and conditional p.d.f.s
g
h
f 0.6672
( , 1) / (1); Remember that this conditional pdf is normalized, i.e.,
and not independent,No ste: ince ( | ) depends on !
| 1
( 1)f y x g f y
x y f y x
x
x
dy= = = =∫
( , )f x y dy= ∫
( , )f x y dx= ∫
![Page 53: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/53.jpg)
53
USMExample (cont’d)
![Page 54: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/54.jpg)
54
USMMoments
in analogy to univariate distributions, we define
20
02
11
'''
μμμ
==
= =
cov( ,
"covarianc
( )
e ") x y
x y
E xy μ μ
=
= − =
![Page 55: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/55.jpg)
55
USM
similarly, we define
examples
[ ]{ } { } { }( )222 2
( ( , )) ( , ) ( , )
( ( , )) ( , ) ( ( , )) ( , ) ( , )
E u x y u x y f x y dxdy
u x y E u x y E u x y E u x y E u x yσ
=
= − = −
∫∫
2 2
2
2 2 2 2
2
( , ) ( ) ( ) ( )
( ) (( ) ( ))
= ( ( ) ( ))
= ( ) ( ) 2 ( )( )
=
x y
x y x y
u x y ax by E ax by aE x bE y
ax by E ax by E ax by
E a x b y
E a x b y ab x y
a
σ
μ μ
μ μ μ μ
= + ⇒ + = +
⎡ ⎤+ = + − + =⎣ ⎦⎡ ⎤− + − =⎣ ⎦⎡ ⎤− + − + − − =⎣ ⎦
( )( )
2 2 2( ) ( ) 2 cov( , )
( , ) and , independent, i.e., ( , ) ( ) ( )
i) (
(c.f. Sect. II)
(c.f. Sect. I
) ( , ) ( ) ( ) ( ) ( )
= ( ) ( )
ii
I
)
)
x b y ab x y
u x y xy x y f x y g x h y
E xy xyf x y dxdy xyg x h y dxdy xg x dx yh y dy
E x E y
σ σ+ +
= = ⇒
= = = =∫∫ ∫∫ ∫ ∫
cov( , ) ( )( ) ( ) ( ) 0 !!!x yx y x y g x h y dxdyμ μ= − − =∫∫
![Page 56: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/56.jpg)
56
USMCovariance, correlation coefficient
from definition of covariance, we see that • cov(x,y) is positive if values x>μx (x<μx) appear preferentially together with values y>μy (y<μy).
• cov(x,y) is negative if values x>μx (x<μx) appear preferentially together with values y<μy (y>μy).
• if the knowledge of x does not give information about the probable position of y, the covariance
vanishes (see Fig. below)
if cov(x,y) ≠0, the variables x,y are called correlated, otherwise uncorrelated. correlation is quantified by the dimensionless correlation coefficient
cov( , )( , ) , 1 ( , ) 1; ( ) ( )
the limiting values are reached when and b>0 ( 1) or b<0 ( 1) calculate cov( , ) ( ) - ( ) ( )
proof:
x yx y x yx y
y a bxx y E xy E x E y
ρ ρσ σ
ρ ρ
= − ≤ ≤
= + = = −
=
2 2 2 2with and then 11 or cov( , ) ( ) ( ) ( )
2
by a bxx y x y x y b
ρσ σ σ
⎫⎪ = + → = ±⎬⎡ ⎤= + − − ⎪⎣ ⎦⎭
f(x,y)=const for different correlation coefficients (linear dependency: f(x,y)=f(y|x)f(x) with f(y|x)=δ(y-(a+bx))
![Page 57: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/57.jpg)
57
USM
Note: for independent (uncorrelated) variables → cov(x,y)=0But: cov(x,y)=0 does not necessarily imply that x,y are independent, since covariance detects only linear dependencies.Example: let x be uniformly distributed between [-1,1], and y=x2
• Then: y depends on x, but cov(x,y)=E(x3)-E(x)E(x2)=0, since expectation values of odd quantities=0!
In other words: there are cases when cov(x,y)=0, but the conditional p.d.f. f(y|x) depends on x. Independence is only warranted if f(y|x) = f(y)!
![Page 58: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/58.jpg)
58
USMTransformation of variables
analogous to 1-D (univariate) case• given f(x,y) and u(x,y), v(x,y)
exampleabsolute value of Jacobi-determinante
,Then: ( , ) ( , ) ( , ) ( , ),
x yg u v dudv f x y dxdy g u v f x yu v∂ ∂
= ⇒ = ⋅∂ ∂
2 2
2
( ) / 2
/ 2
2
0
1Transform 2-D normal distribution into polar coordinates cos , sin2
cos sin,sin cos,
1( , ) ,with marginal distributions 2
( , )
x y
r
r
e x r y r
x yx y r r r
x y r rr
g r re
g g r dπ
ϕ ϕπ
ϕ ϕϕ ϕϕ
ϕ ϕ
ϕπ
ϕ ϕ
− +
−
= =
∂ ∂∂ ∂ ∂ ∂= = =
∂ ∂ −∂ ∂∂ ∂
⇒ =
= =∫2 / 2
0
1 and ( , ) , i.e.,2
the distribution factorizes into the marginal distributions (independent variates!)
rre g g r drϕ ϕπ
∞− = =∫
![Page 59: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/59.jpg)
59
USMReduction of variables
problem: we have f(x,y), and need g(u) with u(x,y).solution: use standard transformation, by introducing a 2nd variable v(x,y) (usually, choose v=x)
example: given 2-D uniform distribution
( , ) ( , )and marginalize with respect to
( ) ( , )
f x y h u vu
g u h u v dv
→
= ∫
2 see next page, left figure
Calculate
1 if [0, ] and [0, ]( , )
0 else
( , ) already normalized
1 1, 1 1
(
( , ) ( , )
)!Note:
x yf x y
f x yx y
u x y x y u u h u v f x y
g x y
⎧ ∈ Δ ∈ Δ⎪= Δ⎨⎪⎩
∂ ∂= + ∂ ∂ ∂ ∂= = = ⇒ = = 21 0,v x x yu v
v v= ∂ ∂∂ ∂ Δ
∂ ∂
+
![Page 60: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/60.jpg)
60
USM
max max
min min
( ) ( )
max min2( ) ( )
2 2
1( ) ( , ) ( , ) ( ( ) ( ))
From above figure (middle): ( ) ( ) [ , ], since [0, ]1 1: ( ) [0, ] (slope=1) ( ) ( 0)
: ( )
v u x u
v u x u
g u h u v dv h u x dx x u x u
u x x y x x y
u x u u g u u u
u x u
= = = −Δ
= + ∈ + Δ ∈ Δ
< Δ ∈ ⇒ = − =Δ Δ
> Δ ∈
∫ ∫
2 2
max
The distribution of the sum of two uniformly distributed quantitie
1 1[ , ]
s is triangular-shaped, see
( ) ( ( )) (2 )
1( )
Nabove figure (right)
the distriote:
u g u u u
g g
− Δ Δ ⇒ = Δ − − Δ = Δ −Δ Δ
= Δ =Δ
bution of x-y looks similar, when the abscissa is shifted by -Δ
![Page 61: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/61.jpg)
61
USMCalculation of the transformation
as in the 1-D case: integration and inversion of primitive functionimportant example: Box-Muller algorithm to create normally distributed variates from uniform distribution (random number generator)
[ ]
2
2
2
/ 2
' / 21 1
0
' / 2
remember 2-D normal distribution in polar coordinates1( , ) (factorized in and )
2distribution in r:
( ) ' ' ( ) (uniform distribution (w.r.t. 0,1 )
( )
r
rr
r
g r drd re drd r
G r r e dr F x x
G r e
ϕ ϕ ϕ ϕπ
−
−
−
=
= = =
=
∫
0 1 1
2 20
2 2
| ; 2ln(1 )distribution in :
1( ) ' ( ) (uniform distribution )2
( ) ; 22
r x r x
H d F x x F
H x x
ϕ
ϕ
ϕ ϕπ
ϕϕ ϕ ππ
= = − −
= = =
= = =
∫
![Page 62: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/62.jpg)
62
USM
1 2 1 2
1 2 1 2
in Carthesian coordinates:
cos 2ln(1 ) cos(2 ) 2ln( ) cos(2 )
sin 2ln(1 ) sin(2 ) 2ln( ) sin(2 )
These variables are independent and normally distributed with expectation value ze
x r x x x x
y r x x x x
ϕ π π
ϕ π π
= = − − −
= = − − −
2 2 2 2( ) / 2 / 2 / 2
1 2two uniformly distributed
ro and unit varian
variates
ce.1 1 1( , ) =
2 2 2 ,Thus two normally distributed varia: tes x,y
x y x y
x x
f x y e e eπ π π
− + − −
⇒
=
P.d.f.s for a uniform distribution (black), generated by a random number generator from N=105 subsequent numbers. The corresponding normal distribution (blue) has been created from these numbers using the Box-Muller algorithm. Displayed are histograms with bin size 0.02. Analytical p.d.f.s in green and red.
![Page 63: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/63.jpg)
63
USMDistributions with more than two variables
{ }
1 2 3
1
( , , ,... ) ( ) (in vector notation)
( ) ... ( ) ( )
particularly important is ,
cov( , ) ( )( )
probability den
sity
expectation value
covariance matrix
(see Sec
N
N
ii
ij i j i i j j
f x x x x f
E u u f dx
C
C x x E x µ x µ
∞ ∞
=−∞ −∞
=
=
= = − −
∏∫ ∫
x
x x
{ }
2
1
21 2 3
t. V)
The covariance matrix is symmetric, and the diagonal elementsare the variances: ( ) ( )
Matrix notation: with ( , , ,... ) and ,...
( )( )
ii i i
N
N
C Var x xxx
x x x x
x
E
σ= =
⎛ ⎞⎜ ⎟⎜ ⎟= =⎜ ⎟⎜ ⎟⎝ ⎠
= − −
T
T
x x
C x µ x µ
![Page 64: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/64.jpg)
64
USM
1
1
transformation of variables
independent, identically distributed (i.i.d.) var
with Jacobi determinant
...g( ) ( )
...
(u.i.v. = unabhängig, identisch verteilt)For p
iabl
arameter estim
es
ates,
N
N
x xfy y∂ ∂
=∂ ∂
y x
1
sample of independent measurements might
be used. The p.d.f. for independent variables which are identically distributed according to ( ) is given by
( ) ( )N
ii
N
f Nf x
f f x=
=∏x
![Page 65: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/65.jpg)
65
USMIV. Important distributions
Binomial distribution
experiment with two mutually exclusive outcomes, i.e.,
calculate the probability that n experiments have k times the outcome A.• What is the probability to obtain (exactly!) 4 times the six when rolling the die 10 times? Answer: ≈0.054
• What is the probability to toss “number” only one time in 20 trials? Answer: ≈1.75 ·10-5
with ( ) and ( ) 1S A A P A p P A p q= + = = − =
1 510, 4, ( ) ( )6 6
n k p A p A= = = =
1 120, 1, ( ) ( )2 2
n k p A p A= = = =
let's assign the random variable to the outcome of experiment .
1 if the result occurs, and 0 if occurs. Our above question can be rephrased then to the question regarding the probability
i
i i
x i
x A x A= =
1
distribution of the random number
,
and, particularly, to the probability P( )
n
ii
x x
x k=
=
=
∑
![Page 66: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/66.jpg)
66
USM
answer depends on two factors
i) What is the probability to obtain the result in the experiments and to obtain in the remaining ?Since the experiments are independent, this probability is given by theproduct of the
A first kA n k−
( )
probabilities of the individual events, i.e.,(1 )
ii) How many possibilities for the event " times result in experiments"do exist? This is given by the binomial coefficients,
!!
k n kp p
k A n
n nk k n k
−−
⎛ ⎞=⎜ ⎟ −⎝ ⎠
( )
!
Thus, the probability ( ) is given by !( ) (1 )
! !n k n kp
P x knB k p p
k n k−
=
= −−
![Page 67: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/67.jpg)
67
USM
expectation value and variance
( )22 2 2 2
n
i=1
expectation value and variance of experiment( ) 1 0 (1 )
( ) ( ) ( ) 1 0 (1 ) (1 )
The corresponding values for the random variable x= are
(exploiting the
i
i i i
i
singleE x p p p
Var x E x E x p p p p p pq
x
= ⋅ + ⋅ − =
⎡ ⎤= − = ⋅ + ⋅ − − = − =⎣ ⎦
∑
2
calculation rules for independent variates)
( ) "mean number of successes"
( ) ( ) (1 )
E x k np
Var x x np p npqσ
= =
= = − =
( )1 1
' '
' 0 ' 0
(cumultative) distribution function!( ) ( ' ) ( ') (1 )
'! ' !
k kn k n kp
k k
nF k F k k B k p pk n k
− −−
= =
= < = = −−∑ ∑
![Page 68: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/68.jpg)
68
USM
Binomial distribution, ( ), as a function of . Top panel: fixed , different ;
middle: fixed , different ; bottom: different values of and , but =const
npB k k p n
n p n p np
![Page 69: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/69.jpg)
69
USM
Example: Detector efficiency• spark chambers (95% efficient) are used to measure the tracks of cosmic rays. At least three
points are needed to define a track. How efficient is a stack of three chambers? Would using 4 or
5 chambers give significant improvement?
3 3 0 30.95
The probability of three hits from three chambers is3!(3;3,0.95) (3) (1 ) 0.95 0.857
3!0!
For four chambers, the probability of three or four hits isP(3;4,0.95)+P(4;4,0.95)=0.171+0.815=0.986
P B p p= = − = =
For five chambers, the probability of three, four or five hits isP(3;5,0.95)+P(4;5,0.95)+P(5;5,0.95)=0.021+0.204+0.774=0.999!
![Page 70: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/70.jpg)
70
USMMultinomial distribution
binomial distribution: 2 different outcomesmultinomial: more than 2 different outcomes, mutually exclusive!
1 2 3
1 2 31
, , ,..., 1 2 21
1
+ + +...+ with ( ) and 1
When experiments are performed, the probability of finding events is given by
!( , , ,..., )!
We define
j
l
l
l j j jj
j j
lkn
p p p p l jlj
jj
S A A A A P A p p
n k A
nM k k k k pk
x
=
=
=
= = =
=
∑
∏∏
1
1 if experiment yields , and 0 otherwise. Then
and
( ) , with covariance matrix
( ) ( Kronecker ), i.e.,
(1 ) as before, but nonvanishing, n
ij j
n
j iji
j j
ij i ij j ij
ii i i
i A
x x
E x np
C np p
C np p
δ δ δ
=
=
=
=
= −
= −
∑
That there a correlation was to be expected, since the are not independent due to
the constraint 1. I.e., if there are more successes for class than expected
egative correlation
j
j
ij i j
is x
p i
C np p
=
= −
∑ ( ( )),
the values of for all other classes are smaller than (
negati
)
correla i !ve t on
i
j j
E x
x j E x
⇒
![Page 71: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/71.jpg)
71
USMFrequency; law of big numbers
probabilities, e.g., pj in case of the multinomial distribution, are usually not known a priori but have to be obtained from experiments. Thefrequency of event Aj in n experiments is given by
This frequency is a random number, since it depends on the results of the particular n experiments.
1
1 1n
ij ji
h x xn n=
= =∑
2
1( ) ( ) ,
i.e., the expectation value of the frequency of an event is the correspondingprobability, and
1 1 1( ) ( ) (1 ) ( )
This i the law of big numbers !s
jj j
jj j j
xE h E E x p
n n
xVar h Var Var x p p h
n n n nσ
⎛ ⎞= = =⎜ ⎟
⎝ ⎠
⎛ ⎞= = = − ⇒ ∝⎜ ⎟
⎝ ⎠
For large , the standard deviation of the nfrequency vanishes beyond any given limit, which justifies the frequencydefinition of probability (cf. Sect. I).
![Page 72: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/72.jpg)
72
USMPoisson distribution
The study of the lower panel of the last figure (binomial distribution) suggests that this distribution approaches a fixed distribution if n tends to infinity but the product (the expectation value) np=λ is kept constant.Indeed,
( )
( )
!( ; , ) ( ; , / ) 1! !
! ( 1)( 2) ( 1) for !
1 1 for (definition of the exp function)
Thus,
( ; , / ) ( , ) which i!
k n k
k
n k n
k
nP k n p P k n nk n k n n
n n n n n k n nn k
e nn n
eP k n n P kk
λ
λ
λ λλ
λ λ
λλ λ
−
−−
−
⎛ ⎞ ⎛ ⎞= = −⎜ ⎟ ⎜ ⎟− ⎝ ⎠ ⎝ ⎠
= − − ⋅⋅ ⋅ − + → →∞−
⎛ ⎞ ⎛ ⎞− → − → →∞⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
→ = s the
start with P(0)= , and then successively multiply by and dividcalcula
Poiss
tion:
on-distribu
e by 1,2,3,4
t
,
ion
...
and describes the probability of obtaining k events if the expected number is
e λ
λ
λ−
. to obtain P(1), P(2) etc.
![Page 73: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/73.jpg)
73
USM
interpretation: Suppose λ events are expected to occur in some interval. Split up this interval into n very small sections, so that the chance to find two events in one section is negligible. The probability that one section contains one event is then p=λ/n.
The probability of finding k events in the n sections is given by the binomial distribution,
P(k;n,p=λ/n)
which approaches the Poisson distribution for large n.
Note: the Poisson distribution is defined only for integer values of k!
![Page 74: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/74.jpg)
74
USM
Poisson distribution for differentexpectation values
1
2
3
( , ) 1
( ) ( ) ( )
[this is consistent with the binomial distribution:
Total probability
expectation value and variance
skewnes
( ) and
Var( ) (1 ) (1 ) for ]
' s
k
P k
E kVar k k
E k np nn
k np p n nn n
λ
λ
σ λ
λ λ
λ λ λ
μ λ
∞
=
=
=
= =
= = =
= − = − → →∞
=
∑
1/ 233 3/ 2
(third central moment)'= ,
i.e., the distribution becomes increasingly symmetric for increasing
μ λγ λσ λ
λ
−
→
= =
![Page 75: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/75.jpg)
75
USM
application:
Poisson distribution describes asymptotic behavior of binomial distribution with constant λ=np, i.e., with a (very) low probability for the individual process. Thus, it should be applied when there are many trials but only few successes. Since one has no idea on the number of trials (only that there are many), it describes the cases of sharp events occurring in a continuum.
examples: • the number of flashes of lightning in a thunderstorm (it is meaningless to ask
how often there is no flash)
• the number of clicks in a Geiger counter (meaningless to ask about “non-
clicks”)
![Page 76: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/76.jpg)
76
USMA historical example
Statistics on the numbers of Prussian soldiers kicked to death by horses. In the 19th century is was reported that there were 122 deaths in ten different army corps over twenty years, i.e., the mean number of deaths per corps and per year is λ=122/200=0.61.The probability of, e.g., no death is thenP(0,0.61)=0.5434 per year and corps. In twenty years and ten corps, there should be 108.7 cases where no death should have happened. Actually, 109 such events have been reported.
Number of deaths per year and corps
actual numberreported for 20 years and 10 corps
predictions from Poisson statistics
01234
109652231
108.766.320.24.10.6
![Page 77: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/77.jpg)
77
USMSupernova 1987A
The following table gives the numbers of neutrino events detected in 10 s intervals by the Irvine-Michigan-Brookhaven experiment on Feb. 23rd 1987 (around which time SN1987A has been firstly seen)
The average number of events per interval (ignoring the interval with 9 events) is 0.77The Poisson predictions agree well with the data, except for the interval with the 9 events. Thus, the background due to random events is Poisson and well understood, and the nine events cannot be due to fluctuations, but must have come from a different event (the supernova).
no. of events 0 1 2 3 4 5 6 7 8 9
0 1
0.00030.003
no. of intervals 1042 860 307 78 15 3 0 0
prediction 1064 823 318 82 16 2 0.3 0.03
![Page 78: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/78.jpg)
78
USMTwo Poisson distributions
If there are two separate types of Poisson distributed events, and we do not distinguish between the two, then the probability of k=k1+k2 events is also Poisson, with mean equal to the sum of the two individual means.
1
1 1 1 2 1 20
0 0 0
( ) ( , ) ( , ) ( , )
Proof via characteristic function of Poisson distribution( )( ) ( , ) exp( ) exp (
Remembe
1)! !
ch: ar ar
k
k
k it kitk itk it it
Pk k k
P k P k P k k P k
e et e P k e e e e ek k
λλ λ
λ λ λ λ
λ λφ λ λ λ
=
−∞ ∞ ∞− −
= = =
= − = +
⎡ ⎤= = = = = −⎣ ⎦
∑
∑ ∑ ∑
1 2
1 2
sum P( ) P( ) 1 2
1 2 P( )
cteristic function of sum of independent variables is productof their characteristic functions (Sect. II)
( ) ( ) ( ) exp ( 1) exp ( 1)
exp ( )( 1) (
it it
it
t t t e e
e
λ λ
λ λ
φ φ φ λ λ
λ λ φ +
⇒
⎡ ⎤ ⎡ ⎤= = − − =⎣ ⎦ ⎣ ⎦⎡ ⎤= + − =⎣ ⎦
1 2
).
Thus, the sum of two independent, Poisson distributed variables is Poisson-distributed as well, with =
t
λ λ λ+
![Page 79: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/79.jpg)
79
USM
can be generalized to any number of Poisson processes
example: signal with background• expected are S signals with an average background B. The average
fluctuation (standard dev.) of the observed number of events k is thus
• If we subtract the average background from the signal, this fluctuation
remains conserved, of course.
• If the exact expectation value of the background is not known, the
uncertainty is even larger (error propagation)
( )S B S Bσ + = +
For an expected signal =100 and background 50 we observe
on average 150 events with a standard deviation of 150. After
subtracting the background, the average signal is =100 150
S B
S
=
±
![Page 80: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/80.jpg)
80
USMUniform distribution
so far, only distributions of one or more discrete variables discussedwill now turn to continuous distribution functionsmost simple case: the uniform distribution (already mentioned before):
constant probability density in a certain interval, elsewhere 0.( ) ( ) 0 ,
From the normalization, ( ) 1, we obtain
1 , -
and the distribution function becomes
( )
f x c a x bf x x a x b
f x dx
cb a
F x
∞
−∞
= ≤ <= < ≥
=
=
=
∫
1 - - -
( ) 0 ( ) 1
x
a
x adx a x bb a b a
F x x aF x x b
= ≤ <
= <= ≥
∫
![Page 81: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/81.jpg)
81
USM
uniform distributions with a=0, b=1 created by random number generators (RNGs).Note: in many RNGs, “0” not included, i.e., lowermost value =ε (machine dependent)important for Monte Carlo methodsdifferent distributions obtained from transformation methods (see Sect. II/III)
2
1 1( ) ( )- 21( ) ( )
12
a
b
E x xdx a bb a
Var x b a
= = +
= −
∫
![Page 82: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/82.jpg)
82
USMGaussian (or normal) distribution
assume binomial distribution with random variable k
( )
[ ]0
!( ; , ) (1 )! !
characteristic function:
( ) ( ; , ) exp( ) (1 ) (without proof)
use reduced variable
and interpret this as the sum of two independent r.v. (though
k n k
nitk
k
nP k n p p pk n k
t e P k n p it p p
k k k npu
φ
σ σ
−
∞
=
= −−
= = + −
− −= =
∑
( )( )
2 32
the 2nd term is constant)
( ) exp exp (1 )
take logarithm of ( ) and expand exp , thereafter expanding ln 1+f t/
1 (1 )ln ( ) ( )2
n
u
u
u
itnp itt p p
itt
np pt t O
φσ σ
φ σσ
φ σσ
−
⎡ ⎤⎛ ⎞ ⎛ ⎞= − + −⎜ ⎟ ⎜ ⎟⎢ ⎥⎝ ⎠ ⎝ ⎠⎣ ⎦⎛ ⎞ ⇒⎜ ⎟⎝ ⎠
−= − +
![Page 83: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/83.jpg)
83
USM
2
2
Thus, accounting for (1 ) and in the limit of n , we find1( ) exp2
This is the characteristic function of a binomial distribution, using a reducedvariable, in the limit of large .Ba
u
np p
t t
n
σ
φ
= − →∞
⎛ ⎞= −⎜ ⎟⎝ ⎠
2
ck-transformation yields the corresponding p.d.f.,
1 1( ) exp22
which is called the Gaussian or normal distribution. Since is a reduced variable,( ) should be 0 and ( ) should be 1.
Test:
f u u
uE u Var u
π⎛ ⎞= −⎜ ⎟⎝ ⎠
2
( ) 02 2
2 20 0
1 1( ) exp 022
'( ) ( )( ) 1, q.e.d.E u
t t
E u u u
d t d tVar udt dt
π
φ φ
∞
−∞
=
= =
⎛ ⎞= − =⎜ ⎟⎝ ⎠
= − = − =
∫
![Page 84: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/84.jpg)
84
USM
2
2
2
2
2
A more general form of the normal distribution is
1 ( )( ) exp .22
Since ( ) and ( ) , the conventional representation is
1 ( )( ) exp22
The inflection points of this di
x af xbb
E x a Var x b
xf x
π
μσπσ
⎛ ⎞−= −⎜ ⎟
⎝ ⎠= =
⎛ ⎞−= −⎜ ⎟
⎝ ⎠stribution (zero curvature) are located at .
Once again, this is the limit of a binomial distribution with the above expectation value and variance, in the limit .
The corresponding characterist
n
μ σ±
→∞
2 2
ic function is1( ) ( ) exp( )exp2
The characteristic function of a normal distribution withzero mean is itself a normal distribution withTheorem:
zero mean. The productof th
e var
itxt e f x dx it tφ μ σ⎛ ⎞= = −⎜ ⎟⎝ ⎠∫
iances of both distributions is one.
![Page 85: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/85.jpg)
85
USM
2 2
n0
2 41 2 3 4
The characteristic function transformed to is1'( ) exp2
1 '( )With ' (Sect. II), we find the central moments
' 0, ' , ' 0, ' 3 (remember kurtosis, Sect.
n
n nt
y x
t t
d ti dt
μ
φ σ
φμ
μ μ σ μ μ σ=
= −
⎛ ⎞= −⎜ ⎟⎝ ⎠
=
= = = =
2 1
22
2
( ) /22
2
I),and
' 0, 0,1,2,3,...(2 )!' .2 !
Corresponding cumulative distribution functions are
1 1( ) exp22
1 ( ) 1 1( ) exp exp2 22 2
k
kk k
x
o
xx
o
kkk
x x dx
x xx dx u duμ σ
μ
μ σ
ψπ
μ μψ ψσ σπσ π
+
−∞
−
−∞ −∞
= =
=
⎛ ⎞= −⎜ ⎟⎝ ⎠
⎛ ⎞− −⎛ ⎞ ⎛= − = − =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝⎝ ⎠
∫
∫ ∫ ⎞⎜ ⎟
⎠
![Page 86: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/86.jpg)
86
USM
( )
( )
0
0
0
0
The probability of observing within a band width 2 around the expectation value zero is
( ) ( ) ( ) ( ( ) ( ))
2 ( ) 2 ( ) 2 ( ) 2 1
andthe probability
x x
x x
x x
o
x x
P t x f u du f u du f u du f u f u
f u du f u du f u du xψ
− −
−∞ −∞
≤ = = + = = −
= = − = −
∫ ∫ ∫
∫ ∫ ∫
( ) ( )
of a random variable being observed within an integer multiple of the standard deviation from the mean
2 1 2 1
o onP x n nσμ σ ψ ψσ
⎛ ⎞− ≤ = − = −⎜ ⎟⎝ ⎠
( ) ( ) ( )( ) ( ) ( )( )
from Tchebychev inequality(Sect. II)
0.682 0.318 1.0
2 0.954 2 0.046 2 0.25
3 0.998
P x P x P x
P x P x P x
P x
μ σ μ σ μ σ
μ σ μ σ μ σ
μ σ
− ≤ = − > = − > <
− ≤ = − > = − > <
− ≤ = ( ) ( ) 3 0.002 3 0.11P x P xμ σ μ σ− > = − > <
“3σ-error“
![Page 87: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/87.jpg)
87
USMMultivariate normal distribution
1 2The joint normal distribution of variables ( , ,..., ) is defined as1( ) exp2
with a , Matrix. Since ( ) symmetric about ,s
( ) ( )
ymmetric
d , i.e.,
nn x x x x
k
n
E
n
φ
φ
φ∞
−∞
= =
⎧ ⎫= −⎨ ⎬⎩ ⎭
×
− =∫
T
T
x
x (x - a) B(x - a)
B x a
x a x x 0
(x
( )
1 2
Differentiating w.r.t. (=0), we find for the th component
1( ) ( )d ( ) +( ) ( ) 2 ( ) d ,2
and for all components
, ,
.
...
,
i ik ki
n
i
B x aa
a a a
φ φ φ∞ ∞
−∞ −∞
∂ ⎛ ⎞⎛ ⎞− = − − − − − =⎜ ⎟⎜ ⎟∂ ⎝ ⎠⎝ ⎠
⎛ ∂ ∂ ∂∂ ∂ ∂⎝
∑∫ ∫
a
x a x x x e x a x
) = a = μ
x 0
[ ]( )( )
( )
( ) ( )d ( )d , which implies that
( symmetric) and thus
The Matrix in the exponent of ( ) the inverse of is ju the cst ovariance
E
E
φ φ
φ
−
∞ ∞
−∞ −∞
⎞− = −⎜ ⎟
⎠
= =
∫ ∫T
1
T
T
x a x x I - (x - a) B(x - a) x x = 0
(x -
C (x - a)(x
a)(x - a) B = I B
B x
- a) B
matrix,the vector formed by the ex
a pend the vector ctation va lues.a
![Page 88: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/88.jpg)
88
USMBinormal (or bivariate normal) distribution
( )
21 1 2
21 2 2
22 1 2
2 2 2 21 2 1 2 1 2 1
221 1 1
1 2 21 2 1
22
cov( , )With , we obtain
cov( , )
cov( , )1cov ( , ) cov( , )
1Case 1: independent vari
01 1
able
( , ) exp1
s
2 20
x xx x
x xx x x x
xx x
σσ
σσ σ σ
σ μφ
πσ σ σσ
− ⎛ ⎞= = ⎜ ⎟
⎝ ⎠
⎛ ⎞−= ⎜ ⎟− −⎝ ⎠
→
⎛ ⎞⎜ ⎟ ⎛ −⎜ ⎟= → = −⎜⎜⎜ ⎟ ⎝⎜ ⎟⎝ ⎠
1C B
B
B ( )
( )
22 2
22
/ 2
1exp , i.e.,2
becomes the product of two normal distributions (the leading factor from normalization)
1(for variables with non-vanishing covariance, one obtains 2 det( )
Cas
n
x
n k
μσ
φ
π
⎞ ⎛ ⎞−−⎟ ⎜ ⎟⎟ ⎜ ⎟
⎠ ⎝ ⎠
=B
( )( ) ( )2 2
1 1 2 21 1 2 21 2 2 2 22
1 1 2 21 2
1 2
1
1 1( , ) exp 22(1 )2 1
cov( , )Let's use reduced variables, , 1,2 and correlation coe
e 2: dependent variab
fficient
es
s
l
i ii
i
x xx xx x
x x xu i
μ μμ μφ ρρ σ σ σ σπσ σ ρ
μ ρσ σ
⎡ ⎤⎛ ⎞− −− −⎢ ⎥= − − +⎜ ⎟⎜ ⎟−⎢ ⎥− ⎝ ⎠⎣ ⎦−
= = = 1 22
cov( , )u uσ
= →
![Page 89: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/89.jpg)
89
USM
1 2
2
2 21 2 1 22
1 1( , ) exp( ), with22 det
1111
Lines of constant probability density result from constant exponent1 ( 2 ) const
1Let const=1, i.e., the prob. density has decr
u u
u u u u
φπ
ρρρ
ρρ
= −
−⎛ ⎞= ⎜ ⎟−− ⎝ ⎠
+ − =−
Tu BuB
B
(This corresponds to the 1-D case where at 1 (i.e., ( - ) )
the prob. density has decreased by the same factor. )
eased by a factor of
exp(-1/2)=1/ e from the maximum, (0,0).
In the original vari
u x μ σ
φ= ± ± =
( ) ( )2 221 1 2 21 1 2 2
2 21 1 2 2
1 2
ables, we then have
2 1 ,
which is the equation of an ellipse withellips
center at ( , ) and is called thee of covariance (
The Fehlerellipse). extreme va olu s f e
x xx x
x
μ μμ μρ ρσ σ σ σ
μ μ
− −− −− + = −
1 2
1 1 2 2
1 2
and are located at
and ,
i.e., the ellipse fits exactly into the rectangular box between these limits. The total probability of observing a pair of and inside the ellipse is 1- ex
x
x x
μ σ μ σ± ±
p(-1/ 2).
![Page 90: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/90.jpg)
90
USM
1 2'
'1
'2
'2
1=0.7 (green) = -31.60 , 0.625
=0.0
covariance ellipses centered at (2,2), with
(red) = 0.0
2, 1.6
0 ,
15
1, 2 , a
1.000
2
0
=-0.3 (blue) = 2
nd
, 2
0.16 ,
ρ θ σ
ρ θ σ σ
ρ σ
σ
σ σ
θ
= =
→ = =
→ =
→
=
' '1 2
' '1 2
=-0.999 (black) = 35.2
0.9188, 1
6 , 0.0365, 1.7316
.4683
ρ θ σ σ
σ=
→ = =
=
1 1
2 2
By a simple rotation, the correlation can be put to zero (diagonalization by orthogonal transformation).The corresponding transformation is
' cos sin , with
' -sin cos
2tan 2
x xx x
θ θθ θ
θ
⎛ ⎞ ⎛ ⎞⎛ ⎞=⎜ ⎟ ⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠ ⎝ ⎠
= 1 22 2
1 2
1 22 2 2
'2 1 21 2 2 2 2
1 2 1 22 2 2
' 2 1 22 2
1
uncorr
,
and new semi-major and semi-minor axes (corresponding to the variances of the variables 'elated and ')
(1 )sin cos 2 sin cos
(1 )o
c
x x
ρσ σσ σ
σ σ ρσσ θ σ θ σ σ θ θ
σ σ ρσσ
−
−=
+ −
−= 2 2 2
2 1 2
2 21 2
1 2 ' ' '2 ' 21 2 1 2
s sin 2 sin cos
In the rotated coordinate system, the distribution has the simple form
' '1 1( ', ') exp2 2
x xx x
θ σ θ σ σ θ θ
φπσ σ σ σ
+ +
⎧ ⎫⎛ ⎞⎪ ⎪= − +⎨ ⎬⎜ ⎟⎪ ⎪⎝ ⎠⎩ ⎭
![Page 91: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/91.jpg)
91
USM
' '1 2
The probability enclosed by the covariance-ellipse can be calculated as follows:Consider the rotated coordinate system, and work in reduced variables. In this case, the p.d.f. reads
1( , ) exp(2
u uφπ
= ' 2 ' 21 2
2 1' ' ' ' 21 2 1 2
circle 0 0
1- ( ),2
and the total probability inside the covariance-ellipse (which in the transformed variables is the unit circle) can be calculated from
1( , ) exp(- / 2)2
u u
u u du du d r r drφ ϕπ
+
=∫∫ ∫
1
2
2
1
0
This is the probability that any ( , ) pair is located within the covariance-ellipse, and applies for all binormal distributions, in
exp(- / 2) 1 exp( 1/ 2) 0
dependent of their specific cor
3
r
. 93
x x
rπ
= − = − − =∫
(distribution in transformed coordinate system independent of correlation).
The area inside the covariance ellipse is called the "1- confidence region", since it comprises t
elation te
he region r
rm
wheσ
1,2
e the p.d.f. has decreased from the maximum by a factor of exp(-1/ 2),in analogy to the 1-D case (independent of correlation and the specific ).
Similarly, one can calculate the 2- confidence regi
σ
σ2 2
on (where the probability density has decreased by a factor of exp(-(2 ) / 2 ) exp( 4 / 2), with a total probability inside the corresponding ellipse of 1 exp( 4 / 2) 0.865 (in the above i
σ σ = −− − =
ntegral, replace the upper limit by r=2), and so on for the n- interval.
Finally, one can generalize this consideration to arbitrary dimensions.
σ
![Page 92: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/92.jpg)
92
USM
Generally, the 1- confidence interval denotes the region where the probability density has decreased by the factor of exp(-1/ 2)σ
binormal distribution as before, with =-0.9, and contour plots for the 1-,2- and 3- covariance ellipses
In the lower panel, the coordinate system has been transformed (rotated, streched)and displays
ρσ
the transformed binormal distribution (with unit variances and =0) and corresponding covariance "ellipses" for =1,2,3
Note that the volume (corresponding to the total probability inside the contour l
ρ
σ
evels) remains preserved under the transformation(e.g., for thin ellipses with large the probability densities are larger)
ρ
![Page 93: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/93.jpg)
93
USM
covariance ellipses for σ=1,2,3, corresponding probabilities andstandard-deviations with respectto the two directions
2σ1
2σ2
deviation confidence-niveau [%]
left: probability inside n-σconfidence region; right: interval limits in units of σ,for a given confidence niveau (probability)
![Page 94: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/94.jpg)
94
USMχ2-distribution
2
Remember from Sect. II, "calculation of the transformed p.d.f.":
Calculate the distribution for the of a reduced r.v. which itself should be normally distributed.
squa
ex
re
ampl
( - ) and
e 2
xu μσ
⎡ ⎤= ⎢ ⎥⎣ ⎦
2
2( )
/ 2
2
2
2
1 ( ) e2
,
which is the so-called -distribution for one degree of freedom.For convenience, we denote by in the following.
Now, let's add inden
1( )2
( ) 1, ( ) 2
dent
x
ug u eu
E u
f x
u
f
Var u
μσ
σ π
χ
χ
π
−
−
−
=
⇒ =
=
=
22
2i=1
, normally distributed and reduced random variables
( - )fi i
i
xu μχσ
= =∑
![Page 95: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/95.jpg)
95
USM
2This results in the , and plays an important role in the comparison of measurements and theoretical predictions(e.g., line
-distribution for degrees o
ar regressions). In this cas
f fre
1( )
ed
(
e
om
g u
f
f
χ
=Γ
/ 2 1 / 2/ 2 with Gamma-function and
(from the definition and using the calculation rules
, / 2)2
( ) , ( for expectation value and va
) 2 ri ce
an
f uf u e
E u f Var u f
− − Γ
= =
2max
2
)
Maximum of -distribution for 2 at 2.
For 2, we obtain an exponential distributionFor large , -distribution approaches normal distribution.
The role of the degrees of freedom will be d
f u f
ff
χ
χ
> = −
=
iscussed in Sect. x.x
![Page 96: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/96.jpg)
96
USMThe central limit theorem (CLT)
{ }1
Remember: Normal distribution was derived as the asymptotic distribution for
lim
when describes the outcome of an experiment with two possible results, 0,1 .
Let's now investigate more
n
in i
i i
x x
x x
→∞=
=
=
∑
2
general sums of this type.
We assume that the are independent r.v. and originate from the distribution
with mean and vari
same, arbitr
ance . The characteristic function
"Classical" theorary
em:
ix
μ σ
'
'
'
2( ) 2
20 0
2 2 3
of this distribution (for ) is
( ) ( )( ) ( ), with 0 and
Thus, the Taylor expansion is given by ( (0) 1)1( ) 1 ( )2
i
i
i
i i
it xx
t t
x
x x
d t d tt E edt dt
t t O t
μ
μ
φ φφ σ
φ
φ σ
−
= =
= −
= = = −
=
= − +
![Page 97: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/97.jpg)
97
USM
'
'
2
We introduce now a new variable
, which simply contracts the scale. The corresponding charact. function is
t( ) ( ) (exp( )) ( ), and therefore
( ) 1 .... 2
i
i i
i
i ii
itu iu x
u
x xun n
xt E e E itn n
ttn
μσ σ
μφ φσ σ
φ
−= =
−= = =
= − + 3/ 2 with higher terms at most of order ( )
Making use of the fact that the characteristic function of the sum of independent r.v. is given by the product of the individual charact. func
O n
n
−
( )1 1
2
2
tions, and going to the limit ,we find for
lim lim that
( ) lim ( ) lim 1 ....2
1( ) exp( ),2
which is just the charact. function of the standardi
i
n ni
in ni i
nn
u un n
u
n
xu un
tt tn
t t
μσ
φ φ
φ
→∞ →∞= =
→∞ →∞
→∞
−= =
⎛ ⎞= = − +⎜ ⎟
⎝ ⎠
= −
∑ ∑
zed normal distribution, with expectation value 0 and variance 1.
![Page 98: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/98.jpg)
98
USM
1
1 1
2
In terms of the of the original variable then,
1lim ( + ) lim lim ,
( ) ,
arithmetic mea
( ) /
the back-transformed distribution is normal, w
n
ith me
n
n ni
i in n ni i
x
xx nu un n nn
E x Var x n
μσσ μ
μ σ
=
→∞ →∞ →∞= =
⎡ ⎤= = = +⎢ ⎥⎣ ⎦
= =
∑∑ ∑
2
an and standard deviation / .
Thus, the "classical" central limit theorem reads:
If the are a set of independent r.v. each distributed with mean and variance ,then in the limit of their ar
i
n
xn
μ σ
μ σ→∞
1
2
ithmetic mean 1
is normal distributed with mean and variance / .
Under certain assumptions [see, e.g., : the Lyapunov criterium ("weak" asymmetry) or theeven weaker Li
n
ii
xx xn n
nμ σ=
= = ∑
ndeberg condition], a central limit theorem can be formulated.If these conditions apply, the sum of arbitrary (i.e., not identical) distributed r.v converges
to a normal dist
"generaliz
ribution,
ed"
wit 2
1 1
h mean and variance .n n
i ii i
μ σ= =∑ ∑
Wikipedia
![Page 99: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/99.jpg)
99
USMExamples for the CLT
CLT for several cases:upper panel: arithmetic mean of n=1, 2, 30 uniformly distributed r.v.overplotted is the corresponding Gaussianwith μ= 0.5 and variance =1/(12*n)
middle panel: arithmeticmean of n=1, 2, 30 exponentially (λ=1)distributed r.v.overplotted is the corresponding Gaussianwith μ= 1 and variance =1/n
lower panel: sum of n=1, 2, 30 exponentially (λ=1)plus n=1,2, 30 uniformly distributed r.v.overplotted is the corresponding Gaussianwith μ= n*1+n*0.5 and variance =n*1+n/12.
sample size =1e6, binsize=0.005
![Page 100: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/100.jpg)
100
USM
The CLT in its generalized form is the base of assuming experimental errors as being normally distributed:
each measurement error is assumed to consist of an accumulation of small individual errors (with unknown distribution), whereas their sum (the measured error) can be described by a Gaussian.
![Page 101: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/101.jpg)
101
USMlog-normal distribution
single-tailed probability distribution of a random variable whose logarithm is normally distributed. If y is a random variable with a normal distribution, then x = exp(y) has a log-normal distribution likewise, if x is log-normally distributed, then log(x) is normally distributed. (The base of the logarithmic function does not matter) a variable might be modeled as log-normal if it can be thought of as the product of many independent factors which are positive and close to 1. (see figure next page)log (x) = log of product = sum of log’s -> CLT -> log (x) normally distributedplays an important role in, e.g., economy, biology, mechanics and astrophysics
2
2 2
2
2
/ 2
2
1 (ln( ) )( , , ) exp( )22
( ) e
Var(x)=(e 1)e
xf xx
E x μ σ
σ μ σ
μμ σσπσ
+
+
−= −
=
−
![Page 102: Statistical Methods - uni-muenchen.de · Statistical Methods An Introduction for (Astro-)Physicists. 2 USM Content Fundamental terms of statistics and data analysis, with examples](https://reader034.fdocuments.in/reader034/viewer/2022050715/5f2ed4ed31bc784f3d65d071/html5/thumbnails/102.jpg)
102
USM
pdf (left) and cumulative distribution function (right) for a log-normal distribution with μ=0 and different σ
5
7
ii=1
Left: simulation of a log-normal distribution from asample of 10 r.v. which are distributed according to
x= x with independent ,
where the are uniformly distributed within the interval 0.4
i
i
x
x
∏
[ ],1.6 .The estimators (Sect. VI) for and areˆ ˆ=-0.47 and =1.02Overplotted is a theoretical log-normal distribution with these parameters
μ σμ σ