Lecture Notes for Phys 114 ”Introduction to data reduction”
Transcript of Lecture Notes for Phys 114 ”Introduction to data reduction”
notes114.tex
Lecture Notes for Phys 114 ”Introduction to data reduction...”
Vitaly A. Shneidman
Department of Physics, New Jersey Institute of Technology
(Dated: April 13, 2014)
Abstract
These lecture notes will contain some theoretical material related to Bevington, 3rd ed. (abbre-
viated BEV.) , which is the main textbook. Notes for all lectures will be kept in a single file and
the table of contents will be automatically updated so that each time you can print out only the
updated part.
The Mathematica appendix is included for reference; you do not have to print it out since there
will be other files which illustrate how this program works.
Please report any typos to [email protected]
1
notes114.tex
Contents
I. Introduction 2
II. Fundamental limitations of measurements 2
A. Diffraction 2
B. Quantization of light 2
C. deBroglie wavelength 3
D. Thermal noise 3
E. Charge quantization 3
F. Decay of metastable states 3
III. The Mathematica program and review of introductory math 6
A. Error function 6
B. Gamma function and Stirling formula 7
C. Incomplete gamma function 8
IV. Introduction to Mathematica 9
A. Overview 9
B. Main commands 10
C. Lists 11
V. Data and Histograms 12
VI. Errors, Parent and Sample distributions 13
A. Parent and Sample distributions 14
B. Mean and variance 15
C. Alternatives to mean and variance 15
VII. Discrete and continuos distributions 17
1. Mean and variance 17
A. Change of variables in continuos distributions 18
VIII. Binomial distribution 18
A. Normal distribution 19
0
IX. Limits of the binomial distribution 20
A. Poisson 20
1. Advanced: sum of two Poisson distributions 21
B. Normal 21
C. Advanced: Central limit theorem 22
D. Physical example: Fluctuations of density in ideal gas 23
X. Other continuous distributions 24
1. Exponential 24
2. Gamma and χ2 25
A. Lorentz (Cauchy) 25
XI. ADVANCED: Expectation and moments, addition for Gauss 26
A. Adding two normal distributions (ND) 26
XII. Distrubution of several variables 27
1. Multivariate Gaussian distribution 27
2. Covariance and correlation 28
3. Statistical analog of covariance and correlation 28
4. Independent random variables 28
XIII. Measurements and propagation of errors 29
A. The coin experiment 30
1. Fair 30
2. Unfair 31
B. Propagation of small errors 31
1. x = f(u) 31
2. x = f(u, v) 32
C. Designing an experiment: Example 33
XIV. Estimators, χ2 and Kolmogorov-Smirnov tests 34
A. Maximum likelihood 34
1. Weighted average 35
B. χ2 35
1
1. Where does it come from? 35
C. Comparing two distributions 37
1. χ2-test for data vs theoretical 37
2. χ2-test for data1 vs data2 37
D. Kolmogorov-Smirnov 38
1. Kolmogorov-Smirnov test 39
XV. Monte Carlo integration 41
A. Buffon’s needle 41
B. General MC and example 41
XVI. Generation of random number for different distributions 43
A. exponential distribution 43
B. Poisson 43
C. Gauss 44
XVII. LSA 45
A. Fitting of data and geometric LSA 45
B. ”Physical” LSA 45
1. Errors in a and b 47
XVIII. Trigonometric, polynomial and nonlinear fits 48
A. a bit of Mathematica 49
1. Basic elementary commands 49
2. NUMBERS 50
3. SYMBOLIC MATH 51
4. DEFINING YOUR OWN FUNCTIONS 52
5. Graphics (2D) 53
6. Modeling and analysis of data 56
a. Filtering 56
2
Dr. Vitaly A. Shneidman, Phys 114, 1st Lecture
I. INTRODUCTION
Homeworks are important part of the course; if done by hand they must be clearly
written in pen (black or blue), if done using Mathematica they must be printed out as a
clear hard copy (adding neat hand-written corrections on this hard copy is ok). No electronic
submissions of Mathematica notebooks will be accepted.
II. FUNDAMENTAL LIMITATIONS OF MEASUREMENTS
A. Diffraction
-10 -5 0 5 10
-10
-5
0
5
10
2 4 6 8 10 12 14
0.05
0.1
0.15
0.2
0.25
FIG. 1: Fraunhofer diffraction from a circular aperture with diameter D. The arguments are
απD/λ with α ≪ 1 being the angle and λ the wavelength. The angle to see the first minimum is
called the angular resolution, θ ≈ 1.21967λ/D. Note that the intensity is very small beyond the
first minimum (right figure), but still can be easily picked up by the eye (left figure). [Also, note:
in optics literature this circular pattern is sometimes called ”Airy disk”.]
B. Quantization of light
E = hω , h ≈ 10−34 J · s (1)
2
C. deBroglie wavelength
λdB =h
p, h = 2πh (2)
D. Thermal noise
E ∼ kBT , kB ≃ 1.38 · 10−23 J/oK (3)
Einstein (1905) – relation between noise intensity and dissipation rate
Nyquist noise:
V 2 ∼ kBT ·R
E. Charge quantization
Q/e = 0, ±1, ±2, . . . , e = −1.6 · 10−19 C (4)
F. Decay of metastable states
Quantum:
3
0.5 1.0 1.5 2.0 2.5 3.0distance
0.5
1.0
1.5
2.0
2.5
3.0
energy
P ∼ exp(
−2
h|S|)
|S| =∫ x2
x1
|p(x)| dx , p2/2m = E − U(x)
4
Thermal:
r
W
P ∼ exp(
− W∗
kBT
)
HW: Give a crude estimation of the resolution limit (in m) when using
• 1 Mhz radio waves
• 1015Hz light
• 1018Hz X-rays
• 1 KeV electrons
• thermal neutrons at about 1000 oK
Arrange the results in a neat Table.
5
Dr. Vitaly A. Shneidman, Phys 114, 2nd Lecture
III. THE MATHEMATICA PROGRAM AND REVIEW OF INTRODUCTORY
MATH
Introduction to Mathematica is described in a separate file ”IntroToMathematica.doc”
(see also Appendix A). The examples refer to numbers and elemetary functions, which
should not cause any difficulties. Non-elementtary functions used in data analysis are de-
scribed below. See also 111−functions.pdf
HW: reproduce the Mathematica output described in the file ”IntroToMathematica.doc” , slightly
changing the input functions compared to what we discussed in class. Print out your notebook.
A. Error function
erf(x) =2√π
∫ x
0dt exp(−t2) , erf(±∞) = ±1 (5)
erfc(x) = 1− erf(x) =2√π
∫ ∞
xdt exp(−t2) (6)
erfc(−∞) = 2 , erfc(x → +∞) ∼ 1√πx
exp(−x2) (7)
-3 -2 -1 1 2 3x
-1.0
-0.5
0.5
1.0
1.5
2.0
y
erf HxLerfcHxL
6
B. Gamma function and Stirling formula
Γ(n+ 1) = nΓ(n) =∫ ∞
0dt tne−t ≡ n! (8)
n! ≃√2πn(n/e)n (9)
Γ(1) = Γ(2) = 1 , Γ(0) = ∞ , Γ(1/2) =√π
1 2 3 4 5x
-1
1
2
3
4
5
y
logHGH1 + xLL
log ã-x 2 Π x1
2+x
lnn! = ln 1 + ln 2 + . . .+ lnn ≃∫
lnn dn = n lnn− n = n lnn
e
7
C. Incomplete gamma function
Γ(x+ 1,m) =∫ ∞
mdt txe−t , Γ(x, 0) ≡ Γ(x) (10)
Let us plot a few g[m](x) = Γ(1 + x,m)/Γ(1 + x):
1 2 3 4x
0.2
0.4
0.6
0.8
1.0
y
gH1LgH2LgH3LgH4L
8
Dr. Vitaly A. Shneidman, Phys114
IV. INTRODUCTION TO MATHEMATICA
A. Overview
TOPICS for several lectures (0.-III. in IntroToMathematica.doc; also see Appendix A):
0. GETTING STARTED
entering a command
help
useful type- and space saving commands
saving files
brackets
equality signs
I. NUMBERS:
exact
approximate
complex (optional)
numbers with dimensions
II. SYMBOLIC MATH
sums and series
integration
algebra
trigonometry
Taylor expansions and limits (optional)
III GRAPHICS
2D
3D (optional)
IV. FUNCTIONS - see 114−functions.pdf
V. LISTS
see 114−intro2.pdf
VI. RANDOM NUMBERS and HISTOGRAMS - see 114−hist.pdf
9
reading external data (Excel and txt)
reading images
B. Main commands
To know by heart (with options):
(space=multiplication)
(*...*) - comment
;
%
/. - replacement with x− > ... or {x− > ..., y− > ...}. - dot product
{..., ..., ...} - list (”vector”)
{{...}, {...}} - nested list (matrix)
=
:=
==
...//f - equiv. to f[...] with f any pure function or operation
N
Integrate
D
Solve
Table , Do , If
...[[i]] - i-th element of list ... (or, row of a matrix if nested list); [[i,k]] - element of a matrix
Transpose , Append , Drop , Select
Plot , ListPlot , Show, Export
Graphics , Point , Line , Text
Timing , RandomReal , RandomInteger
Histogram , Mean , StandardDeviation
FindFit
Import
10
Dr. Vitaly A. Shneidman, Phys114
C. Lists
see 114−intro2.pdf/nb
1. How to generate : Table command; nested lists
2. Length and Dimensions; Lists as vectors and matrices
extracting elements of lists - the [[...]]
3. REARRANGING LISTS : Sort, Rotate, etc.
4. Restructuring Lists : Transpose, Flatten, Drop, Select, etc.
5. Combining Lists : Join, Union, Intersection
6. Operating on lists : the Map (or, /@) command
7. Pure functions
8. Graphics
HW: Select any simple function f . Construct mylist - a reasonably long list of values of f
plus random corrections. Make a good plot which shows f(x) and mylist together. The plot must
include yourname and labels. Create a high quality plot outside of Mathemaica; print it out.
11
1 2 3 4 5 6
1
2
3
4
5
6
FIG. 2: A typical histogram after 20 rolls of a single die.
Dr. Vitaly A. Shneidman, Phys114
V. DATA AND HISTOGRAMS
see 114−histogram.nb
Suppose, you have a long list of data xk , 1 ≤ k ≤ N and
xmin ≤ xk ≤ xmax
Select ”binSize” ∆x and group data into n bins with
n ≃ xmax − xmin
∆x
Each bin will be distinguished by its index i and content bi . E.g. an element xk belongs to
bin i if
i = [xk/∆x]
Here [...] is the ”Floor”” function. A plot of bi as a function of x will then represent a
”Histogram”; a plot normalized to 1 corresponds to ”Probability” option; bi/(N · binSize)will approximate the PDF.
HW:
1. write a function which gives a sum (between 3 and 18) for a single random roll of 3 fair dice
2. repeat the experiment 500 times (not more)
3. use ”Commonest” command to find the most frequent outcome (”mode”)
12
2 3 4 5 6 7 8 9 10 11 12
25000
50000
75000
100000
125000
150000
2 3 4 5 6 7 8 9 10 11 12
0.025
0.05
0.075
0.1
0.125
0.15
FIG. 3: Unscaled histogram (red) and scaled (blue) after 1 million rols of 2 dice. The blue histogram
represents the experimental probabilities (which are already close to theoretical expectations) with
a total area of 1.
4. plot histograms similar to those in fig. 3; you do not have to use colors and, in fact the
do-it-yourself alternative based on ”myBins” (see notebook) could be better.
Dr. Vitaly A. Shneidman, Phys 114
VI. ERRORS, PARENT AND SAMPLE DISTRIBUTIONS
• Accuracy - How close the measurements are to the true value (note that we may not
always know the true value).
• Precision - How close repeated measurements are to each other. A measure of the
spread of data points. One can make measurements that are highly accurate (their
mean is close to the true value) even though they may not be very precise (large spread
of measurements). Conversely, on can make very precise measurements that are not
accurate.
• Errors - Deviations of measurements from the true value. Error here does not mean a
blunder! Also referred to as uncertainties.
• Systematic Errors - deviations from the true value that are very reproducible, gener-
ally due to some uncorrected effect of an instrument or measurement technique. An
example is reading a scale slightly off the vertical, which may systematically give a
too-high or too-low reading.
• Statistical, or Random Errors - fluctuations in measurements that result in their being
13
both too high and too low, due to how precisely the measurement can be made, and
which are amenable to reduction by doing repeated measurements
A. Parent and Sample distributions
Suppose we measure a sample with discrete values of x
x1 , x2 , . . . , xn
in N measurements with n ≤ N . Let all xi be distinct, repeated ni ≥ 1 times. Define
frequency
fi =ni
N,∑
i
fi = 1
Then probability
pi = limN→∞
fi (11)
provided the limit exists. The set pi ≥ 0 with
n∑
i
pi = 1 (12)
determines the discrete Parent probability distribution. The n can be finite or not. EXAM-
PLE: coin (in class).
If x is continuous, break the x-interval into bins
x0 < x1 < x2 < . . . < xn
and define fi for each bin with xi ≤ x < xi+1 (a ”histogram”). The parent distribution now
is a continuous probability density function (”PDF”) p(x) ≥ 0 with
∫ ∞
−∞
p(x) dx = 1 ,∫ xi+1
xi
p(x) dx = pi (13)
the rest is the same. Note units:
[pi] = 1 , [p(x)] = 1/[x]
14
B. Mean and variance
Sample:
N - number of experiments (large); n - number of possible outcomes of a single experiment,
can be large or small (e.g. n = 2 for a single coin).
x =1
N
N∑
i
xi =n∑
i
fixi (14)
s2 =1
N − 1
N∑
i
(xi − x)2 ≈⟨
(x− x)2⟩
(15)
Equivalently
s2 ≈n∑
i
fi (xi − x)2 =⟨
x2⟩
− x2
Parent: N → ∞ , n can remain finite (or not).
pi = limN→∞
fi (16)
µ = limN→∞
x =n∑
i
pixi (17)
σ2 = limN→∞
s2 =n∑
i
pi (xi − µ)2 (18)
for discrete, or for continuous:
µ =∫ ∞
−∞
dx xp(x)
σ2 =∫ ∞
−∞
dx (x− µ)2p(x)
In both cases
σ2 =⟨
(x− µ)2⟩
=⟨
x2⟩
− µ2 (19)
σ - ”standard deviation”; usually ”error”≃ ±σ..
C. Alternatives to mean and variance
”Mode” M :
x = M , p(x) = max
(command ”Commonest” in Mathematica).
15
”Median” m (also, µ1/2)
∫ m
−∞
dx p(x) =∫ ∞
mdx p(x) =
1
2
(less sensitive to ”outliers” than µ).
”Deviation”:
d = limN→∞
1
N
N∑
i
|xi − µ| = 〈|x− µ|〉
(less sensitive to ”outliers” than σ).
HW: Consider and ”experiment”: a coin is dropped twice, with heads = 0 and tails = 1 (i.e.
with outcome a number 0, 1 or 2). The experiment is repeated a large number of times, N .
1. find ”parent” values of n, all xi and pi
2. find µ and σ
3. find M,m, d
4. write a Mathematica code to simulate one experiment
5. repeat N =10,000 times creating a sample list of data (don’t forget the ”;” or your screen
will be full).
6. find x, s for the list and M,m and d
7. plot a histogram and compare with parent distribution.
16
Dr. Vitaly A. Shneidman, Phys114
VII. DISCRETE AND CONTINUOS DISTRIBUTIONS
READING: Ch.2 + these notes
• Discrete: assume the elements of sample space can be numbered by an integer j; xj
are their values while pj are the probabilites. Cumulative distribution:
F (x) =∑
xj≤x
pj (20)
• Continuous: assume the elements of sample space can be identified by a continuous
variable x with a probability density p(x). Cumulative distribution is given by
F (x) =∫ x
−∞
dx′p (x′) (21)
Primtive examples:
Discrete: ”binary”
p0 = p1 =1
2, F (x) = 0, x < 0 , F (x) =
1
2, 0 ≤ x < 1 , F (x) = 1, x ≥ 1
Continuous: ”uniform”
p(x) =1
L, 0 ≤ x ≤ L , F (x) = 0, x < 0 , F (x) = x, 0 ≤ x ≤ L , F (x) = 1, x > L
1. Mean and variance
We use equivalently ”bar” for averages and 〈. . .〉 for longer expressions.
x =∑
pixi →∫ ∞
−∞
xp(x) dx (22)
σ2 =⟨
(x− x)2⟩
=⟨
x2⟩
− 〈x〉2 (23)
17
5 10 15 20
0.025
0.05
0.075
0.1
0.125
0.15
0.175
5 10 15 20
0.05
0.1
0.15
0.2
-1 1 2 3 4 5
0.1
0.2
0.3
0.4
0.5
FIG. 4: Binomial probability function and approximation of the results by a gaussian curve for
n = 20 , p = 1/2 (left, unbiased), n = 20 , p = 0.3 (middle, biased) and n = 3 , p = 0.6 (right).
The approximation becomes exact for n → ∞ (”Limit theorem of de Moivre and Laplace”) but in
practice is good starting from very modest n.
A. Change of variables in continuos distributions
x → y(x) , F [y(x)] = F (x)
Thus the new probability density P (y) is derived from
P (y) dy = p(x) dx (24)
Primitive example: p(x) = 1 , 0 ≤ x ≤ 1, y = x/L
VIII. BINOMIAL DISTRIBUTION
see 114−BinPoiGa.nb
Consider a ”loaded” coin with unequal probabilities of heads and tails, p and q = 1− p.
Then, probability to get m heads is
P binm = Cm
n pmqn−m , Cmn =
n!
m!(n−m)!(25)
see Fig. 4.
HW: (a) verify normalizationn∑
m=0
P binm = 1
(b) show that the following is true:
m ≡n∑
m=0
mP binm = pn (26)
σ2 = np(1− p) (27)
18
-3 -2 -1 1 2 3
0.2
0.4
0.6
0.8
1
FIG. 5: Normal probability density (red) and cumulative distribution (blue).
HW: use Stirling formula to approximate the binomial coefficient if n, m and n−m are large
A. Normal distribution
see Fig. 5 and the end of 114−BinPoiGa.nb.
For
Z =x− x
σ
P (Z) =1√2π
e−Z2/2 (28)
and
F (Z) ≡∫ Z
−∞
P (u)du =1
2
[
1 + erf(Z/√2)]
=1
2erfc(−Z/
√2) (29)
with
erf(z) =2√π
∫ z
0e−x2
dx ≡ 1− erfc(z) (30)
F (1)− F (−1) ≈ 68% (31)
F (2)− F (−2) ≈ 95% (32)
HW: Write the distribution in terms of x; find 〈x〉 and⟨
x2⟩
19
5 10 15 20
0.025
0.05
0.075
0.1
0.125
0.15
0.175
FIG. 6: Poisson distribution Pm (black) for m = 10 and binomial distributions P binm (red) with
different n and p = m/n. From top to bottom: n = 20, n = 100 and n = 400 (which practically
blends with the Poisson curve)
IX. LIMITS OF THE BINOMIAL DISTRIBUTION
A. Poisson
(note: typos in eq. (2.9) in textbook).
Consider n → ∞, p → 0 with fixed m = pn. Then,
P binm ≈ n!
(n−m)!nm
mm
m!
(
1− m
n
)n
=mm
m!e−m (33)
which is the Poisson distribution, Pm. See Fig. 6 and 114−BinPoiGa.nb.
HW: (a) verify normalization∞∑
m=0
Pm = 1
(b) verify
m ≡∞∑
m=0
mPm
(c) find⟨
m2⟩
≡∞∑
m=0
m2Pm
20
1. Advanced: sum of two Poisson distributions
see Fig. 7.
Consider P1(m) with m = µ1 and P2(m) with m = µ2 . Then
P1+2(n) =n∑
m=0
P1(m)P2(n−m) = (34)
=n∑
m=0
µm1
m!e−µ1
µn−m2
(n−m)!e−µ2 = (35)
=(µ1 + µ2)
n
n!e−(µ1+µ2)
n∑
m=0
Cmn
(
µ1
µ1 + µ2
)m (µ2
µ1 + µ2
)n−m
= (36)
=(µ1 + µ2)
n
n!e−(µ1+µ2) (37)
since the last sum evaluates to 1.
5 10 15 20 25 30 35
50 000
100 000
150 000
FIG. 7: The sum of two Poisson distributions with m1 = 5 (orange) and m2 = 10 (white) results in
another Poisson distribution (blue) with m = m1+ m2 . (”Experiment” with 106 random numbers
in each distribution).
B. Normal
Alternatively, let m be close to the average n/2 for p = q = 1/2. We will use
x = 2m− n
21
and further switch to scaled
y = x/√n ∼ 1
(with this the distribution is multiplied by√n to ensure normalization). This leads to
Gaussian. Major steps:
• use Stirling approximation
n! ≃√2πn (n/e)n , n ≫ 1 (38)
for both n! and (n−m)!
• replace m by (n+ y√n) /2 (and multiply by
√n/2 to ensure normalization).
• Take the limit n → ∞
•
P gauss(y) =1√2π
e−y2/2 (39)
See the end of 114−BinPoiGa.nb; also see this notebook for some Advanced topics: Gauss
from Poisson for m ≫ 1, cumulative distribution (CDF) and Addition Theorem for Poisson,
etc..
HW: Homework for Ch.2, pp. 34,35 (part of Exam 1):
2(d), 3-5, 11-18
C. Advanced: Central limit theorem
Why normal distributions are so typical?
Let
yn = X1 + . . .+Xn
with
µ = 〈Xn〉 , σ2 =⟨
(Xn − µ)2⟩
for any n. Then, for n → ∞ the distribution for Yn is asymptotically normal with
µy = nµ , σ2y = nσ2 (40)
22
30 40 50 60 70 80
0.02
0.04
0.06
0.08
0.1
FIG. 8: Illustration of central limit theorem. A sum of a large number (100) of independent
random variables has a gaussian distribution, while each individual variable can have an arbitrary
non-gaussian distribution. In the example individual random variables were informly distributed
with an average µ = 1/2 and variance σ2 = 1/4. Points (blue) correspond to an experimental
histogram after 10000 runs; the line (red) is the normal distribution with average µ100 = 100µ−1/2
and variance σ100 = σ√100. (the −1/2 is due to introducing a discrete histogram).
Note that we do not need a gaussian X, only large n(!)
For a ”binary” distribution X = {0, 1} we could see it above (in that case yn is binomial
and is known exactly). For a different X consider a inform distribution
p(X) =1√3, 1/2−
√3/2 ≤ X ≤ 1/2 +
√3/2
HW: find µ and σ2
see Fig. 8.
D. Physical example: Fluctuations of density in ideal gas
see Fig. 9
23
FIG. 9: 500 non-interacting particles randomly distributed in a large (blue) box. Probability to
find exactly m particles in a selected red box is given by a binomial distribution (exactly) with
n = 500 and p = 1/50 (and with m = np = 10). Once the number of red boxes is large, this
probability becomes Poissonian. If, in addition the number of molecules in a red box is still large,
the probability becomes Gaussian with the same m and with σ2 = m.
Dr. Vitaly A. Shneidman, Phys114
X. OTHER CONTINUOUS DISTRIBUTIONS
1. Exponential
p(x) =1
µexp
(
−x
µ
)
(41)
with
x = µ
.
HW: (a) calculate⟨
x2⟩
; (b) find F (x)
24
2. Gamma and χ2
f(x) = λ exp(−λx)(λx)t−1/Γ(t) (42)
with λ = 1/2 and t = n/2 (integer n) this is χ2 distribution.
F (x) = (43)
HW: (a) calculate 〈x〉 ; (b) find σ2 ; (c) find the above F (x) using Mathematica
A. Lorentz (Cauchy)
p(x) =1
π
γ
γ2 + (x− µ)2(44)
Note: mean undefined (error in textbook).
25
Dr. Vitaly A. Shneidman, Phys114
XI. ADVANCED: EXPECTATION AND MOMENTS, ADDITION FOR GAUSS
E(g) ≡ 〈g(X)〉 =∑
i
g (xi) p (xi) →∫ ∞
−∞
g(x)p(x) dx (45)
kth moment:
µk = E(
Xk)
, µ1 ≡ µ (46)
kth central moment:
Mk = E(
(X − µ)k)
, M2 ≡ σ2 (47)
HW: Show that µ0 = M0 = 1
Dimensionless central moments:
γk = Mk/σk (48)
with γ3 - ”skewness” and γ4 - ”kurtosis”.
HW: show that for normal distribution γ3 = 0 and γ4 = 3 (for which reason γ4−3 is called ”excess
curtosis”)
A. Adding two normal distributions (ND)
Let p1(x), p2(x) - ND’s with means µ1 , µ2 and SD’s σ1 , σ2 . Then
P1+2(x) =∫ ∞
−∞
dy p1(y)p2(x− y) (49)
is also a ND with
µ = µ1 + µ2 (50)
σ2 = σ21 + σ2
2 (51)
(Proof in class). See next fig. with µ1,2 = 3 and 7, and σ1,2 = 3 and 4 (green and white).
Solid line - ND with µ = 10 and σ = 5. (histograms were obtained from ”experiment”, as
before).
26
-10 0 10 20 30
20 000
40 000
60 000
80 000
100 000
120 000
140 000
XII. DISTRUBUTION OF SEVERAL VARIABLES
READING: these notes
Probability density
p(x, y)
which satisfies all axioms of probability. (in fact, for 2 variables Veen diagrams are the most
insructive).
1. Multivariate Gaussian distribution
With
~r = (x, y, . . .)
p (~r) = C exp(
−1
2~r · A · ~r
)
, C =|detA|1/2
(2π)d/2(52)
where d is the dimension of r (2 in our case). [Proof in class].
(3D plot removed)
27
2. Covariance and correlation
Cov [X, Y ] = σ2xy = 〈(X − µx) (Y − µy)〉 (53)
with
Cov [X,X] = σ2x (54)
Correlation:
Corr [X, Y ] =Cov [X, Y ]
σxσy
=σ2xy
σxσy
(55)
and
Corr [X,X] = Corr [Y, Y ] = 1
(thus, one can introduce a correlation matrix, V .)
3. Statistical analog of covariance and correlation
Covariance:
sxy =1
n− 1
n∑
i=1
(xi − x) (yi − y) (56)
The rest is similar. Note: a good random number generator should give practically uncor-
related results - see 114−2dGa.nb
4. Independent random variables
p(x, y) = p1(x)p2(y)
or
F (x, y) = F1(x)F2(y)
Then
E(XY ) = E(X)E(Y )
28
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
1.2
1.4
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
-3 -2 -1 1 2 3
-5
5
: 1 0
0 4, 4>
-3 -2 -1 1 2 3
-4
-2
2
4
6
8
10
1 1
1 4
-3 -2 -1 1 2 3
-4
-2
2
4
6
8
10
: 1 1.95
1.95 4, 0.1975>
FIG. 10: Charge of pattern from non-correlated data (left) to strongly correlated data (right).
Upper row: uniform distribution in x. Lower: from ND with indicated covariance matrix (and
determinant).
Dr. Vitaly A. Shneidman, Phys114
XIII. MEASUREMENTS AND PROPAGATION OF ERRORS
READING: Ch.3+these notes
HW: pp. 49,50: 1,2,5,9-11
29
A. The coin experiment
20 000 40 000 60 000 80 000 100 000
-150
-100
-50
50
100
20 000 40 000 60 000 80 000 100 000
500
1000
1500
2000
1. Fair
heads – +1, tails – -1; ξ - random (”noise”):
ξ = ±1 , ξ = 0 , < ξ2 >=1
212 +
1
2(−1)2 = 1
xn = xn−1 + ξ (57)
x2n = x2
n−1 + ξ2 + 2ξxn−1 (58)
30
< x2n >=< x2
n−1 > + < ξ2 >=< x2n−1 > +1 (59)
< x2n >= n (60)
or
xRMS =√n
2. Unfair
p - probability to get heads
ξ = ±1 , ξ = µ = 2p− 1 , < ξ2 >= p · 12 + (1− p)(−1)2 = 1 = σ2 + µ2
with σ2 = 1− µ2 ≈ 1 for small µ.
xn = xn−1 + ξ (61)
< xn >=< xn−1 > +µ = µn (62)
x2n = x2
n−1 + ξ2 + 2ξxn−1 (63)
< x2n >=< x2
n−1 > + < ξ2 > +2 < xn−1 > µ =< x2n−1 > +1 + 2(n− 1)µ2 (64)
σ2n =< x2
n > −µ2n2 =< x2n−1 > − < xn−1 >
2 +σ2 = σ2n−1 + σ2
or
σn = σ√n , µn = µn
Thus, for any µ 6= 0 for large enough n the ”signal-to-noise ratio” becomes large
µn/σn ∼√n ≫ 1
allowing to properly measure µ.
B. Propagation of small errors
1. x = f(u)
SNR :u
σu≫ 1
31
x = f(u) , u = u+ ξ , ξ = 0 , < ξ2 >= σ2u (65)
x ≃ f(u) + f ′uξ +
1
2f ′′uuξ
2 + . . . (66)
x = f(u) +1
2f ′′uuσ
2u (67)
x2 = f 2 + f ′2u ξ
2 + ff ′′uuξ
2 + . . . (68)
< x2 >= f 2 +(
f ′2u + ff ′′) σ2
u (69)
x 2 = f 2 + ff ′′σ2u + . . .
< x2 > −x2 = f ′2u σ
2u (70)
or
σx = |f ′u| σu (71)
2. x = f(u, v)
u = u+ ξu , v = v + ξv
< ξ2u >= σ2u , < ξ2v >= σ2
v , < ξuξv >= σ2uv
x = f(u , v) + ξuf′u + ξvf
′v +
+1
2ξ2uf
′′uu +
1
2ξ2vf
′′vv + ξuξvf
′′uv (72)
x = f(u , v) +1
2σ2uf
′′uu +
1
2σ2vf
′′vv + σ2
uvf′′uv ≈ f (u , v) (73)
and
σ2x = σ2
uf′2u + σ2
vf′2v + 2σ2
uvf′uf
′v (74)
32
C. Designing an experiment: Example
T ≃ 2π√
L/g
g = 4π2 L
T 2
σ2g = g2
σ2L
L2+ 4
σ2T
T 2
33
Dr. Vitaly A. Shneidman, Phys114
XIV. ESTIMATORS, χ2 AND KOLMOGOROV-SMIRNOV TESTS
READING: Ch.4 + a bit of Ch.11 + these notes HW: 4.5,4.6,4.9; apply the χ2-test to your
data on radioactive decay
A. Maximum likelihood
P (xi) = p(xi ; α)∆x
L = P (x1)P (x2) . . . P (xn) = max (75)
dL/dα = 0 (76)
Let, e.g.
p(xi) ∼ exp
−(xi − µ)2
2σ2
with fixed σ and adjustable µ. One has
L ∼ exp
−∑ (xi − µ)2
2σ2
(77)
0 = d lnL/dµ =∑
i
(xi − µ)
σ2(78)
µ =1
n
∑
xi (79)
as expected.
Error in µ:
σ2µ = σ2(dµ/dx1)
2 + σ2(dµ/dx2)2 + . . . =
1
nσ2 (80)
σµ = σ/√n (81)
34
1. Weighted average
Let σ → σi:
p(xi) ∼ exp
−(xi − µ)2
2σ2i
with fixed set of σ and adjustable µ. One has
L ∼ exp
−∑ (xi − µ)2
2σ2i
(82)
0 = d lnL/dµ =∑
i
(xi − µ)
σ2i
(83)
µ =1
∑
i(1/σ2i )
∑
k
1
σ2k
xk (84)
Note that the best measurements with σi → 0 win!
Error in µ:
σ2µ = σ2
1(dµ/dx1)2 + σ2
2(dµ/dx2)2 + . . . =
1∑
i(1/σ2i )
2∑
k
1
σ2k
= (85)
=1
∑
i(1/σ2i )
(86)
B. χ2
1. Where does it come from?
Let µ = 0 , σ = 1 and
p(x) =1√2π
e−x2/2
Introduce
yn =∑
i
x2i (87)
What is P (yn)? E.g., n = 1 , y1 = x2 , x = ±√y1
P (y1) = p[x(y1)](dx/dy1) · 2 ∼ e−y1/2/√y1 (88)
35
etc.
General (χ2 = x in PDF):
PDF :2−n/2e−x/2x
n
2−1
Γ(
n2
) (89)
CDF :1
Γ(n/2)Γ
n
2,χ2
2
(90)
xmode = n− 2
For n ≫ 1
PDF → 1√2π
e−y2/2 , y =x− xmode√
2xmode(91)
2 4 6 8 10
500
1000
1500
2000
2500
3000n = 1
2 4 6 8 10
1000
2000
3000
4000
5000n = 2
2 4 6 8 10
500
1000
1500
2000
2500n = 3
2 4 6 8 10
500
1000
1500
2000n = 4
FIG. 11:
36
C. Comparing two distributions
1. χ2-test for data vs theoretical
Let X (data) are tested vs. theory with known PDF p(x) and CDF F (x).
Let data be grouped in N bins, with 1 ≤ i ≤ N being the number of the bin
and Ri and Si be the number of events in a corresponding bin for data and
theory, respectively.
Si = N∫ xi+1
xi
dx p(x) = N (F (xi+1)− F (xi))
Then,
χ2 =∑
i
(Ri − Si)2
Si(92)
Evaluate
1
Γ(N/2)Γ
N
2,χ2
2
(93)
If this is close to 1 the two distributions are close. - see chi2.nb.
2. χ2-test for data1 vs data2
Let X and Y be grouped in N bins each, with 1 ≤ i ≤ N being the number
of the bin and Ri and Si be the number of events in a corresponding bin for
X and Y , respectively. Then,
χ2 =∑
i
(Ri − Si)2
Ri + Si(94)
Evaluate
1
Γ(N/2)Γ
N
2,χ2
2
(95)
If this is close to 1 the two distributions are close. Note
37
• the total number of events for X and Y is not required the same (oth-
erwise N → N − 1).
• bins should be filled up; 1-2 empty bins can be ok, but if a bin is empty
for both X and Y will not work.
• works well if the number of bins is large
• can be used to compare with a known distribtion with Si in the numer-
ator replaced by a known ni and the entire denominator replaced by the
the same ni
D. Kolmogorov-Smirnov
Qks (λ) = 2∞∑
i=1
(−1)i−1 exp(
−2i2λ2)
= 1− θ4(
0, e−2λ2)
(96)
0.5 1.0 1.5 2.0Λ
0.2
0.4
0.6
0.8
1.0
ksHΛL
FIG. 12:
38
-3 -2 -1 1 2 3x
0.2
0.4
0.6
0.8
1.0
FHxLd=0.064
FIG. 13:
1. Kolmogorov-Smirnov test
Consider two unbinned distribution with N points in each.
• Construct cummulant distributions S1(x) and S2(x)
• find the maximum distance
D = max−∞<x<∞ |S1(x)− S2(x)|
• Evaluate
Qks
(
√
N/2D)
• If the number is close to 1 distributions are similar.
Note:
• distributions must be one-dimensional each
• can be used if different number of points with N/2 → N1N2/ (N1 +N2)
39
200 400 600 800 1000
0.2
0.4
0.6
0.8
1
200 400 600 800 1000
0.2
0.4
0.6
0.8
1
FIG. 14: Cumulative distributions used in Kolmogorov-Smirnov test for comparing two distribu-
tions. Each set of data was generated using the standard uniform RNG (so that distributions are
expected identical if the number of points is large). Left - 20 points in each distribution, right -
200 points. The KS test identifies distributions with each other with confidence 0.33 for the 1st
case and confidence 0.987 for the 2nd.
• can be used for comparison with a known distribution with N/2 → N
40
Dr. Vitaly A. Shneidman, Phys114
XV. MONTE CARLO INTEGRATION
READING: Ch.5 + these notes + 114−MonteCarlo.pdf
HW: 5.6,7,9. Using MC, find the area of a triangle with vertexes (-1,0), (1,0), (0,1).
A. Buffon’s needle
FIG. 15: Buffon’s needle. Chance to cross a line is 1/π.
Mario Lazzarini, an Italian mathematician, performed the Buffon’s needle
experiment in 1901. Tossing a needle 3408 times, he attained the well-known
estimate 355/113 for π, which is a very accurate value, differing by about
10−7. What was wrong? - see ”MonteCarlo.nb”
B. General MC and example
file MonteCarlo.nb
When to use?
• d ≥ 2
41
• complicated (”bad”) boundary
• more-or-less smooth integrand (no peaks in small areas)
• not too high accuracy is ok
Note, sometimes you may have a ”good” boundary but a ”bad” integrand.
If change of variables can reverse this, MC will work much better.
Ideas of MC - see Fig. 16. We want to find an area under the black arc
(semicircle in this case) and to locate its center of gravity. Steps:
• surround by a simple boundary (blue box)
• define functions ”ar” (area) and ”mom” (moment) with zero initial value
• generate N points inside the box randomly
• if a point falls under the arc, increase ”ar” and ”mom” accordingly
• calculate averages ar/N , mom/N
Error decays as 1/√N - not too fast, but algorithm is very simple.
-1 -0.5 0.5 1
0.2
0.4
0.6
0.8
1
FIG. 16: Ideas of Monte Carlo integration
42
XVI. GENERATION OF RANDOM NUMBER FOR DIFFERENT DISTRIBU-
TIONS
Usually, the standard RNG gives a uniform density
p(x) = 1 , 0 < x < 1
with the cumulative distribution being
x , 0 ≤ x ≤ 1
Then, for another distribution with a cumulative F (y) one has
y = F−1(x)
Example:
A. exponential distribution
file: 114−MonteCarlo.nb
Transformation method:
P (y) = λ exp(−λy) (97)
F (y) = 1− exp(−λy) = x (98)
y = −1
λln(1− x) (99)
B. Poisson
The rejection method plus look-up table - see file: 114−MonteCarlo.nb
43
10 20 30 40 50
0.02
0.04
0.06
0.08
0.1
10 20 30 40 50
0.02
0.04
0.06
0.08
0.1
10 20 30 40 50
0.02
0.04
0.06
0.08
0.1
FIG. 17: ”Experimental” studies of exponential distribution. 10000 data were produced by a do-
it-yourself RNG obtained from a modified built-in RNG for uniform distributions (see text) and
grouped into different bins. From left to right: bin size 1, 5 and 0.2. In each case solid line is
exponential approximation obtained from non-linear fit.
C. Gauss
Transformation method and the Box-Muller algorithm. See file:
114−MonteCarlo.nb
44
Dr. Vitaly A. Shneidman, Phys114
XVII. LSA
READING: Ch.6 + notes
A. Fitting of data and geometric LSA
Fitting to a straight line:
y = a+ bx
with
b =sxyσ2x
, a = y − bx (100)
and sxy being ”sample covariance”.
In Mathematica linear fit is achieved using the ”FindFit” command.
HW: create a list of 20 points of type y = ax + b+noise. Use ”FindFit” command to find a
linear fit; compare coefficients to a and b.
B. ”Physical” LSA
Notations:
X = (x1 , x2 , . . . , xn) , Y = (y1 , . . . , yn) , S = (σ1 , . . . , σn) (101)
σ2 = 1/n∑
i=1
σ−2i , 〈f〉 = σ2
n∑
i=1
fiσ2i
for any vector f (102)
〈X〉 = σ2n∑
i=1
xiσ2i
, 〈Y 〉 = σ2n∑
i=1
yiσ2i
(103)
〈XY 〉 = σ2n∑
i=1
xiyiσ2i
, 〈X2〉 = σ2n∑
i=1
x2iσ2i
(104)
45
Units:
[σ] = [Y ] = [σi] , [〈f〉] = [f ]
y(x) = a+ bx (105)n∑
i=1
[yi − y(xi)]2
σ2i
= min (106)
∂
∂a:
n∑
i=1
yi − y(xi)
σ2i
= 0
〈Y 〉 − a− b〈X〉 = 0 (107)
∂
∂b:
n∑
i=1
[yi − y(xi)]xiσ2i
= 0
〈XY 〉 − a〈X〉 − b〈X2〉 = 0 (108)
Thus,
a =〈X2〉〈Y 〉 − 〈X〉〈XY 〉
〈X2〉 − 〈X〉2 (109)
b =〈XY 〉 − 〈X〉〈Y 〉〈X2〉 − 〈X〉2 (110)
Poisson:
σ2i = yi > 0 (111)
σ2 = 1/∑
(1/yi) (112)
〈Y 〉 = nσ2 (113)
〈X〉 = σ2∑xi/yi (114)
〈XY 〉 = σ2∑ xi (115)
46
1. Errors in a and b
σ2a =
∑
σ2i (da/dyi)
2 ∝ ∑
σ2i (〈X2〉σ
2
σ2i
− 〈X〉xiσ2
σ2i
)2 ∝ (116)
σ4∑(〈X2〉2 + 〈X〉2x2i − 2〈X2〉〈X〉xi)/σ2i ∝ σ4〈X2〉(〈X2〉 − 〈X〉2) = (117)
= σ2 〈X2〉〈X2〉 − 〈X〉2(118)
Similarly
σ2b =
∑
σ2i (db/dyi)
2 = σ2 1
〈X2〉 − 〈X〉2 (119)
47
Dr. Vitaly A. Shneidman, Phys114
XVIII. TRIGONOMETRIC, POLYNOMIAL AND NONLINEAR FITS
READING: Ch.7,8 + orthog.pdf
HW: reproduce examples from orthog.pdf, and in each case find χ2 and χ2/dof .
Theory: in class.
===============================(updated till here)
48
Dr. Vitaly A. Shneidman, Phys114
〈x| |x〉 〈x|y〉 〈x〉 〈xy〉
APPENDIX A: A BIT OF MATHEMATICA
1. Basic elementary commands
HELP:
1) if you know the exact command , but want to refresh what
argument it requires, use ?. E.g.
?Sin
2) if you approximately know the spelling, use ? with * for the
unknown part, e.g.
?*Plot*
gives all commands which have Plot in them
Frequent type- and space-saving commands:
1) % uses the last output as input.
Similarly, %% uses the one before last output, etc. Or, %12
2) space - can be used instead of * for multiplication:
3) ; will not produce an output on the screen (but can
work with it further!)
(main typesaving - defining your own functions, etc. - will study
later).
4) /.x->.... replacement. E.g.
3x^3/.x->2 gives
24
Sin[x/y]/.{x->Pi,y->2.} gives 1.
49
Saving your work:
There are two ways:
1) Save["filename", symbol] appends definitions associated with
the specified symbol to a file.
if symbol includes previous definitions, will save everything
which is required! "filename" usually includes .m at the end (for
convenience), but you can be creative. Graphics cannot be saved
this way, but you can save the last command used to generate it,
and then recreate the picture upon restarting Mathematica. Files
are in plain text and relatively small.
Example:
In[1]:= fig:=Plot[Sin[x]/x, {x,-8,8}] (we defined a plot function,
fig)
In[2]:= Save["figSinc.m", fig] (saved this function in a file
figSinc.m)
In[3]:= !!figSinc.m (this shows the contents of the file)
fig := Plot[Sin[x]/x, {x, -8, 8}]
Now, if you start a new Mathematica session, you can type
<< figSinc.m
and you will have all saved definitions. Command fig will plot
your picture. Note: can require a full path on your PC.
2) you save as a notebook, with all graphics you created (and all
the junk). Saved files are BIG, and can quickly overflow your
directory if caution is not used. Use sparingly, and only for work
you feel you really need and which you cannot save using the Save
command.
2. NUMBERS
1) Integer
50
2) Exact - 1/2, 10^-10, Pi, E, Sqrt[2], EulerGamma, etc.
3) approximate - 2. , 10.^-10, pi= N[Pi,15], e=N[E,7], etc.
4) Complex numbers:
I represents the imaginary unit Sqrt[-1], e.g.
z = 2+3I and then Abs[z]=..., Arg[z]=..., etc.
5) Random numbers, e.g.
Random[]
Note: Updated in Mathematica9
3. SYMBOLIC MATH
Sum, e.g.:
In[74]:= Sum[i^2, {i,1,n}]
or
In[74]:= Sum[i^-2, {i, 1, Infinity}]
Derivatives and integration:
In[75]:= D[x^n,x]
-1 + n
Out[75]= n x
In[76]:= D[%,x]
-2 + n
Out[76]= (-1 + n) n x
In[77]:= Integrate[%,x]
-1 + n
Out[77]= n x
Algebraic operations:
Expand, Factor, Collect, Simplify, etc.
Trigonometry:
TrigExpand and TrigReduce
51
Connection with exponential notations:
In[8]:= ExpToTrig[Exp[I x]]
Out[8]= Cos[x] + I Sin[x]
or
In[9]:= TrigToExp[Cos[x]+I Sin[x]]
I x
Out[9]= E
Power serieses:
In[118]:= Series[Exp[a x], {x, 0, 5}]
To make a polynomial by truncating a series:
In[119]:= Normal[%]
Will give a series even if there is a simple singularity (Laurent
series):
In[1]:= Series[1/Sin[t], {t,0,2}]
1 t 3
Out[1]= - + - + O[t]
t 6
Limit:
In[123]:= Limit[(1+x/n)^n, n->Infinity]
x
Out[123]= E
4. DEFINING YOUR OWN FUNCTIONS
In[1]:= f[x_]:=Sin[x]
In[3]:=Plot[f[x]/x, {x,-6,6}] (*will give a plot*)
52
Can define and save a plotting function:
In[23]:= plotf:=Plot[f[x]/x, {x,-6,6}]
Difference between := and =
In[19]:= r=Random[];
In[20]:= Table[r, {i,5}]
Out[20]= {0.307826, 0.307826, 0.307826, 0.307826, 0.307826}
Gives identical numbers since r was assigned a fixed value
but
In[21]:= Clear[r]; r:=Random[]
In[22]:= Table[r, {i,5}]
Out[22]= {0.0592439, 0.981402, 0.944823, 0.0902293, 0.598816}
gives different values each time r is evaluated
A third assignment (dangerous!):
Clear[x]; f[x_]=Sin[x]
Must use Clear (!)
5. Graphics (2D)
Main functions: Plot, Show, ListPlot
Options: PlotStyle, AxesLabel, etc.
Text, arrows, etc.
Plot[f, {x, xmin, xmax}] generates a plot of f as a function of x
from xmin to xmax. Plot[{f1, f2, ... }, {x, xmin, xmax}] plots
several functions.
PlotRange is an option for graphics functions that specifies what
points to include in a plot.
Show[graphics, options] displays two- and three-dimensional
graphics, using the options specified. Show[g1, g2, ... ] shows
53
several plots combined.
Examples:
In[12]:= Clear[plo]
In[13]:= plo[n_]:=Plot[Sin[n x]/x, {x,-2,2}, PlotRange -> {-1,2},
PlotStyle -> Dashing[{0.01*n, 0.02}]]
In[15]:= sho:=Show[Table[plo[n], {n,1,3}]]
In[16]:= sho (*will give graphics*)
AxesLabel -> {"x", "y"} will label each axes.
Plotting discrete data points:
ListPlot[{y1, y2, ... }] plots a list of values. The x coordinates
for each point are taken to be 1, 2, ... .
ListPlot[{{x1, y1}, {x2, y2}, ... }]
plots a list of values with specified x and y coordinates.
Example:
In[21]:= list=Table[Sin[i/100.]+.1*Random[], {i,100}];
In[22]:= ListPlot[list]
Out[22]= -Graphics-
Main extra options:e.g., PlotStyle -> PointSize[0.02],
or Joined->True
Parametric plot:
ParametricPlot[{fx, fy}, {t, tmin, tmax}] produces a parametric
plot with x and y coordinates fx and fy generated as a function
of t. Example:
In[2]:=Clear[x,y,phi]; x[phi_]=Cos[phi];
In[3]:= y[phi_]=Sin[phi];
In[4]:= ParametricPlot[{x[phi],y[phi]}, {phi, 0, 2Pi}]
54
-Graphics- (not a circle on the screen)
(*by default, AspectRatio ->GoldenRatio; try
to use Show[%, AspectRatio -> 1]*)
PlotLabel:
can be simple text, PlotLabel -> "mypicture" or
Labels as parameters:
plo[n_] := Plot[x^n, {x, 0, 1}, PlotLabel ->n] (*no quotes
now!!!*)
(*suppose we like what we see and want to create, a
postscript file, e.g. t.ps *)
disp:=Export["t.ps", #, "EPS"] &
(*now disp[plo[3]] will create t.ps, as a GOOD postscript
which is outside of Mathematica and can be further used
independently*)
Graphics Primitives:
Line[{pt1, pt2, ... }] is a graphics primitive which represents a
line joining a sequence of points.
Point[coords] is a graphics primitive that represents a point.
Circle[{x, y}, r] is a two-dimensional graphics primitive that
represents a circle of radius r centered at the point x, y.
Polygon, etc. use with Graphics, similar to Arrow
=============================================================
===================================================================
55
2 4 6 8 10
0.1
0.2
0.3
0.4
0.5
FIG. 18: Experimental studies of a normal distribution. 10000 data were produced using a RNG
from standard package in Mathematica and grouped into bins. Solid line is the best non-linear fit
by a gaussian curve.
100 200 300 400
-1.5
-1
-0.5
0.5
1
1.5
100 200 300 400
-1
-0.5
0.5
1
100 200 300 400
-1
-0.5
0.5
1
FIG. 19: A do-it-yourself Fourier filter which eliminates ”noise” (in fact, any signal) with a fourier
component below selected level. Red line - hidden deterministic signal, black dots - full signal after
filtering: left - noise cut-off 0.1, middle - noise cut-off 0.6 and right noise cut-off 0.7
6. Modeling and analysis of data
a. Filtering
will be discussed in class. See −stat.nb and Figs. 19 and 20
HW: LAST HW: (a) create 2 lists of 40 random points each with exponen-
tial distribution (see −stat.nb). (b) perform the χ2-test (c) perform the KS test
100 200 300 400
-1
-0.5
0.5
1
100 200 300 400
-1
-0.5
0.5
1
100 200 300 400
-1
-0.5
0.5
1
FIG. 20: Same signal as in Fig. 19 processes using the built-in moving average filter.
56
=======================
57