Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 1 of 37 ECE 3800
Henry Stark and John W. Woods, Probability, Statistics, and Random Variables for Engineers, 4th ed.,
Pearson Education Inc., 2012. ISBN: 978-0-13-231123-6
Chapter 6 Statistics: Part 1 Parameter Estimation
Sections 6.1 Introduction 340 Independent, Identically Distributed (i.i.d.) Observations 341 Estimation of Probabilities 343 6.2 Estimators 346 6.3 Estimation of the Mean 348 Properties of the Mean-Estimator Function (MEF) 349 Procedure for Getting a δ-confidence Interval on the Mean of a Normal
Random Variable When σX Is Known 352 Confidence Interval for the Mean of a Normal Distribution When σX Is Not
Known 352 Procedure for Getting a δ-Confidence Interval Based on n Observations on the
Mean of a Normal Random Variable when σX Is Not Known 355 Interpretation of the Confidence Interval 355 6.4 Estimation of the Variance and Covariance 355 Confidence Interval for the Variance of a Normal Random variable 357 Estimating the Standard Deviation Directly 359 Estimating the covariance 360 6.5 Simultaneous Estimation of Mean and Variance 361 6.6 Estimation of Non-Gaussian Parameters from Large Samples 363 6.7 Maximum Likelihood Estimators 365 6.8 Ordering, more on Percentiles, Parametric Versus Nonparametric Statistics 369 The Median of a Population Versus Its Mean 371 Parametric versus Nonparametric Statistics 372 Confidence Interval on the Percentile 373 Confidence Interval for the Median When n Is Large 375 6.9 Estimation of Vector Means and Covariance Matrices 376 Estimation of μ 377 Estimation of the covariance K 378 6.10 Linear Estimation of Vector Parameters 380 Summary 384 Problems 384 References 388 Additional Reading 389
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 2 of 37 ECE 3800
6.1 Introduction
Statistics Definition: The science of assembling, classifying, tabulating, and analyzing data or facts:
Descriptive statistics – the collecting, grouping and presenting data in a way that can be easily understood or assimilated.
Inductive statistics or statistical inference – use data to draw conclusions about or estimate parameters of the environment from which the data came from.
Theoretical Areas:
Sampling Theory – selecting samples from a collection of data that is too large to be examined completely.
Estimation Theory – concerned with making estimates or predictions based on the data that are available.
Hypothesis Testing – attempts to decide which of two or more hypotheses about the data are true.
Curve fitting and regression – attempt to find mathematical expressions that best represent the data. (Shown in Chap. 4)
Analysis of Variance – attempt to assess the significance of variations in the data and the relation of these variances to the physical situations from which the data arose. (Modern term ANOVA)
We will focus on parameter estimation (Chap. 6) and hypothesis testing (Chap. 7)
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 3 of 37 ECE 3800
Sampling Theory – The Sample Mean
How many samples are required to find a representative sample set that provides confidence in the results?
Defect testing, opinion polls, infection rates, etc.
Definitions
Population: the collection of data being studied N is the size of the population
Sample: a random sample is the part of the population selected all members of the population must be equally likely to be selected! n is the size of the sample
Sample Mean: the average of the numerical values that make of the sample
Population: N
Sample set: nxxxxxxS ,,,,, 54321
Sample Mean
n
iix
nx
1
1
To generalize, describe the statistical properties of arbitrary random samples rather than those of any particular sample.
Sample Mean
n
iiX
nX
1
1ˆ , where iX are random variables with a pdf.
Notice that for a pdf, the true mean, X , can be compute while for a sample data set the above
sample mean, is computed. X̂
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 4 of 37 ECE 3800
As may be noted, the sample mean is a combination of random variables and, therefore, can also be considered a random variable. As a result, the hoped for result can be derived as:
XXn
XEn
XEn
i
n
ii
11
11ˆ
If and when this is true, the estimate is said to be an unbiased estimate.
Though the sample mean may be unbiased, the sample mean may still not provide a good estimate.
What is the “variance” of the computation of the sample mean?
Varianceofthesamplemean–(themeanitself,notthevalueofX)
You would expect the sample mean to have some variance about the “probabilistic” or actual mean; therefore, it is also desirable to know something about the fluctuations around the mean. As a result, computation of the variance of the sample mean is desired.
For N>>n or N infinity (or even a known pdf), using the collected samples … based on the prior definition of variance, a statistical estimate of the 2nd moment and the square of the mean.
22
1
ˆ1ˆ XEXn
EXVarn
ii
211
2
1ˆ XXXn
EXVarn
jj
n
ii
21 1
2
1ˆ XXXn
EXVarn
i
n
jji
21 1
2
1ˆ XXXEn
XVarn
i
n
j
ji
For iX independent (measurements are independent of each other)
jiforXXEXEXE
jiforXXEXXE
ji
ii
ji
,ˆ
,
22
22
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 5 of 37 ECE 3800
As a result we can define two summation where i=j and i<>j,
21 ,1
2
1ˆ XXXEXXEn
XVarn
i
n
ijj
jiii
2222
2
1ˆ XXEnnXEnn
XVar ii
22
2
221ˆ XX
n
nnX
nXVar
nn
XXX
n
nX
nXVar
2222
221ˆ
where 2 is the true variance (probabilistic) of the random variable, X.
Therefore, as n approaches infinity, this variance in the sample mean estimate goes to zero! Thus a larger sample size leads to a better estimate of the population mean.
Note: this variance is developed based on “sampling with replacement”.
When based on sampling without replacement …
Destructive testing or sampling without replacement in a finite population results in another expression:
1
ˆ2
N
nN
nXVar
Note that when all the samples are tested (N=n) the variance necessarily goes to 0. And … all the samples have been removed from the population?!
The variance in the mean between the population and the sample set must be zero as the entire population has been measured!
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 6 of 37 ECE 3800
Example: How many samples of an infinitely long time waveform would be required to insure the mean is within 1% of the true (probabilistic) mean value? For this relationship, let
22 1001.001.0ˆ XVar
Infinite set, therefore assume that you use the “with replacement equation”:
n
XVar2
ˆ
Assume that the true means is 10 and that the true variance is 9 so that the mean =/- a standard deviation would be 310 . Then,
21001.09ˆ n
XVar
01.01.09 2 n
900n
A very large sample set size to “estimate” the mean within the 1% desired bound!
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 7 of 37 ECE 3800
CentralLimitTheoremEstimate
Thinking of the characterization after using a very large number of samples …
Using the central limit theorem (assume a Gaussian distribution) to estimate the probability that the mean is within a prescribed variance (1% from the previous example):
9.91.101.10ˆ9.9Pr FFX
Assume that the statistical measurement density function has become Gaussian centered around 10 with a 1% of the mean standard deviation (assuming that 10 and 1.0 ). We can use Gaussian/Normal Tables to determine the probability …
1.0
109.9
1.0
101.101.10ˆ9.9Pr X
112111111.10ˆ9.9Pr X
6826.018413.021.10ˆ9.9Pr X
This implies that, after taking so many measurement to form an estimate, there is a 68.3% chance the estimate is within 1% of the mean
or
that there is a 1-0.6826 or 31.74% probability that the estimate of the population mean is more than 1% away from the true population mean.
Summary, as the number of sample measurements increases, the density function of the estimated mean about the true (probabilistic) mean takes on a Gaussian characteristic. (based on the central limit theorem) Based on the variance of the sample mean computation (related to number of samples) the probability that the measurement mean match the probabilistic mean has known probability (based on Gaussian statistics).
We will be dealing with Gaussian/Normal Distributions as large sum sizes with some random variable association haves joint density functions that are Gaussian – Central Limit Theorem.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 8 of 37 ECE 3800
Example #2: A smaller sample size
Population: 100 transistors
Find the mean value of the current gain, . Assume that: the true population mean is 120 and
the true population variance is 252 .
How large a sample is required to obtain a sample mean that has a standard deviation of 1% of the true mean? Therefore, we want
44.12.112001.0ˆ 22 XVar
A smaller sample size, sample mean variance can be computed as
1
ˆ2
N
nN
nXVar
Determining the number of samples needed to meet tolerance …
44.11100
10025
n
n
nn
25
9944.1100
1592.147024.6
100
25
9944.11
100
n
A rule-of-thumb is offered to define “large vs. small” sample sizes, the threshold given is 30. The ultimate goal is to have enough samples to achieve a near-Gaussian probability distribution.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 9 of 37 ECE 3800
Sampling Theory – The Sample Variance
When dealing with probability, both the mean and variance provide valuable information about the “DC” and “AC” operating conditions (about what value is expected) and the variance (in terms of power or squared value) about the operating point.
Therefore, we are also interested in the sample variance as compared to the true data variance.
The sample variance of the population (stdevp) is defined as:
n
i
i XXn
S
1
22 ˆ1
and continuing until (shown in the coming pages)
22 1
n
nSE
where is the true variance of the random variable.
Note: the sample variance is not equal to the true variance; it is a biased estimate!
To create an unbiased estimator, scale by the biasing factor to compute (stdev):
n
ii
n
iix XX
nXX
nn
nSE
n
nSE
1
2
1
2222 ˆ
1
1ˆ1
11
~
When the population is not large, the biased estimate becomes
22 1
1
n
n
N
NSE
and the unbiased estimate is
22
1
1~SE
n
n
N
NSE
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 10 of 37 ECE 3800
Additional notes: MATLAB and MS Excel
Simulation and statistical software packages allow for either biased or unbiased computations.
In MS Excel there are two distinct functions stdev and stdevp.
stdev uses (n-1) - http://office.microsoft.com/en-us/excel-help/stdev-function-HP010335660.aspx stdevp uses (n) - https://support.office.com/en-US/article/STDEVP-function-1F7C1C88-1BEC-4422-
8242-E9F7DC8BB195
In MATLAB, there is an additional flag associate with the std function.
n
jjx
nXXstd
1
2
1
1var , flag implied as 0
n
jjx
nXXstd
1
211,var1, , flag specified as 1
>> help std std Standard deviation. For vectors, Y = std(X) returns the standard deviation. For matrices, Y is a row vector containing the standard deviation of each column. For N-D arrays, std operates along the first non-singleton dimension of X. std normalizes Y by (N-1), where N is the sample size. This is the sqrt of an unbiased estimator of the variance of the population from which X is drawn, as long as X consists of independent, identically distributed samples. Y = std(X,1) normalizes by N and produces the square root of the second moment of the sample about its mean. std(X,0) is the same as std(X).
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 11 of 37 ECE 3800
Sampling Theory – The Sample Variance - Proof
The sample variance of the population is defined as
n
i
i XXn
S
1
22 ˆ1
n
i
n
j
ji Xn
Xn
S
1
2
1
2 11
Determining the expected value
n
i
n
jji X
nX
nESE
1
2
1
2 11
n
i
n
jj
n
jjii X
nXX
nX
nESE
1
2
11
22 121
n
i
n
kk
n
jj
n
jjii XX
nEXXE
nXE
nSE
1 112
1
22 121
n
i
n
kk
n
jj
n
i
n
jji
n
ii XX
nE
nXXE
nXE
nSE
1 112
1 12
1
22 1121
n
i
n
j
n
kkj
n
i
XXEnn
XEnXEn
XEnn
SE1 1 1
21
222
22 111
21
n
i
n
j
n
j
n
jkkkjj XXEXE
nnXEnnXEn
nXESE
1 1 1 ,1
2
2
222
22 111
2
n
i
XEnnXEnn
XEn
nXE
nXESE
1
2223
2222 1122
22223
2222 1122XEnnnXEn
nXE
n
nXE
nXESE
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 12 of 37 ECE 3800
n
n
n
nXE
nnXESE
112121 222
n
nXE
n
nXESE
11 222
2222 11
n
nXEXE
n
nSE
Therefore,
22 1
n
nSE
To create an unbiased estimator, scale by the (un-) biasing factor to compute:
222
1
~
SEn
nSE
Varianceofthevariance
As before, the variance of the variance can be computed. (Instead of deriving the values, it is given.) It is defined as
n
SVar4
42
where 4 is the fourth central moment of the population and is defined by
44 XXE
Proof for extra credit homework credit ? …
For the unbiased variance, the result is
2
44
44
2
22
2
22
111
~
n
n
nn
nSVar
n
nSVar
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 13 of 37 ECE 3800
Example: the random time samples problem (first example) previously used where the true means is 10 and that the true variance is 9. Then,
nn
XVar9ˆ
2
and for n=900 01.0900
9ˆ XVar
2
442
1
~
n
nSVar
for a Gaussian random variable, the 4th central moment is 44 3 . Therefore
2
4
2
442
1
2
1
3~
n
n
n
nSVar
1804.0808201
145800
1900
99002~2
22
SVar
4247.0~2 SVar
The Variance estimate would then be
2~SVar or within %72.4%9
~100 2
SVar
While 900 was selected to provide a mean estimate that was within 1%, the variance estimate is not nearly as close at 4.72%. More samples are required to improve the variance estimate.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 14 of 37 ECE 3800
Statistical Mean and Variance Summary
For taking samples and estimating the mean and variance …
The Estimate Variance of Estimate
Mean
n
iiX X
nX
1
1ˆ̂
An unbiased estimate
XEXE ˆ
XX ˆ
n
XVar X2
ˆ
Variance (biased)
n
i
i XXn
S
1
22 ˆ1
A biased estimate
22 1Xn
nSE
2
442
1
~
n
nSVar X
44 XXE
Variance (unbiased) 222
1
~XSE
n
nSE
An unbiased estimate
22~XXESE
222 ˆ~
XXSE
n
SVar X4
42
44 XXE
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 15 of 37 ECE 3800
Bounds on the estimates
Using the Chebyshev Inequality 2
2
XXXP
Bounding the estimated mean value
2
2
ˆ
n
P
X
XX
if we let n∞
0limˆlim2
2
n
P X
nXX
n
Therefore, for any value lambda
0ˆlim
XXn
P
The probability that the estimated (statistical) mean is different from the probabilistic mean is zero! Therefore the two must be identical for the infinite sample case!
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 16 of 37 ECE 3800
Building a confidence interval
From Chapter 4, the discrete derivation stated for the Chebyshev Inequality stated
𝑃𝑟 |𝑋 𝜇 | 𝜖𝑉𝑎𝑟 𝑋𝜖
Let X be an arbitrary R.V. with known mean and variance. Then for any 0
Using the Chebyshev Inequality 2
2
XXXP
Derivation
dxxfXxXXEXX X
2222
Then
Xx
XX dxxfXxdxxfXx222
and
XxPdxxfdxxfXxXx
X
Xx
X2222
Results#1:
XxP2
2
It may be convenient to define the delta function in terms of a multiples of the standard deviation.
k
22
2
X
XXX
kkXP
Flipping the bounds on the inequality
2
11
kkXP XX
Expanding the absolute value and adding the mean
2
11
kkXkP XXX
A confidence interval for the statistical average becomes
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 17 of 37 ECE 3800
2
11ˆ
knk
nkP X
XXX
If we assume that the mean value has a Gaussian distribution, an exact value can be computed for this probability
12ˆ
kkkk
n
kPX
XX
Aconfidenceintervaloftenchosenis95%or0.95. 1295.0 k
975.02
95.01
k
96.1k
and
95.096.1ˆ
96.1
nP
X
XX
UsingExample6.3‐1effectofsamplesizeontheestimatedmean
If the actual mean is 0 and actual standard deviation is 3, what are the 95% confidence bounds?
For 0X and 3X
For a two-sides interval 1295.0 k
975.02
95.01
k and k = 1.96
95.096.13
0ˆ96.1
nP X
95.088.5
ˆ88.5
nnP X
If we were hoping to be within +/-0.1, how many samples are needed? 95.01.0ˆ1.0 XP
If and only if 1.088.5
n
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 18 of 37 ECE 3800
or n8.58 or n44.3457
If we only used n=64 samples, the probability of being within the 95% interval is ???
???64
88.5ˆ
64
88.5
XP
???735.0ˆ735.0 XP
26.0163.021735.02735.0ˆ735.0 XP
We could pick a different confidence interval … say 50%. 1250.0 k
75.02
50.01
k
67.0k and
50.067.03
0ˆ67.0
nP X
50.02
ˆ2
nnP X
If we only used n=64 samples, 50.025.0ˆ25.0 XP
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 19 of 37 ECE 3800
Gaussian Confidence Intervals (CI):
For a “two sided” confidence interval, we want
..12ˆ
ICkkkk
n
kPX
XX
(A: textbook steps 1 & 2) Find the appropriate value k for the confidence interval selected.
(B: textbook step 3 & 4) Based on the known Gaussian mean and variance for the “estimated” R.V and the known number of samples, Compute the bounds on the inequality.
Alternate solution typy.
If you know the bounds on the probability inequality and the Gaussian statistics, compute the number of samples needed.
boundn
k X
2
bounds
kn X
for ..12 ICk
SummaryforusingGaussianC.I(knownmeanandvariance)
..12ˆ ICkboundsn
kn
kboundsP XXX
X
Compute what you need …
Lookingatthenumbers…
The higher the confidence interval, the wider are the bounds.
The tighter are the bounds, the smaller the confidence interval.
Gaussian “confidence” +/- one standard deviation 68.3% +/- two standard deviation 95.44% +/- three standard deviation 99.74%
90% k=+/-1.64 standard deviation 95% k=+/-1.96 standard deviation 99% k=+/-2.58 standard deviation
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 20 of 37 ECE 3800
MoreGaussian
Confidence Interval (in %) Two Tail Bounds ccc zzzzork :
99.99% 0.005% to 99.995% 3.89
99.9% 0.05% to 99.95% 3.29
99% 0.5% to 99.5% 2.58
95% 2.5% to 97.5% 1.96
90% 5% to 95% 1.64
80% 10% ro 90% 1.28
50% 25% to 75% 0.675
see Sec4_4_Gaussian.m
There are “one-sided” bounds that we have not discussed. For a Gaussian R.V.
czq for zzc
..ˆ ICkboundsn
kP XXX
-5 -4 -3 -2 -1 0 1 2 3 4 5
f(x)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
q= 50.00%, k=0.674
q= 90.00%, k=1.645
q= 95.00%, k=1.960
q= 99.00%, k=2.576
Gaussian q values
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 21 of 37 ECE 3800
Confidence Intervals when we do not know the actual variance …
We have the statistically computed, non-biased variance estimate.
Define the new “estimated random variable mean function: as
nnS
T
ˆˆ
~ˆ
and define
n
ii
n
Xnnn
T
1
2
1
ˆ1
1
ˆˆ
ˆ
To simplify the textbook description involving the chi-squared distribution, this is the basis for Student’s t-distribution with n-1 degrees of freedom.
The Student’s t probability density function (letting v=n-1, the degrees of freedom) is defined as
2
12
1
2
2
1
v
T v
t
vv
v
tf
where is the gamma function.
The gamma function can be computed as
integerankfor!
kanyfor1
k
kkk
and
21
(1) Note that when evaluating the Student’s t-density function, all arguments of the gamma function are integers or an integer plus ½.
(2) Note that: The distribution depends on ν, but not μ or σ; the lack of dependence on μ and σ is what makes the t-distribution important in both theory and practice.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 22 of 37 ECE 3800
http://en.wikipedia.org/wiki/Student's_t-distribution
Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data.
Note that: The distribution depends on ν = n-1, but not μ or σ; the lack of dependence on μ and σ is what makes the t-distribution important in both theory and practice.
T-distribution confidence interval
For a “two sided” confidence interval, we want
..12ˆ
ˆ111 ICtTtTtTt
n
tP nnnX
XX
(A: textbook steps 1 & 2) Find the appropriate value t based on the value v=n-1 for the confidence interval selected. (Hint. the tables says x and n, but you are looking up v=n-1 (not n) and finding t=x based on FT)
(B: textbook step 3 & 4) Based on the known computed variance for the “estimated” R.V and the known number of samples. Compute the bounds on the inequality.
ICn
tn
tP XX .ˆ
ˆˆ
Or the bounds on the true mean, based on the confidence interval are
ICn
tn
tP XXX .ˆ
ˆˆ
ˆ
cTcT
t
t
T tFtFdttfCIc
c
100 for cc ttt , 2-sided
There are “one-sided” bounds that we have not discussed. For T-distribution R.V.
cT
t
T tFdttfCIc
100 for ttc , “right-tail”
ICn
tP XX .ˆ
ˆ
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 23 of 37 ECE 3800
Comparing the density functions: Student’s t and Gaussian
See StudentsT_Plot.m and function students_t.m
Student’s t 2
12
1
2
2
1
v
T v
t
vv
v
tf
Gaussian
2
2
2exp
2
1
X
X
XX
xxf
t-4 -3 -2 -1 0 1 2 3 4
Den
sity
fun
ctio
n F
T(t
)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4Students t and Gaussian Densities
Gaussian
T w/ v=1
T w/ v=2
T w/ v=8
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 24 of 37 ECE 3800
HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of 120 and a standard deviation of 10. The value of current gain may be assumed to be independent Gaussian random variables.
a) Find the confidence limits for a confidence level of 90% on the sample mean if it is computed from a sample size of 150.
nkXX
nkX
ˆ
Two sided test at 90% means that k = 1.645.
343.1150
10645.1
nk
343.1120ˆ343.1120 X
b) Repeat part (a) if the sample size is 21.
Two sided test at 90% means that k = 1.645.
590.321
10645.1
nk
590.3120ˆ590.3120 X
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 25 of 37 ECE 3800
HW 4-4.3 Repeat Problem 4-4.2 for a one-sided confidence interval. Restating the problem …
Find the value of the current gain above which 90% of the sample means would lie.
Xn
kX ˆ
(a) 150 sample size
One sided test at 90% means that
9.0
nk
or 9.01
nkQ
Therefore, k = 1.2826 and
047.1150
102826.1
nk
Xn
kX ˆ95.118
(b) 21 sample size
One sided test at 90% means, k = 1.2826 and
799.221
102826.1
nk
Xn
kX ˆ20.117
One Tail Bounds Confidence Interval (in %) ccc zzzzork :
99.99% 99.99% 3.7190 99.9% 99.9% 3.0902 99% 99% 2.3263 95% 95% 1.6449 90% 90% 1.2816 80% 80% 0.8416 75% 75% 0.6745 50% 50% 0
Examples of use:
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 26 of 37 ECE 3800
Exercise 4-4.2
A very large population of resistor values has a true mean of 100 ohms and a sample standard deviation of 4 ohms. Find the confidence interval on the sample mean for a confidence level of 95% if it is computed from:
a) a sample size of 100. v = 99 Using v=60 (no 100 given) and F=0.975 (2 sided test) on p. G-4, t=2.00. Therefore
nStXX
nStX
~ˆ~
8.0100
400.2
~
nSt
8.0100ˆ8.0100 X
8.100ˆ2.99 X
Using v=120 (no 100 given) and F=0.975 (2 sided test) on p. G-4, t=1.98.
792.0100
498.1
~
nSt
792.100ˆ208.99 X
b) a sample size of 9.
v = 8 Using v=8 and F=0.975 (2 sided test) on p. G-4, t=2.306. Therefore
nStXX
nStX
~ˆ~
075.39
4306.2
~
nSt
075.3100ˆ075.3100 X
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 27 of 37 ECE 3800
HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of 120 and a standard deviation of 10, The value of current gain may be assumed to be independent Gaussian random variables.
b) Repeat part (a) if the sample size is 21.
Two sided test at 90% means that k = 1.645.
590.321
10645.1
nk
590.3120ˆ590.3120 X
If the variance was an estimated variance … instead of a known variance.
v = 20 Using v=20 and F=0.95 (2 sided test) on p. G-4, t=1.725. Therefore
nStXX
nStX
~ˆ~
764.321
10725.1
~
nSt
764.3120ˆ764.3120 X
Notice that using an estimate variance results in a greater range of values (differences in the density functions).
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 28 of 37 ECE 3800
Skill 17-2 A cereal vendor’s quality control department has just tested a random sample of 10 “20 ounce” boxes of Oat Flakes by weighing them in order to see if their 20 ounce claim is to be believed. Their report, to be forwarded to management, must include a 95% confidence interval as to the population mean.
a) Find the unbiased mean and standard deviation
b) Determine the 95% confidence interval of the mean (by using the Student’s-t table).
c) In general, if the confidence interval becomes tighter (smaller), would the confidence level increase or decrease?
Measurement Data: 19, 18, 21, 21, 18, 22, 17, 19, 20, and 17.
a) Sample Mean
n
i
iXn
X
1
1ˆ , where iX are random variables with a pdf.
2.1910
19217201917221821211819
10
1ˆ X
Unbiased variance
n
ii XX
nSE
1
22 ˆ
1
1~
067.39
6.272.28.12.02.28.22.18.18.12.12.0
9
1~ 22222222222 SE
751.1067.39
6.27~S
v = 9 Using v=9 and F=0.975 (2 sided test) on p. G-4, t=2.262. Therefore
nStXX
nStX
~ˆ~
252.110
751.1262.2
~
nSt
252.12.19ˆ252.12.19 X
452.20ˆ948.17 X
(c) As the confidence interval becomes tighter (smaller) [p% going down! ], the confidence level/interval decreases.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 29 of 37 ECE 3800
6.2 Definitions
Estimator: a function of the observations vector that estimates a particular parameter.
Unbiased estimator an estimator is unbiased if the estimate converges to the correct value.
Biased estimator an estimator may be biased. Being converging to an offset or gain adjusted value.
Linear estimation the estimate is a linear combination of the sample points … bH x X
Consistent the estimator is consistent if the estimate converges to the appropriate value as the number of samples goes to infinity.
There are estimators that minimize the variance in the estimate from the sample values.
There are estimators that minimize the mean-squared error for the sample values.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 30 of 37 ECE 3800
6.6 Estimation of non-Gaussian Parameters.
Using the Chebyshev Inequality 2
2
XXXP
The confidence interval for the statistical average became
2
11ˆ
knk
nkP X
XXX
But based on the central limit theorem, we determined the probability to be related as a sum RV where the resulting RV becomes Gaussian, with prescribed means and variances based on the original density functions.
kkk
n
kPX
XXˆ
or
kk
nk
nkP X
XXX
X ˆˆ
When the initial distributions of the summed random variables are non-Gaussian, the mean and variance may be related, for example the exponential distribution.
Example6.6‐1exponentialdistribution xuexf x
X
where
1 and
22 1
We wish to estimate bounds for lambda
kk
nk
nkP X
XXX
X ˆˆ
kk
nk
nkP X
XXX
X ˆ
kk
nk
nkP X
11ˆ
11
kk
nk
nkP X 1
1ˆ1
1
kk
nk
nkP
XX
1ˆ1
1ˆ1
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 31 of 37 ECE 3800
Example6.6‐2exponentialdistribution
Determine the 95% confidence interval for 64 samples when the data estimated mean is 3.5.
kk
nk
nkP
XX
1ˆ1
1ˆ1
Then,
96.1k
95.064
96.115.3
164
96.115.3
1
P
95.05.3
245.1
5.3
755.0
P
95.0356.0216.0 P
Note that the estimated is based a mean of 3.5,
286.0ˆ1ˆ
X
Example6.6‐3Bernoullidistribution
1 kpkqkpmfX , for pq 1
pXEmX
qpppXVARX 12
The statistical summation based on CLT should provide
pmXX ̂ and n
qp
nS X
2
2~
kk
nk
nkP X
XXX ˆ
kk
n
qpkpp
n
qpkP ˆ
The range of the bounds become
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 32 of 37 ECE 3800
kk
n
qpkpp
n
qpkP ˆ
Determining the interval bounds and width – solve for p in terms of all the other variables! Remember that q=1-p
n
qpkpp
22ˆ
ppn
kpppp 1ˆˆ2
222
22
22
ˆ2
ˆ21 ppn
kpp
n
k
nk
pp
nk
nkp
p2
2
2
2
2
1
ˆ
1
2ˆ
2
Complete the “squared” term on the left 2
2
2
2
22
2
2
2
22
22
ˆ2ˆ
22
ˆ2
22
ˆ22
kn
kpn
kn
pn
kn
kpnp
kn
kpnp
2
22
2
22
2
2
22
ˆ2
22
ˆ2
22
ˆ2
kn
pn
kn
kpn
kn
kpnp
2
22
2
2
2
2
22
ˆ2
22
ˆ2
22
ˆ2
kn
pn
kn
kpn
kn
kpnp
Simplify as best possible
2
22222
2
ˆ4ˆ2ˆ2
kn
pnknkpnkpnp
2
222242222
2
ˆ4ˆ4ˆ4ˆ4ˆ2
kn
pnkpnkkpnpnkpnp
2
22
2
42
2,1 2
ˆ2
2
ˆ2
kn
kkpn
kn
kkpnp
2
2
2
2
1
ˆ
2
2ˆ2
kn
kpn
kn
kpnp
and 222
ˆ
2
ˆ2
kn
pn
kn
pnp
The distance between the two solutions goes to zero as n increases …
2
2
22
2
21
ˆˆ
kn
k
kn
pn
kn
kpnpp
Note: I have no clue what the textbook did …
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 33 of 37 ECE 3800
Example6.6‐4Isitafaircoin?
If we get 47 heads after tossing a coin 100 times, is it fair within a 95% confidence interval?
95.0100
5.05.0ˆ
100
5.05.0
kkkppkP
96.1k
95.010
5.096.1ˆ
10
5.096.1
ppP
95.0098.0ˆ098.0 ppP
95.0098.05.0ˆ098.05.0 pP
95.0598.0ˆ402.0 pP
Therefore, 0.47 is within the acceptable range. Alternately
95.0100
5.05.0ˆ
100
5.05.0ˆ
kppkpP
95.0098.047.0098.047.0 pP
95.0568.0372.0 pP
If the number of coin flips were 1200 with the same proportional results …
95.064.34
5.096.1ˆ
64.34
5.096.1
ppP
95.00283.0ˆ0283.0 ppP
95.05283.0ˆ4717.0 pP (0.47 is not in the range)
Alternately
95.04983.04417.0 pP (0.5 is not in the range)
We would have to say the coin is biased. The values are not within the 95% confidence intervals!
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 34 of 37 ECE 3800
6.7 Maximum Likelihood Estimators
The likelihood function can be “properly” defined as
n
iiXn xfxxxL
121 |,,,;
We are interested in finding the value of theta, , that maximizes this function!
To solve … take the derivative, set to zero, solve and determine the minima and maxima. Pick the global maxima!
Easy … right ?!
Example6.7‐1BernoulliRVofanunknownprobabilityp.
What is the maximum likelihood estimate of p if after flipping coins n times, we have k1 heads?
knk ppk
nk
1Pr , for nk ,,2,1,0
We define a likelihood function
11 1|Pr1
1knk pp
k
npkY
Determine the derivative
011|Pr 11
11
11
1111
knkknk ppknppk
k
npkY
dp
d
011 1111 11 pknpkpp knk
01 11111 11 pkpnpkkpp knk
01 111 11 pnkpp knk
The roots are at
n
kp 1,1,0
Two are minima (p=0 and p=1), therefore the ML probability is n
kpML
1
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 35 of 37 ECE 3800
Example6.7‐2DeterminethemeanofaGaussianR.Vwithknownvariance.
For
2
2
2exp
2
1
X
X
X
X
mxxf
The joint probability likelihood estimate for n sample trials becomes.
2
1
2
1 2exp
2
1|
X
n
ii
n
X
n
iiX
xxfL
A simplification often performed is to use the log-likelihood function or
n
iiX
n
iiX xfxfL
11
|log|loglog
21
2
2explog
2
1loglog
X
n
ii
X
xnL
n
i X
i
X
xnL
12
2
2explog
2
1loglog
n
ii
XX
n
i X
i
X
xnx
nL1
2
21
2
2
2
1
2
1log
22
1loglog
Taking the derivative
02
1
2
1loglog
1
2
2
n
ii
XX
xd
dn
d
dL
d
d
022
1
12
n
ii
X
x
n
i
n
iix
11
n
iiML x
n 1
1
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 36 of 37 ECE 3800
MaximumLikelihoodEstimatorProperties
Invariance: if an MLE is found for theta, , then the MLE of a function of theta is the function of the MLE.
For ML ˆ
Then hy has MLML hyy ˆ
from https://en.wikipedia.org/wiki/Maximum_likelihood
Consistency: the sequence of MLEs converges in probability to the value being estimated.
Asymptotic normality: as the sample size increases, the distribution of the MLE tends to the Gaussian distribution with mean \theta and covariance matrix equal to the inverse of the Fisher information matrix.
Efficiency, i.e., it achieves the Cramér–Rao lower bound when the sample size tends to infinity. This means that no consistent estimator has lower asymptotic mean squared error than the MLE (or other estimators attaining this bound).
Second-order efficiency after correction for bias.
Notes and figures are based on or taken from materials in the course textbook: Probability, Statistics and Random Processes for Engineers, 4th ed., Henry Stark and John W. Woods, Pearson Education, Inc., 2012.
B.J. Bazuin, Fall 2016 37 of 37 ECE 3800
6.8 Ordering, Ranking and Percentiles
Percentile: https://en.wikipedia.org/wiki/Percentile
“A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20 percent of the observations may be found.”
Textbook:
The u-th percentile of X is the number xu such that FX(xu)=u.
uX xFu
One would say that the result xu is in the uth percentile.
Example 6.8-1 Assume a person’s IQ is distributed as N(100,100). That is , a Gaussian normal with mean of 100, a variance of 100 and a standard deviation of 10.
Then an IQ of 115 would be defined at what percentile of the popolations?
5.110100115 z
9332.05.1
The individual is in the 93rd percentile for IQ.
Median: The median of a population is defined where half of the population is above and half is below the value.
5.0medianX xF
For some of the distributions described, the mean and the median are not equal!
For example, the exponential distribution.
69.05.0ln
medianx whereas
1X
Top Related