MIT OpenCourseWare ://ocw.mit.edu/courses/mechanical-engineering/2-830j-control-of... · • Random...
Transcript of MIT OpenCourseWare ://ocw.mit.edu/courses/mechanical-engineering/2-830j-control-of... · • Random...
MIT OpenCourseWare _________http://ocw.mit.edu___
2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303)Spring 2008
For information about citing these materials or our Terms of Use, visit: ________________http://ocw.mit.edu/terms.
1Manufacturing
Control of Manufacturing Processes
Subject 2.830/6.780/ESD.63Spring 2008Lecture #5
Probability Models, Parameter Estimation, and Sampling
February 21, 2008
2Manufacturing
The Normal Distribution
p(x) =1
σ 2πe
− 12
x − μσ
⎛ ⎝ ⎜
⎞ ⎠ ⎟
2
z
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4
z =x − μ
σ
“Standard normal”
μz=0σz=1
3Manufacturing
Properties of the Normal pdf
• Symmetric about mean• Only two parameters:
μ and σ
• Mean (μ) and Variance ( σ2 ) have well known “estimators” (average and sample variance)
p(x) =1
σ 2πe
− 12
x − μσ
⎛ ⎝ ⎜
⎞ ⎠ ⎟
2
4Manufacturing
Testing the Model: e.g. Is the Process “Normal” ?
• Is the underlying distribution really normal?– Look at histogram– Look at curve fit to histogram– Look at % of data in 1, 2 and 3σ bands
• Confidence Intervals– Look at “kurtosis”
• Measure of deviation from normal– Probability (or qq) plots (see Mont. 3-3.7 or MATLAB
stats toolbox)
5Manufacturing
Kurtosis: Deviation from Normal
k =n(n +1)
(n −1)(n − 2)(n − 3)xi − x
s⎛ ⎝ ⎜
⎞ ⎠ ⎟ ∑
4⎡
⎣ ⎢ ⎢
⎤
⎦ ⎥ ⎥
−3(n −1)2
(n − 2)(n − 3)
For sampled data:
k = 0 - normalk > 0 - more “peaked”k < 0 - more “flat”
6Manufacturing
Kurtosis for Some Common Distributions
D: Laplace (k = 3)L: logistic (k = 1.2)N: normal (k = 0)U: uniform (k = -1.2)
Source: Wikimedia Commons, http://commons.wikimedia.org
7Manufacturing
• Plot– normalized (mean
centered and scaled to s)
vs. – theoretical position
of unit normal distribution for ordered data
• Normal distribution: data should fall along line
Quantile-Quantile (qq) Plots
Source: Wikimedia Commons, http://commons.wikimedia.org
8Manufacturing
Guaranteeing “Normality”The Central Limit Theorem
– If x1, x2 ,x3 ...xN … are N independent observations of a random variable with “moments” μx and σ2
x,
– The distribution of the sum of all the samples will tend toward normal.
9Manufacturing
Example: Uniformly Distributed Data
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
x1
x2
x100
+
+ . . .
Sum of 100 sets of 1000 points each
35 40 45 50 55 60 650
50
100
150
y = xii=1
100
∑
10Manufacturing
Sampling: Using Measurements (Data) to Model the Random Process
• In general p(x) is unknown• Data can suggest form of p(x)
– e.g.. uniform, normal, weibull, etc.• Data can be used to estimate parameters of distributions
– e.g. μ and σ for normal distribution: p(x) = N(μ ,σ2)
• How to estimate– Sample Statistics
• Uncertainty in estimates– Sample Statistic pdf’s
p(x) =1
σ 2πe
− 12
x − μσ
⎛ ⎝ ⎜
⎞ ⎠ ⎟
2
11Manufacturing
Sample Statistics
Average or sample mean
Sample variance
Sample standard deviation
12Manufacturing
Sample Mean Uncertainty
• If all xi come from a distribution with μx and σ2x, and
we divide the sum by n:
x =1n
xii =1
n
∑
μx = μx σ x2 =
1n
σ x2 or σ x =
1n
σ x
x = c1x1 + c2 x2 + c3x3 + K cn xn
ci =1n
Then: and
13Manufacturing
Manufacturing as Random Processes
• All physical processes have a degree of natural randomness
• We can model this behavior using probability distribution functions
• We can calibrate and evaluate the quality of this model from measurement data
14Manufacturing
Formal Use of Statistical Models
• Discrete Variable Distributions and Uses– Attribute Modeling
• Sampling: Key distributions arising in sampling• Chi-square, t, and F distributions
• Estimation: – Reasoning about the population based on a sample
• Some basic confidence intervals• Estimate of mean with variance known• Estimate of mean with variance not known• Estimate of variance
• Hypothesis tests
15Manufacturing
Discrete Distribution: Bernoulli
Bernoulli trial: an experiment with two outcomes
Probability density function (pdf):
x
f(x)
p1 - p
0 1
¼
¾
16Manufacturing
Discrete Distribution: Binomial
Repeated random Bernoulli trials
• n is the number of trials• p is the probability of “success” on any one trial• x is the number of successes in n trials
17Manufacturing
Binomial DistributionBinomial Distribution
0
0.05
0.1
0.15
0.2
0.25
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Number of "Successes"
0
0.2
0.4
0.6
0.8
1
1.2
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Series1
e.g. expected number of “rejects”
18Manufacturing
Discrete Distribution: Poisson
• Poisson is a good approximation to Binomial when n is large and p is small (< 0.1)
Example applications:# misprints on page(s) of a book# defects on a wafer
Mean:Variance:
19Manufacturing
Poisson Distribution
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
c
Poisson Distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
c
Poisson Distribution
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
λ=5
λ=20
λ=30
Poisson Distributions
e.g. defects/device
20Manufacturing
Back to Continuous Distributions
• Uniform Distribution• Normal Distribution
– Unit (Standard) Normal Distribution
21Manufacturing
Continuous Distribution: Uniform
xa
1
b
xa b
cdf
22Manufacturing
Standard Questions For a Known cdf or pdf
xa
1
b
xa b• Probability x sits within
• Probability x less than orequal to some value
some range
23Manufacturing
Continuous Distribution: Normal or Gaussian
cdf
0
0.16
0.5
0.84
1
0.9770.99865
0.02270.00135
24Manufacturing
Continuous Distribution: Unit Normal
• Normalization
cdf
Mean
Variance
25Manufacturing
Using the Unit Normal pdf and cdf
• We often want to talk about “percentage points”of the distribution – portion in the tails
0.1
0.5
0.9
1
26Manufacturing
Use of the pdf:Location of Data
• How likely are certain values of the random variable?
• For a “Standard Normal” Distribution:
z = (x − μ)σ
N(0,1)μ=0
σ=1
z =1 ⇒ x =1σz = 2 ⇒ x = 2σz = 3 ⇒ x = 3σ
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4
27Manufacturing
Location of Data
P(-1≤ z ≤ 1) = P(z ≤1) - P(z ≤-1) = Φ(1) - Φ(-1) (+ 1σ) = 0.841 - (1-0.841) = 0.682
P(-2≤ z ≤ 2) = P(z ≤2) - P(z ≤-2) = 0.977 - (1-0.977) = 0.954(+ 2σ)
P(-3≤ z ≤ 3) = P(z ≤3) - P(z ≤-3) = 0.998 - (1-0.998) = 0.997(+ 3σ)
Φ(z) tabulated (e.g. p. 752 of Montgomery)
29Manufacturing
Statistics
The field of statistics is about reasoning in the face of uncertainty, based on evidence from
observed data
• Beliefs:– Probability distribution or probabilistic model form– Distribution/model parameters
• Evidence:– Finite set of observations or data drawn from a
population (experimental measurements or observations)
• Models:– Seek to explain data wrt a model of their probability
30Manufacturing
Sampling to Determine Parameters of the Parent Probability Distribution
• Assume Process Under Study has a Parent Distribution p(x)• Take “n” Samples From the Process Output (xi)
• Look at Sample Statistics (e.g. sample mean and sample variance)
• Relationship to Parent• Both are Random Variables• Both Have Their Own Probability Distributions
• Inferences about the process (the parent distribution) via Inferences about the derived sampling distribution
31Manufacturing
Moments of the Population vs. Sample Statistics
• Mean
• Variance
• StandardDeviation
• Covariance
• CorrelationCoefficient
Underlying model or Population Probability
Sample Statistics
32Manufacturing
Sampling and Estimation
• Sampling: act of making observations from populations
• Random sampling: when each observation is identically and independently distributed (IID)
• Statistic: a function of sample data; a value that can be computed from data (contains no unknowns) – Average, median, standard deviation– Statistics are by definition also random variables
33Manufacturing
Population vs. Sampling Distribution
Population(“true”)probability density function)
Sample Mean(statistic)
n = 20
n = 10
n = 2
Sample Mean Distribution(sampling distribution)
n = 1
34Manufacturing
Sampling and Estimation, cont.• Sampling• Random sampling• Statistic• A statistic is a random variable, which itself has a
sampling (probability) distribution– I.e., if we take multiple random samples, the value for the
statistic will be different for each set of samples, but will begoverned by the same sampling distribution
• If we know the appropriate sampling distribution, we can reason about the underlying population based on the observed value of a statistic– E.g. we calculate a sample mean from a random sample; in
what range do we think the actual (population) mean sits?
35Manufacturing
Sampling and Estimation – An Example
• Suppose we know that the thickness of a part is normally distributed with std. dev. of 10:
• We sample n = 50 random parts and compute the mean part thickness:
• First question: What is distribution of the mean of T =
• Second question: can we use knowledge of distribution to reason about the actual (population) mean μgiven observed (sample) mean?
36Manufacturing
Estimation and Confidence Intervals• Point Estimation:
– Find best values for parameters of a distribution– Should be
• Unbiased: expected value of estimate should be true value• Minimum variance: should be estimator with smallest variance
• Interval Estimation: – Give bounds that contain actual value with a given
probability– Must know sampling distribution!
37Manufacturing
Confidence Intervals: Variance Known
• We know σ, e.g. from historical data• Estimate mean in some interval to (1-α)100% confidence
0.1
0.5
0.9
1
Remember the unit normal percentage points
Apply to the sampling distribution for the sample mean
38Manufacturing
95% confidence interval, α = 0.05
Example, Cont’d
• Second question: can we use knowledge of distribution to reason about the actual (population) mean μ given observed (sample) mean?
n = 50
~95% of distributionlies within +/- 2σ of mean
39Manufacturing
Summary• Process as Random Variable
– Histograms to pdf’s• Different Distributions for Different Processes
– Discrete or Binary (e.g. Defects)– Continuous (e.g. Dimensional Variation)
• Parent Distributions and Sampling– Estimating the Parent from Data
• Use of Distributions to establish “Confidence” on Parameter Estimates