Economics 105: Statistics
description
Transcript of Economics 105: Statistics
Economics 105: Statistics• Review #1 to be handed out Tuesday, due following Tuesday in class. Take-home, closed-book, closed-notes, untimed, must use Excel or calculator (and transfer answers to the exam paper). • Formula sheet rules: No words, in English or otherwise. Only formulas/equations. No proofs. Symbols like B for Binomial are okay. Front & back of 1 sheet of paper. Excel help is okay.• Equation editor can be useful•Go over GH 6, GH 7 & 8 due Tuesday
Probability Distributions
Continuous Probability
Distributions
Binomial
Hypergeometric
Poisson
Probability Distributions
Discrete Probability
Distributions
Normal
Uniform
Exponential
Bernoulli
Exponential Distribution• • • • Graph• Useful for waiting time, duration, or queuing problems• Memoryless property• Find the prob no student arrives in next hour.• Find prob a student arrives in next 5 minutes.
Probability Distributions
Continuous Probability
Distributions
Binomial
Hypergeometric
Poisson
Probability Distributions
Discrete Probability
Distributions
Normal
Uniform
Exponential
Bernoulli
Normal Distribution• Let • The p.d.f. is given by
• “The bell curve”, also sometimes called the Gaussian distribution after this guy
• http://cnx.rice.edu/content/m11161/latest/#java• Reading the table … pages 915 in BLK, 11th edition. Note that numbers across the top (i.e., at top of each column) are the SECOND digit after the decimal.
The Normal Distribution• ‘Bell Shaped’• Symmetrical • Mean, Median and Mode
are EqualLocation is determined by the mean, μ
Spread is determined by the standard deviation, σ
The random variable has an infinite theoretical range: + to
Mean = Median= Mode
X
f(X)
μ
σ
By varying the parameters μ and σ, we obtain different normal distributions
Many Normal Distributions
Standardized Normal Distribution
Z
f(Z)
0
1
Values above the mean have positive Z-values, values below the mean have negative Z-values
The Z distribution always has mean = 0 and standard deviation = 1
Example• Convention • If X ~ N(100, 2500), then the Z value for
X = 200 is
• This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100.
Comparing X and Z units
Z100
2.00200 X
Note that the distribution is the same, only the scale has changed.
(μ = 100, σ = 50)
(μ = 0, σ = 1)
Finding Normal Probabilities
a b X
f(X) P a X b( )≤
Probability is measured by the area under the curve
≤
P a X b( )<<=(Note that the probability of any individual value is zero)
f(X)
Xμ
Probability as Area Under the Curve
0.50.5
The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below
Empirical Rules
μ ± 1σ encloses about 68% of X’s
f(X)
Xμ μ+1σμ-1σ
What can we say about the distribution of values around the mean? There are some general rules:
σσ
68.26%
The Empirical Rule
• μ ± 2σ covers about 95% of X’s
• μ ± 3σ covers about 99.7% of X’s
xμ
2σ 2σ
xμ
3σ 3σ
95.44% 99.73%
(continued)
The Standardized Normal Table
• The Cumulative Standardized Normal table in the textbook (Appendix table E.2) gives the probability less than a desired value for Z (i.e., from negative infinity to Z)
Z0 2.00
0.9772
Example:
P(Z < 2.00) = 0.9772
The Standardized Normal Table
The value within the table gives the probability from Z = up to the desired Z value
.9772
2.0P(Z < 2.00) = 0.9772
The row shows the value of Z to the first decimal point
The column gives the value of Z to the second decimal point
2.0
.
.
.
(continued)
Z 0.00 0.01 0.02 …
0.0
0.1
Finding Normal Probabilities
• Suppose X ~ N(8, 25). Find P(X < 8.6)
X
8.6
8.0
• Suppose X ~ N(8, 25). Find P(X < 8.6)
Z0.12 0X8.6 8
μ = 8 σ = 10
μ = 0σ = 1
(continued)
Finding Normal Probabilities
P(X < 8.6) P(Z < 0.12)
Z
0.12
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
Solution: Finding P(Z < 0.12)
.5478.02
0.1 .5478
Standardized Normal Probability Table (Portion)
0.00
= P(Z < 0.12)P(X < 8.6)
Finding the X value for a Known Probability
Example:• Suppose X ~ N(8, 25)• Find the X value so that only 20% of all
values are below this X
X? 8.0
0.2000
Z? 0
Find the Z value for 20% in the Lower Tail
• 20% area in the lower tail is consistent with a Z value of -0.84Z .03
-0.9 .1762 .1736
.2033
-0.7 .2327 .2296
.04
-0.8 .2005
Standardized Normal Probability Table (Portion)
.05
.1711
.1977
.2266
…
…
…
…X? 8.0
0.2000
Z-0.84 0
1. Find the Z value for the known probability
2. Convert to X units using the formula:
Finding the X value
So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80
More Examples• If Z ~ N(0,1), find P(-1 < Z < 1)
• If W ~ N(3,4), find P(-1 < W < 1)
Evaluating Normality• Construct charts or graphs
– For small- or moderate-sized data sets, do stem-and-leaf display and box-and-whisker plot look symmetric?
– For large data sets, does the histogram or polygon appear bell-shaped?
• Compute descriptive summary measures– Do the mean, median and mode have similar
values?– Is the interquartile range approximately 1.33 σ?– Is the range approximately 6 σ?
Evaluating Normality• Observe the distribution of the data set
– Do approximately 2/3 of the observations lie within mean 1 standard deviation?
– Do approximately 80% of the observations lie within mean 1.28 standard deviations?
– Do approximately 95% of the observations lie within mean 2 standard deviations?
• Evaluate normal probability plot– Is the normal probability plot approximately
linear with positive slope?
(continued)
The Normal Probability Plot• Normal probability plot
– Arrange data into ordered array
– Find corresponding standardized normal
quantile values
– Plot the pairs of points with observed data
values on the vertical axis and the standardized
normal quantile values on the horizontal axis
– Evaluate the plot for evidence of linearity
A normal probability plot for data from a normal distribution will be
approximately linear:
30
60
90
-2 -1 0 1 2 Z
X
The Normal Probability Plot(continued)
The Normal Probability PlotData 1/(9+1) = 1/10 Data
X Order Cumulative area Corresponding Z score X
1 1 0.1 -1.281551939 1
4 2 0.2 -0.841621042 4
12 3 0.3 -0.524400458 12
23 4 0.4 -0.253347241 23
55 5 0.5 5.47142E-10 55
67 6 0.6 0.253347241 67
75 7 0.7 0.524400458 75
87 8 0.8 0.841621042 87
112 9 0.9 1.281551939 112
Normal Probability Plot
Left-Skewed Right-Skewed
Rectangular
30
60
90
-2 -1 0 1 2 Z
X
(continued)
30
60
90
-2 -1 0 1 2 Z
X
30
60
90
-2 -1 0 1 2 Z
X Nonlinear plots indicate a deviation from normality
Other Continuous Distributions
Source: wikipedia pages