Chapter 3 Selected Basic Concepts in Statistics n Expected Value, Variance, Standard Deviation n...

44
Chapter 3 Selected Basic Concepts in Statistics Expected Value, Variance, Standard Deviation Numerical summaries of selected statistics Sampling distributions

Transcript of Chapter 3 Selected Basic Concepts in Statistics n Expected Value, Variance, Standard Deviation n...

  • Slide 1
  • Slide 2
  • Chapter 3 Selected Basic Concepts in Statistics n Expected Value, Variance, Standard Deviation n Numerical summaries of selected statistics n Sampling distributions
  • Slide 3
  • Expected Value Weighted average Not the value of y you expect; a long-run average
  • Slide 4
  • E(y) Example 1 Toss a fair die once. Let y be the number of dots on upper face. y123456 p(y)1/6
  • Slide 5
  • E(y) Example 2: Green Mountain Lottery Choose 3 digits between 0 and 9. Repeats allowed, order of digits counts. If your 3-digit number is selected, you win $500. Let y be your winnings (assume ticket cost $0) y$0$500 p(y)0.9990.001
  • Slide 6
  • US Roulette Wheel and Table n The roulette wheel has alternating black and red slots numbered 1 through 36. n There are also 2 green slots numbered 0 and 00. n A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is... n If you bet $1 on the winning number, you receive $36, so your winnings are $35 American Roulette 0 - 00 (The European version has only one 0.)
  • Slide 7
  • US Roulette Wheel: Expected Value of a $1 bet on a single number n Let y be your winnings resulting from a $1 bet on a single number; y has 2 possible values y-135 p(y)37/381/38 n E(y)= -1(37/38)+35(1/38)= -.05 n So on average the house wins 5 cents on every such bet. A fair game would have E(y)=0. n The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in
  • Slide 8
  • Slide 9
  • Variance and Standard Deviation n Measure spread around the middle, where the middle is measured by
  • Slide 10
  • Variance Example Toss a fair die once. Let y be the number of dots on upper face. y123456 p(y)1/6 Recall = 3.5
  • Slide 11
  • V(y) Example 2: Green Mountain Lottery y$0$500 p(y)0.9990.001 Recall =.50
  • Slide 12
  • Estimators for , 2, n s 2 average squared deviation from the middle n Automate these calculations n Examples
  • Slide 13
  • Linear Transformations of Random Variables and Sample Statistics n Random variable y with E(y) and V(y) n Lin trans y*=a+by, what is E(y*) and V(y*) in terms of original E(y) and V(y)? n Data y 1, y 2, , y n with mean y and standard deviation s n Lin trans y* = a + by; new data y 1 *, y 2 *, , y n *; what is y* and s* in terms of y and s
  • Slide 14
  • n E(y*)=E(a+by) = a + bE(y) n V(y*)=V(a+by) = b 2 V(y) n SD(y*)=SD(a+by) =|b|SD(y) n y* = a + by n s* 2 = b 2 s 2 n s* = b s Linear Transformations Rules for E(y*), V(y*) and SD(y*) Rules for y*, s* 2, and s*
  • Slide 15
  • Expected Value and Standard Deviation of Linear Transformation a + by Let y=number of repairs a new computer needs each year. Suppose E(y)= 0.20 and SD(y)=0.55 The service contract for the computer offers unlimited repairs for $100 per year plus a $25 service charge for each repair. What are the mean and standard deviation of the yearly cost of the service contract? Cost = $100 + $25y E(cost) = E($100+$25y)=$100+$25E(y)=$100+$25*0.20= = $100+$5=$105 SD(cost)=SD($100+$25y)=SD($25y)=$25*SD(y)=$25*0.55= =$13.75
  • Slide 16
  • Addition and Subtraction Rules for Random Variables n E(X+Y) = E(X) + E(Y); n E(X-Y) = E(X) - E(Y) n When X and Y are independent random variables: 1. Var(X+Y)=Var(X)+Var(Y) 2. SD(X+Y)= SDs do not add: SD(X+Y) SD(X)+SD(Y) 3. Var(XY)=Var(X)+Var(Y) 4. SD(X Y)= SDs do not subtract: SD(XY) SD(X)SD(Y) SD(XY) SD(X)+SD(Y)
  • Slide 17
  • Example: rvs NOT independent n X=number of hours a randomly selected student from our class slept between noon yesterday and noon today. n Y=number of hours the same randomly selected student from our class was awake between noon yesterday and noon today. Y = 24 X. n What are the expected value and variance of the total hours that a student is asleep and awake between noon yesterday and noon today? n Total hours that a student is asleep and awake between noon yesterday and noon today = X+Y n E(X+Y) = E(X+24-X) = E(24) = 24 n Var(X+Y) = Var(X+24-X) = Var(24) = 0. n We don't add Var(X) and Var(Y) since X and Y are not independent.
  • Slide 18
  • a2a2 c 2 =a 2 +b 2 b2b2 Pythagorean Theorem of Statistics for Independent X and Y a b c a 2 + b 2 = c 2 Var(X) Var(Y) Var(X+Y) SD(X) SD(Y) SD(X+Y) Var(X)+Var(Y)=Var(X+Y) a + b c SD(X)+SD(Y) SD(X+Y)
  • Slide 19
  • 9 25=9+16 16 Pythagorean Theorem of Statistics for Independent X and Y 3 4 5 3 2 + 4 2 = 5 2 Var(X) Var(Y) Var(X+Y) SD(X) SD(Y) SD(X+Y) Var(X)+Var(Y)=Var(X+Y) 3 + 4 5 SD(X)+SD(Y) SD(X+Y)
  • Slide 20
  • Example: meal plans n Regular plan: X = daily amount spent n E(X) = $13.50, SD(X) = $7 n Expected value and stan. dev. of total spent in 2 consecutive days? (assume independent) n E(X 1 +X 2 )=E(X 1 )+E(X 2 )=$13.50+$13.50=$27 SD(X 1 + X 2 ) SD(X 1 )+SD(X 2 ) = $7+$7=$14
  • Slide 21
  • Example: meal plans (cont.) n Jumbo plan for football players Y=daily amount spent n E(Y) = $24.75, SD(Y) = $9.50 n Amount by which football players spending exceeds regular student spending is Y-X n E(Y-X)=E(Y)E(X)=$24.75-$13.50=$11.25 SD(Y X) SD(Y) SD(X) = $9.50 $7=$2.50
  • Slide 22
  • For random variables, X+X2X n Let X be the annual payout on a life insurance policy. From mortality tables E(X)=$200 and SD(X)=$3,867. 1) If the payout amounts are doubled, what are the new expected value and standard deviation? Double payout is 2X. E(2X)=2E(X)=2*$200=$400 SD(2X)=2SD(X)=2*$3,867=$7,734 2) Suppose insurance policies are sold to 2 people. The annual payouts are X 1 and X 2. Assume the 2 people behave independently. What are the expected value and standard deviation of the total payout? E(X 1 + X 2 )=E(X 1 ) + E(X 2 ) = $200 + $200 = $400 The risk to the insurance co. when doubling the payout (2X) is not the same as the risk when selling policies to 2 people.
  • Slide 23
  • Estimator of population mean n y will vary from sample to sample n What are the characteristics of this sample-to- sample behavior?
  • Slide 24
  • Numerical Summary of Sampling Distribution of y Unbiased: a statistic is unbiased if it has expected value equal to the population parameter.
  • Slide 25
  • Numerical Summary of Sampling Distribution of y
  • Slide 26
  • Standard Error Standard error - square root of the estimated variance of a statistic important building block for statistical inference
  • Slide 27
  • Shape? n We have numerical summaries of the sampling distribution of y n What about the shape of the sampling distribution of y ?
  • Slide 28
  • THE CENTRAL LIMIT THEOREM The World is Normal Theorem
  • Slide 29
  • The Central Limit Theorem (for the sample mean y) n If a random sample of n observations is selected from a population (any population), then when n is sufficiently large, the sampling distribution of y will be approximately normal. (The larger the sample size, the better will be the normal approximation to the sampling distribution of y.)
  • Slide 30
  • The Importance of the Central Limit Theorem n When we select simple random samples of size n, the sample means we find will vary from sample to sample. We can model the distribution of these sample means with a probability model that is Shape of population is irrelevant
  • Slide 31
  • Estimating the population total
  • Slide 32
  • Expected value
  • Slide 33
  • Estimating the population total Variance, standard deviation, standard error
  • Slide 34
  • Finite population case Example: sampling w/ replacement to estimate
  • Slide 35
  • Finite population case Example: sampling w/ replacement to estimate SampleProb of Sample V( ) {1, 2}.021525.0 {1, 3}.0835/41.5625 {1, 4}.08100 {2, 3}.0855/439.0625 {2, 4}.081525.0 {3, 4}.3235/41.5625 {1, 1}.01100 {2, 2}.01200 {3, 3}.1615/20 {4, 4}.16100
  • Slide 36
  • Finite population case Example: sampling w/ replacement to estimate From the table:
  • Slide 37
  • Finite population case Example: sampling w/ replacement to estimate
  • Slide 38
  • Finite population case Example: sampling w/ replacement to estimate Example Summary
  • Slide 39
  • Finite population case Sampling w/ replacement to estimate pop. total In general
  • Slide 40
  • Finite population case Sampling w/ replacement to estimate pop. total
  • Slide 41
  • Finite population case Sampling w/ replacement to estimate pop. total In reality, do not know value of y i for every item in the population. BUT can choose i proportional to a known measurement highly correlated with y i.
  • Slide 42
  • Finite population case Sampling w/ replacement to estimate pop. total Example: want to estimate total number of job openings in a city by sampling industrial firms. Many small firms employ few workers; A few large firms employ many workers; Large firms influence number of job openings; Large firms should have greater chance of being in sample to improve estimate of total openings. Firms can be sampled with probabilities proportional to the firms total work force, which should be correlated to the firms job openings.
  • Slide 43
  • Finite population case Sampling without replacement to estimate pop. total Thus far we have assumed a population that does not change when the first item is selected, that is, we sampled with replacement. Example: population {1, 2, 3, 4}; n=2, suppose equally likely. Prob. of selecting 3 on first draw is . Prob. of selecting 3 on second draw depends on first draw (probability is 0 or 1/3) When sampling without replacement this is not true
  • Slide 44
  • Finite population case Sampling without replacement to estimate pop. total Worksheet
  • Slide 45
  • End of Chapter 3