1. Homework #2 2. Inferential Statistics 3. Review for Exam
-
Upload
ursula-moreno -
Category
Documents
-
view
29 -
download
2
description
Transcript of 1. Homework #2 2. Inferential Statistics 3. Review for Exam
![Page 1: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/1.jpg)
1. Homework #22. Inferential Statistics 3. Review for Exam
![Page 2: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/2.jpg)
HOMEWORK #2: Part A Sanitation Eng. Z=.53 = .2019 + .50 = .7019 F.C. Z=.67 = .2486 + .50 = .7486
5 GPA’s, which are in the top 10%? GPA of 3.0 and 3.20 are not:
Z = (3.0-2.78)/.33 =.67 Area beyond = .2514 (25.14%)
Z=(3.20-2.78)/.33=1.27 Corresponds to .8980 (.3980+.5000) Area beyond = .1020 (10.2%)
By contrast, for 3.21… Z=(3.21-2.78)/.33=1.30
Corresponds to .9032 (.4032+.5000)
![Page 3: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/3.jpg)
HOMEWORK #2: Part B
Question 1 a. Mean=18.87; median=15; mode=4
b. The mean is higher because the distribution is positively skewed (several large cities with high percents)
c. When you remove NYC, the mean=16.43 & the median goes from 15 to 14.5. Removing NYC’s high value from the distribution reduces the skew. The mean decreases more than the median because value
of the mean is influenced by outlying values; the median is not—it only moves one case over.
![Page 4: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/4.jpg)
HOMEWORK #2: Part B Question 2
For this problem, there are two measures of central tendency (indicating the “typical” score). The mean per student expenditure was almost $2,000 higher in
2003 ($9,009) than in 1993 ($7,050). The median also increased, but not nearly as much (from $7,215
to $7,516).
The spread of the scores, as indicated by the standard deviation, was more than double 2003 (1,960) than it was in 1993 (804).
Shape For 1993, the distribution of scores has a slight negative skew;
this distribution is essentially normal (bell-shaped) as the mean ($7,050) and median ($7,215) are similar. By contrast, for 2003, the mean is much greater than the median; this distribution has a strong positive skew.
![Page 5: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/5.jpg)
HOMEWORK #2: Part B Q3
a. 53.28% Opposite sides of mean, add 2 areas together
b. 6.38% Both scores on right side of mean, subtract areas
c. 10.56% “Column C” area for Z=1.25 is .1056
d. 69.15% “Column B” area for Z= -0.5 is .1915 + .5000 (for other half
of normal curve) e. 99.38%
Z=2.5; Column B (for area between 2.5 & 0) = .4938 + .5000 (for other half of normal curve)
f. 6.68% Z = -1.5; Column C for area beyond -1.5 =.0668
![Page 6: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/6.jpg)
HOMEWORK #2: Part B Q4
a. .9953 Column B area (.4953) + .5000 (for other half of normal
curve) b. .5000
50% of area on either side of mean (47) c. .6826
“Column B” for both – .3413 + .3413 d. .9997
Column B area (.4997) + .5000 (for other half of normal curve)
e. .0548 “Column C” area for Z=1.6
f. .3811 Scores on opposite sides of mean add “Col. B” areas
![Page 7: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/7.jpg)
HOMEWORK #2: Part C
SPSS: All the info needed to answer these questions
is contained in this output
StatisticsHOURS PER DAY WATCHING TVN Valid 1426
Missing 618Mean 3.03Median 2.00Mode 2Std. Deviation 2.766Percentiles 10 1.00
20 1.0025 1.0030 2.0040 2.0050 2.0060 3.0070 3.0075 4.0080 4.0090 6.00
![Page 8: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/8.jpg)
Distribution (Histogram) for TV Hours
![Page 9: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/9.jpg)
Sibs Distribution
![Page 10: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/10.jpg)
College Science Credits
![Page 11: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/11.jpg)
Sampling Terminology Element: the unit of which a population is
comprised and which is selected in the sample Population: the theoretically specified
aggregation of the elements in the study (e.g., all elements)
Parameter: Description of a variable in the population σ = standard deviation, µ = mean
Sample: The aggregate of all elements taken from the pop.
Statistic: Description of a variable in the sample (estimate of parameter) X = mean, s = standard deviation
![Page 12: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/12.jpg)
Non-probability Sampling
Elements have unknown odds of selection Examples
Snowballing, available subjects… Limits/problems
Cannot generalize to population of interest (doesn’t adequately represent the population (bias)
Have no idea how biased your sample is, or how close you are to the population of interest
![Page 13: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/13.jpg)
Probability Sampling
Definition: Elements in the population have a known (usually
equal) probability of selection Benefits of Probability Sampling
Avoid bias Both conscious and unconscious More representative of population
Use probability theory to: Estimate sampling error Calculate confidence intervals
![Page 14: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/14.jpg)
Sampling Distributions
Link between sample and population DEFINITION 1
IF a large (infinite) number of independent, random samples are drawn from a population, and a statistic is plotted from each sample….
DEFINITION 2 The theoretical, probabilistic distribution of a
statistic for all possible samples of a certain outcome
![Page 15: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/15.jpg)
The Central Limit Theorem I
IF REPEATED random samples are drawn from the population, the sampling distribution will always be normally distributed As long as N is sufficiently (>100) large
The mean of the sampling distribution will equal the mean of the population WHY? Because the most common sample mean will
be the population mean Other common sample means will cluster around the
population mean (near misses) and so forth Some “weird” sample findings, though rare
![Page 16: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/16.jpg)
The Central Limit Theorem II
Again, WITH REPEATED RANDOM SAMPLES, The Standard Deviation of the Sampling distribution = σ
√N This Critter (the population standard deviation
divided by the square root of N) is “The Standard Error” How far the “typical” sample statistic falls from the
true population parameter
![Page 17: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/17.jpg)
The KICKER
Because the sampling distribution is normally distributed….Probability theory dictates the percentage of sample statistics that will fall within one standard error
1 standard error = 34%, or +/- 1 standard error = 68% 1.96 standard errors = 95% 2.58 standard errors = 99%
![Page 18: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/18.jpg)
The REAL KICKER
From what happens (probability theory) with an infinite # of samples… To making a judgment about the accuracy of statistics
generated from a single sample Any statistic generated from a single random sample has a
68% chance of falling within one standard error of the population parameter OR roughly a 95% CHANCE OF FALLING WITHIN 2 STANDARD
ERRORS
![Page 19: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/19.jpg)
EXAM Closed book
BRING CALCULATOR
You will have full class to complete
Format: Output interpretation Z-score calculation problems
Memorize Z formula Z-score area table provided
Short Answer/Scenarios Multiple choice
![Page 20: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/20.jpg)
Review for Exam Variables vs. values/attributes/scores
variable – trait that can change values from case to case example: GPA
score (attribute)– an individual case’s value for a given variable
Concepts Operationalize Variables
![Page 21: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/21.jpg)
Review for Exam Short-answer questions, examples:
What is a strength of the standard deviation over other measures of dispersion?
Multiple choice question examples: Professor Pinhead has an ordinal measure of a variable called
“religiousness.” He wants to describe how the typical survey respondent scored on this variable. He should report the ____. a. median b. mean c. mode e. standard deviation
On all normal curves the area between the mean and +/- 2 standard deviations will be a. about 50% of the total area b. about 68% of the total area c. about 95% of the total area d. more than 99% of the total area
![Page 22: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/22.jpg)
EXAM
Covers chapters 1- (part of)6: Chapter 1
Levels of measurement (nominal, ordinal, I-R) Any I-R variable could be transformed into an ordinal or
nominal-level variable Don’t worry about discrete-continuous distinction
Chapter 2 Percentages, proportions, rates & ratios
Review HW’s to make sure you’re comfortable interpreting tables
![Page 23: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/23.jpg)
EXAM Chapter 3: Central tendency
ID-ing the “typical” case in a distribution Mean, median, mode
Appropriate for which levels of measurement? Identifying skew/direction of skew Skew vs. outliers
Chapter 4: Spread of a distribution R & Q s2 – variance (mean of squared deviations) s
Uses every score in the distribution Gives the typical deviation of the scores
DON’T need to know IQV (section 4.2)
![Page 24: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/24.jpg)
Keep in mind…
All measures of central tendency try to describe the “typical case” Preference is given to statistics that use the most
information For interval-ratio variables, unless you have a highly
skewed distribution, mean is the most appropriate For ordinal, the median is preferred
If mean is not appropriate, neither is “s” S = how far cases typically fall from mean
![Page 25: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/25.jpg)
EXAM Chapter 5
Characteristics of the normal curve Know areas under the curve (Figure 5.3)
KNOW Z score formula Be able to apply Z scores
Finding areas under curve Z scores & probability Frequency tables & probability
![Page 26: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/26.jpg)
EXAM Chapter 6
Reasons for sampling Advantages of probability sampling What does it mean for a sample to be representative? Definition of probability (random) sampling Sampling error
Plus… Types of nonprobability sampling
![Page 27: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/27.jpg)
Interpret Total IQ Score N Valid 1826
Missing 9092
Mean 88.98
Median 91.00
Mode 94
Std. Deviation 20.063
Minimum 0
Maximum 160
Percentiles 10 63.00
20 74.00
25 78.00
30 80.00
40 86.00
50 91.00
60 95.00
70 100.00
75 103.00
80 105.00
90 112.00
1. Number of cases used to calculate mean?
2. Most common IQ score?
3. Distribution skewed? Direction?
4. Q?5. Range?6. Is standard
deviation appropriate to use here?
![Page 28: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/28.jpg)
Scenario
Professor Scully believes income is a good predictor of the size of a persons’ house IV? DV? Operationalize DV so that it is measured at all
three levels (nominal, ordinal, IR) Repeat for IV
![Page 29: 1. Homework #2 2. Inferential Statistics 3. Review for Exam](https://reader035.fdocuments.in/reader035/viewer/2022062321/56813319550346895d99dbab/html5/thumbnails/29.jpg)
Express the answer in the proper format
Percent Proportion Ratio Probability