Ways to look at the data
description
Transcript of Ways to look at the data
![Page 1: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/1.jpg)
Ways to look at the data
Number of hurricanes that occurred each year from 1944through 2000 as reported by Science magazine
4
8
12
16
0 2 4 6 8Hurricanes
Ch04_Hurricanes Histogram
0 1 2 3 4 5 6 7 8Hurricanes
Ch04_Hurricanes Dot Plot
012345678
Ch04_HurricanesBox PlotHistogram Dot plot Box plot
![Page 2: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/2.jpg)
3 Characteristics of data
Shape
Center
Spread
![Page 3: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/3.jpg)
Shape of the data – Symmetric
The age of all US Presidents at the time they took office
Notice that this distribution has only one mode
2
4
6
8
10
12
14
Age (years)40 45 50 55 60 65 70 75
Presidents Histogram
![Page 4: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/4.jpg)
Shape of the data – Bimodal
10
20
30
40
50
Time110 120 130 140 150 160 170 180
Ch04_Kentucky_Derby Histogram
The winning times in the Kentucky Derby from 1875 to thepresent. Why two modes?
![Page 5: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/5.jpg)
Shape of the data – Bimodal
10
20
30
40
50
Time110 120 130 140 150 160 170 180
Ch04_Kentucky_Derby Histogram
The winning times in the Kentucky Derby from 1875 to thepresent. Why two modes?
The length of the track was reduced from 1.5 miles to1.25 miles in 1896. The race officials thought that 1.5 miles was too far.
![Page 6: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/6.jpg)
Shape of the data – skewed
Data for two different variables for all female heart attackpatients in New York state in one year. One is skewed left; the other is skewed right. Which is which?
LEFT RIGHT
![Page 7: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/7.jpg)
Center and Spread of Data
Maximum
Q3
Median
Q1
Minimum
100th percentile
75th percentile
50th percentile
25th percentile
0th percentile 012345678
Ch04_HurricanesBox Plot
These numbers are called the 5 number summary.The median measures the center of the data.Q3 – Q1 = Interquartile range (IQR) measures the spread.
![Page 8: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/8.jpg)
Symbols:• s2 = Sample Variance• s = Sample Standard Deviation 2 = Population Variance (Pop. St. Dev. Squared) = Population Standard Deviation (Sq. Root of Variance)• REMEMBER-The Variance is the SD squared!And the SD is the Sq. root of the Variance!• x = Mean
Symbols
x
xxxxxxx
--
![Page 9: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/9.jpg)
The normal distribution and standard deviations
The total area under the curve is 1.
In a normal distribution:
34% 34%
13.5% 13.5% 2.35%2.35%
![Page 10: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/10.jpg)
The normal distribution and standard deviations
Approximately 68% of scores will fall within one standard deviation of the mean
In a normal distribution:
![Page 11: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/11.jpg)
The normal distribution and standard deviations
Approximately 95% of scores will fall within two standard deviations of the mean
In a normal distribution:
![Page 12: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/12.jpg)
The number of points that one standard deviations equals varies from distribution to distribution. On one math test, a standard deviation may be 7 points. If the mean were 45, then we would know that 68% of the students scored from 38 to 52.
24 31 38 45 52 59 63Points on Math Test
30 35 40 45 50 55 60Points on a Different Test
On another test, a standard deviation may equal 5 points. If the mean were 45, then 68% of the students would score from 40 to 50 points.
2.35% 13.5% 34% 34% 13.5% 2.35%
2.35% 13.5% 34% 34% 13.5% 2.35%
![Page 13: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/13.jpg)
Using standard deviation units to describe individual scores
100 110 1209080-1 sd 1 sd 2 sd-2 sd
What score is one sd below the mean? 90
What score is two sd above the mean? 120
Here is a distribution with a mean of 100 and standard deviation of 10:
![Page 14: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/14.jpg)
Using standard deviation units to describe individual scores
Here is a distribution with a mean of 100 and standard deviation of 10:
100 110 1209080-1 sd 1 sd 2 sd-2 sd
How many standard deviations below the mean is a score of 90? 1
2How many standard deviations above the mean is a score of 120?
![Page 15: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/15.jpg)
Using standard deviation units to describe individual scores
Here is a distribution with a mean of 100 and standard deviation of 10:
100 110 1209080-1 sd 1 sd 2 sd-2 sd
What percent of your data points are < 80? 2.50%
84%What percent of your data points are > 90?
![Page 16: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/16.jpg)
Types of Sampling:Self-selected Sample
• This methods allows the sample to choose themselves by responding to a general appeal (volunteering to be surveyed).
• Examples of Self-selected Sample: a call-in radio poll, an internet poll on a website
• Problems with Self-selected samples: bias – because people with strong opinions on the topic (especially negative opinions) are most likely to respond.
![Page 17: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/17.jpg)
Convenience Sampling• In a convenience sample individuals are
chosen because they are easy to reach.• Example: People conducting a survey go to
the mall and stop people who are shopping. This is convenient for the person doing the survey but does not guarantee that the sample is representative of the population of the study.
• Convenience sampling also involves bias on the part of the interviewer.
![Page 18: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/18.jpg)
Random Samples
• A random sample of size “n” individuals from the population chosen in such a way that every set of “n” individuals has an equal chance to be the sample selected.
• Example: Putting everyone’s name in a hat and drawing 3 names to participate in the study.
![Page 19: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/19.jpg)
Systematic Sample
• When a rule is used to select members of the population.
• Ex. Every third person on an alphabetized list
![Page 20: Ways to look at the data](https://reader036.fdocuments.in/reader036/viewer/2022062520/56815c31550346895dca11b4/html5/thumbnails/20.jpg)
Stratified Random Sample
To select a stratified random sample, first divide the population into groups of similar individuals, called STRATA. Then choose a separate sample in each strata and combine these to form the full sample. Common example would be separating by gender or race first, then selecting samples from each group.