Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
-
Upload
lynette-lucas -
Category
Documents
-
view
218 -
download
0
description
Transcript of Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
![Page 1: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/1.jpg)
Variability
Introduction to StatisticsChapter 4
Jan 22, 2009Class #4
![Page 2: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/2.jpg)
Describing Variability Describes in an exact quantitative measure, how spread
out/clustered together the scores are Variability is usually defined in terms of distance
How far apart scores are from each other How far apart scores are from the mean How representative a score is of the data set as a whole
![Page 3: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/3.jpg)
Describing Variability: the Range
Simplest and most obvious way of describing variability Range = Highest - Lowest(real limits)
The range only takes into account the two extreme scores and ignores any values in between. To counter this there the distribution is divided into quarters (quartiles). Q1 = 25%, Q2 =50%, Q3 =75%
The Interquartile range: the distance of the middle two quartiles (Q3 – Q1)
The Semi-Interquartile range: is one half of the Interquartile range
![Page 4: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/4.jpg)
The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.
• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.
• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median.
• The 3rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3rd quartile is equivalent to the 75th percentile.
Interquartile range (IQR)
![Page 5: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/5.jpg)
Interquartile range (IQR) The interquartile range (IQR) is the distance
between the 75th percentile and the 25th percentile The IQR is essentially the range of the middle
50% of the data Because it uses the middle 50%, the IQR is not
affected by outliers (extreme values)
![Page 6: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/6.jpg)
Interquartile range (IQR)
Example: Compute the interquartile range for the
sorted 18, 33, 58, 67, 73, 93, 147 The 25th and 75th percentiles are
the .25*(7+1) and .75*(7+1) = 2nd and 6th observations, respectively.
IQR = 93-33 = 60.
![Page 7: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/7.jpg)
Describing Variability: Deviation in a Population A more sophisticated measure of variability is one
that shows how scores cluster around the mean Deviation is the distance of a score from the mean
X - , e.g. 11 - 6.35 = 3.65, 3 – 6.35 = -3.35
A measure representative of the variability of all the scores would be the mean of the deviation scores
(X - ) Add all the deviations and divide by n N However the deviation scores add up to zero (as mean
serves as balance point for scores)
![Page 8: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/8.jpg)
Describing Variability: Variance in a Population
To remove the +/- signs we simply square each deviation before finding the average. This is called the Variance:
(X - )² = 106.55 = 5.33 N 20
The numerator is referred to as the Sum of Squares (SS): as it refers to the sum of the squared deviations around the mean value
SS is a basic component of variability – the sum of squared deviation scores
![Page 9: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/9.jpg)
Variability: Variance in a Population
let X = [3, 4, 5 ,6, 7] Mean = 5 (X - Mean ) = [-2, -1, 0, 1, 2]
subtract Mean from each number in X (X - Mean )2 = [4, 1, 0, 1, 4]
squared deviations from the mean (X - Mean )2 = 10
sum of squared deviations from the mean (SS) (X - Mean )2 /N = 10/5 = 2
average squared deviation from the mean
NX
r
22 )(
![Page 10: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/10.jpg)
Variability: Variance in a Population let X = [1, 3, 5, 7, 9] Mean = 5 (X - Mean) = [-4, -2, 0, 2, 4 ]
subtract Mean from each number in X (X - Mean)2 = [16, 4, 0, 4, 16]
squared deviations from the mean (X - Mean)2 = 40
sum of squared deviations from the mean (SS) (X - Mean)2 /n = 40/5 = 8
average squared deviation from the mean
NX
r
22 )(
![Page 11: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/11.jpg)
Variance can be calculated with the sum of squares (SS) divided by n
Variability: Variance in a Population
NX
r
22 )(
![Page 12: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/12.jpg)
Variability: Variance in a Sample
Variance in a sample
n is the number of scores -1SS is the Sum of Squared Deviations From the Mean
So, variance (S2) is the average squared deviation from the mean
SS (X X)2
1)( 2
2
n
XXS
![Page 13: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/13.jpg)
Describing Variability: Population and Sample Variance
Population variance is designated by ² ² = (X - )² = SS N N
Sample Variance is designated by s² Samples are less variable than populations: they therefore give
biased estimates of population variability Degrees of Freedom (df): the number of independent (free to
vary) scores. In a sample, the sample mean must be known before the variance can be calculated, therefore the final score is dependent on earlier scores: df = n -1
s² = (x - M)² = SS = 106.55 = 5.61 n - 1 n -1 20 -1
![Page 14: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/14.jpg)
Describing Variability: the Standard Deviation
Variance is a measure based on squared distances In order to get around this, we can take the square
root of the variance, which gives us the standard deviation
Population () and Sample (s) standard deviation
= (X - )² N
s = (X - M)² n - 1
![Page 15: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/15.jpg)
Variability: Standard Deviation of a Sample The square root of Variance is called the
Standard Deviation
1)( 2
2
n
XXS
1)( 2
n
XXS
Variance
Standard Deviation
![Page 16: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/16.jpg)
Variability: Standard Deviation “The Standard Deviation tells us
approximately how far the scores vary from the mean on average”
It is approximately the average deviation of scores from the mean
![Page 17: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/17.jpg)
The Standard Deviation and the Normal Distribution
There are known percentages of scores above or below any given point on a normal curve 34% of scores between the mean
and 1 SD above or below the mean
An additional 14% of scores between 1 and 2 SDs above or below the mean
Thus, about 96% of all scores are within 2 SDs of the mean (34% + 34% + 14% + 14% = 96%)
Note: 34% and 14% figures can be useful to remember
Pro
babi
lity
Den
sity
![Page 18: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/18.jpg)
Describing Variability
The standard deviation is the most common measure of variability, but the others can be used. A good measure of variability must:
Must be stable and reliable: not be greatly affected by little details in the data
Extreme scores Multiple sampling from the same population Open-ended distributions
Both the variance and SD are related to other statistical techniques
![Page 19: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/19.jpg)
SS Computational Formula Note this formula on page 93. In later
chapters, we will be using this alternate SS formula.
![Page 20: Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.](https://reader036.fdocuments.in/reader036/viewer/2022082501/5a4d1b1e7f8b9ab059994518/html5/thumbnails/20.jpg)
Credits http://www.le.ac.uk/pc/sk219/introtostats1.ppt#259,4,Plotting Data:
describing spread of data http://math.usask.ca/~miket/Sullivan_PP/Chapter_3/sec3_4.ppt#24