Topics for Today Inter-Quartile Range Observing...

21
Stat203 Page 1 of 21 Fall 2011 – Week3, Lecture 3 Topics for Today Inter-Quartile Range Observing Variability in Figures The Standard Deviation Review of Assignment #1

Transcript of Topics for Today Inter-Quartile Range Observing...

Page 1: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 1 of 21 Fall 2011 – Week3, Lecture 3

Topics for Today

Inter-Quartile Range Observing Variability in Figures The Standard Deviation Review of Assignment #1

Page 2: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3

Recap of Quartiles

_________ are the percentiles that divide the distribution into _:

Q1 = 1st Quartile = __th Percentile

Q2 = 2nd Quartile = __th Percentile = Median

Q3 = 3rd Quartile = __th Percentile

Remember, the 25th percentile is the value where 25% of all individuals are less than it.

Letʼs look at the quartiles for Age in the GSS data.

Page 3: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 3 of 21 Fall 2011 – Week3, Lecture 3

Now mark approximately where these points are on the histogram…

Page 4: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 4 of 21 Fall 2011 – Week3, Lecture 3

Page 5: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 5 of 21 Fall 2011 – Week3, Lecture 3

Some similar exercises can be found at:

http://simon.cs.vt.edu/SoSci/converted/Dispersion_I/activity.html

Check out the Quartile Finder Applet

Page 6: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 6 of 21 Fall 2011 – Week3, Lecture 3

Interpreting the IQR The IQR is the ____________ 50% of all values of a variable. 50% of all individuals are between __ and __ What does it mean if the IQR is small? IQR _____ = the middle 50% of the data is very

close to the median IQR _____ = the middle 50% farther from the

median From the GSS, the middle 50% of the data is within 25 years; IQR = __ years for the GSS. Letʼs look at this graphically.

Page 7: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 7 of 21 Fall 2011 – Week3, Lecture 3

Boxplots Boxplots are a convenient way to display the distribution of data, and include the quartiles.

These 5 numbers are also known as the ________________.

Page 8: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 8 of 21 Fall 2011 – Week3, Lecture 3

In SPSS …

Page 9: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 9 of 21 Fall 2011 – Week3, Lecture 3

Page 10: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 10 of 21 Fall 2011 – Week3, Lecture 3

Boxplots are related to Histograms Boxplots and Histograms are both graphical methods to examine the ____________ of data. The image below shows how they relate:

http://simon.cs.vt.edu/SoSci/converted/Dispersion_I/box_n_hist.gif

Page 11: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 11 of 21 Fall 2011 – Week3, Lecture 3

Letʼs now look at the

Comparing Measures of Dispersion applet

http://simon.cs.vt.edu/SoSci/converted/Dispersion_I/activity.html

(Please note that the RANGE is not properly defined on this page!)

Page 12: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 12 of 21 Fall 2011 – Week3, Lecture 3

One quick note on Figures Consider the following figure

http://timelyportfolio.blogspot.com/2011/04/bond-market-as-casino-game-part-1.html

Page 13: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 13 of 21 Fall 2011 – Week3, Lecture 3

1. Note the correspondence between the histogram and the boxplot (most of the bars are between Q1 and Q3).

2. Note that there is an additional mark on

the boxplot for the ____.

3. Note that the smooth line ʻ____________ʼ the histogram

… we will often just draw the smooth

line instead of drawing the histogram.

Page 14: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 14 of 21 Fall 2011 – Week3, Lecture 3

(Standard) Deviation

Take a look at these two histograms.

Page 15: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 15 of 21 Fall 2011 – Week3, Lecture 3

(data: http://journals.iucr.org/d/issues/2004/12/01/ba5071/index.html )

Which do you think has greater ___________? We could define this as, ______ IQR. Individuals from both (a) and (b) have a mean of zero. If we were to randomly select an individual from (a), then an individual from (b), the individual would have a value of between -4 and 4 …. But … … which individual, (a) or (b), is likely to be closer to zero?

Page 16: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 16 of 21 Fall 2011 – Week3, Lecture 3

The distance from an observation to the mean is called a _________. So … letʼs go back to the GSS data. The mean Age was _____. The Deviation for someone who is 20 is then: Deviation = __________ = ______ So .. that individual is _____ years below the mean.

Page 17: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 17 of 21 Fall 2011 – Week3, Lecture 3

The Standard Deviation The __________________ is simply a way of averaging all the individual deviations. Letʼs do to a sample of 5 people from class and see what weʼre talking about: Weight Deviation Absolute

Deviation Squared Deviation

Mean: Note that the mean deviation will ALWAYS be zero.

Page 18: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 18 of 21 Fall 2011 – Week3, Lecture 3

And … standard deviation is the ___________ of the squared deviation.

s =(X − X )∑

2

N Exercise: check this for the in class example with weights. We will not be calculating standard deviations by hand … that is what computers are for. What is important is that you understand what it is and what it means.

Page 19: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 19 of 21 Fall 2011 – Week3, Lecture 3

Handy feature of the Standard Deviation When the distribution looks kind of symmetric (approximately like below) • about _ of the distribution is within 1 standard

deviation of the mean • about ___ is within 2 standard deviations of the

mean • about ___ is within 3 standard deviations of the

mean

Page 20: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 20 of 21 Fall 2011 – Week3, Lecture 3

Todayʼs Topics

Inter-Quartile Range - Captures the middle 50% of the data - Large = data is more spread out - Small = most individuals are close to the

median Variability in Figures - Boxplots: including Min, Max, Q1, Q3, Median

(sometimes the Mean and outlying observations) - Boxplots and Histograms give similar information - Can ʻeyeballʼ the center, and determine whether the

distribution is skewed, or symmetric, or bimodal Standard Deviation - Tells, on average, how far an individual is expected

to be away from the mean - Large = data is spread out and lots of individuals are

far from the mean - Small = tightly grouped (steeply peaked histogram)

with most individuals close to the mean

Page 21: Topics for Today Inter-Quartile Range Observing ...people.stat.sfu.ca/~dthompso/teaching/Stat203/Fall2011/Stat203_W3L3.pdfStat203 Page 2 of 21 Fall 2011 – Week3, Lecture 3 Recap

Stat203 Page 21 of 21 Fall 2011 – Week3, Lecture 3

Reading for next lecture Chapter 5