H OW T ALL A RE YOU ?. 2.1A D ESCRIBING L OCATION IN A D ISTRIBUTION Using percentiles Making an...

35
HOW TALL ARE YOU?

Transcript of H OW T ALL A RE YOU ?. 2.1A D ESCRIBING L OCATION IN A D ISTRIBUTION Using percentiles Making an...

HOW TALL ARE YOU?

2.1A DESCRIBING LOCATION IN A DISTRIBUTION

Using percentiles Making an Ogive Calculating and

Interpreting a z-score

What is a percentile?

Wins in Major League BaseballThe stemplot below shows the number of wins for each of the 30 Major League Baseball teams in 2009.

Key: 5|9 represents a team with 59 wins. 5 9

6 2455 7 00455589 8 0345667778 9 12355710 3

Key: 5|9 represents a team with 59 wins.

Calculate and interpret the percentiles for the Colorado Rockies (92 wins), the New York Yankees (103 wins) and the Cleveland Indians (65 wins).

5 9 6 2455 7 00455589 8 0345667778 9 12355710 3

Ogive?

Age of Senators in the 103rd Congress (n=100)

Freq Rel Freq Cumul. Rel. Freq

30 −< 40 1 0.01

40 −< 50 16 0.16

50 −< 60 49 0.49

60 −< 70 22 0.22

70 −< 80 11 0.11

80 −< 90 1 0.01

Age of Senators in the 103rd Congress (n=100)

Freq Rel Freq Cumul. Rel. Freq

30 −< 40 1 0.01 0.01

40 −< 50 16 0.16 0.17

50 −< 60 49 0.49 0.66

60 −< 70 22 0.22 0.88

70 −< 80 11 0.11 0.99

80 −< 90 1 0.01 1.00

Age of Representatives in the 103rd Congress (n=435)

Freq Rel Freq Cumul. Rel. Freq

30 −< 40 47

40 −< 50 153

50 −< 60 131

60 −< 70 89

70 −< 80 12

80 −< 90 3

0.0

0.2

0.4

0.6

0.8

1.0

35 40 45 50 55 60 65 70Median_Household_Income

State Median Household IncomesBelow is a cumulative relative frequency graph showing the distribution of median household incomes for the 50 states and the District of Columbia.

0.0

0.2

0.4

0.6

0.8

1.0

35 40 45 50 55 60 65 70Median_Household_Income

a) California, with a median household income of $57,445, is at what percentile? Interpret this value.

0.0

0.2

0.4

0.6

0.8

1.0

35 40 45 50 55 60 65 70Median_Household_Income

b) What is the 25th percentile for this distribution? What is another name for this value?

0.0

0.2

0.4

0.6

0.8

1.0

35 40 45 50 55 60 65 70Median_Household_Income

c) Where is the graph the steepest? What does this indicate about the distribution?

z-score?

Macy, a 3-year-old female is 100 cm tall. Brody, her 12-year-old brother is 158 cm tall. Obviously, Brody is taller than Macy—but who is taller,

relatively speaking? That is, relative to other kids of the same ages, who

is taller?

(According to the CDCP, the heights of three-year-old females have a mean of 94.5 cm and a standard

deviation of 4 cm. The mean height for 12-year-olds males is 149 cm with

a standard deviation of 8 cm.)

Bellwork: 9/17/15

What information can be obtain from an Ogive?

*Percentiles- Ms. McDonald screwed up!

Year Player HR Mean SD1927 Babe Ruth 60 7.2 9.71961 Roger Maris 61 18.8 13.41998 Mark McGwire 70 20.7 12.72001 Barry Bonds 73 21.4 13.2

To make a fair comparison, we should see how these performances rate relative to

others hitters during the same year. Calculate the standardized score for each

player and compare.

Year Player HR Mean SD1927 Babe Ruth 60 7.2 9.71961 Roger Maris 61 18.8 13.41998 Mark McGwire 70 20.7 12.72001 Barry Bonds 73 21.4 13.2

In 2001, Arizona Diamondback Mark Grace’s home run total

had a standardized score of z = –0.48. Interpret this value and calculate the number of home

runs he hit.

2.1B DESCRIBING LOCATION IN A DISTRIBUTION

TRANSFORM data

DEFINE and DESCRIBE density curves

0 10 20 30 40 50 60 70

Here is a dotplot of Kobe Bryant’s point totals for each of the 82 games in the 2008-2009 regular

season. The mean of this distribution is 26.8 with a standard deviation of 8.6 points.

In what percentage of games did he score within one standard deviation of his mean?

Within two standard deviations?

0 2 4 6 8 10 12 14 16

Here is a dotplot of Tim Lincecum’s strikeout totals for each of the 32 games he pitched in during the

2009 regular season. The mean of this distribution is 8.2 with a standard deviation of 2.8.

In what percentage of games were his strikeouts within one standard deviation of his mean?

Within two standard deviations?

Effect of Adding (or Subtracting) a Constant

Adding the same number a (either positive, zero, or negative) to each observation:

• adds a to measures of center and location (mean, median, quartiles, percentiles), but

• Does not change the shape of the distribution or measures of spread (range, IQR, standard deviation).

Effect of Multiplying (or Dividing) by a ConstantMultiplying (or dividing) each observation by the same number b (positive, negative, or zero):

• multiplies (divides) measures of center and location by b

• multiplies (divides) measures of spread by |b | , but

• does not change the shape of the distribution

In 2010, Taxi Cabs in New York City charged an initial fee of $2.50 plus $2 per mile.

In equation form: fare = 2.50 + 2(miles).

At the end of a month a businessman collects all of his taxi cab receipts and analyzed the

distribution of fares. The distribution was skewed to the right with a mean of $15.45 and a

standard deviation of $10.20.

a) What are the mean and standard deviation of the lengths of his cab rides in miles?

Density Curves

In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we’ll add one more step to the strategy.

Exploring Quantitative Data

1. Always plot your data: make a graph.2. Look for the overall pattern (shape, center,

and spread) and for striking departures such as outliers.

3. Calculate a numerical summary to briefly describe center and spread.

4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.

Definition:

A density curve is a curve that• is always on or above the horizontal axis, and• has area exactly 1 underneath it.

A density curve describes the overall pattern of a distribution. The area under the curve

and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval.

In this section, we learned that…

There are two ways of describing an individual’s location within a distribution – the percentile and z-score.

A cumulative relative frequency graph allows us to examine location within a distribution.

It is common to transform data, especially when changing units of measurement. Transforming data can affect the shape, center, and spread of a distribution.

We can sometimes describe the overall pattern of a distribution by a density curve (an idealized description of a distribution that smooths out the irregularities in the actual data).