Statistical Measures

Statistical Measures

Mrs. WatkinsAP StatisticsChapters 5,6

MEASURES OF CENTERMean: arithmetic average of all data values

population mean:sample mean:Formula:

Mode: the most common value in a data set

Median: the middle value in a data set

Midrange: average of the extremes

Trimmed Mean: when you find the meanof data set with a certain percentage ofdata values trimmed of the ends of thedistribution

Ex:

5 number summary

5 important numbers in data set:Min:Q1:Med:Q3:Max:

Q1, Med, Q3, may not be actual data values

BOXPLOT

graphical display of data using 5 number summary (if outliers shown, called “modified box plot”)

OUTLIERS

Outliers:

IQR Test for Outliers: (IQR )(1.5) = multiplier MQ1 - M = outlier lower boundQ3 + M = outlier upper bound

If values exceed these bounds, they are outliers

RESISTANCE

Resistant Measures:

Non-resistant Measures:

Mean, Midrange:Median, IQR, Trimmed Mean:

MEASURES OF SPREAD

Range: the spread between high and low

Resistant?

IQR (Interquartile Range) : Resistant?

STANDARD DEVIATION

a measure of the average amount of deviation from the mean among the data values

Population St. Deviation:Sample St. Deviation:

We generally use sx because we usually do not have entire population.

VARIANCE

the square of the standard deviation what you get before taking square root

Population Variance:Sample Variance:

This measure not used much in elementary statistics but you need to know what it is.

Coefficient of Variance

measure of how relatively large a st. dev. is

Ex: St. deviation of IQ = 15, Mean 100

St. deviation of height = 3 in, Mean 69

“Comment on the distribution”

You now have numbers to support your statements, rather than just graphs.

SHAPE:OUTLIERS:CENTER:SPREAD: how widely does the data vary?Unusual Features: gaps, clusters

SHAPE If the mean > median, then data distribution

is skewed ________The mean is in the tail.

If the mean < median, then data distribution is skewed ________The mean is in the tail.

If the mean ≈ median, then data distributionis approximately ____________.

SHAPESymmetric if mean = median

SKEWNESSSkewed left if mean < medianSkewed right if mean > median

Left RightMean is in the tail of the data

OTHER SHAPES

Uniform distribution: all values relativelyevenly distributed across interval

Bimodal distribution: two peaks

TRANSFORMATIONS TO DATA

What would happen to the statistical measures if each data value had a constant added to or subtracted from it?

Mean:Standard Deviation:Median:IQR:

What would happen to the statistical measures if each data value had a constant multiplied or divided by it?Mean:Standard Deviation:Median:IQR:

TRANSFORMATIONS TO DATA SET

What would happen to the statistical measures if one very low or very high data value was added to the set?

Mean:Standard Deviation:Median:IQR:

MEASURES OF POSITION

Give a numerical approximation of where a single data value stands compared to the whole distribution

Quartiles:Percentiles: Z Scores:

Z SCORES

standardized scorehow a single value compares to entire data set

in terms of position in distribution

z =

How unusual are you?

Compute your z score for height?

Compute your z score for Math SAT?

Compute your z score for IQ?

NORMAL MODEL

shows how data is distributed symmetrically along an interval according to empirical rule

Empirical Rule:of data within 1 st. deviation of μof data within 2 st. deviations of μ of data within 3 st. deviations of μ

ANOTHER OUTLIER TEST

Using Empirical Rule:

Data values of z > +2 st. deviations awayfrom mean are mild outliers

Data values of z > +3 st. deviations awayfrom mean are extreme outliers

NORMAL CURVE

a theoretical ideal about how traits/characteristics are distributed

Many human traits are approximately normally distributed such as height, body temp, IQ, pulse

Avoid using “normal” when describing data—say “approximately normal or symmetric” unless clearly mound-shaped, bell-shaped

NORMAL CURVE

Normal curve—symmetric, mound-shaped

Area under curve=

A z score can be used to establish what % ofthe curve is less or more than the z score,and establish probability of a data value being in that position.

FINDING PERCENTILE/PROBABILITY USING NORMAL CURVE

1. Calculate z score for data value2. Use calculator: normalcdf under DISTR

key

Looking for area > z score: normalcdf (z, ∞)Looking for area < z score: normalcdf (∞, z)Looking for area between z scores:

normalcdf (z1, z2)

FINDING CUT OFF SCORES

If you are given a percentile or probability, and need to determine the “cut off score”

1. Sketch curve to determine where z score is located.2. Determine if you want area above or below this

percentile3. Use INVNORM on calculator

invnorm(percentile)= z score4. Use z score formula to solve for x.

Does the data fit a normal model?

1. Check mean and median

2. Make a NORMAL PROBABILITY PLOT—

3. Make a BOXPLOT on calculator.

AVOID using histograms on calculator to check.

Statistical Measures

Documents

Transcript of Statistical Measures