Descriptive Statistics and the Normal Distribution

Post on 10-Feb-2016

37 views 0 download

Tags:

description

Descriptive Statistics and the Normal Distribution. HPHE 3150 Dr. Ayers. Introduction Review. Terminology Reliability Validity Objectivity Formative vs Summative evaluation Norm- vs Criterion-referenced standards. Scales of Measurement. Nominal name or classify - PowerPoint PPT Presentation

Transcript of Descriptive Statistics and the Normal Distribution

Descriptive Statistics and the

Normal Distribution

HPHE 3150Dr. Ayers

Introduction Review

• Terminology• Reliability• Validity• Objectivity• Formative vs Summative evaluation• Norm- vs Criterion-referenced standards

Scales of Measurement

• Nominal• name or classify• Major, gender, yr in college

• Ordinal• order or rank• Sports rankings

• Continuous• Interval

equal units, arbitrary zero• Temperature, SAT/ACT score

• Ratioequal units, absolute zero (total absence of

characteristic)• Height, weight

Summation Notation• is read as "the sum of"

• X is an observed score

• N = the number of observations

• Complete ( ) operations first

• Exponents then * and / then + and -

Operations Orders

65

26

-5

2 -34

Summation Notation Practice:Mastery Item 3.2

Scores:3, 1, 2, 2, 4, 5, 1, 4, 3, 5

Determine:∑ X

(∑ X)2

∑ X2

30

900

110

Percentile

•The percent of observations that fall at or below a given point

•Range from 0% to 100%

•Allows normative performance comparisons

If I am @ the 90th percentile,how many folks did better than me?

Test Score Frequency Distribution Figure 3.1 (p.42 explanation)

Valid Frequency Percent Valid Percent Cumulative Percent

41 1 1.5 1.5 1.5

43 3 4.6 4.6 6.2

44 3 4.6 4.6 10.8

45 5 7.7 7.7 18.5

46 5 7.7 7.7 26.2

47 7 10.8 10.8 36.9

48 11 16.9 16.9 53.8

49 8 12.3 12.3 66.2

50 7 10.8 10.8 76.9

51 6 9.2 9.2 86.2

52 3 4.6 4.6 90.8

53 3 4.6 4.6 95.4

54 2 3.1 3.1 98.5

55 1 1.5 1.5 100.0

Total 65 100.0 100.0

Central Tendency

• Meansum scores / # scores

• Median (P50)exact middle of ordered scores

• Modemost frequent score

Where do the scores tend to center?

• Mean

• Median (P50)

• Mode

Raw scores27551

Rank order12557

• Mean: 4 (20/5)• Median: 5• Mode: 5

Distribution Shapes Figure 3.2

So what? OUTLIERS

Direction of tail = +/-

Mean = 11.7 SD = 2.0

Normal DensitySuperimposed

0.0

5.1

.15

.2D

ensi

ty

5 7 9 11 13 15 17 19CRF at Initial Examination (METs)

Based on 15,242 maximal GXT

Distribution of Initial CRF

Kampert, MSSE, Suppl. 2004, p. S135

Histogram of Skinfold Data

0

10

20

30

40

50

60

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80

Three Symmetrical Curves Figure 3.3

The difference here is the variability;

Fully normal

More heterogeneous

More homogeneous

Descriptive Statistics I

• What is the most important thing you learned today?

• What do you feel most confident explaining to a classmate?

Descriptive Statistics IREVIEW

• Measurement scales• Nominal, Ordinal, Continuous (interval, ratio)

• Summation Notation:3, 4, 5, 5, 8 Determine: ∑ X, (∑ X)2, ∑X2

9+16+25+25+64 25 625 139

• Percentiles: so what?

• Measures of central tendency• 3, 4, 5, 5, 8• Mean (?), median (?), mode (?)

• Distribution shapes

Variability

• RangeHi – Low scores only (least reliable measure; 2 scores only)

• Variance (s2) inferential statsSpread of scores based on the squared

deviation of each score from meanMost stable measure of variability

• Standard Deviation (S) descriptive statsSquare root of the variance

Most commonly used measure of variability

True Var-iance

Totalvariance

Error

2SS

Variance (Table 3.2)

The didactic formula

The calculating formula

1

22

nMX

S

1

2

2

2

nnX

XS

4+1+0+1+4=10 10 = 2.5 5-1=4 4

55 - 225 = 55-45=10 = 2.5 5 4 4

4

Standard Deviation

The square root of the variance

Nearly 100% scores in a normal distribution are captured by the mean + 3 standard deviations

M + S100 + 10

2SS

The Normal Distribution

M + 1s = 68.26% of observationsM + 2s = 95.44% of observationsM + 3s = 99.74% of observations

Calculating Standard Deviation

Raw scores37451

∑ 20

Mean: 4

(X-M)-1301-30

S= √20 5

S= √4

S=2

NMXS

2

(X-M)2

1901920

Coefficient of Variation (V)Relative variability

Relative variability around the mean OR determine homogeneity of two data sets with different units S / M

Relative variability accounted for by the mean when units of measure are different (ht, hr, running speed, etc.)

Helps more fully describe different data sets that have a common std deviation (S) but unique means (M)

Lower V=mean accounts for most variability in scores.1 - .2=homogeneous >.5=heterogeneous

Descriptive Statistics II

• What is the “muddiest” thing you learned today?

Descriptive Statistics IIREVIEW

Variability• Range• Variance: Spread of scores based on the squared deviation of

each score from mean Most stable measure• Standard deviation Most commonly used measure

Coefficient of variation• Relative variability around the mean (homogeneity of scores)• Helps more fully describe relative variability of different data

sets

50+10What does this tell you?

Standard ScoresZ or t

SMXZ

•Set of observations standardized around a given M and standard deviation

•Score transformed based on its magnitude relative to other scores in the group

•Converting scores to Z scores expresses a score’s distance from its own mean in sd units

•Use of standard scores: determine composite scores from different measures (bball: shoot, dribble); weight?

Standard Scores• Z-scoreM=0, s=1

• T-scoreT = 50 + 10 * (Z)

M=50, s=10

• Percentilep = 50 + Z (%ile)

SMXZ

SMXT

1050

SMXZ

)(50 percentilezp

Conversion to Standard Scores

Raw scores37451

• Mean: 4• St. Dev: 2

SMXZ

X-M-1 3 0 1-3

Z-.5 1.5 0 .5-1.5 Allows the comparison of

scores using different scales to compare “apples to apples”

SO WHAT? You have a Z score but what

do you do with it? What does it tell you?

Normal distribution of scores Figure 3.6

99.9

Descriptive Statistics II REVIEW

Standard Scores• Converting scores to Z scores expresses a score’s distance

from its own mean in sd units• Value?

Coefficient of variation• Relative variability around the mean (homogeneity of scores)• Helps more fully describe relative variability of different data

sets

100+20What does this tell you?Between what values do 95% of the scores in this data set fall?

Normal-curve Areas Table 3.4

• Z scores are on the left and across the top• Z=1.64: 1.6 on left , .04 on top=44.95• Since 1.64 is +, add 44.95 to 50 (mean) for 95th percentile

• Values in the body of the table are percentage between the mean and a given standard deviation distance• ½ scores below mean, so + 50 if Z is +/-

• The "reference point" is the mean• +Z=better than the mean• -Z=worse than the mean

p. 51

Area of normal curve between 1 and 1.5 std dev above the mean

Figure 3.7

Normal curve practice

• Z score Z = (X-M)/S• T score T = 50 + 10 * (Z)• Percentile P = 50 + Z percentile (+: add to 50, -: subtract from 50)

• Raw scores

• Hints• Draw a picture• What is the z score?• Can the z table help?

• Assume M=700, S=100

Percentile T score z score Raw score

64 53.7 .37 737

43

–1.23

618

17

68

68

835

.57

Descriptive Statistics III

• Explain one thing that you learned today to a classmate

• What is the “muddiest” thing you learned today?