Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in...
-
Upload
christal-copeland -
Category
Documents
-
view
217 -
download
2
Transcript of Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in...
Introduction to Descriptive StatisticsIntroduction to Descriptive Statistics
Objectives:Objectives:1.1. Explain the general role of statistics in Explain the general role of statistics in
assessment & evaluationassessment & evaluation
2.2. Explain three methods for describing a data set: Explain three methods for describing a data set: shape, center, and spreadshape, center, and spread
3.3. Explain the relationship between the standard Explain the relationship between the standard deviation and the normal curvedeviation and the normal curve
““Data have a story to tell. Statistical analysis is Data have a story to tell. Statistical analysis is detective work in which we apply our intelligence detective work in which we apply our intelligence and our tools to discover parts of that story.”and our tools to discover parts of that story.”
-Hamilton (1990)-Hamilton (1990)
Levels of MeasurementLevels of Measurement
NominalNominalOrdinalOrdinal IntervalIntervalRatioRatioDetermining what statistics are Determining what statistics are
appropriateappropriate
NominalNominal
Naming things.Naming things.
Creating groups that are Creating groups that are qualitatively different or qualitatively different or unique…unique…
But not necessarily But not necessarily quantitatively different.quantitatively different.
NominalNominal
Placing individuals or Placing individuals or objects into categories.objects into categories.
Making mutually Making mutually excusive categories.excusive categories.
Numbers assigned to Numbers assigned to categories are arbitrary.categories are arbitrary.
NominalNominal
Sample variables:Sample variables:– GenderGender– RaceRace– EthnicityEthnicity– Geographic locationGeographic location– Hair or eye colorHair or eye color
OrdinalOrdinal
Rank ordering things.Rank ordering things.
Creating groups or Creating groups or categories when only rank categories when only rank order is known.order is known.
Numbers imply order but not Numbers imply order but not exact quantity of anything.exact quantity of anything.
OrdinalOrdinal
The difference between The difference between individuals with adjacent individuals with adjacent ranks, on relevant ranks, on relevant quantitative variables, is quantitative variables, is not necessarily the same not necessarily the same across the distribution.across the distribution.
OrdinalOrdinal
Sample variables:Sample variables:– Class RankClass Rank– Place of finish in a race (1Place of finish in a race (1stst, 2, 2ndnd, ,
etc.)etc.)– Judges ratingsJudges ratings– Responses to Likert scale items Responses to Likert scale items
(for example – SD, D, N, A, SA) (for example – SD, D, N, A, SA)
IntervalInterval
Orders observations Orders observations according to the quantity of according to the quantity of some attribute.some attribute.
Arbitrary origin.Arbitrary origin. Equal intervals.Equal intervals. Equal differences expressed Equal differences expressed
as equal distances.as equal distances.
IntervalInterval
Sample variables:Sample variables:– Test ScoresTest Scores
•SATSAT•GREGRE• IQ testsIQ tests
– Temperature Temperature •CelsiusCelsius•FahrenheitFahrenheit
RatioRatio
Quantitative measurement.Quantitative measurement. Equal intervals.Equal intervals. True zero point.True zero point. Ratios between values are Ratios between values are
useful.useful.
RatioRatio
Sample variables:Sample variables:– Financial variablesFinancial variables– Finish times in a raceFinish times in a race– Number of units soldNumber of units sold– Test scores scaled as percent Test scores scaled as percent
correct or number correctcorrect or number correct
Levels of Measurement Levels of Measurement ReviewReview What level of measurement?What level of measurement?
– Today is a fall day.Today is a fall day.– Today is the third hottest day of Today is the third hottest day of
the month.the month.– The high today was 70The high today was 70o o
Fahrenheit.Fahrenheit.– The high today was 20The high today was 20oo Celsius. Celsius.– The high today was 294The high today was 294oo Kelvin. Kelvin.
Levels of Measurement Levels of Measurement ReviewReview What level of measurement?What level of measurement?
– Student #1256 is:Student #1256 is:– a malea male– from Lawrenceville, GA.from Lawrenceville, GA.– He came in third place in the race He came in third place in the race
today.today.– He scored 550 on the SAT verbal He scored 550 on the SAT verbal
section.section.– He has turned in 8 out of the 10 He has turned in 8 out of the 10
homework assignments.homework assignments.
Levels of Measurement Levels of Measurement ReviewReview What level of measurement?What level of measurement?
– Student #3654 is:Student #3654 is:– in the third reading group.in the third reading group.– Nominal?Nominal?– Ordinal?Ordinal?– Interval?Interval?– Ratio?Ratio?
Descriptive StatisticsDescriptive Statistics
Used to describe the basic features of a Used to describe the basic features of a batch of data. Uses graphical displays batch of data. Uses graphical displays and descriptive quantitative indicators. and descriptive quantitative indicators.
The purpose of descriptive statistics is to The purpose of descriptive statistics is to organize and summarize data so that organize and summarize data so that the data is more readily comprehended. the data is more readily comprehended. That is, descriptive statistics describes That is, descriptive statistics describes distributions with numbers.distributions with numbers.
Five Descriptive QuestionsFive Descriptive Questions
What is the middle of the set of What is the middle of the set of scores?scores?
How spread out are the scores?How spread out are the scores? Where do specific scores fall in the Where do specific scores fall in the
distribution of scores?distribution of scores? What is the shape of the distribution?What is the shape of the distribution? How do different variables relate to How do different variables relate to
each other?each other?
Five Descriptive QuestionsFive Descriptive Questions
MiddleMiddle SpreadSpread Rank or Relative PositionRank or Relative Position ShapeShape CorrelationCorrelation
MiddleMiddle
MeanMeanMedianMedianModeMode
Examples of these measuresExamples of these measures
Mean of: 2, 3, 6, 7, 3, 5, 10Mean of: 2, 3, 6, 7, 3, 5, 10(2 + 3 + 6 + 7 + 3 + 5 + 10)/ 7 = 36/ 7 = 5.14(2 + 3 + 6 + 7 + 3 + 5 + 10)/ 7 = 36/ 7 = 5.14
Mode of: 2, 3, 6, 7, 3, 5, 10 is 3Mode of: 2, 3, 6, 7, 3, 5, 10 is 3
Median of: 2, 3, 6, 7, 3, 5, 10Median of: 2, 3, 6, 7, 3, 5, 10
First data is ordered: 2, 3, 3, 5, 6, 7, 10. Middle First data is ordered: 2, 3, 3, 5, 6, 7, 10. Middle value is 5 therefore that is the median. value is 5 therefore that is the median.
Some Important PointsSome Important Points
Mode is the only descriptive measure Mode is the only descriptive measure used for nominal dataused for nominal data
Median is unaffected by extreme values, Median is unaffected by extreme values, it is resistant to extreme observations.it is resistant to extreme observations.
Mean or Average is affected by extremely Mean or Average is affected by extremely small or large values. We say that it is small or large values. We say that it is sensitive or sensitive or nonresistantnonresistant to the influence to the influence of extreme observations. The mean is the of extreme observations. The mean is the balance point of the distribution. balance point of the distribution.
In symmetric distributions the mean and In symmetric distributions the mean and median are close together.median are close together.
More important pointsMore important points
In skewed data the mean is pulled to the In skewed data the mean is pulled to the tail of the distribution.tail of the distribution.
Median is not necessarily preferred over Median is not necessarily preferred over the mean even if it is resistant. However if the mean even if it is resistant. However if data is known to be strongly skewed then data is known to be strongly skewed then the median is preferable. the median is preferable.
Finally, the average is usually the Finally, the average is usually the measurement of central tendency of measurement of central tendency of choice because it is stable during choice because it is stable during sampling. sampling.
SpreadSpread
Standard DeviationStandard Deviation VarianceVariance RangeRange IQRIQR
Large S
Average or Normal S Small S
X = Mean
S = Standard Deviation
X = 50 (S = 20)
X = 50 (S = 10) X = 50 (S = 5)
How do measures of variability differ How do measures of variability differ when distributions are spread out?when distributions are spread out?
Describing Data: Center & Describing Data: Center & SpreadSpread
Rank or Relative PositionRank or Relative Position
Five number summaryFive number summary Min, 25Min, 25thth, 50, 50thth, 75, 75thth, Max, Max Identifying specific values that Identifying specific values that
have interpretive meaninghave interpretive meaning Identifying where they fall in Identifying where they fall in
the set of scoresthe set of scoresBox plotsBox plotsOutliersOutliers
ShapeShape
Positive SkewnessPositive Skewness Negative SkewnessNegative Skewness NormalityNormality HistogramsHistograms
Shape - NormalityShape - Normality
Scanning
50.0
47.5
45.0
42.5
40.0
37.5
35.0
32.5
30.0
27.5
25.0
100
80
60
40
20
0
Std. Dev = 4.84
Mean = 38.0
N = 344.00
344N =
Scanning
60
50
40
30
20
184719125
23312240
Shape- Positive SkewnessShape- Positive Skewness
Total for IIP
50
40
30
20
10
0
Std. Dev = .56
Mean = 2.10
N = 344.00
344N =
Total for IIP
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
.5
1102733625610710429
Shape – Negative Shape – Negative SkewnessSkewness
PREACT
40
30
20
10
0
Std. Dev = .42
Mean = 3.32
N = 154.00
154N =
PREACT
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
13079
91119
11164118
When a distribution of When a distribution of data resembles a normal data resembles a normal distribution (or normal distribution (or normal curve): curve):
68% of the data lies 68% of the data lies within + or – 1 standard within + or – 1 standard deviationdeviation
95% of the data lie 95% of the data lie within + or – 2 standard within + or – 2 standard deviationsdeviations
99.7% of the data lie 99.7% of the data lie within + or – 3 standard within + or – 3 standard deviations from the deviations from the meanmean
99.7%
95%
68%
Relating the Relating the Standard DeviationStandard Deviation (S) to the (S) to the normal distribution.normal distribution.““68-95-99.7% Rule”68-95-99.7% Rule”
Describing Data: Center & Describing Data: Center & SpreadSpread
OutliersOutliers
344N =
BDI Total
50
40
30
20
10
0
-10
107321196125276851132930018336
22061
71
82120
OutliersOutliers
BDI Total
40.0
35.0
30.0
25.0
20.0
15.0
10.0
5.0
0.0
BDI TotalF
requ
ency
140
120
100
80
60
40
20
0
Std. Dev = 7.10
Mean = 7.1
N = 344.00
OutliersOutliers
Statistics
BDI Total344
0
7.12
5.00
0
7.101
50.426
0
40
2.00
5.00
10.00
Valid
Missing
N
Mean
Median
Mode
Std. Deviation
Variance
Minimum
Maximum
25
50
75
Percentiles