1
Biostatistics
Measures of Central Tendency
2
Central Tendency
Central Tendency
3
Measures of Central Tendency Central Tendency – Definition: The most
common value (for nominal variables) or the value around which cases tend to cluster (for ordinal and interval-ratio variables)
Central Tendency – Simplified Definition: A number that represents what is “typical”, “average”, or “in the middle”
Measures of Central Tendency Mode Median Mean
The one that is to use depends on the situation
4
Mode Definition: The most frequently occurring value
of a variable
Levels of Measurement Nominal Ordinal Interval-ratio
Comment: The mode is a value, not a frequency!
5
Mode in a Frequency Distribution Mode is the value with the largest frequency or
percentage
Colour of eyes Frequency (f) Percentage (%)Brown 247 50.2%Black 145 29.5%Grey 37 7.5%Blue 63 12.8%Total 492 100.0%
Mode = “O”
6
Mode in a Bar Graph The mode is the value with the tallest bar
Mode = “brown”
brown black grey AB0
100
200
300
7
Mode in a Pie Chart The mode is the value with the largest slice
Mode = “brown”
brown
black
grey
blue
8
Mode in a Histogram The mode is the value with the tallest bar
Mode = 0Colour
Brown
Black
Grey Blue
9
Mode: Potential Problems Problem 1: Mode might not fall near the center
of the distribution for an interval-ratio variable
The mode is here We’d like the mode to be here
10
Mode: Potential Problems
Problem 2: There might be more than one mode Bimodal: Two modes
Body Modification Frequency(f)
Percentage(%)
Brown 155 31.5%
Grey 121 24.6%
Blue 61 12.4%
Black 155 31.5%
Total 492 100.0%
Two Modes:“brown ”
and“blue ”
11
Median Definition: The middle number in the distribution
of a variable when its values are placed in order
Levels of Measurement Ordinal Interval-ratio
Comment: The median divides the distribution of a variable in half Half of the cases will be above the middle number Half of the cases will be below the middle number
12
Determining the Median Interval-Ratio Variable
Odd Number of Cases: The median is the middle number
Even Number of Cases: The median is the average of the two middle numbers
13
Median: Example Data
Person number
Age in years
1 432 57
3 45
4 68
5 51
6 26
7 13
8 1
9 24
Student number
Hours / week
1 02 103 124 125 56 127 18 129 510 11
Table 1. age of people Table 2. duration of study per student
14
Median: Interval-Ratio Variable, Odd Number of Cases
Example: age of people (Table 1)
Steps: Illustrated on next slide Step 1: Arrange the values in order from smallest to largest Step 2: Assign “case numbers” from 1 to N Step 3: Find the middle case by adding 1 to N and diving by 2
Here, (N+1)/2 = (9+1)/2 = 10/2 = 5 The fifth case (not the number 5!) has the median
Step 4: Find the value corresponding to the middle case Here the 5th case has a value of 43
Note: The median of 43 divides the distribution in half Four cases are above the median Four cases are below the median
15
Median: Interval-Ratio Variable, Odd Number of Cases Step 1: Arrange the values in order from smallest to largest
Step 2: Number the values from 1 to 9
Step 3: Find the middle case: (9+1)/2 = 10/2 = 5
Step 4: Find the value for the 5th case – this is the median = 43
Note: 4 cases are above, and 4 are below, the median
Variable Value 1 13 24 26 43 45 51 57 68
Case Number 1 2 3 4 5 6 7 8 9
16
Median: Interval-Ratio Variable, Even Number of Cases
Example: Metallica CDs Owned (Table 2)
Steps: Illustrated on next slide Step 1: Arrange the values in order from smallest to largest Step 2: Assign “case numbers” from 1 to N Step 3: Find the middle case by adding 1 to N and diving by 2
Here, (N+1)/2 = (10+1)/2 = 11/2 = 5.5 The median is the average of the 5th and 6th cases
Step 4: Find the average of the two middle cases Here the 5th case has a value of 10 and the 6th case has a value of 11 The median is (10+11)/2 = 10.5
Note: The median of 10.5 divides the distribution in half Five cases are above the median Five cases are below the median
17
Median: Interval-Ratio Variable, Even Number of Cases
Step 1: Arrange the values in order from smallest to largest
Step 2: Number the values from 1 to 10
Step 3: Find the middle case: (10+1)/2 = 11/2 = 5.5 (average of the 5 th and 6th cases)
Step 4: Find average of the 5th and 6th cases – the median is (10+11)/2 = 10.5
Note: 5 cases are above, and 5 are below, the median
Variable Value 0 1 5 5 10 11 12 12 12 12
Case Number 1 2 3 4 5 6 7 8 9 10
18
Median in a Frequency Distribution Median: Value of the variable where the cumulative
percentage is 50% Here the cumulative percentage hits 50% at a value of 3 So the median number of hours of study is 3
Hours of study Frequency Percentage Cumulative Percentage
0 137 23.2% 23.2%
1 56 9.5% 32.7%
2 48 8.1% 40.8%
3 75 12.7% 53.5%
4 42 7.1% 60.6%
5 15 2.5% 63.1%
6 68 11.5% 74.6%
7 150 25.4% 100.0%
Total 591 100.0%
19
Mean Definition: The average obtained by summing
the values of a variable divided by the number of cases
Level of Measurement: Interval-ratio
Comments It incorporates all values of a variable
Unlike the mode and median It can be misleading when there are outlying
(extreme) values
20
Mean: Formula for a Data Table
Ẋ = Ʃ x n
Ẋ represents the mean ∑ tells us to sum or add up x represents each value of the variable n represents the number of cases
Y
21
Mean: Calculating for a Data Table
Example: hours of study
Ẋ = Ʃ x = 0 + 1+ 5+5+10+11+12+12+12+12 = 6.8 n 10 People in this sample study for an average
of 6.8 hours
22
Mean: Formula for a Frequency Distribution
Ẋ = Ʃ f * x n
Ẋ represents the mean ∑ tells us to sum or add up f represents the frequency for each value of the variable x represents each value of the variable f·*x tells us to multiply the frequency (f) by the value (Y) n represents the number of cases
23
Mean:Calculating for a Frequency Distribution Example: hours of study
Hours of study(x) Frequency (f) f·x
0 137 137·0 = 0
1 56 56·1 = 56
2 48 48·2 = 96
3 75 75·3 = 225
4 42 42·4 = 168
5 15 15·5 = 75
6 68 68·6 = 408
7 150 150·7 = 1,050
N = 591
24
Mean:Calculating for a Frequency Distribution
Example: hours of study(continued)
Ẋ = 0+ 56+96+225+ 168+ 75+408+1078 = 2078 591 591 People in this sample study for an average of
3.52 hours
25
Outlying Value
Definition: A value that is very small or large relative to other values of the variable
Effect of Outlying ValueMode: Usually has no effectMedian: Usually has no effectMean: May have an effect
26
Outlying Value:Potential Effect on Mean Example: hours of study
Suppose one student study for 112 (instead of 12) hours
Ẋ = Ʃ x = 0 + 1+ 5+5+10+11+12+12+12+112 = 168 n 10 10
The mean is now 16.8
27
Determining Skewness:Using the Mean and Median Procedure
Compare the mean and median orSubtract the median from the mean
Mean – Median
Symmetric DistributionComparison: Mean equals medianMean – Median: Difference is zero
28
MeanMedianMode
29
Determining Skewness:Using the Mean and Median Positively Skewed Distribution
Comparison: Mean greater than medianMean – Median: Difference is positive
Greater than zero
Negatively Skewed DistributionComparison: Mean smaller than medianMean – Median: Difference is negative
Less than zero
30
Choosing a Measure of Central Tendency Nominal Variable: Mode only
Ordinal Variable: Median is best Mode is also possible
Interval-Ratio Variable Symmetric Distribution: Any will work
Mean is typically used Positively or Negatively Skewed Distribution: Median
is best
Top Related