Descriptive Statistics Prepared By Masood Amjad Khan GCU, Lahore.
-
Upload
milton-carroll -
Category
Documents
-
view
227 -
download
7
Transcript of Descriptive Statistics Prepared By Masood Amjad Khan GCU, Lahore.
Descriptive Statistics
Prepared By
Masood Amjad Khan
GCU, Lahore
Slide No.
1. Index 22. Index 3 3. Statistics (Definitions) 4 4. Descriptive Statistics 5 5. Inferential Statistics 116. Examples of 4 and 5 147. Data, Level of measurements 15 8. Variable 8 9. Discrete variable 10 10. Continues variable 9 11. Frequency Distribution 6 12. Constructing Freq. Distn. 22, 23 13. Example of 12 24, 2514. Displaying the Data 715. Bar Chart, Pie Chart 1616. Stem Leaf Plot 32-34 17. Graph 17 18. Histogram 26, 2719. Frequency Polygon 28, 29 20. Cumulative Freq. Polygon 30, 31
Subject Slide No.
21. Summary Measures 18 22. Goals 1923. Arithmetic Mean 37, 4024. Characteristic of Mean 2025. Examples of 23 38-39 26. Weighted Mean 41 27. Example weighted Mean 42 28. Geometric Mean 43 29. Example: Geometric Mean 44 30. Median 4531. Example of Median 4632. Properties of Median 4733. Mode 48 34. Examples of Mode 49-50 35. Positions of mean, median and mode. 51 36. Dispersion 52 37. Range and Mean Deviation 53 39. Example of Mean Deviation 54-55 40. Variance 56
Subject
Index
41. Examples of variance 57-59 42. Moments 60 43. Examples of Moments 61-6244. Skewness 6345. Types of Skewness 6446. Coefficient of Skewness 6547. Example of skewness 66-6748. Empirical Rule 68-6949. Exercise 7050.51.52.53.54.55.56.57.58.59.60.
Slide No.
Subject Subject
61.62.63.64.65.66.67.68.69.70.71.72.73.74.75.76.77.75.79.80.
Slide No.
Index
Numerical Facts (Common Usage)
Field or Discipline of StudyDefinition
The Science of Collection, Presentation, Analyzing and Interpretation of Data to make Decisions and Forecasts.
1. No. of children born in a hospital in some specified time.2. No. of students enrolled in GCU in 2007.3. No of road accidents on motor way.4. Amount spent on Research Development in GCU during 2006-2007.5. No. of shut down of Computer Network on a particular day.
Inferential Statistics
Probability provides the transitionbetween Descriptive and
Inferential Statistics
Examples of DescriptiveAnd Inferential
Statistics
DescriptiveStatistics
1
STATISTICS
Consists of methods for Organizing, Displaying,and Describing Data by using Tables, Graphs,and Summary Measures.
DataData
A data set is a collection of observations on oneor more variables..
Types of Data
1
Descriptive Statistics
A grouping of qualitative data intomutually exclusive classes showing the number of observations in each class.
A grouping of quantitative data intomutually exclusive classes showingthe number of observations in eachclass.
Preference of four type of beverageby 100 customers.Beverage NumberCola-Plus 40Coca-Cola 25Pepsi 207-UP 15
Selling price of 80 vehiclesVehicle Selling Number ofPrice Vehicles15000 to 24000 4824000 to 33000 3033000 to 42000 2
Tables
Frequency Table Frequency Distribution
Construction of Frequency Distribution
1 Organizing the Data
Displaying the Data
Diagrams/Charts Graph
Bar Chart Pie Chart
Histogram Frequency Polygon
Stem and Leaf Plot
1
A characteristic under study that assumes different values for different elements. (e.g Height of persons,
no. of students in GCU )
Qualitative orCategorical variable
Quantitative Variable
A variable that can not assumea numerical value but can beclassified into two or more non numeric categories iscalled qualitative or categoricalvariable.
A variable that can be measured numerically is called quantitativevariable.
Educational achievements Marital status Brand of PC
Discretevariable
Continuous variable
1
Variable
Go to Descriptive Statistics
A variable whose observations can assume anyvalue within a specific range.
Amount of income tax paid. Weight of a student. Yearly rainfall in Murree.Time elapsed in successive network breakdown.
1
Continuous variable
Back
Variable that can assume only certain values, and there are gaps between the values.
Children in a family Strokes on a golf hole TV set owned Cars arriving at GCU in an hour Students in each section of statistics course
1
Discrete variable
Back
Consists of methods, that use sample results to helpmake decisions or predictions about population.
1
InferentialStatistics
1. A portion of population selected for study.2. A sub set of Data selected from a population.
Estimation Testing ofHypothesis
PointEstimation
IntervalEstimation
Selecting a Sample1
Sample
Go to Inferential Statistics
1. Consists of all-individual items or objects-whose characteristics are being studied.2. Collection of Data that describe some phenomenon of interest.
ExamplesFinite Population Infinite Population
Length of fish in particular lake. No. of students of Statistics course in BCS. No. of traffic violations on some specific holiday.
Depth of a lake from any conceived position. Length of life of certain brand of light bulb. Stars on sky.
Population
1Go to Inferential Statistics
Examples
InferentialDescriptive
1. At least 5% of all fires reported last year in Lahore were deliberately set.2. Next to colonial homes, more residents in specified locality prefer a contemporary design.
1. As a result of recent poll, most Pakistanis are in favor of independent and powerful parliament.2. As a result of recent cutbacks by the oil-producing nations, we can expect the price of gasoline to double in the next year.
Descriptive and Inferential Statistics
1
Data can be classified according to level of measurement. The level of measurement dictates the calculations that can be done to summarize and present the data. It also determines the statistical tests that should be performed.
Data may only beclassified
Data are ranked nomeaningful differencebetween values
Meaningfuldifferencebetween values.
Level of measurement
Nominal Ordinal Interval Ratio
Meaningful 0 pointand ratio betweenvalues.
Jersey numbers of football player. Make of car.
Your rank in class. Team standings.
Temperature Dress size
No. of patients seen No of sales call made Distance students travel to class
1Types of Data
Bar Chart Pie Chart
A graph in which the classes are reported on the horizontal axis and the class frequencies onvertical axis. The class frequenciesare proportional to the heights ofthe bars.
A chart that shows the proportion or percent that each class representsof the total number of frequencies.
Orange35%
Red22%
Lime25%
White10%
Black8%
Covers for Cell phones
0200400600
Brigh
twh
ite
Mag
netic
lime
Fusio
nre
d
Cover Color(variable of interest)
No. o
f Co
vers
(Cla
ss
Freq
uenc
y)
Angle = (f/n)3603601300
79286Red
126455Orange
90325Lime
29104Black
36130White
f Angle
n =
1
Diagrams/Charts
Back
HistogramFrequency
PolygonCumulative Frequency
Polygon
1
Graphs
Go to Descriptive Statistics
Summary Measures
Arithmetic Mean Weighted Arithmetic Mean Geometric Mean Median Mode
Range, Mean Deviation Variance, Standard Deviation
GoalsMeasures of
LocationMeasures ofDispersion
1
Describing the Data
Moments
Moments about Origin Moments about mean
Skewness
Calculate the arithmetic mean, weighted mean, median, mode, and geometric mean. Explain the characteristics, uses, advantages, and disadvantages of each measure of location. Identify the position of the mean, median, and mode for both symmetric and skewed distributions.
Goals
Compute and interpret the range, mean deviation, variance, and standard deviation. Understand the characteristics, uses, advantages, and disadvantages of each measure of dispersion.
Understand Chebyshev’s theorem and the Empirical Rule as they relate to a set of observations. 1
Summary Measures
Characteristics of the Mean
The arithmetic mean is the most widely used measure of location. It requires the interval scale.
Its major characteristics are: All values are used. It is unique. The sum of the deviations
from the mean is 0. It is calculated by
summing the values and dividing by the number of values.
Every set of interval-level and ratio-level data has a mean.
All the values are included in computing the mean.
A set of data has a unique mean. The mean is affected by unusually
large or small data values. The arithmetic mean is the only
measure of central tendency where the sum of the deviations of each value from the mean is zero.
1
Use of Tables of Random Numbers
Random numbers are the randomly produced digits from 0 to 9. Table of random numbers contain rows and columns of these randomly produced digits. In using Table, choose: the starting point at random read off the digits in groups containing either one, two, three, or more of the digits in any predetermined direction (rows or columns).
Example
Choose a sample of size 7 from a group of 80 objects. Label the objects 01, 02, 03, …, 80 in any order. Arbitrarily enter the Table on any line and read out the pair of digits in any two consecutive columns. Ignore numbers which recur and those greater than 80.
1
Selecting a Sample
Go to Sample
Step 1 Step 2
How many no. of groups (classes)?
Just enough classes to reveal the shape of the distribution.
Let k be the desired no. of classes.
k should be such that 2k > n.
If n = 80 and we choose k = 6,
then 26 = 64 which is < 80, so k = 6 is not desirable. If we take k = 7, then 27 = 128, which is > 80, so no. of classes should be 7.
Determine the class interval (width).
the class interval should be the same for all classes.
The formula to determine class width:
where i is the class width, H is the highest observed value, L is the lowest observed value, and k is the number of classes.
Next
H Li
k
1
Construction of Frequency Distribution
Set the individual class limits. Class limits should be very clear. Class limits should not be overlapping. Some time class width is rounded which may increase the range H-L. Make the lower limit of the first class a multiple of class width.
Make tally of observations falling in each class.
Step 3 Step 4
Step 5
Count the number of items in each class (class frequency)
Example1
Back
Construction of Frequency Distribution(continued)
Raw Data( Ungrouped Data )
2044519251266132281725449245712237421740
2361317266274433592527896262852284520962
1889029237155463227719331217222144220356
2063320642358512365718263157942579924052
1796832492274532453326661257832376520203
2198119766208182867019688201551735720004
2089517399283372316928034252772525119873
1988920642290762665124609243242428520047
1593524296216392155819587308722868318021
1789122442306552422023591204542337223197
Continued 1Back
Construction of Frequency Distribution( Example )
Following Step 1, with n = 80 k should be 7. Following Step 2 the class width should be 2911. The width size is usually rounded up to a number multiple of 10 or 100. The width size is taken as i = 3000. Following Step 3, with i = 3000 and k = 7, the range is 7×3000=21000. Where as the actual range is H – L = 35925 - 15546 = 20379. The lower limit of the first class should be a multiple of class width. Thus the lower limit of starting class is taken as 15000.
Following Step 4and Step 5
Total = 80
233000 up to 36000
430000 up to 33000
827000 up to 30000
1824000 up to 27000
1721000 up to 24000
2318000 up to 21000
815000 up to 18000
FrequencySelling Price
1Back
Construction of Frequency Distribution( Example Continued )
A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other.
2405.14.6 - 5.2
6384.54.0 - 4.6
13323.93.4 – 4.0
13193.32.8 - 3.4
462.72.2 – 2.8
222.11.6 - 2.2
f cf HGroup Histogram (Example 1)
0
5
10
15
20
25
30
35
1.60 2.20 2.80 3.40 4.00 4.60 5.20
Groups
Example 1 k = 6
Next 1
Histogram
24054.5 - 5.0
6384.54.0 - 4.5
83243.5- 4.0
15243.53.0 - 3.5
5932.5- 3.0
242.52.0 - 2.5
2221.5 - 2.0
fcf HGroup Histogram (Example 1)
0
10
20
30
40
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Groups
Per
cen
t
Example 1 k = 7
1
Histogram
Back
A graph in which the points formed by the intersections of the classmidpoints and the class frequencies are connected by line segments.
BackBackMid point = ( Li +Hi )/2 1
FrequencyPolygon
2404.94.6 - 5.2
6384.34.0 - 4.6
13323.73.4 – 4.0
13193.12.8 - 3.4
462.52.2 - 2.8
221.91.6 - 2.2
fcfMid ptGroup
Example 1 k = 6Frequency Polygon (Example 1)
1.90
2.50
3.10 3.70
4.30
4.90
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
1 2 2 3 3 4 5
Raw Data
Per
cen
t
3404.754.5 – 5.0
5374.254.0 - 4.5
10323.753.5 – 4.0
15223.253.0 - 3.5
472.752.5 – 3.0
132.252.0 - 2.5
221.751.5 – 2.0
f
cf Mid ptGroup
Example 1 k = 7
1Back
Frequency PolygonContinued
Frequency Polygon (Example 1)
1.75 2.25
2.75
3.25
3.75
4.25
4.75
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
1 2 3 4
Data Example1
Per
cent
A graph in which the points formed by the intersections of the classmidpoints and the class cumulative frequencies are connected by line segments.
A cumulative frequency polygon portrays the number or percent of observations below given value.
Next
2404.94.6 - 5.2
6384.34.0 - 4.6
13323.73.4 – 4.0
13193.12.8 - 3.4
462.52.2 - 2.8
221.91.6 - 2.2
f cf Mid ptGroupExample 1 k = 6 Ogive Example 1
2.20 2.80
3.40
4.00
4.60 5.20
0.0
25.0
50.0
75.0
100.0
1 2 2 3 3 4 5
Data Example 1
Cum
ula
tive P
ercent
1
Cumulative FrequencyPolygon
3404.754.5 – 5.0
5374.254.0 - 4.5
10323.753.5 – 4.0
15223.253.0 - 3.5
472.752.5 – 3.0
132.252.0 - 2.5
221.751.5 – 2.0
fcfMid ptGroup
Cumulative Frequency PolygonContinued
Example 1 K = 7
Ogive Example 1
2.00 2.50 3.00
3.50
4.00
4.50 5.00
0.0
25.0
50.0
75.0
100.0
1 2 3 4
Data Example 1C
umul
ativ
e P
erce
nt
1Back
1
A Stem and Leaf Plot is a type of graph that is similar to a histogram but shows more
information. Summarizes the shape of a set
of data. provides extra detail regarding
individual values. The data is arranged by placed
value. Stem and Leaf Plots are great
organizers for large amounts
of information.
The digits in the largest place are referred to as the stem.
The digits in the smallest place are referred to as the leaf
The leaves are always displayed to the left of the stem.
Series of scores on sports teams, series of temperatures or rainfall over a period of time, series of classroom test scores are examples of when Stem and Leaf Plots could be used.
What is A Stem and Leaf Plot Diagram? What Are They Used For?
ConstructingStem and Leaf Plot
Stem and Leaf Plot
1
Make Stem and Leaf Plot with the following temperatures for June. 77 80 82 68 65 59 61 57 50 62 61 70 69 64 67 70 62 65 65 73 76 87 80 82 83 79 79 71 80 77
Stem (Tens) and Leaf (Ones)
Begin with the lowest temperature.
The lowest temperature of the month was 50.
Enter the 5 in the tens column and a 0 in the ones.
The next lowest is 57. Enter a 7 in the ones Next is 59, enter a 9 in the
ones. find all of the temperatures that
were in the 60's, 70's and 80's. Enter the rest of the
temperatures sequentially until your Stem and Leaf Plot contains all of the data.
0 0 0 2 2 3 78
0 0 1 3 6 7 7 9 97
1 1 2 2 4 5 5 5 7 8 9 6
0 7 95
Leaf (Ones)Stem (Tens)
Temperature
Next
ConstructingStem and Leaf Plot
Stem and LeafExample
Make a Stem and Leaf Plot for the
following data.
11.71.22.12.51.2
1.92.00.26.35.3
2.05.91.13.91.7
1.43.52.80.42.7
1.82.61.32.41.8
4.31.52.32.10.4
2.52.33.40.94.6
0.33.11.83.53.2
2.13.72.62.91.6
1.32.83.90.72.4Freq Stem Leaf
6 0 2 3 4 4 7 9
14 1 1 2 2 3 3 4 5 6 7 7 8 8 8 9
17 2 0 0 1 1 1 3 3 4 4 5 5 6 6 7 8 8 9
8 3 1 2 4 5 5 7 9 9
2 4 3 6
2 5 3 9
1 6 3
50 Next Back
Stem and Leaf PlotExample
Following are the car battery life
Data.
Make a Stem and Leaf Plot.
2.2 4.1 3.5 4.5 3.2 3.7 3 2.6
3.1 1.6 3.1 3.3 3.8 3.1 4.7 3.7
2.5 4.3 3.4 3.6 2.9 3.3 3.9 3.1
3.3 3.1 3.7 4.4 3.2 4.1 1.9 3.4
4.7 3.8 3.2 2.6 3.9 3 4.2 3.5
f S L
2 1 6 9
5 2 2 5 6 6
9
25 3 0 0 1 1 1 1 1 2 2 2 3 3 3 4 4 5 5 6 7 7 7 8 8
9 9
8 4 1 1 2 3 4 5 7 7
40
1Next Back
Stem and Leaf PlotExample
Frequen
cy
Stem
Leaf
2 1 6 9
1 2 2
4 2 5 6 6 9
15 3 0 0 1 1 1 1 1 2 2 2 3 3 3 4 4
10 3 5 5 6 7 7 7 8 8 9 9
5 4 1 1 2 3 4
3 4 5 7 7
40
1Go to Stem and Leaf Plot
Back
Arithmetic Mean
N observationsX1, X2,…, XN inthe population.
1 1 2 ...
N
ii N
XX X X
N N
n observationsX1, X2 ,…, Xn inthe sample
1
n
ii
XX
n
Let Xi and fi be the midpoint and frequencyrespectively of the ithgroup in the populationThe mean is defined as
1
1
N
i iiN
ii
f X
f
Ungrouped Data Grouped Data
Population Sample Population Sample
Let Xi and fi be the midpoint and frequencyrespectively of the ithgroup in the sampleThe mean is defined as
1
1
n
i iin
ii
f XX
f
Next
Point ofEquilibrium
1
Measures of Location
Example of Sample Mean
Following is a random sample of
12 Clients showing the number of
minutes used by clients in a
particular cell phone last month.
What is the mean number of
Minutes Used?
Example of Population Mean
There are automobile manufacturing
Companies in the U.S.A. Listed below
is the no. of patents granted by the US
Government to each company.
Is this information a sample or population?
90 110 89 113
91 94 100 112
77 92 119 83
90 91 77 ... 83 117097.5
12 12
XX
n
Number of Number of
Company Patent Granted Company Patent Granted
General Motors 511 Mazda 210
Nissan 385 Chrysler 97
DaimlerChrysler 275 Porsche 50
Toyota 257 Mistubishi 36
Honda 249 Volvo 23
Ford 234 BMW 13
511 385 ... 13 2340195
12 12
X
N
Next1 Back
Numerical Examples Of Arithmetic MeanUngrouped Data
Numerical Examples Of Arithmetic MeanGrouped Data
Following is the frequency distribution of Selling Prices of Vehicles at
Whitner Autoplex Last month.
Find arithmetic mean.
So the mean vehicle selling price is $23100.
184523.1
80
fXX
f
1845.080Total
69.034.5233 - 36
126.031.5430 - 33
228.028.5827 - 30
459.025.51824 - 27
382.522.51721 - 24
448.519.52318 - 21
132.016.5815 - 18
fXXf($ thousands)
MidpointFrequencySelling Price
Go to Summary measures
1Back
X
3X2X 4X1X 5X 6X
3f2f1f 4f 5f 6f
1
An object is balanced at whenX
1 1 2 2 3 3 4 4 5 5 6 6
1 1 2 2 3 3 1 2 3 4 5 6 4 4 5 5 6 6
1 1 2 2 3 3 4 4 5 5 6 6 1 2 3 4 5 6
1 1 2 2 3 3 4 4 5 5 6 6
1 2 3 4
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
( )
X X f X X f X X f X X f X X f X X f
f X f X f X f f f X f f f X f X f X f X
f X f X f X f X f X f X f f f f f f X
f X f X f X f X f X f XX
f f f f
5 6
6
16
1
i ii
ii
f f
f X
f
Back
Point ofEquilibrium
1EXAMPLE
Weighted Mean
Summary Measures
A special case of arithmetic mean. Case when values of variable are associated with certain quality,
e.g price of medium, large, and big
The weight mean of a set of numbers
X1, X2, ..., Xn, with corresponding
weights w1, w2, ...,wn, is computed
from the following formula:
1 1 2 2
1 2
1
1
...
...n n
wn
n
i iin
ii
w X w X w XX
w w w
w X
w
3$1.50 Big
4$1.25 Large
3$0.90 Medium
WeightsPriceSoft Drink
Weighted Mean
The Carter Construction Company pays its hourly employees
$16.50, $19.00, or $25.00 per hour. There are 26 hourly employees,
14 of which are paid at the $16.50 rate, 10 at the $19.00 rate, and 2 at the
$25.00 rate. What is the mean hourly rate paid the 26 employees?
EXAMPLE Weighted Mean
1Go to
Summary measuresBack
Summary Measures
The geometric mean of a set of n positive numbers is defined as the nth root of the product of n values. The formula for the geometric mean is written:
The geometric mean used as the average percent increase over timen is calculated as:
Useful in finding the average change of percentages, ratios, indexes, or growth rates over time.
It has a wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the GDP, which compound or build on each other.
The geometric mean will always be less than or equal to the arithmetic mean.
1
1 2( ... ) nnGM X X X
Example1
Geometric Mean
nValue at the end of period
GMValue at the start of period
Example of Geometric Mean
The return on investment by certainCompany for four successive years was 30%, 20%, -40%, and 200%. Find the geometric mean rate of return on investment.Solution: The 1.3 represents the 30 percentreturn on investment, i.e original Investment of 1.0 plus the return of0.3. So
Which shows that the average return is 29.4 percent.
If you earned $30000 in 1997 and $50000 in 2007, what is your annual rate ofincrease over the period?
The annual rate of increase is 5.24 percent.
Summary Measures 1
4 (1.3)(1.2)(0.6)(3.0) 1.294GM
nValue at the end of period
GMValue at the start of period
500001 0.0524
30000nGM
Back
Median
If number of observations n is odd,the median is( n+1)/2th observation. If n is even the median is the average of n/2th and (n/2+1)th observationsExample:Determine the median for each set ofdata.
Arrange the set of data
1) n=7 median is 4th observation that is 33.
2) n=6, median is average of 3rd and 4th observation, that is (27+28)/2= 27.5.Median for Grouped DataThe median is obtained by using theformula:
Where m is the group of n/2th obs.
Lm, Im, fm, and cfm-1 are the lowest value, class width, frequency, andcumulative frequency respectively ofthe mth group.
Median is the midpoint of the values after they have been ordered fromthe smallest to the largest, or thelargest to the smallest
(1) 41 15 39 54 31 15 33(2) 15 16 27 28 41 42
(1) 15 15 31 33 39 41 54(2) 15 16 27 28 41 42
1( )2
mm m
m
I nX L cf
f
Example1
Example (Median)
Find the Median for the following
data.
n/2 = 20, so median group is 3.40-4.00
Lm = 3.40, Im = 0.6, fm = 13, cfm-1 = 19
Go to Summary Measures
Example 1
L H f cf
1.60 < 2.20 2 2
2.20 < 2.80 4 6
2.80 < 3.40 13 19
3.40 < 4.00 13 32
4.00 < 4.60 6 38
4.60 < 5.20 2 40
0.63.40 (20 19) 3.45 3.5
13X
Back
1
Properties of the Median
There is a unique median for each data set. It is not affected by extremely large or small
values and is therefore a valuable measure of central tendency when such values occur.
It can be computed for ratio-level, interval-level, and ordinal-level data.
It can be computed for an open-ended frequency distribution if the median does not lie in an open-ended class.
Go to Summary Measures 1
Mode
The mode is the value of the
observation that appears most
frequently.
0100200300400500600700800900
NewEngland
Middle
Atlantic
E.N.Central
W.N.Central
S.Atlantic
E.S.Central
W.S.Central
Mountain
Pacific
Regions
No. o
f Sen
iors
Region No. of Seniors
New England 524
Middle Atlantic 818
E.N.Central 815
W.N.Central 367
S.Atlantic 679
E.S.Central 196
W.S.Central 436
Mountain 346
Pacific 783
ModeNext 1
Mode(Example)
Next1
Back
ModeGrouped Data
Calculating Mode for Grouped Data.
Calculate the mode of the following
Distribution.
Solution:
Modal Group is 2.8 - 3.4
fm = 14, fm-1 = 4, fm+1 = 12 and Im= 0.6
Group f
1.6 - 2.2 2
2.2 - 2.8 4
2.8 - 3.4 14
3.4 - 4.0 12
4.0 - 4.6 6
4.6 - 5.1 2
1Go to Summary Measures
1
1 1( ) ( )m m
m mm m m m
f fMode L I
f f f f
1
1 1( ) ( )
14 42.8 0.6
(14 4) (14 12)
3.3
m mm m
m m m m
f fMode L I
f f f f
Back
The Relative Positions of the Mean, Median and the Mode
Go to Summary Measures 1
Dispersion
Why Study Dispersion? A measure of location, such as
the mean or the median, only describes the center of the data. It is valuable from that standpoint, but it does not tell us anything about the spread of the data.
For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth.
A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions.
Studying dispersion through display.
1Next
Range and Mean Deviation
Range
Mean Deviation
Example
The number of cappuccinos sold at
the Starbucks location in the Orange
Country Airport between 4 and 7p.m.
for a sample of 5 days last year were
20, 40, 50, 60, and 80. Determine the
mean deviation for the number of
cappuccinos sold.
Range = Largest value – Smallest value
1.
n
ii
X XM D
n
Range = Largest – Smallest value = 80 – 20 = 60
Next
1Back
Mean DeviationExample
Example
The number of cappuccinos sold at he Starbucks location in the Orange Country Airport between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50,60, and 80. Determine the mean deviation for the number of cappuccinos sold.
Solution
Number of Cappuccinos Absolute Deviation
Sold Daily ( X )
2020 - 50 = -30 30
4040 - 50 = -10 10
50 50 - 50 = 0 0
60 60 - 50 = 10 10
80 80 - 50 = 30 30
Total 80
X X X X
1 80. 16
5
n
ii
X XM D
n
Next 1Back
Mean Deviation (Grouped Data)
Mean Deviation for Grouped Data
80Total
34.5233 - 36
31.5430 - 33
28.5827 - 30
25.51824 - 27
22.51721 - 24
19.52318 - 21
16.5815 - 18
Xf($ thousands)
FrequencySelling Price
184523.1
80
fXX
f
Go to Summary Measures 1
1
1
k
i ii
k
ii
f X XMD
f
288.6Total
22.811.434.52
33.68.431.54
43.25.428.58
43.22.425.518
10.2-0.622.517
82.8-3.619.523
52.8-6.616.58
Xf
1
1
288.63.61
80
k
i ii
k
ii
f X XMD
f
X X f X X
Back
Variance andStandard Deviation
Population variance and standarddeviation.Let X1, X2,…, XN be N observations in the population.The variance is defined as:
The standard deviation is defined as:
The sample variance and Standard deviation.Let X1, X2,…, Xn be n observations in the sample.The variance is defined as:
The standard deviation is defined as:
2
2 1
( )N
ii
X
N
2
1
( )N
ii
X
N
2
2 1
( )
1
n
ii
X Xs
n
2
1
( )
1
n
ii
X Xs
n
Next 1
ExampleVariance and standard deviation
The number of traffic citations issued
during the last five months in
Beaufort County, South Carolina, is
38, 26, 13, 41, and 22. What is the
population variance?
The hourly wages for a sample of
part-time employees at Home Depot
are: $12, $20, $16, $18, and $19.
What is the sample variance?
Hourly Wage
$ ( X )
12 -5 25
20 3 9
16 -1 1
18 1 1
19 2 4
85 0 40
X X 2( )X X
2
2 1
( )
140
10.04
n
ii
X Xs
n
8517.0
5X
Next 2Back
ExampleGrouped Data
The sample standard deviation is defined as:
Example:
For the following frequency distribution of prices of vehicle, compute the
standard deviation of the prices.
2( )
( ) 1
f X Xs
f
Next 2Back
Example (continued)
Alternate method of computing variance is:
Example
Group Mid pt (X) f fX fX2
1.5- 2.0 1.75 2 3.5 6.125
2.0 - 2.5 2.25 2 4.5 10.13
2.5 - 3.0 2.75 5 13.75 37.81
3.0 - 3.5 3.25 15 48.75 158.4
3.5 - 4.0 3.75 8 30 112.5
4.0 - 4.5 4.25 6 25.5 108.4
4.5 - 5.0 4.75 2 9.5 45.13
Total 40 135.5 478.5
22 2 ( )1
( )1
fXs fX
n n
22 1 (135.5)
(478.5 ) 0.540 1 40
s
Go to Measures of Dispersion 2Back
Moments
Moments about Origin
The rth moment about origin ‘a’ is
defined as:
Moments about Mean
The rth moment about mean is
defined as:
First moment about mean is Zero.
Moments of Grouped Data
The rth moment about origin ‘a’ is
defined as:
The rth moment about mean is
defined as:
First moment about mean is Zero.
( )r
r
X am
n
( )r
r
X Xm
n
( )r
r
f X am
f
( )r
r
f X Xm
f
Next 2
Example of Moments
Moments about Mean.Group
Mid pt (X) f fX
1.5- 2.0 1.75 2 3.5
2.0 - 2.5 2.25 2 4.5
2.5 - 3.0 2.75 5 13.75
3.0 - 3.5 3.25 15 48.75
3.5 - 4.0 3.75 8 30
4.0 - 4.5 4.25 6 25.5
4.5 - 5.0 4.75 2 9.5
Total 40 135.5
5.445 -8.98425 14.824013
2.645 -3.04175 3.4980125
2.1125 -1.373125 0.8925313
0.3375 -0.050625 0.0075937
0.98 0.343 0.12005
4.335 3.68475 3.1320375
3.645 4.92075 6.6430125
19.5 -4.50125 29.11725
135.53.4
40
fXX
f
( )r
r
f X Xm
f
2
19.50.5
40m
3
-4.50125 0.1125
40m 4
29.11725 0.7279
40m
Next 2Back
2( )f X X 3( )f X X 4( )f X X
Example of Moments(Continued)
Example
-1.97 -9.84 19.37 -38.11 75.00
-1.17 -10.51 12.28 -14.34 16.75
-0.37 -5.52 2.03 -0.75 0.28
0.43 4.32 1.87 0.81 0.35
1.23 7.39 9.11 11.22 13.82
2.03 4.06 8.26 16.78 34.10
2.83 2.83 8.02 22.71 64.32
3.63 7.26 26.38 95.82 348.03
0 87.31 94.14 552.65
Class f X fX
0.0-0.8 5 0.4 2
0.8-1.6 9 1.2 10.8
1.6-2.4 15 2 30
2.4-3.2 10 2.8 28
3.2-4.0 6 3.6 21.6
4.0-4.8 2 4.4 8.8
4.8-5.6 1 5.2 5.2
5.6-6.4 2 6 12
Total 50 118.4118.4
2.3750
fX
fX
X X ( )f X X 2( )f X X 3( )f X X 4( )f X X
2
2
( ) 87.311.75
50
f X X
fm
3
3
( ) 94.141.88
50
f X X
fm
4
4
( ) 552.6511.05
50
fm
X X
f
Go to Dispersion 2Back
Skewness
Mean, median and mode are measures of central location for a set of observations and measures of data dispersion are range and the standard deviation.
Another characteristic of a set of data is the shape.
There are four shapes commonly observed: symmetric, positively skewed, negatively skewed, Bimodal
The coefficient of skewness can range from -3 up to 3. A value near -3, such as -
2.57, indicates considerable negative skewness.
A value such as 1.63 indicates moderate positive skewness.
A value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is no skewness present.
Next
2
Types of Skewness
2Next Back
Coefficient ofSkewness
The Pearson coefficient of skewness is defined as:
Example Following are the earnings per share for
a sample of 15 software companies for the year 2005. The earnings per share are arranged from smallest to largest.
Compute the mean, median, and standard deviation. Find the coefficient of skewness using Pearson’s estimate. What is your conclusion regarding the shape of the distribution?
Solution
The shape is moderately positively skewed.
2
3( )X Xsk
s
2
2 2
$74.26$4.95
15
1
($0.09 $4.95) ... ($16.40 $4.95) )
15 1$5.22
3( )
3($4.95 $3.18)1.017
$5.22
XX
n
X Xs
n
X Mediansk
s
Next Back
Example of Skewness(Continued)
ExampleClass
0.0-0.8 5 5 2 20.65
0.8-1.6 8 13 9.6 12.14
1.6-2.4 14 27 28 2.61
2.4-3.2 11 38 30.8 1.49
3.2-4.0 7 45 25.2 9.55
4.0-4.8 2 47 8.8 7.75
4.8-5.6 1 48 5.2 7.66
5.6-6.4 2 50 12 25.46
Total 50 118.4 87.31
2( )f X X
121.62.43
50
fX
fX
2( ) 87.31
1.33481 49
f X Xs
n
1( )2
0.8 501.6 ( 13) 2.29
14 2
mm m
m
XI n
L cff
3( ) 3(2.43 2.29)0.3147
1.3348
X Xs
sk
The shape is slightly positively skewed
fXcff
Go to Skewness
2
Back
The skewness can also be measuredwith moments as: m2
= 1.75, m3 = 62 b = 0.492
23
32
mb
m
Next
ExampleSkewness
Histogram
0
5
10
15
20
25
30
0.00 0.80 1.60 2.40 3.20 4.00 4.80 5.60 6.40
Data
Per
cent
MeanMode MedianGo to Skewness
2Back Next
Empirical RuleEmpirical RuleFor a symmetrical, bell-shaped frequency distribution: Approximately 68% of the observations will lie within plus and minus one standard
deviations of the mean. ( mean ±s.d ) About 95% of the observations will lie within plus and minus two standard deviations
of the mean. ( mean ± 2s.d ) Practically all (99.7%) wiill lie within plus and minus three standard deviations of the
mean. ( mean ± 3s.d ) Let the mean of a symmetric distribution be 100 and standard deviation be 10, then
the empirical rule is as follows:
70 80 90 100 110 120 130
68%
95%
99.7% 2
Go to SkewnessBack
Next
ExampleEmpirical Rule
Consider the following distribution:
Check the empirical rule.
Mean = 3.2 s.d = 0.75Mean ± sd = ( 2.45 – 3.95 ) ( 67.5%)Mean ± 2sd = ( 1.7 – 4.7 ) ( 97.5%)Mean ± 3sd = ( 0.89 – 5.45 ) (100%)
Mean = 3.25 sd = 0.77Mean ± sd = ( 2.48 – 4.05) ( 67.5%)Mean ± 2sd = ( 1.71 – 4.79 ) ( 97.5%)Mean ± 3sd = ( 0.94 – 5.56 ) ( 100%)
1.6 2.5 3 3.4 3.8
1.8 2.6 3.2 3.5 4.1
2 2.6 3.2 3.6 4.1
2.3 2.6 3.2 3.6 4.2
2.3 2.8 3.3 3.6 4.3
2.3 2.8 3.3 3.7 4.3
2.4 2.9 3.4 3.7 4.5
2.5 3 3.4 3.8 4.6
Group f X fX fX^2
1.5- 2.0 2 1.75 3.5 6.13
2.0 - 2.5 5 2.25 11.3 25.3
2.5 - 3.0 8 2.75 22 60.5
3.0 - 3.5 10 3.25 32.5 106
3.5 - 4.0 8 3.75 30 113
4.0 - 4.5 5 4.25 21.3 90.3
4.5 - 5.0 2 4.75 9.5 45.1
40 130 446
Back 2Next
Exercise
For the following data of examination
marks find the Mean, Median, Mode,
Mean Deviation and variance. Also
find the Skewness.
The following is the distribution of
Wages per thousand employees in a
Certain factory.
Marks30 – 3940 – 4950 – 5960 – 6970 – 7980 – 8990 - 99
No. of students8
871903042118520
Daily Wages
222426283032343638404244
No. of Employees
31343
102175220204139692561
Calculate the Modal
and Medianwages. Why isdifference b/w
the two.
Back
3