Business+Statistics
-
Upload
carla-tate -
Category
Documents
-
view
215 -
download
0
Transcript of Business+Statistics
-
7/28/2019 Business+Statistics
1/123
Business Statistics
-
7/28/2019 Business+Statistics
2/123
Contents
1. Meaning and Scope
2. Collection of Data3. Classification and tabulation
4. Diagrammatic and Graphic Representation
5. Averages
6. Dispersion
7. Skewness and Kurtosis8. Correlation
9. Linear Regression Analysis
10. Index Numbers
11. Time series Analysis
12. Theory of Probability13. Random Variable, Probability Distribution and Mathematical expectation
14. Theoretical Distributions
15. Sampling Theory and Design of sample Surveys
16. Interpolation and Extrapolation
-
7/28/2019 Business+Statistics
3/123
Quantitative Decision Making
-
7/28/2019 Business+Statistics
4/123
Learning Objectives
Basic Statistics and its application in dayto-day lifeof a Manager
Various aspects of quantitative techniques and their applicationin Decision making
Also frequently used models of Statistical analysis
Understand:
Complexity of Managerial decisions
Quantitative Techniques
Need of using Quantitative approach in decisions
Role of statistical methods in data analysis
Brief idea of various statistical methods
Know the areas of applications of quantitative approach in businessand management.
-
7/28/2019 Business+Statistics
5/123
Introduction
Individual business prior to Industrial revolutionand need for info----Decisions based on past
experience and intuition.
Marketing of productsTest marketing of products
The manager (also the owner)
Progress of work
Any other fact the owner needed to know
-
7/28/2019 Business+Statistics
6/123
Intuition alone has no place in
decision making Becomes highly questionable when decisions
involve the choice among several courses of action
each of which can achieve several managementobjectives simultaneously.
-
7/28/2019 Business+Statistics
7/123
Statistical methods used in
Marketing, Finance, Production and
personnel Also in:
Regional planning
Transportation
Public health
Communication
Military
agriculture
-
7/28/2019 Business+Statistics
8/123
QT: A group of statistical , and OR
(programming) Techniques
QT approach in decision making :
Problems be defined, analyzed and solved in a conscious,
rational, systematic, scientific manner based on ;
Data, facts, info, and logic (and not whims and guesses)
QT provides decision maker a scientific method based on quantitative
data in identifying a course of action to achieve the optimal value of the
predetermined objective or goal.
Usage of numbers , symbols or mathematical formulae are used to
represent the models of reality.
-
7/28/2019 Business+Statistics
9/123
Statistics and different senses
Statistical Data
Numerical or quantitative aspects
Statistical Methods
Collect, organize /classify, present, analyze and
interpret
-
7/28/2019 Business+Statistics
10/123
Functions of Statistical Methods
Data Collection
Organize: segregate/condense
Presentation: orderly manner: graphs/charts
Analysis
Interpretation
examples
-
7/28/2019 Business+Statistics
11/123
Statistics:
Characteristics of Data: Common to refer
data in quantitative form as Data.
Not all numerical data is statistical.
For numerical description to be statistics: Aggregate of facts
Affected to a marked extend by multiplicity of causes
(controllable/uncontrollable)
Enumerated or estimated according to reasonable standard of
accuracy.
Collected in a systematic manner for a pre-determined purpose.
Placed in relation to each other
Numerically expressed
-
7/28/2019 Business+Statistics
12/123
Types of Statistical data
Secondary
Primary
-
7/28/2019 Business+Statistics
13/123
OR : a mathematical model to represent the
situation under study.
Helps to:
Either to predict the performance of a system
Or determine the action or control needed to optimize the
performance.
-
7/28/2019 Business+Statistics
14/123
Classification of Statistical Methods
into three categories Descriptive Statistics
Data Collection
Presentation
Inductive statistics
Statistical inference
Estimation
Statistical decision Theory
Analysis of business Decision
-
7/28/2019 Business+Statistics
15/123
Descriptive Statistics
Used for re-arranging, grouping, and summarizing
sets of data
Changes in price index,
Yield by wheat using different charts and graphs
having large quantities of numerical data for easy
understanding
Various types of averages, central tendency and dispersion,trends, index numbers.
-
7/28/2019 Business+Statistics
16/123
Inductive Statistics
The development of some criteria which can be usedto derive info about the nature of entire population
or universe from the nature of the small sample.
Include : probability, probability distribution, sampling and sampling
distribution,
various methods of testing hypothesis :correlation, regression,
factor analysis, time series analysis.
-
7/28/2019 Business+Statistics
17/123
Statistical Decision Theory; 4 different
states of decision environment
State of decision and Consequence
Certainty: Deterministic
Risk: ProbabilisticUncertainty: Unknown
Conflict: Influenced by an opponent
Subjective approach (uses probabilities)
Also known as Bayesian approach,
-
7/28/2019 Business+Statistics
18/123
Models in OR
Based on Purpose: Descriptive: behavior of a system ( Behavior of demand of an inventory item)
Explanatory, : Explain behavior with relationships( wages, promotion policy,)
Predictive: predict stock prices for given any level of earning per share.
Prescriptive (normative): norms for comparison of alternate solutions
(Allocation). Based on Degree of Abstraction Physical, Graphic, Schematic, Analog, Mathematical
Based on Degree of certainty, and risk Deterministic: Linear programming, transportation and assignment models
Probabilistic: simulation models, decision theory Based on Specified behavior characteristics
Static, Dynamic, Linear, Non-linear
Based on Procedure (method) of solution Analytical, Simulation
-
7/28/2019 Business+Statistics
19/123
Classification of models help in
understanding the nature and role of
models Abstract or
Physical Static : linear programming
Dynamic model
Linear or non-linear
Stable,
unstable
unstable( Constrained)
Unstable (explosive)
Transient steady state,
Transient (non existent)
Ref:
-
7/28/2019 Business+Statistics
20/123
Various Statistical Techniques Measure of Central tendency
Measure of Dispersion:
Correlation
Regression analysis:
Time Series Analysis
Index Numbers
Sampling and Statistical Inference
-
7/28/2019 Business+Statistics
21/123
Measure of Central tendency
Mean: common arithmetic average
Divide the sum of the values of observation s by number of items observed.
Median:
Item lies exactly half way between the lowest and highest values
when they are arranged in ascending/descending order. Not
affected by value of observation
Divides the number of households into two equal parts.
(50% of all households have income below median income)
Mode:
Category that has max number of observation, (that occurs more
frequently)
-
7/28/2019 Business+Statistics
22/123
Measure of Dispersion:
spread away from central tendency
(mean/mode/median) :
Range, mean deviation, Standard deviation.
The data spread in symmetrical or asymmetrical
pattern: skewness
Frequency distribution in the shape of a peak:
measure called: Kurtosis
-
7/28/2019 Business+Statistics
23/123
Correlation
Dependent variable associated with changes
in other independent variable.
Sales as depended variable and advertisingbudget as an independent.
Could be casual or causal relationships
-
7/28/2019 Business+Statistics
24/123
Regression analysis:
determining casual relationship between
two variables
Use of Multi-variate statistical techniques for
determining casual relationships involving two or
more variables:
Multi-regression analysis, Discriminant analysis, factor
analysis
-
7/28/2019 Business+Statistics
25/123
Time Series Analysis
A set of data (arranged in some desired manner)recorded either at successive points in time or over
successive periods of time.
The changes considered as a resultant of combinedaffect of a force
The force components:
Editing time series data
Secular trend
Periodic changes (cyclical/seasonal variations)
Irregular or random variators.
Cost of living, growth of agricultural /food production, seasonalrequirements of items, impact of war, strikes
-
7/28/2019 Business+Statistics
26/123
Index Numbers: a relative number
representing net result of change in a group
of variables Stated in percentages
given or current year, and base year
production, sales price, volume of employment,
-
7/28/2019 Business+Statistics
27/123
Sampling and Statistical Inference
Sampling for reasons Schemes for drawing samples are classified as :
Random Sampling Schemes
Every element has an equal chance (probability) of beingselected
Non-random sampling schemes
Drawing samples based on choice or purpose of selectors
Sampling analysis using various tests :
Z normal distribution
Students t distribution,
F distribution
X^2 distribution
-
7/28/2019 Business+Statistics
28/123
Advantages to Management
Definiteness
Condensation
Comparison
Formulation of policies
Formulating and testing hypothesis
Prediction
-
7/28/2019 Business+Statistics
29/123
Application of techniques in Business
and Management Management
Marketing
Production
Finance, accounting and Investment
Personnel
Economics
Research and Development
Natural science
-
7/28/2019 Business+Statistics
30/123
Marketing
Marketing research info
Building and maintaining an extensive
market
Sales forecasting
-
7/28/2019 Business+Statistics
31/123
Production
PPC and analysis
Machine performance evaluation
QC
Inventory control
-
7/28/2019 Business+Statistics
32/123
Finance, accounting and Investments
Financial forecast, budget preparation
Fin Investment decision
Selection of securities
Auditing function
Credit policies, credit risk, delinquent
account
-
7/28/2019 Business+Statistics
33/123
Personnel
Labour turnover rate
Employment trends
Performance appraisal
Wage rates and incentive plans
-
7/28/2019 Business+Statistics
34/123
Economics
Measurement of Gross National Product and input-output analysis
Determination of business cycles, seasonal
fluctuations Comparison of market price, cost and profit of
individual firm
Analysis of population, Operational studies of Public utilities
Formulation of appropriate economic policies and
evaluation of their effects
-
7/28/2019 Business+Statistics
35/123
Research and Development
Development of new product lines
Optimal use of resources
Evaluation of existing products
-
7/28/2019 Business+Statistics
36/123
Natural science
Diagnosing based on inputs
Efficacy of certain drugs
Study of plant life
-
7/28/2019 Business+Statistics
37/123
Exercise/ Assignments
1. Comment on the statement: Statistics arenumerical statements of facts, but all factsnumerically stated are not statistics
2. Explain the distinction between : Descriptiveand Prescriptive models
1. Presentation topic:1. Formulate a business problem and analyze it by
applying the major phases of statistics
-
7/28/2019 Business+Statistics
38/123
Functions and Progressions
-
7/28/2019 Business+Statistics
39/123
Learning Objectives:
Insight into different aspects of the types of functional
relationships among business variables
Their applications in various fields of management
Need to Identify/define relationships among business
variables
Define functional relationships
Various types of functional relationships
Use of graph to depict functional relationships
Managerial applicability
Progression and application..
-
7/28/2019 Business+Statistics
40/123
Introduction
For decision problems which use mathematicaltools, the first requirement is to identify or formally
define all significant interactions or relationships
among primary factors (also called variables). The
relationships usually are stated in the form of an
equation or inequation.
Study mathematical problems in the context of
managerial problem
Definitions
-
7/28/2019 Business+Statistics
41/123
Definitions Variables: A variable is something whose magnitude can
vary or which can assume various values. Represented by
symbols (first letter of the name) Discrete variable: suspect to counting (houses, machines)
Continuous Variables: suspect to measurements (temp, height)
Constant and Parameters:
A constant: Remains fixed in the context of a given problem orsituation
An Absolute ( or numerical) Constant retains same value in all problems
Absolute ( or numerical) value of b is denoted by lbl regardless of its algebraicsign. lbl=l-bl
An Arbitrary (or parametric) constant or parameter retains same valuethroughout any particular problem, but may assume different values indifferent problems
P21 (ex1)
-
7/28/2019 Business+Statistics
42/123
Types of Function Linear Functions:
The power of independent variable is 1 A function with only one independent variable is called a Single variable function. (P21(1)
A single variable function can be linear or non-linear. (p 22)
A linear function with one variable can always be graphed in two dimensional plane (orspace). The graph of such functions is always a straight line.
(P22ex2
Polynomial functions: Polynomial function of degree 1 is called a linear function
Polynomial function of degree 2 is called a Quadratic function (p23-ab
Absolute Value Functions : ( p23(3
Inverse Function: (P 23 Step function: For different values of an independent variable x in an interval the
depended variable y=f(x) takes a constant value, but takes different values in diffintervals. (p24-5)
Algebraic and Transcendental functions
-
7/28/2019 Business+Statistics
43/123
Activity
P 25 activity B -1a&b assignment
-
7/28/2019 Business+Statistics
44/123
Business Application
Linear Function ( P27-ex3 assignment
Quadratic function ( P27-ex4 assignment
Activity D (Page 28-b_assignment
-
7/28/2019 Business+Statistics
45/123
Sequence and Series
If for every positive integer,n, --------related to somenumber-----sequence
Installment buying,
simple and compound interest problemsAnnuities and present values
Mortgage payments
-
7/28/2019 Business+Statistics
46/123
Arithmetic progression (AP)
Arithmetic progression: A sequence whose
term increases or decreases by a constant
number called Common difference of an APand is denoted by d
P29 ex6 assignment
-
7/28/2019 Business+Statistics
47/123
Geometric progression (GP)
A geometric progression: A sequence
whose term increases or decreases by a
constant ratio called Common ratio of anAP and is denoted by d
P29 ex7 assignment
P31 ex 8
-
7/28/2019 Business+Statistics
48/123
Concept of Maxima and Minima
with managerial applications Page 55 ex18 assignment
-
7/28/2019 Business+Statistics
49/123
Descriptive Statistics
Data Collection and analysis
-
7/28/2019 Business+Statistics
50/123
Contents
Collection of data:
Need and significance of data collection
Primary and secondary data
Different methods of collecting primary data
Edit primary data and know sources of secondary data and its use
Census versus sample
Classification and presentation of collected data
Treatment of data through central tendency measurements,
Deviations and different measures of variation.
-
7/28/2019 Business+Statistics
51/123
Introduction
The need for data collection
Statistical data is a set of facts expressed in
quantitative form.The use of facts expressed as measurable
quantities can help a decision maker to arrive at
better decisions.
-
7/28/2019 Business+Statistics
52/123
Primary and Secondary Data
Distinguish between Primary and------
-
7/28/2019 Business+Statistics
53/123
Methods of collecting Primary Data
Observation
Questionnaire
Personal interviewMail
Telephone
Designing/Preparing questionnaire
Pre-testing a questionnaire
Editing the primary data.
-
7/28/2019 Business+Statistics
54/123
Important points in Designing a
questionnaire Covering letter
Number of questions to be minim (15-40)
Simple, short, and unambiguous Sensitive and personal nature be avoided
Answer to questionnaires should not require
calculations Logical arrangement
Crosscheck and footnotes
-
7/28/2019 Business+Statistics
55/123
Editing Primary Data to ensure:
completeness
Consistency
Accuracy
Homogeneity
-
7/28/2019 Business+Statistics
56/123
Sources of secondary data
Published Sources
Unpublished Sources
-
7/28/2019 Business+Statistics
57/123
Precautions in use of secondary Data
Because of bias, inadequate sample size,
errors of definitions, computational errors
Hence to consider:Suitability
Reliability
Adequacy
-
7/28/2019 Business+Statistics
58/123
Census (complete enumeration) and
Sample Advantages and disadvantages of census
(Physical destruction)
-
7/28/2019 Business+Statistics
59/123
Exercises/Assignments
1. Distinguish between Primary and
Secondary data. Indicate the situations in
which each of these----?2. Distinguish between census and sampling
methods of data collection. Compare
merits/demerits. Why samplingunavoidable in certain situations.
-
7/28/2019 Business+Statistics
60/123
Presentation of Data
Presentation of Data
-
7/28/2019 Business+Statistics
61/123
Presentation of Data
Learning objectives
Understand the need and significance of presentation of dataNecessity of classifying data and various types of classification
Construct frequency distribution of discrete and continuous data
Frequency distribution in the form of :bar diagrams, histograms,
frequency polygon, and ogives
Classification
Discrete frequency Distribution Continuous frequency distribution
Choosing the classes
Cumulative and Relative frequencies
Charting data
Introduction
-
7/28/2019 Business+Statistics
62/123
Introduction
After the understanding various ways of data
collection:The successful use of Data collected depends on:
The manner in which it is arranged, displayed and summarized.
Presentation of data can be displayed either in tabular form orthrough charts
In tabular form , it is necessary to classify the data before the data is
tabulated. Hence to understand:
classification ,
tabulation and
charting of data.
Classification of data
-
7/28/2019 Business+Statistics
63/123
Classification of data
After the data has been systematically collected andedited,
The first step in presentation of data is Classification
Classification is the process of arranging the dataaccording to points of similarities and dissimilarities
-
7/28/2019 Business+Statistics
64/123
Principal objectives of classification
To condense the mass of data in such a way that
salient features can be easily noticed
To facilitate comparisons between attributes of
variables
To prepare data to be presented in tabular form
To highlight significant features of data at a glance
-
7/28/2019 Business+Statistics
65/123
Some Common Types of Classification
Geographical Classification Production of wheat state-wise
Chronological Classification Sales figures of a company for last six years
Qualitative Classification Dichotomous Classification
An attribute divided into two classes, one possessing and the other notpossessing it (basis of employment)
Manifold Classification : divided into several classes (educationallevel)
Quantitative Classification : according to characteristics thatcan be measured (employees as per monthly salaries) Discrete : limited to certain numerical value of a variable
Continuous: Take all values of the variable
-
7/28/2019 Business+Statistics
66/123
Examples
Chronological classification
Discrete frequency distribution
Continuous frequency distribution
P14,15
Construction of a Discrete Frequency
-
7/28/2019 Business+Statistics
67/123
Construction of a Discrete Frequency
distribution
Place all possible values of the variable in ascending orderin one column
Then prepare another column of Tally mark to count the
number of times a particular value of the variable isrepeated
To facilitate counting use blocks of 5 Tally marks with a spaceleft in-between blocks
The frequency column refers to numbers of tally marks, aparticular class will contain
p15
Construction of a Continuous
-
7/28/2019 Business+Statistics
68/123
Construction of a Continuous
Frequency distribution
Class limits: 60-69: lower and upper limits, lowestand highest
Class intervals: width, span or size20-10=10
Class frequency: The number of observation fallingwithin a particular class is called , class frequency or
frequency. Total frequency (sum of all frequencies)
indicate the total number of observations consideredin a given frequency distribution.
Class mid-point: sum of two successive lower points
divided by 2.
A i t
-
7/28/2019 Business+Statistics
69/123
Assignments
1. What do you understand by classification of data?
2. Why classification of data is required?
3. Illustrate the difference between qualitative andquantitative data.
Types of class interval: Methods
-
7/28/2019 Business+Statistics
70/123
Types of class interval: Methods
Exclusive and Inclusive (on whether upper limit is
included or excluded) ----(p16)
Openend (p17)
Generally opt for exclusive method
But If Inclusive is suggested, minor adjustments required
to determine class interval
Correction factor: Lower limit of second class-upper limit of
first class, divided by 2
Deduct the correction value from lower limit and add to upper
limit
Guidelines for choosing the class
-
7/28/2019 Business+Statistics
71/123
Guidelines for choosing the class
The number of classes should not be too small or too large
(5 to 15)
If possible Values of widths of interval should benumerically simple like 5, 10, 25 (values like3,7,9 beavoided
It is desirable to have classes of equal width, (classes withunequal class interval can be formed, like in incomedistribution)
The starting point of a class should begin with 0,5,10, ormultiples of. ( eg 3-13 not allowed)
Class interval should be determined, considering, min maxvalue and the number of classes to be formed
(p18)
-
7/28/2019 Business+Statistics
72/123
Activity
Distinguish between:
1. Discrete and continuous frequency
distribution2. Class limits and class intervals
3. Inclusive and exclusive methods
Cumulative and Relative frequencies
-
7/28/2019 Business+Statistics
73/123
Cumulative and Relative frequencies
Rather than listing the actual frequency opportunity
each class , it may be appropriate to list eithercumulative frequencies or relative frequencies orboth.
Cumulative frequencies: cumulates the frequencies,starting from either lowest or highest values. (p18-19)
Relative Frequencies: Very often, the frequencies in a
frequency distribution are converted to relativefrequencies to show percentage for each class. Thefrequency of class is divided by the total number ofobservations (total frequency).To get the percentage for
each class, multiply the relative frequency by 100. (p19)
Important advantages in looking at
-
7/28/2019 Business+Statistics
74/123
Important advantages in looking at
Relative frequencies (percentages)
1. Facilitates a comparison of two or more
sets of data.
2. Constitute the basis for understanding theconcept of probability.
-
7/28/2019 Business+Statistics
75/123
Activity
Explain the concept of relative frequency
-
7/28/2019 Business+Statistics
76/123
Charting of Data
-
7/28/2019 Business+Statistics
77/123
Bar diagram
-
7/28/2019 Business+Statistics
78/123
Bar diagram
Most popular
Example: Population, per capita income, sales and profits A bar is a thick line whose width is shown to attract the
viewer.
A bar diagram may be either vertical or horizontal.
DRAWING A BAR DIAGRAM:
Take characteristic (or attributes) under consideration on X-axis and thecorresponding value on the Y-axis. It is desirable to mention the valuedepicted by the bar on the top of the bar.
The gap between one bar and the other is kept equal.
Also width of bars are same.
The only difference is in length of the bars.
That is why this type of diagrams are known as one dimensional.
(P20)
Histograms
-
7/28/2019 Business+Statistics
79/123
g One of the most commonly used and easily understood
methods of graphic representation of frequency distribution.
A histogram is a series of rectangles having areas that are in
the same proportion as the frequencies of a frequency
distribution
CONSTRUCTING HISTOGRAM:
On horizontal axis or X-axis, we take class limits of variables, and on
vertical axis or Y-axis, we take frequencies of class intervals shown on
horizontal axis
If class intervals are of equal width, then the vertical bars of equal
widths.(P20-21)
On the other hand if the class intervals are unequal , the frequencies have to
be adjusted according to width of class interval (P 21-22)
-
7/28/2019 Business+Statistics
80/123
Activity
Draw a sketch of a histogram and a bar
diagram and explain the difference between
the two.
Frequency Polygon
-
7/28/2019 Business+Statistics
81/123
Frequency Polygon
A graphical presentation of frequency distribution
A polygon is a many sided closed figure, A frequency polygon is constructed by:
taking the mid points of upper horizontal points of each rectangle on the
histogram and
connecting these mid-points by straight lines. In order to close the polygon, an additional class is assumed at each end,
having zero frequency.
(p22-23)
The histogram is usually associated with discrete data and a frequency polygon
is appropriate for continuous data. (But the distinction is not always followed)
The frequency polygon and frequency curve have a special advantage over
histogram particularly when to compare two or more frequency distributions
-
7/28/2019 Business+Statistics
82/123
Activity
What is the procedure for making a
frequency polygon? Illustrate.
Ogives or Cumulative frequency Curve
-
7/28/2019 Business+Statistics
83/123
Ogives or Cumulative frequency Curve
A graphical presentation of a cumulative frequencydistribution .
There are two methods:
Less than ogive:
The upper limits of various classes are taken on X-axis, and frequencies
obtained by the process of cumulating the preceding frequencies on Y-
axis.By joining these points we get less than ogive
More than ogive.
By taking lower limits on X-axis and cumulative frequencies on the Y-axis.by joining these points we get more than ogive.
The shape of less than ogive curve will be a rising one,
Whereas the shape of more than ogive curve wood be a falling one
Activity
-
7/28/2019 Business+Statistics
84/123
Activity
With the help of an example , explain the
concept of less than ogive and more than ogive.
Types of Data
-
7/28/2019 Business+Statistics
85/123
yp
Data refers to known facts or things used as basis for
inference or reckoning.
Types of Data:
Qualitative: concerned with qualities and non-numerical
characteristics.
Quantitative: concerned with numerical characteristics.
Discrete: take only one of a range of distinct values (no of
employees). Continuous: take any value within a given range (time, length)
(P160-161BR)
The Concept of Level of Measurements
-
7/28/2019 Business+Statistics
86/123
The Concept of Level of Measurements
Scales of Measurement
Nominal level (Classificatory/ named) Data:
Ordinal level (Ranking/ordered) data:
Interval level (Numerical) data
Ratio level (Numerical) data: represent highest level ofprecision.
Nominal level (Classificatory/ named)
-
7/28/2019 Business+Statistics
87/123
Nominal level (Classificatory/ named)
Data:
And Implications for Data handlingMethodologies
Classification of data: Statements of equality or differences
(according to variable occupation)
Although mode could be used, very few statistics can be
applied to data collected in this form
Ordinal level (Ranking/ordered) data:
-
7/28/2019 Business+Statistics
88/123
( g )
And Implications for Data handling
Methodologies
Can be Classified in terms of of equality or differences
Permit you to order individual data and make decisions such as
this score is greater or lesser than another. (employee grades or
choices ranked)
Since arithmetic mean cannot be calculated , the use of many
other statistics are also excluded.
Interval level (Numerical) data
-
7/28/2019 Business+Statistics
89/123
( )
And Implications for Data handling
Methodologies
Have characteristics of both Nominal and Ordinal scales, but
also provides additional info regarding the degree of differencebetween individual data items within a set of group.
Most measures of human characteristics have interval
properties. (Interval between IQ Scores/ assignment marks)
However precision in interval scale is limited. Also somestatistics such as geometric mean are excluded from use with
data collected in this form.
Ratio level (Numerical) data: represent
-
7/28/2019 Business+Statistics
90/123
highest level of precision.
And Implications for Data handlingMethodologies
A Mathematical number system (height, weight, time)
Ratio Scale allow ratio as well as interval decision (allowing us
to say something is so many times big/bright/heavy)
Any statistics can be used on data collected in this form. (Some scales such as temp may appear to have ratio properties,
but in fact are only interval scales) (Centigrade)
Parametric and non-parametric methods
-
7/28/2019 Business+Statistics
91/123
p
(assumptions about parameters of the data)
Associated with every data analytic method, there isa set of assumptions that underlie the use of thatmethod.
t-test (to compare the means of two samples ofdata) as one of the most popular (p133-RM)
non-parametric methods; For research in social sciences in mind Valid for use with nominal or ordinal level.
For very small samples (less than n.=10), though the power ofany test weakens with very small samples.
-
7/28/2019 Business+Statistics
92/123
Measures of central Tendency
Measures of central Tendency
-
7/28/2019 Business+Statistics
93/123
y
Learning objectives:
Concept and significance of measures of central
tendency.
Computing: arithmetic mean, weighted arithmetic mean,
median, mode, geometric mean, and harmonic mean.
Computing several quantiles: quartiles, deciles, and
percentiles
Relationships among various averages.
Si ifi f f t l
-
7/28/2019 Business+Statistics
94/123
Significance of measure of central
tendency
The objective is to find one representative value
which can be used to locate and summarize the
entire set of varying values.
To find some central value around which the data
tend to cluster
Average income
Average sales figure may be compared with that of
another
Properties of a Good measure of central
-
7/28/2019 Business+Statistics
95/123
p
tendency
Easy to understand
Simple to compute
Based on all observations
Uniquely defined
Capable of further algebraic treatment
It should not be unduly affected by extreme
values.
Important measures of central tendency
-
7/28/2019 Business+Statistics
96/123
Important measures of central tendency
commonly used by Business and Industry.
arithmetic mean,
weighted arithmetic mean,median,
quantiles
mode,
geometric mean,
harmonic mean.
Arithmetic Mean
-
7/28/2019 Business+Statistics
97/123
Arithmetic Mean
(or Mean or Average)
In statistics term average refers to any of the measure of centraltendency
The Arithmetic mean is defined as being equal to the sum ofnumerical values of each and every observation divided by the totalnumbers of observations.
Eg; Average monthly salary ..ungrouped data
When observations are classified into a frequency distribution, Themidpoint of a class interval would be treated as the representativeaverage value of that class.
(P-31 .)
M th ti l ti f
-
7/28/2019 Business+Statistics
98/123
Mathmetical properties of
Arithmetic mean
The sum of deviations of observations from
AM is always zero
The sum of squared deviations ofobservations from the mean is minimum
Arithmetic means of several sets of data
may be combined into a single AM forcombined sets of data.
AM
-
7/28/2019 Business+Statistics
99/123
AM
Advantages:Easily computed
Readily understood
Almost all properties of a good measure of centraltendency.
DisadvantagesDistorted by Extreme values
Open end distribution and assigning midpoint value.
Weighted Arithmetic mean
-
7/28/2019 Business+Statistics
100/123
Weighted Arithmetic mean
Arithmetic mean gives equal importance (or weight)to each observation. In some cases all observations
do not have same importance
Useful in problems relating to construction of index
numbers.
P33,34
Median
-
7/28/2019 Business+Statistics
101/123
Divides the distribution into two equal parts.
50% of the observations in distribution are above the
value of median -------
The median is the value of the middle observation
when the series is arranged in
P34,,35
Mathematical Property of Median
-
7/28/2019 Business+Statistics
102/123
Sum of absolute deviations about the median is minimum
Easy to determine and easy to explain Affected by number of observations and not by value of
observation, hence less distorted as a representative value
than AM
It may be computed for an open- end distribution
Disadvantages:
Less familiar than AM As a positional average its values are not determined by each and every
observation.
Not capable of algebraic treatment
Quantiles
-
7/28/2019 Business+Statistics
103/123
Related positional measures of central tendency
The most familiar quantiles are
Quartiles:
Values which divide the total data into 4 equal parts
Since 3 points divide the distribution into 4 equal parts, we have 3 quartile.Q1(25% of observations are smaller and ----), Q2,Q3
Deciles Values which divide the total data into ten equal parts. Since 9 points divide
the distribution into 10 equal parts, we have 9 Deciles denoted as D1, D2----D9
Percentiles:
Values which divide the total data into 100 equal parts. Since 9 9pointsdivide the distribution into 100 equal parts, we have 99 percentiles denotedas P1, P2----P99
P36,37
-
7/28/2019 Business+Statistics
104/123
Locating Quantiles graphically:
To locate median graphically, draw less than ogive(cumulative frequency curve),
Take variables on X axis and frequency on Y axis
Determine median value by locating N/2 observation on Yaxis,
Draw a horizo line to cum freq curve
From where it meets, draw perp to X axis
The point where it meets X axis is the median value.
Same way values of Q1---, D1---,P1---, etc can be found
p38
MODE
-
7/28/2019 Business+Statistics
105/123
MODE Most commonly observed value in a set of data-----
P39
Locating the mode graphically
Construct a histogram
p40
Relationship among Mean Median
-
7/28/2019 Business+Statistics
106/123
Relationship among Mean, Median
and Mode
A distribution in which mean, median and mode coincide is
known as Symmetrical (bell shaped) distribution
If a distribution is skewed, ( not symmetrical), then mean,
median and mode are not equal.
In a moderately skewed distribution, distance between mean
and median is approx , one third the distance between mean
and mode Mode=3median-2mean
p41
Geometric Mean
-
7/28/2019 Business+Statistics
107/123
Geometric mean like arithmetic mean is acalculated average.
Very useful in averaging ratios and percentages.
Also in determining the rate of increase or decrease
Also capable of further algebraic treatment
GM is more difficult to compute and interpret
Cannot be computed if any observation has either a value
zero or negative observations
Harmonic Mean
-
7/28/2019 Business+Statistics
108/123
A measure of central tendency for data expressed
as rates (km/hr, tonnes/day , Km/ltre)
Defined as the reciprocal of arithmetic mean of
reciprocal of individual observations.
Harmonic mean like arithmetic mean and geometricmean is computed from each and every observations
It is specially used for averaging rates
Cannot be computed when on or more observations have zero
value or when there are both positive and negative
observations
In dealing with business problems rarely used.
-
7/28/2019 Business+Statistics
109/123
Measure of Variation( Dispersion)
-
7/28/2019 Business+Statistics
110/123
( p )
A measure of variation (dispersion) describes the
spread or scattering of the individual values around
the central value.
Illustration (p47)
Significance of Measuring variation
-
7/28/2019 Business+Statistics
111/123
1. Determines the reliability of an average by
pointing out as to how far an average is
representative of the entire data.
2. Determine nature and cause of variation in-order to
control the variation itself
3. Enable comparisons of two or more distributions
with regard to their variability.
4. Measuring variability is of great importance to
advanced statistical analysis. (like in sampling or
statistical inference)
Properties of a Good measure of variation
-
7/28/2019 Business+Statistics
112/123
p
Should possess, as far as possible same properties as
those of a good measure of central tendency.
Some of the well known measures of variation
which provide a numerical index of the variability ofthe given data are:
Range
Average or mean deviation
Quartile Deviation or Semi-Interquartile range
Standard deviation
Absolute and Relative measures of
-
7/28/2019 Business+Statistics
113/123
variation
Measures of Absolute variation are expressed in
terms of the original data.
In cases two sets of data are expressed in different
units of measurement, then the absolute measures ofvariation are not comparable. In such cases
measures of relative variation are used. Also in
cases:Comparison between two sets of data having the same
unit of measurement, but with different means.
Range
-
7/28/2019 Business+Statistics
114/123
Difference between the highest (numerically large ) value and thelowest value in a set of data.
R=H-L Range is very easy to calculate and gives us some idea about the
variability of data.
However, the range is a crude measure of variation , as it uses only
two extreme values. Concept of range utilized in SQC, in studying variations in prices of shares anddebentures and other commodities that are very sensitive to price changes fromone period to another. Also a good indicator in weather forecast
For grouped data, the range may be approximated as difference
between upper limit of the largest class and the lower limit of thelowest class.
The relative measure corresponding to range, called the coefficient ofrange , is obtained by applying formula
P48,49
Quartile deviation or
Semi interquartile range
-
7/28/2019 Business+Statistics
115/123
Semi-interquartile range Computed by taking the averages of the difference
between the third quartile and the first quartile.
The relative measure corresponding to quartile
deviation, called coefficient of quartile deviation.
QD is superior to range as it is not based on two extreme
values, but rather on middle 50% observations.
Another advantage of QD is that it is the only measure of
variability which can be used for open-end distribution. The disadvantage is that it ignores the first and last 25%
observations.
P49,50
Average Deviation
or Mean Deviation
-
7/28/2019 Business+Statistics
116/123
or Mean Deviation
Is an improvement over the previous two measures in that it considersall observations in the given set of data.
This measure is computed as a mean of deviations from mean or themedian.
All deviations are treated as positive regardless of sign.
Theoretically, there is an advantage in taking the deviations frommedian, because, the sum of absolute deviations from median isminimum. However, in actual practice, the arithmetic mean is more
popular.
The relative measure corresponding to the average deviation, calledcoefficient of average deviation is obtained by dividing averagedeviation by the particular average used in computing the averagedeviation. (Mean or median)
p51
Advantages and disadvantages
(of Average Deviation)
-
7/28/2019 Business+Statistics
117/123
(of Average Deviation) Though a good measure of variability, its use is
limited,
If only to measure and compare variability among
several sets of data, the AD may be used.
Major disadvantage is its lack of mathematical
properties. This is more so because non-use of signs in
its calculations make it algebraically inconsistent.
Standard Deviation
-
7/28/2019 Business+Statistics
118/123
Most widely used and important measure of variation.
(In computing average deviation , the signs are ignored). The stddeviation overcomes this problem, by squaring the deviations, whichmakes them all positive.
The std deviation, also known as root mean square deviation.
The square of Std Deviation is called variance
The Std Deviation and variance becomes larger as the variability or spreadwithin the data becomes greater.
It is readily comparable with other Std deviations, and greater the Std Deviation,greater the variability.
The Std deviation is commonly used to measure variability,
While other measures have special uses, It is the only measure possessing the necessary mathematical properties to make
it useful for advanced statistical work.
p53
C ffi i f i i (C )
-
7/28/2019 Business+Statistics
119/123
Coefficient of Variation (C.V)
Frequently used relative measure of
variation .
This measure is simply the ratio of stddeviation to mean expressed as percentage.
p54
Skewness
-
7/28/2019 Business+Statistics
120/123
The measure of central tendency and variation do
not reveal all characteristics of a given set of data
Two distributions having same mean and Std
deviation, may differ widely in the shape of their
distribution.
Distribution of data is symmetrical or not (asymmetrical
or skewed)
Thus the skewness refers to lack of symmetry indistribution
Method of detection of skewness is to
id th t il f di t ib ti
-
7/28/2019 Business+Statistics
121/123
consider the tail of distribution
Symmetrical distribution:No extreme values in a particular direction, so that low and high
values balance each other.
Mean=median=mode
Negatively skewed distribution
Longer tail towards lower value, or left hand side, the skewness is
negative. The mean is decreased by some extremely low values.
Positively skewed Distribution Longer tail of distribution towards higher values, or right handside, the skewness is positive. The mean is increased by some
unusually high values.
p55
R l i k
-
7/28/2019 Business+Statistics
122/123
Relative skewness
In order to make comparisons between the
skewness in two or more distributions, the
coefficient of skewness (Karl Pearson method, Bowleys methods )
In practice the value of coefficient ofSkewness , SK may be between +-1
-
7/28/2019 Business+Statistics
123/123