Business+Statistics

download Business+Statistics

of 123

Transcript of Business+Statistics

  • 7/28/2019 Business+Statistics

    1/123

    Business Statistics

  • 7/28/2019 Business+Statistics

    2/123

    Contents

    1. Meaning and Scope

    2. Collection of Data3. Classification and tabulation

    4. Diagrammatic and Graphic Representation

    5. Averages

    6. Dispersion

    7. Skewness and Kurtosis8. Correlation

    9. Linear Regression Analysis

    10. Index Numbers

    11. Time series Analysis

    12. Theory of Probability13. Random Variable, Probability Distribution and Mathematical expectation

    14. Theoretical Distributions

    15. Sampling Theory and Design of sample Surveys

    16. Interpolation and Extrapolation

  • 7/28/2019 Business+Statistics

    3/123

    Quantitative Decision Making

  • 7/28/2019 Business+Statistics

    4/123

    Learning Objectives

    Basic Statistics and its application in dayto-day lifeof a Manager

    Various aspects of quantitative techniques and their applicationin Decision making

    Also frequently used models of Statistical analysis

    Understand:

    Complexity of Managerial decisions

    Quantitative Techniques

    Need of using Quantitative approach in decisions

    Role of statistical methods in data analysis

    Brief idea of various statistical methods

    Know the areas of applications of quantitative approach in businessand management.

  • 7/28/2019 Business+Statistics

    5/123

    Introduction

    Individual business prior to Industrial revolutionand need for info----Decisions based on past

    experience and intuition.

    Marketing of productsTest marketing of products

    The manager (also the owner)

    Progress of work

    Any other fact the owner needed to know

  • 7/28/2019 Business+Statistics

    6/123

    Intuition alone has no place in

    decision making Becomes highly questionable when decisions

    involve the choice among several courses of action

    each of which can achieve several managementobjectives simultaneously.

  • 7/28/2019 Business+Statistics

    7/123

    Statistical methods used in

    Marketing, Finance, Production and

    personnel Also in:

    Regional planning

    Transportation

    Public health

    Communication

    Military

    agriculture

  • 7/28/2019 Business+Statistics

    8/123

    QT: A group of statistical , and OR

    (programming) Techniques

    QT approach in decision making :

    Problems be defined, analyzed and solved in a conscious,

    rational, systematic, scientific manner based on ;

    Data, facts, info, and logic (and not whims and guesses)

    QT provides decision maker a scientific method based on quantitative

    data in identifying a course of action to achieve the optimal value of the

    predetermined objective or goal.

    Usage of numbers , symbols or mathematical formulae are used to

    represent the models of reality.

  • 7/28/2019 Business+Statistics

    9/123

    Statistics and different senses

    Statistical Data

    Numerical or quantitative aspects

    Statistical Methods

    Collect, organize /classify, present, analyze and

    interpret

  • 7/28/2019 Business+Statistics

    10/123

    Functions of Statistical Methods

    Data Collection

    Organize: segregate/condense

    Presentation: orderly manner: graphs/charts

    Analysis

    Interpretation

    examples

  • 7/28/2019 Business+Statistics

    11/123

    Statistics:

    Characteristics of Data: Common to refer

    data in quantitative form as Data.

    Not all numerical data is statistical.

    For numerical description to be statistics: Aggregate of facts

    Affected to a marked extend by multiplicity of causes

    (controllable/uncontrollable)

    Enumerated or estimated according to reasonable standard of

    accuracy.

    Collected in a systematic manner for a pre-determined purpose.

    Placed in relation to each other

    Numerically expressed

  • 7/28/2019 Business+Statistics

    12/123

    Types of Statistical data

    Secondary

    Primary

  • 7/28/2019 Business+Statistics

    13/123

    OR : a mathematical model to represent the

    situation under study.

    Helps to:

    Either to predict the performance of a system

    Or determine the action or control needed to optimize the

    performance.

  • 7/28/2019 Business+Statistics

    14/123

    Classification of Statistical Methods

    into three categories Descriptive Statistics

    Data Collection

    Presentation

    Inductive statistics

    Statistical inference

    Estimation

    Statistical decision Theory

    Analysis of business Decision

  • 7/28/2019 Business+Statistics

    15/123

    Descriptive Statistics

    Used for re-arranging, grouping, and summarizing

    sets of data

    Changes in price index,

    Yield by wheat using different charts and graphs

    having large quantities of numerical data for easy

    understanding

    Various types of averages, central tendency and dispersion,trends, index numbers.

  • 7/28/2019 Business+Statistics

    16/123

    Inductive Statistics

    The development of some criteria which can be usedto derive info about the nature of entire population

    or universe from the nature of the small sample.

    Include : probability, probability distribution, sampling and sampling

    distribution,

    various methods of testing hypothesis :correlation, regression,

    factor analysis, time series analysis.

  • 7/28/2019 Business+Statistics

    17/123

    Statistical Decision Theory; 4 different

    states of decision environment

    State of decision and Consequence

    Certainty: Deterministic

    Risk: ProbabilisticUncertainty: Unknown

    Conflict: Influenced by an opponent

    Subjective approach (uses probabilities)

    Also known as Bayesian approach,

  • 7/28/2019 Business+Statistics

    18/123

    Models in OR

    Based on Purpose: Descriptive: behavior of a system ( Behavior of demand of an inventory item)

    Explanatory, : Explain behavior with relationships( wages, promotion policy,)

    Predictive: predict stock prices for given any level of earning per share.

    Prescriptive (normative): norms for comparison of alternate solutions

    (Allocation). Based on Degree of Abstraction Physical, Graphic, Schematic, Analog, Mathematical

    Based on Degree of certainty, and risk Deterministic: Linear programming, transportation and assignment models

    Probabilistic: simulation models, decision theory Based on Specified behavior characteristics

    Static, Dynamic, Linear, Non-linear

    Based on Procedure (method) of solution Analytical, Simulation

  • 7/28/2019 Business+Statistics

    19/123

    Classification of models help in

    understanding the nature and role of

    models Abstract or

    Physical Static : linear programming

    Dynamic model

    Linear or non-linear

    Stable,

    unstable

    unstable( Constrained)

    Unstable (explosive)

    Transient steady state,

    Transient (non existent)

    Ref:

  • 7/28/2019 Business+Statistics

    20/123

    Various Statistical Techniques Measure of Central tendency

    Measure of Dispersion:

    Correlation

    Regression analysis:

    Time Series Analysis

    Index Numbers

    Sampling and Statistical Inference

  • 7/28/2019 Business+Statistics

    21/123

    Measure of Central tendency

    Mean: common arithmetic average

    Divide the sum of the values of observation s by number of items observed.

    Median:

    Item lies exactly half way between the lowest and highest values

    when they are arranged in ascending/descending order. Not

    affected by value of observation

    Divides the number of households into two equal parts.

    (50% of all households have income below median income)

    Mode:

    Category that has max number of observation, (that occurs more

    frequently)

  • 7/28/2019 Business+Statistics

    22/123

    Measure of Dispersion:

    spread away from central tendency

    (mean/mode/median) :

    Range, mean deviation, Standard deviation.

    The data spread in symmetrical or asymmetrical

    pattern: skewness

    Frequency distribution in the shape of a peak:

    measure called: Kurtosis

  • 7/28/2019 Business+Statistics

    23/123

    Correlation

    Dependent variable associated with changes

    in other independent variable.

    Sales as depended variable and advertisingbudget as an independent.

    Could be casual or causal relationships

  • 7/28/2019 Business+Statistics

    24/123

    Regression analysis:

    determining casual relationship between

    two variables

    Use of Multi-variate statistical techniques for

    determining casual relationships involving two or

    more variables:

    Multi-regression analysis, Discriminant analysis, factor

    analysis

  • 7/28/2019 Business+Statistics

    25/123

    Time Series Analysis

    A set of data (arranged in some desired manner)recorded either at successive points in time or over

    successive periods of time.

    The changes considered as a resultant of combinedaffect of a force

    The force components:

    Editing time series data

    Secular trend

    Periodic changes (cyclical/seasonal variations)

    Irregular or random variators.

    Cost of living, growth of agricultural /food production, seasonalrequirements of items, impact of war, strikes

  • 7/28/2019 Business+Statistics

    26/123

    Index Numbers: a relative number

    representing net result of change in a group

    of variables Stated in percentages

    given or current year, and base year

    production, sales price, volume of employment,

  • 7/28/2019 Business+Statistics

    27/123

    Sampling and Statistical Inference

    Sampling for reasons Schemes for drawing samples are classified as :

    Random Sampling Schemes

    Every element has an equal chance (probability) of beingselected

    Non-random sampling schemes

    Drawing samples based on choice or purpose of selectors

    Sampling analysis using various tests :

    Z normal distribution

    Students t distribution,

    F distribution

    X^2 distribution

  • 7/28/2019 Business+Statistics

    28/123

    Advantages to Management

    Definiteness

    Condensation

    Comparison

    Formulation of policies

    Formulating and testing hypothesis

    Prediction

  • 7/28/2019 Business+Statistics

    29/123

    Application of techniques in Business

    and Management Management

    Marketing

    Production

    Finance, accounting and Investment

    Personnel

    Economics

    Research and Development

    Natural science

  • 7/28/2019 Business+Statistics

    30/123

    Marketing

    Marketing research info

    Building and maintaining an extensive

    market

    Sales forecasting

  • 7/28/2019 Business+Statistics

    31/123

    Production

    PPC and analysis

    Machine performance evaluation

    QC

    Inventory control

  • 7/28/2019 Business+Statistics

    32/123

    Finance, accounting and Investments

    Financial forecast, budget preparation

    Fin Investment decision

    Selection of securities

    Auditing function

    Credit policies, credit risk, delinquent

    account

  • 7/28/2019 Business+Statistics

    33/123

    Personnel

    Labour turnover rate

    Employment trends

    Performance appraisal

    Wage rates and incentive plans

  • 7/28/2019 Business+Statistics

    34/123

    Economics

    Measurement of Gross National Product and input-output analysis

    Determination of business cycles, seasonal

    fluctuations Comparison of market price, cost and profit of

    individual firm

    Analysis of population, Operational studies of Public utilities

    Formulation of appropriate economic policies and

    evaluation of their effects

  • 7/28/2019 Business+Statistics

    35/123

    Research and Development

    Development of new product lines

    Optimal use of resources

    Evaluation of existing products

  • 7/28/2019 Business+Statistics

    36/123

    Natural science

    Diagnosing based on inputs

    Efficacy of certain drugs

    Study of plant life

  • 7/28/2019 Business+Statistics

    37/123

    Exercise/ Assignments

    1. Comment on the statement: Statistics arenumerical statements of facts, but all factsnumerically stated are not statistics

    2. Explain the distinction between : Descriptiveand Prescriptive models

    1. Presentation topic:1. Formulate a business problem and analyze it by

    applying the major phases of statistics

  • 7/28/2019 Business+Statistics

    38/123

    Functions and Progressions

  • 7/28/2019 Business+Statistics

    39/123

    Learning Objectives:

    Insight into different aspects of the types of functional

    relationships among business variables

    Their applications in various fields of management

    Need to Identify/define relationships among business

    variables

    Define functional relationships

    Various types of functional relationships

    Use of graph to depict functional relationships

    Managerial applicability

    Progression and application..

  • 7/28/2019 Business+Statistics

    40/123

    Introduction

    For decision problems which use mathematicaltools, the first requirement is to identify or formally

    define all significant interactions or relationships

    among primary factors (also called variables). The

    relationships usually are stated in the form of an

    equation or inequation.

    Study mathematical problems in the context of

    managerial problem

    Definitions

  • 7/28/2019 Business+Statistics

    41/123

    Definitions Variables: A variable is something whose magnitude can

    vary or which can assume various values. Represented by

    symbols (first letter of the name) Discrete variable: suspect to counting (houses, machines)

    Continuous Variables: suspect to measurements (temp, height)

    Constant and Parameters:

    A constant: Remains fixed in the context of a given problem orsituation

    An Absolute ( or numerical) Constant retains same value in all problems

    Absolute ( or numerical) value of b is denoted by lbl regardless of its algebraicsign. lbl=l-bl

    An Arbitrary (or parametric) constant or parameter retains same valuethroughout any particular problem, but may assume different values indifferent problems

    P21 (ex1)

  • 7/28/2019 Business+Statistics

    42/123

    Types of Function Linear Functions:

    The power of independent variable is 1 A function with only one independent variable is called a Single variable function. (P21(1)

    A single variable function can be linear or non-linear. (p 22)

    A linear function with one variable can always be graphed in two dimensional plane (orspace). The graph of such functions is always a straight line.

    (P22ex2

    Polynomial functions: Polynomial function of degree 1 is called a linear function

    Polynomial function of degree 2 is called a Quadratic function (p23-ab

    Absolute Value Functions : ( p23(3

    Inverse Function: (P 23 Step function: For different values of an independent variable x in an interval the

    depended variable y=f(x) takes a constant value, but takes different values in diffintervals. (p24-5)

    Algebraic and Transcendental functions

  • 7/28/2019 Business+Statistics

    43/123

    Activity

    P 25 activity B -1a&b assignment

  • 7/28/2019 Business+Statistics

    44/123

    Business Application

    Linear Function ( P27-ex3 assignment

    Quadratic function ( P27-ex4 assignment

    Activity D (Page 28-b_assignment

  • 7/28/2019 Business+Statistics

    45/123

    Sequence and Series

    If for every positive integer,n, --------related to somenumber-----sequence

    Installment buying,

    simple and compound interest problemsAnnuities and present values

    Mortgage payments

  • 7/28/2019 Business+Statistics

    46/123

    Arithmetic progression (AP)

    Arithmetic progression: A sequence whose

    term increases or decreases by a constant

    number called Common difference of an APand is denoted by d

    P29 ex6 assignment

  • 7/28/2019 Business+Statistics

    47/123

    Geometric progression (GP)

    A geometric progression: A sequence

    whose term increases or decreases by a

    constant ratio called Common ratio of anAP and is denoted by d

    P29 ex7 assignment

    P31 ex 8

  • 7/28/2019 Business+Statistics

    48/123

    Concept of Maxima and Minima

    with managerial applications Page 55 ex18 assignment

  • 7/28/2019 Business+Statistics

    49/123

    Descriptive Statistics

    Data Collection and analysis

  • 7/28/2019 Business+Statistics

    50/123

    Contents

    Collection of data:

    Need and significance of data collection

    Primary and secondary data

    Different methods of collecting primary data

    Edit primary data and know sources of secondary data and its use

    Census versus sample

    Classification and presentation of collected data

    Treatment of data through central tendency measurements,

    Deviations and different measures of variation.

  • 7/28/2019 Business+Statistics

    51/123

    Introduction

    The need for data collection

    Statistical data is a set of facts expressed in

    quantitative form.The use of facts expressed as measurable

    quantities can help a decision maker to arrive at

    better decisions.

  • 7/28/2019 Business+Statistics

    52/123

    Primary and Secondary Data

    Distinguish between Primary and------

  • 7/28/2019 Business+Statistics

    53/123

    Methods of collecting Primary Data

    Observation

    Questionnaire

    Personal interviewMail

    Telephone

    Designing/Preparing questionnaire

    Pre-testing a questionnaire

    Editing the primary data.

  • 7/28/2019 Business+Statistics

    54/123

    Important points in Designing a

    questionnaire Covering letter

    Number of questions to be minim (15-40)

    Simple, short, and unambiguous Sensitive and personal nature be avoided

    Answer to questionnaires should not require

    calculations Logical arrangement

    Crosscheck and footnotes

  • 7/28/2019 Business+Statistics

    55/123

    Editing Primary Data to ensure:

    completeness

    Consistency

    Accuracy

    Homogeneity

  • 7/28/2019 Business+Statistics

    56/123

    Sources of secondary data

    Published Sources

    Unpublished Sources

  • 7/28/2019 Business+Statistics

    57/123

    Precautions in use of secondary Data

    Because of bias, inadequate sample size,

    errors of definitions, computational errors

    Hence to consider:Suitability

    Reliability

    Adequacy

  • 7/28/2019 Business+Statistics

    58/123

    Census (complete enumeration) and

    Sample Advantages and disadvantages of census

    (Physical destruction)

  • 7/28/2019 Business+Statistics

    59/123

    Exercises/Assignments

    1. Distinguish between Primary and

    Secondary data. Indicate the situations in

    which each of these----?2. Distinguish between census and sampling

    methods of data collection. Compare

    merits/demerits. Why samplingunavoidable in certain situations.

  • 7/28/2019 Business+Statistics

    60/123

    Presentation of Data

    Presentation of Data

  • 7/28/2019 Business+Statistics

    61/123

    Presentation of Data

    Learning objectives

    Understand the need and significance of presentation of dataNecessity of classifying data and various types of classification

    Construct frequency distribution of discrete and continuous data

    Frequency distribution in the form of :bar diagrams, histograms,

    frequency polygon, and ogives

    Classification

    Discrete frequency Distribution Continuous frequency distribution

    Choosing the classes

    Cumulative and Relative frequencies

    Charting data

    Introduction

  • 7/28/2019 Business+Statistics

    62/123

    Introduction

    After the understanding various ways of data

    collection:The successful use of Data collected depends on:

    The manner in which it is arranged, displayed and summarized.

    Presentation of data can be displayed either in tabular form orthrough charts

    In tabular form , it is necessary to classify the data before the data is

    tabulated. Hence to understand:

    classification ,

    tabulation and

    charting of data.

    Classification of data

  • 7/28/2019 Business+Statistics

    63/123

    Classification of data

    After the data has been systematically collected andedited,

    The first step in presentation of data is Classification

    Classification is the process of arranging the dataaccording to points of similarities and dissimilarities

  • 7/28/2019 Business+Statistics

    64/123

    Principal objectives of classification

    To condense the mass of data in such a way that

    salient features can be easily noticed

    To facilitate comparisons between attributes of

    variables

    To prepare data to be presented in tabular form

    To highlight significant features of data at a glance

  • 7/28/2019 Business+Statistics

    65/123

    Some Common Types of Classification

    Geographical Classification Production of wheat state-wise

    Chronological Classification Sales figures of a company for last six years

    Qualitative Classification Dichotomous Classification

    An attribute divided into two classes, one possessing and the other notpossessing it (basis of employment)

    Manifold Classification : divided into several classes (educationallevel)

    Quantitative Classification : according to characteristics thatcan be measured (employees as per monthly salaries) Discrete : limited to certain numerical value of a variable

    Continuous: Take all values of the variable

  • 7/28/2019 Business+Statistics

    66/123

    Examples

    Chronological classification

    Discrete frequency distribution

    Continuous frequency distribution

    P14,15

    Construction of a Discrete Frequency

  • 7/28/2019 Business+Statistics

    67/123

    Construction of a Discrete Frequency

    distribution

    Place all possible values of the variable in ascending orderin one column

    Then prepare another column of Tally mark to count the

    number of times a particular value of the variable isrepeated

    To facilitate counting use blocks of 5 Tally marks with a spaceleft in-between blocks

    The frequency column refers to numbers of tally marks, aparticular class will contain

    p15

    Construction of a Continuous

  • 7/28/2019 Business+Statistics

    68/123

    Construction of a Continuous

    Frequency distribution

    Class limits: 60-69: lower and upper limits, lowestand highest

    Class intervals: width, span or size20-10=10

    Class frequency: The number of observation fallingwithin a particular class is called , class frequency or

    frequency. Total frequency (sum of all frequencies)

    indicate the total number of observations consideredin a given frequency distribution.

    Class mid-point: sum of two successive lower points

    divided by 2.

    A i t

  • 7/28/2019 Business+Statistics

    69/123

    Assignments

    1. What do you understand by classification of data?

    2. Why classification of data is required?

    3. Illustrate the difference between qualitative andquantitative data.

    Types of class interval: Methods

  • 7/28/2019 Business+Statistics

    70/123

    Types of class interval: Methods

    Exclusive and Inclusive (on whether upper limit is

    included or excluded) ----(p16)

    Openend (p17)

    Generally opt for exclusive method

    But If Inclusive is suggested, minor adjustments required

    to determine class interval

    Correction factor: Lower limit of second class-upper limit of

    first class, divided by 2

    Deduct the correction value from lower limit and add to upper

    limit

    Guidelines for choosing the class

  • 7/28/2019 Business+Statistics

    71/123

    Guidelines for choosing the class

    The number of classes should not be too small or too large

    (5 to 15)

    If possible Values of widths of interval should benumerically simple like 5, 10, 25 (values like3,7,9 beavoided

    It is desirable to have classes of equal width, (classes withunequal class interval can be formed, like in incomedistribution)

    The starting point of a class should begin with 0,5,10, ormultiples of. ( eg 3-13 not allowed)

    Class interval should be determined, considering, min maxvalue and the number of classes to be formed

    (p18)

  • 7/28/2019 Business+Statistics

    72/123

    Activity

    Distinguish between:

    1. Discrete and continuous frequency

    distribution2. Class limits and class intervals

    3. Inclusive and exclusive methods

    Cumulative and Relative frequencies

  • 7/28/2019 Business+Statistics

    73/123

    Cumulative and Relative frequencies

    Rather than listing the actual frequency opportunity

    each class , it may be appropriate to list eithercumulative frequencies or relative frequencies orboth.

    Cumulative frequencies: cumulates the frequencies,starting from either lowest or highest values. (p18-19)

    Relative Frequencies: Very often, the frequencies in a

    frequency distribution are converted to relativefrequencies to show percentage for each class. Thefrequency of class is divided by the total number ofobservations (total frequency).To get the percentage for

    each class, multiply the relative frequency by 100. (p19)

    Important advantages in looking at

  • 7/28/2019 Business+Statistics

    74/123

    Important advantages in looking at

    Relative frequencies (percentages)

    1. Facilitates a comparison of two or more

    sets of data.

    2. Constitute the basis for understanding theconcept of probability.

  • 7/28/2019 Business+Statistics

    75/123

    Activity

    Explain the concept of relative frequency

  • 7/28/2019 Business+Statistics

    76/123

    Charting of Data

  • 7/28/2019 Business+Statistics

    77/123

    Bar diagram

  • 7/28/2019 Business+Statistics

    78/123

    Bar diagram

    Most popular

    Example: Population, per capita income, sales and profits A bar is a thick line whose width is shown to attract the

    viewer.

    A bar diagram may be either vertical or horizontal.

    DRAWING A BAR DIAGRAM:

    Take characteristic (or attributes) under consideration on X-axis and thecorresponding value on the Y-axis. It is desirable to mention the valuedepicted by the bar on the top of the bar.

    The gap between one bar and the other is kept equal.

    Also width of bars are same.

    The only difference is in length of the bars.

    That is why this type of diagrams are known as one dimensional.

    (P20)

    Histograms

  • 7/28/2019 Business+Statistics

    79/123

    g One of the most commonly used and easily understood

    methods of graphic representation of frequency distribution.

    A histogram is a series of rectangles having areas that are in

    the same proportion as the frequencies of a frequency

    distribution

    CONSTRUCTING HISTOGRAM:

    On horizontal axis or X-axis, we take class limits of variables, and on

    vertical axis or Y-axis, we take frequencies of class intervals shown on

    horizontal axis

    If class intervals are of equal width, then the vertical bars of equal

    widths.(P20-21)

    On the other hand if the class intervals are unequal , the frequencies have to

    be adjusted according to width of class interval (P 21-22)

  • 7/28/2019 Business+Statistics

    80/123

    Activity

    Draw a sketch of a histogram and a bar

    diagram and explain the difference between

    the two.

    Frequency Polygon

  • 7/28/2019 Business+Statistics

    81/123

    Frequency Polygon

    A graphical presentation of frequency distribution

    A polygon is a many sided closed figure, A frequency polygon is constructed by:

    taking the mid points of upper horizontal points of each rectangle on the

    histogram and

    connecting these mid-points by straight lines. In order to close the polygon, an additional class is assumed at each end,

    having zero frequency.

    (p22-23)

    The histogram is usually associated with discrete data and a frequency polygon

    is appropriate for continuous data. (But the distinction is not always followed)

    The frequency polygon and frequency curve have a special advantage over

    histogram particularly when to compare two or more frequency distributions

  • 7/28/2019 Business+Statistics

    82/123

    Activity

    What is the procedure for making a

    frequency polygon? Illustrate.

    Ogives or Cumulative frequency Curve

  • 7/28/2019 Business+Statistics

    83/123

    Ogives or Cumulative frequency Curve

    A graphical presentation of a cumulative frequencydistribution .

    There are two methods:

    Less than ogive:

    The upper limits of various classes are taken on X-axis, and frequencies

    obtained by the process of cumulating the preceding frequencies on Y-

    axis.By joining these points we get less than ogive

    More than ogive.

    By taking lower limits on X-axis and cumulative frequencies on the Y-axis.by joining these points we get more than ogive.

    The shape of less than ogive curve will be a rising one,

    Whereas the shape of more than ogive curve wood be a falling one

    Activity

  • 7/28/2019 Business+Statistics

    84/123

    Activity

    With the help of an example , explain the

    concept of less than ogive and more than ogive.

    Types of Data

  • 7/28/2019 Business+Statistics

    85/123

    yp

    Data refers to known facts or things used as basis for

    inference or reckoning.

    Types of Data:

    Qualitative: concerned with qualities and non-numerical

    characteristics.

    Quantitative: concerned with numerical characteristics.

    Discrete: take only one of a range of distinct values (no of

    employees). Continuous: take any value within a given range (time, length)

    (P160-161BR)

    The Concept of Level of Measurements

  • 7/28/2019 Business+Statistics

    86/123

    The Concept of Level of Measurements

    Scales of Measurement

    Nominal level (Classificatory/ named) Data:

    Ordinal level (Ranking/ordered) data:

    Interval level (Numerical) data

    Ratio level (Numerical) data: represent highest level ofprecision.

    Nominal level (Classificatory/ named)

  • 7/28/2019 Business+Statistics

    87/123

    Nominal level (Classificatory/ named)

    Data:

    And Implications for Data handlingMethodologies

    Classification of data: Statements of equality or differences

    (according to variable occupation)

    Although mode could be used, very few statistics can be

    applied to data collected in this form

    Ordinal level (Ranking/ordered) data:

  • 7/28/2019 Business+Statistics

    88/123

    ( g )

    And Implications for Data handling

    Methodologies

    Can be Classified in terms of of equality or differences

    Permit you to order individual data and make decisions such as

    this score is greater or lesser than another. (employee grades or

    choices ranked)

    Since arithmetic mean cannot be calculated , the use of many

    other statistics are also excluded.

    Interval level (Numerical) data

  • 7/28/2019 Business+Statistics

    89/123

    ( )

    And Implications for Data handling

    Methodologies

    Have characteristics of both Nominal and Ordinal scales, but

    also provides additional info regarding the degree of differencebetween individual data items within a set of group.

    Most measures of human characteristics have interval

    properties. (Interval between IQ Scores/ assignment marks)

    However precision in interval scale is limited. Also somestatistics such as geometric mean are excluded from use with

    data collected in this form.

    Ratio level (Numerical) data: represent

  • 7/28/2019 Business+Statistics

    90/123

    highest level of precision.

    And Implications for Data handlingMethodologies

    A Mathematical number system (height, weight, time)

    Ratio Scale allow ratio as well as interval decision (allowing us

    to say something is so many times big/bright/heavy)

    Any statistics can be used on data collected in this form. (Some scales such as temp may appear to have ratio properties,

    but in fact are only interval scales) (Centigrade)

    Parametric and non-parametric methods

  • 7/28/2019 Business+Statistics

    91/123

    p

    (assumptions about parameters of the data)

    Associated with every data analytic method, there isa set of assumptions that underlie the use of thatmethod.

    t-test (to compare the means of two samples ofdata) as one of the most popular (p133-RM)

    non-parametric methods; For research in social sciences in mind Valid for use with nominal or ordinal level.

    For very small samples (less than n.=10), though the power ofany test weakens with very small samples.

  • 7/28/2019 Business+Statistics

    92/123

    Measures of central Tendency

    Measures of central Tendency

  • 7/28/2019 Business+Statistics

    93/123

    y

    Learning objectives:

    Concept and significance of measures of central

    tendency.

    Computing: arithmetic mean, weighted arithmetic mean,

    median, mode, geometric mean, and harmonic mean.

    Computing several quantiles: quartiles, deciles, and

    percentiles

    Relationships among various averages.

    Si ifi f f t l

  • 7/28/2019 Business+Statistics

    94/123

    Significance of measure of central

    tendency

    The objective is to find one representative value

    which can be used to locate and summarize the

    entire set of varying values.

    To find some central value around which the data

    tend to cluster

    Average income

    Average sales figure may be compared with that of

    another

    Properties of a Good measure of central

  • 7/28/2019 Business+Statistics

    95/123

    p

    tendency

    Easy to understand

    Simple to compute

    Based on all observations

    Uniquely defined

    Capable of further algebraic treatment

    It should not be unduly affected by extreme

    values.

    Important measures of central tendency

  • 7/28/2019 Business+Statistics

    96/123

    Important measures of central tendency

    commonly used by Business and Industry.

    arithmetic mean,

    weighted arithmetic mean,median,

    quantiles

    mode,

    geometric mean,

    harmonic mean.

    Arithmetic Mean

  • 7/28/2019 Business+Statistics

    97/123

    Arithmetic Mean

    (or Mean or Average)

    In statistics term average refers to any of the measure of centraltendency

    The Arithmetic mean is defined as being equal to the sum ofnumerical values of each and every observation divided by the totalnumbers of observations.

    Eg; Average monthly salary ..ungrouped data

    When observations are classified into a frequency distribution, Themidpoint of a class interval would be treated as the representativeaverage value of that class.

    (P-31 .)

    M th ti l ti f

  • 7/28/2019 Business+Statistics

    98/123

    Mathmetical properties of

    Arithmetic mean

    The sum of deviations of observations from

    AM is always zero

    The sum of squared deviations ofobservations from the mean is minimum

    Arithmetic means of several sets of data

    may be combined into a single AM forcombined sets of data.

    AM

  • 7/28/2019 Business+Statistics

    99/123

    AM

    Advantages:Easily computed

    Readily understood

    Almost all properties of a good measure of centraltendency.

    DisadvantagesDistorted by Extreme values

    Open end distribution and assigning midpoint value.

    Weighted Arithmetic mean

  • 7/28/2019 Business+Statistics

    100/123

    Weighted Arithmetic mean

    Arithmetic mean gives equal importance (or weight)to each observation. In some cases all observations

    do not have same importance

    Useful in problems relating to construction of index

    numbers.

    P33,34

    Median

  • 7/28/2019 Business+Statistics

    101/123

    Divides the distribution into two equal parts.

    50% of the observations in distribution are above the

    value of median -------

    The median is the value of the middle observation

    when the series is arranged in

    P34,,35

    Mathematical Property of Median

  • 7/28/2019 Business+Statistics

    102/123

    Sum of absolute deviations about the median is minimum

    Easy to determine and easy to explain Affected by number of observations and not by value of

    observation, hence less distorted as a representative value

    than AM

    It may be computed for an open- end distribution

    Disadvantages:

    Less familiar than AM As a positional average its values are not determined by each and every

    observation.

    Not capable of algebraic treatment

    Quantiles

  • 7/28/2019 Business+Statistics

    103/123

    Related positional measures of central tendency

    The most familiar quantiles are

    Quartiles:

    Values which divide the total data into 4 equal parts

    Since 3 points divide the distribution into 4 equal parts, we have 3 quartile.Q1(25% of observations are smaller and ----), Q2,Q3

    Deciles Values which divide the total data into ten equal parts. Since 9 points divide

    the distribution into 10 equal parts, we have 9 Deciles denoted as D1, D2----D9

    Percentiles:

    Values which divide the total data into 100 equal parts. Since 9 9pointsdivide the distribution into 100 equal parts, we have 99 percentiles denotedas P1, P2----P99

    P36,37

  • 7/28/2019 Business+Statistics

    104/123

    Locating Quantiles graphically:

    To locate median graphically, draw less than ogive(cumulative frequency curve),

    Take variables on X axis and frequency on Y axis

    Determine median value by locating N/2 observation on Yaxis,

    Draw a horizo line to cum freq curve

    From where it meets, draw perp to X axis

    The point where it meets X axis is the median value.

    Same way values of Q1---, D1---,P1---, etc can be found

    p38

    MODE

  • 7/28/2019 Business+Statistics

    105/123

    MODE Most commonly observed value in a set of data-----

    P39

    Locating the mode graphically

    Construct a histogram

    p40

    Relationship among Mean Median

  • 7/28/2019 Business+Statistics

    106/123

    Relationship among Mean, Median

    and Mode

    A distribution in which mean, median and mode coincide is

    known as Symmetrical (bell shaped) distribution

    If a distribution is skewed, ( not symmetrical), then mean,

    median and mode are not equal.

    In a moderately skewed distribution, distance between mean

    and median is approx , one third the distance between mean

    and mode Mode=3median-2mean

    p41

    Geometric Mean

  • 7/28/2019 Business+Statistics

    107/123

    Geometric mean like arithmetic mean is acalculated average.

    Very useful in averaging ratios and percentages.

    Also in determining the rate of increase or decrease

    Also capable of further algebraic treatment

    GM is more difficult to compute and interpret

    Cannot be computed if any observation has either a value

    zero or negative observations

    Harmonic Mean

  • 7/28/2019 Business+Statistics

    108/123

    A measure of central tendency for data expressed

    as rates (km/hr, tonnes/day , Km/ltre)

    Defined as the reciprocal of arithmetic mean of

    reciprocal of individual observations.

    Harmonic mean like arithmetic mean and geometricmean is computed from each and every observations

    It is specially used for averaging rates

    Cannot be computed when on or more observations have zero

    value or when there are both positive and negative

    observations

    In dealing with business problems rarely used.

  • 7/28/2019 Business+Statistics

    109/123

    Measure of Variation( Dispersion)

  • 7/28/2019 Business+Statistics

    110/123

    ( p )

    A measure of variation (dispersion) describes the

    spread or scattering of the individual values around

    the central value.

    Illustration (p47)

    Significance of Measuring variation

  • 7/28/2019 Business+Statistics

    111/123

    1. Determines the reliability of an average by

    pointing out as to how far an average is

    representative of the entire data.

    2. Determine nature and cause of variation in-order to

    control the variation itself

    3. Enable comparisons of two or more distributions

    with regard to their variability.

    4. Measuring variability is of great importance to

    advanced statistical analysis. (like in sampling or

    statistical inference)

    Properties of a Good measure of variation

  • 7/28/2019 Business+Statistics

    112/123

    p

    Should possess, as far as possible same properties as

    those of a good measure of central tendency.

    Some of the well known measures of variation

    which provide a numerical index of the variability ofthe given data are:

    Range

    Average or mean deviation

    Quartile Deviation or Semi-Interquartile range

    Standard deviation

    Absolute and Relative measures of

  • 7/28/2019 Business+Statistics

    113/123

    variation

    Measures of Absolute variation are expressed in

    terms of the original data.

    In cases two sets of data are expressed in different

    units of measurement, then the absolute measures ofvariation are not comparable. In such cases

    measures of relative variation are used. Also in

    cases:Comparison between two sets of data having the same

    unit of measurement, but with different means.

    Range

  • 7/28/2019 Business+Statistics

    114/123

    Difference between the highest (numerically large ) value and thelowest value in a set of data.

    R=H-L Range is very easy to calculate and gives us some idea about the

    variability of data.

    However, the range is a crude measure of variation , as it uses only

    two extreme values. Concept of range utilized in SQC, in studying variations in prices of shares anddebentures and other commodities that are very sensitive to price changes fromone period to another. Also a good indicator in weather forecast

    For grouped data, the range may be approximated as difference

    between upper limit of the largest class and the lower limit of thelowest class.

    The relative measure corresponding to range, called the coefficient ofrange , is obtained by applying formula

    P48,49

    Quartile deviation or

    Semi interquartile range

  • 7/28/2019 Business+Statistics

    115/123

    Semi-interquartile range Computed by taking the averages of the difference

    between the third quartile and the first quartile.

    The relative measure corresponding to quartile

    deviation, called coefficient of quartile deviation.

    QD is superior to range as it is not based on two extreme

    values, but rather on middle 50% observations.

    Another advantage of QD is that it is the only measure of

    variability which can be used for open-end distribution. The disadvantage is that it ignores the first and last 25%

    observations.

    P49,50

    Average Deviation

    or Mean Deviation

  • 7/28/2019 Business+Statistics

    116/123

    or Mean Deviation

    Is an improvement over the previous two measures in that it considersall observations in the given set of data.

    This measure is computed as a mean of deviations from mean or themedian.

    All deviations are treated as positive regardless of sign.

    Theoretically, there is an advantage in taking the deviations frommedian, because, the sum of absolute deviations from median isminimum. However, in actual practice, the arithmetic mean is more

    popular.

    The relative measure corresponding to the average deviation, calledcoefficient of average deviation is obtained by dividing averagedeviation by the particular average used in computing the averagedeviation. (Mean or median)

    p51

    Advantages and disadvantages

    (of Average Deviation)

  • 7/28/2019 Business+Statistics

    117/123

    (of Average Deviation) Though a good measure of variability, its use is

    limited,

    If only to measure and compare variability among

    several sets of data, the AD may be used.

    Major disadvantage is its lack of mathematical

    properties. This is more so because non-use of signs in

    its calculations make it algebraically inconsistent.

    Standard Deviation

  • 7/28/2019 Business+Statistics

    118/123

    Most widely used and important measure of variation.

    (In computing average deviation , the signs are ignored). The stddeviation overcomes this problem, by squaring the deviations, whichmakes them all positive.

    The std deviation, also known as root mean square deviation.

    The square of Std Deviation is called variance

    The Std Deviation and variance becomes larger as the variability or spreadwithin the data becomes greater.

    It is readily comparable with other Std deviations, and greater the Std Deviation,greater the variability.

    The Std deviation is commonly used to measure variability,

    While other measures have special uses, It is the only measure possessing the necessary mathematical properties to make

    it useful for advanced statistical work.

    p53

    C ffi i f i i (C )

  • 7/28/2019 Business+Statistics

    119/123

    Coefficient of Variation (C.V)

    Frequently used relative measure of

    variation .

    This measure is simply the ratio of stddeviation to mean expressed as percentage.

    p54

    Skewness

  • 7/28/2019 Business+Statistics

    120/123

    The measure of central tendency and variation do

    not reveal all characteristics of a given set of data

    Two distributions having same mean and Std

    deviation, may differ widely in the shape of their

    distribution.

    Distribution of data is symmetrical or not (asymmetrical

    or skewed)

    Thus the skewness refers to lack of symmetry indistribution

    Method of detection of skewness is to

    id th t il f di t ib ti

  • 7/28/2019 Business+Statistics

    121/123

    consider the tail of distribution

    Symmetrical distribution:No extreme values in a particular direction, so that low and high

    values balance each other.

    Mean=median=mode

    Negatively skewed distribution

    Longer tail towards lower value, or left hand side, the skewness is

    negative. The mean is decreased by some extremely low values.

    Positively skewed Distribution Longer tail of distribution towards higher values, or right handside, the skewness is positive. The mean is increased by some

    unusually high values.

    p55

    R l i k

  • 7/28/2019 Business+Statistics

    122/123

    Relative skewness

    In order to make comparisons between the

    skewness in two or more distributions, the

    coefficient of skewness (Karl Pearson method, Bowleys methods )

    In practice the value of coefficient ofSkewness , SK may be between +-1

  • 7/28/2019 Business+Statistics

    123/123