Quantitative Technique New
-
Upload
manishkhandal88 -
Category
Documents
-
view
217 -
download
0
Transcript of Quantitative Technique New
-
8/11/2019 Quantitative Technique New
1/35
QM
Brajaballav KarB. Tech (Electrical, CET)PGDM (XIMB)
1
-
8/11/2019 Quantitative Technique New
2/35
DESCRIPTIVE STATISTICS
It is quantitatively describing the main features of acollection of data, different from inferential statistics (orinductive statistics), It aims to summarize a sample (notlearn about the population that the sample of datarepresents). This generally means that descriptivestatistics, unlike inferential statistics, are not developedon the basis of probability theory. Descriptive statisticsinclude measures of central tendency (mean, medianand mode) and measures of variability or dispersion
(standard deviation (or variance), the minimum andmaximum values of the variables, kurtosis andskewness.Entries in an analysis of variance table can also be
regarded as summary statistic 2
-
8/11/2019 Quantitative Technique New
3/35
Data array: Arrange values in ascending or descendingorder (+ve notice largest, smallest value, divide datainto sections, notice value if appears more than once,observe distance between succeeding values)Frequency distribution: is a table that organizes datainto classes, (into groups), it shows number ofobservations from the data set that fall into each of theclasses.Relative Frequency distribution: Express in fractionor % of the total number of observation: /Mutually exclusive: No data point falls into more thanone categoryAll inclusive: Sum of all the relative frequencies equal 3
-
8/11/2019 Quantitative Technique New
4/35
Open ended class: when it allows the upper or the lowerend of a quantitative classification to be limit less (Age: 11-20, 21-30, 31-40, 41-50, 51-60, 61 and olderDiscrete Classes: Separate entities that do not progressfrom one class to the next without a break.
Continuous class: Progress from one class to the nextwithout a break, (ex weights of cans of tomatoes)The range must be divided by equal classes; that is thewidth of the interval from the beginning of one class to thebeginning of the next class must be same for every class.No of Classes: thumb rule 6 to 15 classesWidth of class interval =Ogives: A cumulative frequency distribution enables to seehow many observations live above or below certain values,
rather than only recording the number of items within theinterval. Graph of a cumulative frequency distribution is 4
-
8/11/2019 Quantitative Technique New
5/35
CHAPTER 2: MEASURES OF CENTRAL TENDENCY
Summary statistics: eg Central tendency anddispersion (which describes the characteristicsof the data set)Central tendencyDispersionSkewness: Opposite to symmetry, reason ofskewness is frequency distribution is lopsided,
not at the middle. Positively skewed (frequencymore at the beginning); Negatively skewed(frequency more at the end)Kurtosis (Peakedness)
5
-
8/11/2019 Quantitative Technique New
6/35
CENTRAL TENDENCY:
Arithmetic mean:(Characteristics of sample are called statistics andthat of population called parameter)
=x/(N) =x/(n)Grouped data= =f x/(n) (n = f) (in case of grouped data the midpoint taken is (if theclass interval is like x1-x2, x3-x4, then midpoint =
(x1+x3)/ 2=> This is an assumption andapproximation)-ve: a. affected by extreme values, if the class is openended then the mean can not be computed, all data
points are taken except in case of grouped data) 6
-
8/11/2019 Quantitative Technique New
7/35
CENTRAL TENDENCY
Weighted mean: w = w * x / (w) Geometric Mean= root of (Product of all xvalues) (Where to use)n th root of the growth ex: cube root of (1.1*1.15 *1.2)
-
8/11/2019 Quantitative Technique New
8/35
CENTRAL TENDENCY-MEDIAN
Middle most or most centralMedian Ungrouped data=
Array the data in ascending or descending order, then((n)+1)/2 th item is median in both odd and evencases.Data set odd, then middle item is medianIf the data set has even then average of the twomiddle item
Median Grouped data: Median Class: the class where the cumulativefrequency becomes (n+1)/ 2Then the assumption is the data points are evenly
spread over entire class interval: 8
-
8/11/2019 Quantitative Technique New
9/35
EXAMPLE
Account Balance Frequency
0-49.9950.00-99.99100.00-149.99150.00-199.99200.00-249.99250.00-299.99300.00-349.99350.00-399.99400.00-449.99450.00-499.99
7812318782514713964
600 9
-
8/11/2019 Quantitative Technique New
10/35
MEDIAN EXAMPLE
Median class : 100.00-149.99Median value is in (600+1)/2 = 300.5 =>300 th and 301 st item; 300 th item =99 th of the
median class (300-(78+123);Width of median class: (150.00-100.00)/ 187= 0.267, 1 st is 100.00 so 99 th = 100.00+ 98 *0.267=126.17100 th = 126.17+ 0.267=126.44 so median =(126.17+126.44)/ 2= 126.30
10
-
8/11/2019 Quantitative Technique New
11/35
MEDIAN FORMULA
Median formula= [ { (n+1)/2 (F+1)}/ f m ]* w + L m n= total no of itemsF= sum of all the class frequencies upto BUT notincluding median class
Fm=frequency of median classw= class interval widthLm= lower limit of the median class intervalFor the above median by formula =126.35 and thedifference is because rounding+ve of median: Extreme values dont affect median, canbe calculated for open ended grouped data, unless themedian is in open ended class. Can be calculated forqualitative data (excellent, very good, good, average bad;find the frequency and then median)
11
-
8/11/2019 Quantitative Technique New
12/35
CENTRAL TENDENCY-MODE
Mode: Value that is most often repeated in the datasetMode of ungrouped data is rarely used; reason being, chance cancause an unrepresentative data to be the most frequent value.Data set 0,0,1,1,2,2,4,4,5,5,6,6,7,7,8, 12, 15,15,15,19 => Mode is15 but is unrepresentative of the data set, since most of the valuesare below 10
No of data= 20So class interval (20-0)/6 =3.3 =>4; => No of Classes = 20/4=5Class 0-3 4-7 8-11 12-15 16-19 6 8 1 4 1 => Modal class is 4-7 Mo = L MO + {d1/(d1+d2)} * wL MO : Lower limit of modal class d1= frequency of the modal class the frequency of the classdirectly below it d2= frequency of the modal class the frequency of the classdirectly above it
w= width of the modal class interval 12
-
8/11/2019 Quantitative Technique New
13/35
MODE EXAMPLE
Account Balance Frequency0-49.9950.00-99.99100.00-149.99
150.00-199.99200.00-249.99250.00-299.99300.00-349.99350.00-399.99400.00-449.99450.00-499.99
78123187
82514713964600
Lmo =100,d1=187-123=64,
d2=187-82=105;w=50=>Mo=119.00
13
-
8/11/2019 Quantitative Technique New
14/35
ADVANTAGE MODE
Advantages: like median it can be used as acentral location for qualitative as well asquantitative data; mode not affected byextreme values; it also can be sued for openended class-ve: if the data occurs with same frequencythen it can not be used, in case of multiplemodes, it is difficult to compare
14
-
8/11/2019 Quantitative Technique New
15/35
MEAN MEDIAN-MODE
Mean, Median, and mode are identical in symmetricaldistributionIn a positively skewed distribution (skewed to right), themode is at the highest point of the distribution, median is tothe right of that and the mean is to the right of both medianand modeIn a negatively skewed distribution (skewed to left), themode is at the highest point of the distribution, median is tothe left of that and the mean is to the left of both medianand mode.When the population is skewed positively or negatively themedian is often the best measure of location because it isalways between the mean and the mode. The median is notas highly influenced by the frequency of occurrence of asingle value as the mode nor is it pulled by extreme valuesas is the mean. 15
-
8/11/2019 Quantitative Technique New
16/35
DISPERSION:
VariabilityWhy dispersion It gives additional information that enables us to
judge the reliability of our measure of centraltendency: Mean age 26; (case 1: Age1=2 Age 2= 52;case 2: Age1=24 Age 2= 28); If data is widely spread,then mean is less representativeCompare dispersion of different samples
Usage: Financial earnings more dispersed=> morerisk Quality parameters Drug Purity
16
-
8/11/2019 Quantitative Technique New
17/35
MEASURE- OF DISPERSION
Range (difference between the highest and lowest observedvalues); Easy to understand and find but usefulness is limited.Heavily influenced by extremes; Open ended distributions donthave a range.
Interfractile range: In a frequency distribution, a given fraction orproportion of the data lie at or below a fractile. The median for exampleis the 0.5 fractile, because half the data set is less than or equal to thisvalueInterfractile range is a measure of spread between tow fractiles in afrequency distribution, i.e the difference between the values of the twofractilesFractiles: if they divide the data into 10 equal parts, it is called deciles, if4, then quartile, if 100 then percentileInter quartile range is difference between the values of the first and thirdquartiles (Q3-Q1)
Other measures: Variance and Standard Deviation; both indicateaverage distance of any observation in the data set from the mean
of the distribution 17
-
8/11/2019 Quantitative Technique New
18/35
VARIANCE-Variance: 2
Population Variance 2 = ((x )2 )/ N which isequivalent to (x2 / N ) 2){used when x values arelarge and x- values are small (Square of a unit measure is not intuitive)
Variance of Grouped Data 2 = (f(x )2 )/ N = f(x2 ) / N 2Sample variance s 2 = (x )2 )/ (n- 1)= x2 / (n-1) n 2/ (n-1)
Standard Deviation: Square root of variance; only positiveroot to considerPopulation standard deviation= Square root ofPopulation variance
18
-
8/11/2019 Quantitative Technique New
19/35
CHEBYSHEVS THEOREM:
Chebyshevs theorem: says that NO MATTER what the shape of the distribution, at least 75% ofthe values will fall within +2 Standard deviation,from the mean of the distribution and at least 89
percent of the values will lie within +3 standarddeviation from the mean.
However it can be more precisely
68% of the values within +1 std Dev95% of the values within +2 Std Dev99% of the values within +3 Std Dev
19
-
8/11/2019 Quantitative Technique New
20/35
STANDARD SCORE: COEFFICIENT OF VARIATION
Standard Score:Standard score gives the number of standard deviations aparticular observation lies below or above the mean.Population Standard Score =(x - ) /
Relative Dispersion: The coefficient of variation 1. Standard deviation is an absolute measure ofdispersion that expresses the variation in the same unitas the original data. 2. Standard deviation alone cant becompared. So we need to know a. The mean b. The
standard deviation c. and how the standard deviationis compared with the mean So to compare we need a relative measure which iscoefficient of variation= /
20
-
8/11/2019 Quantitative Technique New
21/35
PROBABILITY It is the chance something will happen, expressedin fraction, %Event: one or more the possible outcomes ofdoing something
An Experiment: An activity that produces theeventsSample Space: The set of all possible outcomesof an experiment
Mutually exclusive: if one and only one of theevents can take place at a timeCollectively exhaustive: when a list of the possibleevents that can result from an experiment
includes every possible outcome, the list is calledcollectively exhaustive 21
-
8/11/2019 Quantitative Technique New
22/35
TYPES OF PROBABILITY:
Classical approachRelative frequency approachSubjective approach (Not to discuss)Classical approach: A priori, symmetrical , assumed (faircoin, un biased dice) we can know the probability beforehand
Relative frequency approach:In this approach, of relative frequency the probability isdefined as
1 observed relative frequency of an event in a very largenumber of trials (ex CA Pass percentage) or
2 The proportion of times that an event occurs in the long runwhen conditions are stable (This method uses the relativefrequencies of past occurrences as probabilities.Relative frequencies becomes stable as the number oftosses becomes large (under uniform conditions)
22
-
8/11/2019 Quantitative Technique New
23/35
RULES:
Single=marginal=unconditional probability => onlyone event can take placeMutually exclusive Events, Add probabilities:either or events P(A or B) = P (A) +P(B)
Proportion of families having this many childrenNo Children 0 1 2 3 4 5 >6
0.05 0.10 0.3 0.25 0.15 0.10 0.05Whats the P(4 or more Children) =0.15+0.10+0.05=0.3
23
-
8/11/2019 Quantitative Technique New
24/35
NOT MUTUALLY EXCLUSIVE EVENT
Not Mutually exclusive event; Addition Rule: P(A or B) = P (A) +P(B) - P(A and B) Male Age 30 Male 32
Female 45 Female 20 Male 40 Choose one person, who is either female or over 35=>P (female or over 35) =P (female) + P(over35) P (female and over 35)
2/5+2/5 -1/5 = 3/5 24
PROBABILITIES UNDER STATISTICAL
-
8/11/2019 Quantitative Technique New
25/35
PROBABILITIES UNDER STATISTICALINDEPENDENCE:
Statistical Independence: The occurrence ofone has no effect on the probability ofoccurrence of any other event.Rolling a die:In the die rolling: Getting a 6 the first time andgetting a 6 the second time are independent.But:
Getting a 6 the first time a die is rolled and theevent that the sum of the numbers seen on thefirst and second trials is 8 are not independent.
25
-
8/11/2019 Quantitative Technique New
26/35
3 TYPES OF PROBABILITIES UNDER STATISTICALINDEPENDENCE:
1. Marginal2. Joint3. Conditional
Marginal Probabilities of independent events: is simpleprobabilities (e.g fair coin toss P(H)=0.5, If unfair P(H) = 0.8then it is 0.8 every time)Joint probability of two independent events: P(AB) =P(A)*P(B) P(AB) = Probability of events A and B occurring together orin succession is Joint ProbabilityP(A)= Marginal Probability of event A occurringP(B) = Marginal Probability of event B occurring(example: Two heads in succession, dice: first 1 and then 6)P(H1) = P(H2)= P(H3)=0.5 =(marginal or absoluteprobability) But
= =26
-
8/11/2019 Quantitative Technique New
27/35
CONDITIONAL PROBABILITY UNDER STATISTICAL
INDEPENDENCE:
Conditional probability of independentevents: The conditional probability of eventB given that Event A has occurred is simplythe probability of B (Because they areindependent, by definition)P(B|A) = P(B)
Ex: Probability of Head in second toss, giventhat first toss resulted in Head = 0.5
27
-
8/11/2019 Quantitative Technique New
28/35
PROBABILITY UNDER STATISTICAL
INDEPENDENCE:
Type ofProbability
Symbol Formula
Marginal
JointConditional
P(A)
P(AB)P(B|A)
P(A)
P(A)* P(B)P(B)
28
-
8/11/2019 Quantitative Technique New
29/35
PROBABILITY UNDER STATISTICAL
INDEPENDENCE:
Type ofProbability
Symbol Formula
Marginal
JointConditional
P(A)
P(AB)P(B|A)
P(A)
P(A)* P(B)P(B)
29
-
8/11/2019 Quantitative Technique New
30/35
SIMPLE REGRESSIONRegression & Correlation=> Naure and strengthof relationship between two variableRegress to go back to the meanRegression analysis Estimating equation
(mathematically relating the variables)Types of relationship Dependent Independent Variables
One dependent-> Multiple independent variableDirect relationship: X increase; Y increase; Slope+veInverse Relationship: X increase;Y decrease;Slope ve 30
-
8/11/2019 Quantitative Technique New
31/35
REGRESSION- CAUSE & EFFECT ?Differentiate Cause-effect;Dependent-Independent variableNot all relationships are cause and effect (
Relationship found by regression is ofassociation but not of cause and effect. Cause and Effect:
Cause should precede in time Presence of cause indicates presence ofeffect
Presence of effect indicates presence of 31
-
8/11/2019 Quantitative Technique New
32/35
SCATTER DIAGRAM:
Transform tabular information to graphVisually ObserveDraw a fit: How?
Not necessarily touching each point, equalpoints to lie on either side of the line.Relationships could be linear/Curvilinear
32
-
8/11/2019 Quantitative Technique New
33/35
33
-
8/11/2019 Quantitative Technique New
34/35
TOTAL COST
It is known that the total cost is addition ofvariable cost and fixed cost, one businessmanknows that for, incurring a raw material cost of5 crore the total cost comes to 8.5crore and fora RM Cost of 8 Crore, the total cost is 10.6crore. The business man assumes a linearrelationship of the costs involved.
If he plans his raw material cost to be 10 croreswhat would be the total cost, he should beready to incur
34
-
8/11/2019 Quantitative Technique New
35/35
REGRESSION LINE
We will only examine linear relationshipY(dependent)=a(y intercept)+b (slope) x X(independent variable)b= (Y2-Y1) /(x2-x1)Estimating Y (hat)= a + b X
Add the errors (take the lowest)Individual difference may be +ve, -ve; and will cancel
Add absolute values (take lowest)Does not consider large single deviation=> does not stress
magnitude of errorSo, Square the error => Penalize the large absolutedeviation; take the leastMathematicallyb= {(XY)- n( )}/ {x2-n2}